W
Wasp•10mo ago
Zuck

Anyone got RAG working with WASP?

My current approach is to use langchains build in vector storage but i want the vector data to be stored in Postgre database that WASP makes. - crappy coder that uses AI to code religiously
26 Replies
kapa.ai
kapa.ai•10mo ago
Hi @Jacob Ferrari. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave: ---------------------- Based on the knowledge sources provided, there isn't a specific example of using RAG (Retrieval-Augmented Generation) with Wasp. However, there is a discussion on using Wasp with ChatGPT embeddings (pgvector) which might be similar to what you're trying to achieve. In this Github issue (https://github.com/wasp-lang/wasp/issues/1215), a user mentioned using pgvector (https://github.com/pgvector/pgvector) to add vector embeddings to a Postgres database. Since Wasp's Prisma won't be able to interact with the vector field created by pgvector, you'll need to use Prisma's queryRaw for all your CRUD operations for that field. Here are some code snippets provided by the user: - Function to get an embedding from OpenAI API: link (https://github.com/maksim36ua/feedback_hub/blob/main/src/server/services/apiService.ts#LL34C23-L34C23) - How to save your vector into the database: link (https://github.com/maksim36ua/feedback_hub/blob/main/src/server/actions/saveReducedFeedback.ts#L9) - How to perform SQL queries with the vectors in the database: link (https://github.com/maksim36ua/feedback_hub/blob/main/src/server/actions/getFeedbackSimilarities.ts#L17) Please note that vector data types are not currently supported by Prisma, so you'll have to use raw Prisma Queries with prismaClient.$queryRaw instead of Prisma's built-in methods. For more information on how Wasp interacts with databases, you can check out the official documentation (https://wasp-lang.dev/docs/data-model/backends). ---------------------- Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: ---------------------- Sorry @Jacob Ferrari, I seem to be having a bit of trouble. Don't worry though, I've pinged my support and they will try to fix me up! :technologist: Perhaps try again in 15mins?
Zuck
ZuckOP•10mo ago
Prisma | 🦜️🔗 Langchain
For augmenting existing models in PostgreSQL database with vector search, Langchain supports using Prisma together with PostgreSQL and pgvector Postgres extension.
miho
miho•10mo ago
Here's an example app with PG Vector 🙂 https://github.com/wasp-lang/ask-the-documents
GitHub
GitHub - wasp-lang/ask-the-documents: Ask The Documents (Embeddings...
Ask The Documents (Embeddings / ChatGPT) with Wasp & PG Vector - wasp-lang/ask-the-documents
Zuck
ZuckOP•10mo ago
yaaaaayy thank you!
howcaniexplain
howcaniexplain•8mo ago
How has your experience so far using PG Vector with WASP been? I'm working on something similar and wondering if PG Vector is the best move, or if you've had success with something else.
Zuck
ZuckOP•8mo ago
I ended up useing Pinecone for my vector database needs. It felt like a pain (for me atleast) to muck around with the Postgres database. I rely heavily on AI to help write my code and found Claude 3.5 had a ton of knowledge on Pinecone, and I used Pinecone a looong time ago (lol like a year ago) and so thought it was the best choice and it’s flexible. If I could I would have just used open ai Assistant API though
AC
AC•4w ago
How did the example Wasp application get around Prisma limitations for vector columns? Was it through this kind of implementation? https://github.com/vercel/examples/blob/nuxt-blob-example/storage/postgres-pgvector/prisma/schema.prisma
GitHub
examples/storage/postgres-pgvector/prisma/schema.prisma at nuxt-blo...
Enjoy our curated collection of examples and solutions. Use these patterns to build your own robust and scalable applications. - vercel/examples
AC
AC•4w ago
Got it working - Amazing!!! Just saved me a bunch of time. Thanks! @miho So, I'm having to make some adjustments on my first intended implementation. Basically, I want a system that creates the vectorized embeddings, and realizing that (of course) Node is not best for this. I've created a Langchain powered Flask app to create the embeddings, which can be triggered via my Wasp app. Looking at the embeddings-app in Wasp, I'm wondering if there's info on search latency? I took a look through the docs and couldn't find anything. The other option is to run the search in Flask, but concerned about latency there as well.
MEE6
MEE6•4w ago
Wohooo @AC, you just became a Waspeteer level 4!
miho
miho•4w ago
We don't have that info at hand, it would depend on your data set I guess 🙂 you should measure and see what works for you
AC
AC•4w ago
Like OP, I ended up going with a separate database (Chroma) after running some latency tests and further requirements analysis.
Stefan VitĂłria
Stefan Vitória•4w ago
i am
AC
AC•3w ago
What did you end up using @Stefan VitĂłria ? I'm currently finishing up a RAG/AI model built in Flask, with Pydantic. I'm just about to start abstracting out all my AI API calls from Wasp into that app...the latency so far is much better.
Vinny (@Wasp)
Vinny (@Wasp)•3w ago
How's it going? Everything working?
AC
AC•3w ago
Working through it! I created a Flask microservice, integrated Langchain and Pydantic. I am now doing vectorized embedding in Flask, and managing all the HuggingFace API calls through there. Then creating endpoints to call on from Wasp/Node. So, going to extract out all my previous API calls from Node into Flask. So much easier to handle with Pydantic. I made a quick dashboard in Wasp so that I can track latency, etc. It's obviously a bit buggy atm, but good enough to work with quickly so that I can refine my calls and then integrate them with frontend components. This approach is cutting down on dev time significantly for me.
No description
Vinny (@Wasp)
Vinny (@Wasp)•3w ago
That’s awesome. Great to hear. Let us know if you’ve got any other questions.
NEROX
NEROX•6d ago
So, nobody has accomplished it with pure wasp and the PGVector extension?
AC
AC•5d ago
I got it working, but the latency didn't make sense for my use case @NEROX . I have 7 different RAG queries during a 5 minute onboarding process. It was just easier and more efficient to create a Python layer with Pydantic.
miho
miho•5d ago
latency didn't make sense for my use case
What was the setup where the latency was bad? I'm just curious, was it Wasp + PgVector or you had something in-between?
AC
AC•5d ago
Wasp + PgVector. But also, keep in mind (1) I've only been coding in earnest for 5 months; (2) I wanted to be able to experiment with different chunking methods; (3) I did some Python data science in uni, so am a bit more familiar with it.
miho
miho•5d ago
Gotcha, because in the background, it's just Node.js connecting to Postgres, so the latency should be fine if the app and the db are on close geographically. It might be related to how Prisma executes the queries or how you write you queries. I'd love to measure the latency at some point to know for sure.
AC
AC•5d ago
Yes, for sure, happy to do a screenshare and show you what I've done once I'm past alpha testing which should be in a week or two
miho
miho•5d ago
Nice, ping me when you are ready 🙏🏼
NEROX
NEROX•5d ago
I never thought I'd get to build something like this, thanks guys for so much valuable info
Vinny (@Wasp)
Vinny (@Wasp)•4d ago
đź’Ł

Did you find this page helpful?