Anyone got RAG working with WASP?
My current approach is to use langchains build in vector storage but i want the vector data to be stored in Postgre database that WASP makes. - crappy coder that uses AI to code religiously
26 Replies
Hi @Jacob Ferrari. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
----------------------
Based on the knowledge sources provided, there isn't a specific example of using RAG (Retrieval-Augmented Generation) with Wasp. However, there is a discussion on using Wasp with ChatGPT embeddings (pgvector) which might be similar to what you're trying to achieve.
In this Github issue (https://github.com/wasp-lang/wasp/issues/1215), a user mentioned using pgvector (https://github.com/pgvector/pgvector) to add vector embeddings to a Postgres database. Since Wasp's Prisma won't be able to interact with the vector field created by pgvector, you'll need to use Prisma's
queryRaw
for all your CRUD operations for that field.
Here are some code snippets provided by the user:
- Function to get an embedding from OpenAI API: link (https://github.com/maksim36ua/feedback_hub/blob/main/src/server/services/apiService.ts#LL34C23-L34C23)
- How to save your vector into the database: link (https://github.com/maksim36ua/feedback_hub/blob/main/src/server/actions/saveReducedFeedback.ts#L9)
- How to perform SQL queries with the vectors in the database: link (https://github.com/maksim36ua/feedback_hub/blob/main/src/server/actions/getFeedbackSimilarities.ts#L17)
Please note that vector data types are not currently supported by Prisma, so you'll have to use raw Prisma Queries with prismaClient.$queryRaw
instead of Prisma's built-in methods.
For more information on how Wasp interacts with databases, you can check out the official documentation (https://wasp-lang.dev/docs/data-model/backends).
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
Sorry @Jacob Ferrari, I seem to be having a bit of trouble. Don't worry though, I've pinged my support and they will try to fix me up! :technologist:
Perhaps try again in 15mins?can i use the PGVector extension?
https://js.langchain.com/v0.1/docs/integrations/vectorstores/prisma/
Prisma | 🦜️🔗 Langchain
For augmenting existing models in PostgreSQL database with vector search, Langchain supports using Prisma together with PostgreSQL and pgvector Postgres extension.
Here's an example app with PG Vector 🙂 https://github.com/wasp-lang/ask-the-documents
GitHub
GitHub - wasp-lang/ask-the-documents: Ask The Documents (Embeddings...
Ask The Documents (Embeddings / ChatGPT) with Wasp & PG Vector - wasp-lang/ask-the-documents
yaaaaayy thank you!
How has your experience so far using PG Vector with WASP been? I'm working on something similar and wondering if PG Vector is the best move, or if you've had success with something else.
I ended up useing Pinecone for my vector database needs.
It felt like a pain (for me atleast) to muck around with the Postgres database.
I rely heavily on AI to help write my code and found Claude 3.5 had a ton of knowledge on Pinecone, and I used Pinecone a looong time ago (lol like a year ago) and so thought it was the best choice and it’s flexible.
If I could I would have just used open ai Assistant API though
How did the example Wasp application get around Prisma limitations for vector columns? Was it through this kind of implementation? https://github.com/vercel/examples/blob/nuxt-blob-example/storage/postgres-pgvector/prisma/schema.prisma
GitHub
examples/storage/postgres-pgvector/prisma/schema.prisma at nuxt-blo...
Enjoy our curated collection of examples and solutions. Use these patterns to build your own robust and scalable applications. - vercel/examples
Got it working - Amazing!!! Just saved me a bunch of time. Thanks!
@miho So, I'm having to make some adjustments on my first intended implementation. Basically, I want a system that creates the vectorized embeddings, and realizing that (of course) Node is not best for this. I've created a Langchain powered Flask app to create the embeddings, which can be triggered via my Wasp app. Looking at the embeddings-app in Wasp, I'm wondering if there's info on search latency? I took a look through the docs and couldn't find anything. The other option is to run the search in Flask, but concerned about latency there as well.
Wohooo @AC, you just became a Waspeteer level 4!
We don't have that info at hand, it would depend on your data set I guess 🙂 you should measure and see what works for you
Like OP, I ended up going with a separate database (Chroma) after running some latency tests and further requirements analysis.
i am
What did you end up using @Stefan VitĂłria ? I'm currently finishing up a RAG/AI model built in Flask, with Pydantic. I'm just about to start abstracting out all my AI API calls from Wasp into that app...the latency so far is much better.
How's it going? Everything working?
Working through it! I created a Flask microservice, integrated Langchain and Pydantic. I am now doing vectorized embedding in Flask, and managing all the HuggingFace API calls through there. Then creating endpoints to call on from Wasp/Node. So, going to extract out all my previous API calls from Node into Flask. So much easier to handle with Pydantic. I made a quick dashboard in Wasp so that I can track latency, etc. It's obviously a bit buggy atm, but good enough to work with quickly so that I can refine my calls and then integrate them with frontend components. This approach is cutting down on dev time significantly for me.

That’s awesome. Great to hear. Let us know if you’ve got any other questions.
So, nobody has accomplished it with pure wasp and the PGVector extension?
Yes, @miho did here --> https://discord.com/channels/686873244791210014/1239978307181740206/1239985367613177978
I got it working, but the latency didn't make sense for my use case @NEROX .
I have 7 different RAG queries during a 5 minute onboarding process. It was just easier and more efficient to create a Python layer with Pydantic.
latency didn't make sense for my use caseWhat was the setup where the latency was bad? I'm just curious, was it Wasp + PgVector or you had something in-between?
Wasp + PgVector. But also, keep in mind (1) I've only been coding in earnest for 5 months; (2) I wanted to be able to experiment with different chunking methods; (3) I did some Python data science in uni, so am a bit more familiar with it.
Gotcha, because in the background, it's just Node.js connecting to Postgres, so the latency should be fine if the app and the db are on close geographically.
It might be related to how Prisma executes the queries or how you write you queries. I'd love to measure the latency at some point to know for sure.
Yes, for sure, happy to do a screenshare and show you what I've done once I'm past alpha testing which should be in a week or two
Nice, ping me when you are ready 🙏🏼
I never thought I'd get to build something like this, thanks guys for so much valuable info
đź’Ł