Storing objects of a specific class

Hello. Could you please tell me how to store an object of type "Document" (from langchain.docstore.document import Document) in a table?
11 Replies
kostas
kostas9mo ago
Hello! Here's a snippet showing what a document is and how to write it to Xata. Note that you need a column named "content" of type “Text” and any fields present in "metadata" (like "source") should correspond to column names in your Xata schema as well.
import os
from langchain_core.documents import Document
from langchain_community.vectorstores.xata import XataVectorStore
from langchain_openai import OpenAIEmbeddings

api_key = os.environ["XATA_API_KEY"]
db_url = "https://workspace-123456.us-east-1.xata.sh/db/langchaindb"

embeddings = OpenAIEmbeddings()

docs = [
Document(page_content="a user's message", metadata={'source': 'chatroom1'})
]

embeddings = OpenAIEmbeddings()

vector_store = XataVectorStore.from_documents(
docs, embeddings, api_key=api_key, db_url=db_url, table_name="vectors"
)
import os
from langchain_core.documents import Document
from langchain_community.vectorstores.xata import XataVectorStore
from langchain_openai import OpenAIEmbeddings

api_key = os.environ["XATA_API_KEY"]
db_url = "https://workspace-123456.us-east-1.xata.sh/db/langchaindb"

embeddings = OpenAIEmbeddings()

docs = [
Document(page_content="a user's message", metadata={'source': 'chatroom1'})
]

embeddings = OpenAIEmbeddings()

vector_store = XataVectorStore.from_documents(
docs, embeddings, api_key=api_key, db_url=db_url, table_name="vectors"
)
(OPENAI_API_KEY env var must be set) Ref: https://python.langchain.com/docs/integrations/vectorstores/xata/#create-the-xata-vector-store
kostas
kostas9mo ago
No description
Андрей
АндрейOP9mo ago
Thank you, for your reply. My goal is not to create vector storage, but to store selected chunks (Documents) on query to vector database for analysis.
kostas
kostas9mo ago
You can load the Document into a Dictionary and pickup the columns you want to index into Xata from there
docs = [
Document(page_content="a user's message", metadata={'source': 'chatroom1'})
]

myDocument=dict(docs[0])

payload={"content":myDocument["page_content"],"source":myDocument["metadata"]["source"]}

xata = XataClient(db_url=db_url)
record = xata.records().insert("mytable",payload)
docs = [
Document(page_content="a user's message", metadata={'source': 'chatroom1'})
]

myDocument=dict(docs[0])

payload={"content":myDocument["page_content"],"source":myDocument["metadata"]["source"]}

xata = XataClient(db_url=db_url)
record = xata.records().insert("mytable",payload)
Андрей
АндрейOP9mo ago
Thank you, just what I needed! Learning ))
kostas
kostas9mo ago
Happy to help!
Андрей
АндрейOP9mo ago
Help me, please. I can't understand the syntax of the query. I need to select all sub-queries related to questions related to a certain dialogue. data = xata.data().query("subquery", { "columns": ["number", "subquery", "subanswer", "query.*" ], "filter": { "query.dialog_id": dialog_id }, })
kostas
kostas9mo ago
It queries a table named subquery (xata.data().query("subquery",...), retrieves a list of columns including all the columns under a linked table in a column named query ("query.*") and applies a filter Documentation references that will help you understand the query structure (switch the code tab to Python): https://xata.io/docs/sdk/get#columns-selection, https://xata.io/docs/sdk/get#selecting-columns-from-the-linked-tables, https://xata.io/docs/sdk/filtering#exact-matching Perhaps what's confusing you is that subquery refers to actual table names and query refers to a link column name in the table subquery. Do you really have tables and columns named like that?
Андрей
АндрейOP9mo ago
I didn't name the column in the subquery table correctly - "query_id", it should have been "query".
kostas
kostas9mo ago
Great sounds like you made progress
Андрей
АндрейOP9mo ago
Only with your help!
Want results from more Discord servers?
Add your server