I can repro - seems like the dash only lists old (v1) indexes and not v2. I've let the team know so

I can repro - seems like the dash only lists old (v1) indexes and not v2. I've let the team know so hopefully they'll look into this next week. Until then you can use wrangler vectorize list to see your v2 indexes
15 Replies
Unknown User
Unknown User3mo ago
Message Not Public
Sign In & Join Server To View
C J
C J3mo ago
The v2 docs indicate that querying indexed metadata can truncate long text values. I understand not wanting to define the exact value in the API, although is there a guideline for this truncation threshold? I would like to store the original text associated with the vector. In my use case, these text values are generally a few hundred words or less.
Jerome
Jerome3mo ago
You can store up to 10KiB of metadata per vector; vectorize will either accept the data if it fits, or reject the upsert if not. The truncation happens in the metadata index at 64B, meaning you can filter indexed metadata properties on their first 64B. The original vector metadata is never truncated, and can be obtained verbatim on vector query by specifying the "metadata": "all" option.
C J
C J3mo ago
Thank you for those details. That’s really helpful. So basically ~64 ASCII characters before truncation when using indexed metadata? The challenge with metadata: all option for me is that it limits to topK 20 results. I’m hoping to return closer to 40 results, as I do a rerank and then trim the set before presenting to the LLM. Metadata: indexed would let me return topK 100 but the truncation cutoff makes it less useful. Is the topK max of 20 anticipated to improve in the future with metadata: all? Otherwise I will need to completely re-architect and store metadata independently of vectors.
maro
maro3mo ago
Hey guys! I'm aware that Vectorize is currently in beta, but I'm curious about its reliability for production use. My specific use case involves a chatbot that serves approximately 1 million users per month. If Vectorize isn't suitable, could you recommend any other alternatives? Pinecone? pgvector?
scotto
scotto3mo ago
Any plans to support arrays in metadata filtering ?
hannojg
hannojg3mo ago
Hey, quick question: its expected that on the web dashboard you don't see any of the vector indexes you've created (while in wrangler you do) ? ah that has already been answered here https://discord.com/channels/595317990191398933/1279371665859674132/1280103851298394145 thanks! Is there a limit to how many vector indexes we can create? Edit: found the limits docs, all good
Unknown User
Unknown User3mo ago
Message Not Public
Sign In & Join Server To View
hannojg
hannojg3mo ago
Cool! Are there any early details on how the limits will increase? 😊
ac
ac3mo ago
Is there any way to upgrade a v1 Vectorize index to v2, or do I need to create a completely new one?
Unknown User
Unknown User3mo ago
Message Not Public
Sign In & Join Server To View
Isaac McFadyen
Isaac McFadyen3mo ago
TIL that you can link a slash command in a message 😄
Unknown User
Unknown User3mo ago
Message Not Public
Sign In & Join Server To View
hannojg
hannojg3mo ago
In our use case we don’t have a lot of data belonging to one category, but a lot of smaller data belonging to many segments. So I am wondering how I can segment the vectors. The options I see are: - creating an index per segment - using namespaces in the same index - metadata filtering (less preferred) Both indexes and namespaces have quite a tight limit for our use case Kind of like this - if we already have 20.000 customers, and each customer has ~15 projects for which I’d need separate segments, it’s hard to do that with the current limits
Unknown User
Unknown User3mo ago
Message Not Public
Sign In & Join Server To View
Want results from more Discord servers?
Add your server