there is something bad going on with vectorize indexes, I tried three times to create a text embeddi
there is something bad going on with vectorize indexes, I tried three times to create a text embeddings index, I defined them with metric cosine and 1536 dimensions. It appears that after inserting a certain amout of vectors my index breaks and my queries starting returning absolutely nonsense "scores" - for example right now my queries to this cosine index are returning scores like 28.955 or something like that (and that is breaking search results)
10 Replies
I don't see my vectorize v2 index in cloudflare dashboard. I only see v1 vector index. listing it with wrangler cli works
Unknown User•7mo ago
Message Not Public
Sign In & Join Server To View
Hey Cloudflare team and community, I have a large dataset of
500M vectors
, each with 256 dimensions
. I've recently seen the changelog about Vectorize v2 being in public beta, which mentions support for up to 5 million vector dimensions per index but there is not vector dimension limit mentioned on limits page? I'd like some clarification and advice on how to best use this with my dataset.
Limits: https://developers.cloudflare.com/vectorize/platform/limits/
Changelog: https://developers.cloudflare.com/vectorize/platform/changelog/
Given my vector dimensions (256) and the new limit of 5 million vector dimensions per index, my understanding is that I could potentially store up to 19,531 vectors per index (5,000,000 / 256 = 19,531.25). Is this correct?
If so, I would need approximately 25,600 indexes to store all 500M vectors (500,000,000 / 19,531 ≈ 25,600). However, this seems impractical given the current limit of 100 indexes per account.
My questions are:
1. Is my understanding of the "5 million vector dimensions" limit correct? Or does this mean something different? I wish to insert 5M vector for each index based on Limits page statement: Maximum vectors per index= 5M
2. If my understanding is correct, what would be the best approach to handle such a large dataset with Vectorize? Given the current beta limit of 5 million vectors per index (on limits page, not changelog) I am proposing to distribute your data across 100 indexes, each containing 5 million vectors. My insertion strategy involves using a modulo operation to determine which index a vector should be inserted into. For querying, I plan to search all 100 indexes in parallel and then aggregate and rank the results.
3. Are there plans to increase the number of indexes allowed per account? If so wha tis maximum?
Any advice or insights would be greatly appreciated. Thank you!I'm getting a similar error to this when trying to create meta-indexes
this is the command I ran
ERROR] A request to the Cloudflare API Failed
Expected request with
Content-Type: application/json
[code: 40026]
If you think this is a bug, please open an issue at:
https://github.com/cloudflare/workers-sdk/issues/new/choose
This is the output I'm getting
additionally the docs have this command with the name property and type property being wrapped in quotes on one page and not being wrapped in quotes on another
Both commands return the same error (I'm guessing its expecting a post request and using a get request)
Happens on both wrangler version: 3.72.0 and 3.72.2(latest)Hey @Joey. Thank you for reporting this! We are aware of this issue (https://github.com/cloudflare/workers-sdk/issues/6516) and we have pushed a fix. It should be available with the next release of Wrangler (expected early next week).
If you'd like to unblock yourself today, you can use a Wrangler pre-released version associated with this fix: https://github.com/cloudflare/workers-sdk/pull/6548#issuecomment-2302185968.
GitHub
🐛 BUG: Error when wrangler vectorize create-metadata-index on Vecto...
Which Cloudflare product(s) does this pertain to? Wrangler What version(s) of the tool(s) are you using? Wrangler 3.72.0 What version of Node are you using? 22.3.0 What operating system and version...
GitHub
fix: Fix Vectorize JSON payloads by garvit-gupta · Pull Request #65...
What this PR solves / how to test
Add content-type header to Vectorize POST operations
Fix Vectorize getVcetors, deleteVectors payload in Wrangler Client
Fixes #6516/VS-269 and VS-271.
Author has a...
thank you very much, I must have missed the issue as I was looking through open issues before asking here
I'll use the pre-release version and fix it today thanks for the help
Unknown User•7mo ago
Message Not Public
Sign In & Join Server To View
hi cloudflare team, is vectorize still free for users with a workers paid plan?
the pricing page (https://developers.cloudflare.com/vectorize/platform/pricing/) says "Vectorize is currently in public beta and is free to use on Workers Paid plans." but also it says "We intend to enable billing for Vectorize usage in January 2024."
thanks 🙂
As I move from a proof of concept to a real application product,is there a way to migrate the data or rename an existing vector index? It would be very handy to be able to do this in the wrangler cli
it still free to use, but we intend to enable billing next month.