Bump for visibility, also getting this.

Bump for visibility, also getting this. Hopefully it’s being looked into
10 Replies
garvitg
garvitg2mo ago
Hi @cloudkite. We are not observing any issues related to auth in Vectorize. Could you please share some details about the way you are sending requests to Vectorize? I am looking for your mode of connecting to Vectorize (HTTP, Worker Bindings or Wrangler), any details on the auth headers that you are sending, as well as the permissions that you have included (if you are using an API token). Feel free to DM me those details if you would find that preferable.
cloudkite
cloudkiteOP2mo ago
I am using
wrangler dev --experimental-vectorize-bind-to-prod
wrangler dev --experimental-vectorize-bind-to-prod
. About 80% of requests fail but if I retry enough times a few will succeed the other people who have observed this mentioned using it locally so I assume they are also using it with the experimental flag one thing I have observed is that it will work for a length of time. Then fail consistently for a certain length of time. Dont know how wrangler is authenticating when using that flag. But maybe its renewing its auth/refresh token at certain intervals? in terms of auth using
pnpm exec wrangler login
pnpm exec wrangler login
Astro
Astro2mo ago
I also experience this bug intermittently.. It's a hard one to debug. I'm using vectorize via bindings in a Pages/Workers app so auth shouldnt be a problem, although i am loggin in via the wrangler CLI. It seems to happen to me more when i navigate between pages that query vectorize frequently. Possibly a different edge case causing it, but nothing i've been able to debug myself.
garvitg
garvitg2mo ago
Hey @cloudkite. We are working on trying to repro the issue internally, do you have an estimate for the time interval after you start observing the error? Any details that you can share would be useful (including the index name, stack traces and code snippets). Please feel free to DM me with those details if you'd prefer that. Also, you will always need to pass the --experimental-vectorize-bind-to-prod flag to work with Vectorize indexes when using wrangler dev, so that steps seems correct. Hey @Astro. Are you observing this error in the dev mode too?
Astro
Astro2mo ago
Yes i am.. more common in local dev because thats where i spend most of my time.. Im unsure just how noticeable it is on production, but i've definitely seen it a few times
cloudkite
cloudkiteOP2mo ago
heres the stack trace
[ERROR] Error: VECTOR_QUERY_ERROR (code = 10000): Authentication error

at VectorizeIndexImpl._send (cloudflare-internal:vectorize-api:164:23)
at async VectorizeIndexImpl.queryImplV2 (cloudflare-internal:vectorize-api:196:21)
... 7 lines matching cause stack trace ...
at async runWorkflow (file:///Users/jonas/Projects/askd-ai/packages/embed-worker/src/server/workflow.ts:516:19) {
[cause]: Error: Authentication error
at VectorizeIndexImpl._send (cloudflare-internal:vectorize-api:167:28)
at async VectorizeIndexImpl.queryImplV2 (cloudflare-internal:vectorize-api:196:21)
at async VectorizeIndexImpl.query (cloudflare-internal:vectorize-api:42:20)
[ERROR] Error: VECTOR_QUERY_ERROR (code = 10000): Authentication error

at VectorizeIndexImpl._send (cloudflare-internal:vectorize-api:164:23)
at async VectorizeIndexImpl.queryImplV2 (cloudflare-internal:vectorize-api:196:21)
... 7 lines matching cause stack trace ...
at async runWorkflow (file:///Users/jonas/Projects/askd-ai/packages/embed-worker/src/server/workflow.ts:516:19) {
[cause]: Error: Authentication error
at VectorizeIndexImpl._send (cloudflare-internal:vectorize-api:167:28)
at async VectorizeIndexImpl.queryImplV2 (cloudflare-internal:vectorize-api:196:21)
at async VectorizeIndexImpl.query (cloudflare-internal:vectorize-api:42:20)
binding vectorize = [{ binding = "VECTOR_INDEX", index_name = "sources-index-dev" }] code snippet
return env.VECTOR_INDEX.query(await model.createEmbedding(query), {
// @ts-expect-error
filter: { parentId: { $in: parentIds } },
topK: 3,
returnMetadata: "all",
returnValues: true,
});
return env.VECTOR_INDEX.query(await model.createEmbedding(query), {
// @ts-expect-error
filter: { parentId: { $in: parentIds } },
topK: 3,
returnMetadata: "all",
returnValues: true,
});
Astro
Astro2mo ago
Mine is very similar, instead of querying for a search this is for recommendations. Same error message as cloudkite is seeing.
// Grab the appropriate vectorize binding (one for local dev, one for production)
const vectorize =
context.cloudflare.env.environment == "local"
? context.cloudflare.env.localvectorize
: context.cloudflare.env.vectorize;

const videoVectors = await vectorize.getByIds([videoId]);
const videoVector = videoVectors?.[0];

const results = await vectorize.query(videoVector.values, {
topK: 50,
});
// Grab the appropriate vectorize binding (one for local dev, one for production)
const vectorize =
context.cloudflare.env.environment == "local"
? context.cloudflare.env.localvectorize
: context.cloudflare.env.vectorize;

const videoVectors = await vectorize.getByIds([videoId]);
const videoVector = videoVectors?.[0];

const results = await vectorize.query(videoVector.values, {
topK: 50,
});
Wrangler.toml:
[[vectorize]]
binding = "localvectorize"
index_name = "neuraly-videos-local"

[[vectorize]]
binding = "vectorize"
index_name = "neuraly-videos"
[[vectorize]]
binding = "localvectorize"
index_name = "neuraly-videos-local"

[[vectorize]]
binding = "vectorize"
index_name = "neuraly-videos"
garvitg
garvitg2mo ago
And do you observe the error in the local dev mode? After a delay?
Astro
Astro4w ago
I see it in dev mode yes. I notice it mostly when Vite hot refreshes after I make a change (Remix app - so this will cause my loader function to refresh data) I have a gut feeling it only happens when I first put a vector into the database. Like when i create a vector for the video, then I go to that video's page, which should query vectorize for similar videos (for recommendations) is when this happens. I'm testing my whole embedding flow right now so i'll give this a bit more testing to hopefully get to the bottom of it Update on this behavior: a couple days ago i started seeing some address already in use error that would timeout my Remix app being caused by wrangler.. seeing this, i logged out via wrangler logout, then realized i couldnt actually log back in because i code via SSH, and clicking the oauth url to sign in will redirect to localhost, which is NOT where the CLI was initiated from.. I was able to login via a "CLOUDFLARE_API_TOKEN" env variable on the remote system (not ideal - but technically worked) but now i'm consistently getting the 10000 errors from vectorize for "Authentication error". Here's the code thats consistently causing this error
export const action: ActionFunction = async ({ request, context }) => {
console.log("Action function started");

const db = drizzle(context.cloudflare.env.database);
const vectorize =
context.cloudflare.env.environment == "local"
? context.cloudflare.env.localvectorize
: context.cloudflare.env.vectorize;

console.log(vectorize)

console.log("Initialized database and vectorize");

const { user, newToken } = await verifyUser(context, request);

console.log("Verified user");

const formData = await request.formData();
const videoId = formData.get("videoId");

console.log("Got form data");
if (!videoId || !title || !thumbnail || !duration) {
console.warn(`Missing required field(s)`);
return AuthorizedResponse(
{ success: false, message: "Missing required field(s)" },
newToken
);
}
console.log("No early returns needed");

// Embedding stuff

const WEIGHTS = {
thumbnails: 0.25,
text: {
title: 0.5,
description: 0.25,
},
};

const [titleEmb, descEmb, thumbnailEmb] = await Promise.all([...]);

console.log("Got new embeddings");

const combinedEmb = combineVideoEmbeddings(...);
console.log("combined embeddings");

await vectorize.insert([{ id: `${videoId}`, values: combinedEmb }]);

console.log("inserted vector");
await handleTopicVectorOperations(
vectorize,
db,
null,
topic as string | null,
combinedEmb
);
console.log("handled vector operations");

// Database stuff
await db.insert(videos).values({...});

console.log("inserted into db");
// Video processor stuff
await sendToSQS(context, `${videoId}`);

console.log("sent to queue");
await startEC2Instance(context);

console.log("started ec2 instance");

return AuthorizedResponse(
{ success: true, message: "Uploaded video" },
newToken
);
};
export const action: ActionFunction = async ({ request, context }) => {
console.log("Action function started");

const db = drizzle(context.cloudflare.env.database);
const vectorize =
context.cloudflare.env.environment == "local"
? context.cloudflare.env.localvectorize
: context.cloudflare.env.vectorize;

console.log(vectorize)

console.log("Initialized database and vectorize");

const { user, newToken } = await verifyUser(context, request);

console.log("Verified user");

const formData = await request.formData();
const videoId = formData.get("videoId");

console.log("Got form data");
if (!videoId || !title || !thumbnail || !duration) {
console.warn(`Missing required field(s)`);
return AuthorizedResponse(
{ success: false, message: "Missing required field(s)" },
newToken
);
}
console.log("No early returns needed");

// Embedding stuff

const WEIGHTS = {
thumbnails: 0.25,
text: {
title: 0.5,
description: 0.25,
},
};

const [titleEmb, descEmb, thumbnailEmb] = await Promise.all([...]);

console.log("Got new embeddings");

const combinedEmb = combineVideoEmbeddings(...);
console.log("combined embeddings");

await vectorize.insert([{ id: `${videoId}`, values: combinedEmb }]);

console.log("inserted vector");
await handleTopicVectorOperations(
vectorize,
db,
null,
topic as string | null,
combinedEmb
);
console.log("handled vector operations");

// Database stuff
await db.insert(videos).values({...});

console.log("inserted into db");
// Video processor stuff
await sendToSQS(context, `${videoId}`);

console.log("sent to queue");
await startEC2Instance(context);

console.log("started ec2 instance");

return AuthorizedResponse(
{ success: true, message: "Uploaded video" },
newToken
);
};
And here are the console logs when executing this code (showing where in the process it errors out)
Action function started
ProxyStub { name: 'VectorizeIndexImpl', poisoned: false }
Initialized database and vectorize
Verified user
Got form data
No early returns needed
Got new embeddings
combined embeddings
Error: VECTOR_INSERT_ERROR (code = 10000): Authentication error
at processTicksAndRejections (node:internal/process/task_queues:95:5)
at action (/home/astro/Development/deployed/neuraly/app/routes/videos.upload.tsx:145:3)
at Object.callRouteAction (/home/astro/Development/deployed/neuraly/node_modules/@remix-run/server-runtime/dist/data.js:36:16)
at /home/astro/Development/deployed/neuraly/node_modules/@remix-run/router/router.ts:4929:19
at callLoaderOrAction (/home/astro/Development/deployed/neuraly/node_modules/@remix-run/router/router.ts:4993:16)
at async Promise.all (index 2)
at /home/astro/Development/deployed/neuraly/node_modules/@remix-run/server-runtime/dist/single-fetch.js:44:19
at callDataStrategyImpl (/home/astro/Development/deployed/neuraly/node_modules/@remix-run/router/router.ts:4865:17)
at callDataStrategy (/home/astro/Development/deployed/neuraly/node_modules/@remix-run/router/router.ts:4022:19)
at submit (/home/astro/Development/deployed/neuraly/node_modules/@remix-run/router/router.ts:3785:21) {
[cause]: Error: Authentication error
at VectorizeIndexImpl._send (cloudflare-internal:vectorize-api:167:28)
at async VectorizeIndexImpl.insert (cloudflare-internal:vectorize-api:91:21)
at async #fetch (file:///home/astro/Development/deployed/neuraly/node_modules/miniflare/dist/src/workers/core/entry.worker.js:828:18)
at async ProxyServer.fetch (file:///home/astro/Development/deployed/neuraly/node_modules/miniflare/dist/src/workers/core/entry.worker.js:734:14) {
[cause]: undefined
}
}
Action function started
ProxyStub { name: 'VectorizeIndexImpl', poisoned: false }
Initialized database and vectorize
Verified user
Got form data
No early returns needed
Got new embeddings
combined embeddings
Error: VECTOR_INSERT_ERROR (code = 10000): Authentication error
at processTicksAndRejections (node:internal/process/task_queues:95:5)
at action (/home/astro/Development/deployed/neuraly/app/routes/videos.upload.tsx:145:3)
at Object.callRouteAction (/home/astro/Development/deployed/neuraly/node_modules/@remix-run/server-runtime/dist/data.js:36:16)
at /home/astro/Development/deployed/neuraly/node_modules/@remix-run/router/router.ts:4929:19
at callLoaderOrAction (/home/astro/Development/deployed/neuraly/node_modules/@remix-run/router/router.ts:4993:16)
at async Promise.all (index 2)
at /home/astro/Development/deployed/neuraly/node_modules/@remix-run/server-runtime/dist/single-fetch.js:44:19
at callDataStrategyImpl (/home/astro/Development/deployed/neuraly/node_modules/@remix-run/router/router.ts:4865:17)
at callDataStrategy (/home/astro/Development/deployed/neuraly/node_modules/@remix-run/router/router.ts:4022:19)
at submit (/home/astro/Development/deployed/neuraly/node_modules/@remix-run/router/router.ts:3785:21) {
[cause]: Error: Authentication error
at VectorizeIndexImpl._send (cloudflare-internal:vectorize-api:167:28)
at async VectorizeIndexImpl.insert (cloudflare-internal:vectorize-api:91:21)
at async #fetch (file:///home/astro/Development/deployed/neuraly/node_modules/miniflare/dist/src/workers/core/entry.worker.js:828:18)
at async ProxyServer.fetch (file:///home/astro/Development/deployed/neuraly/node_modules/miniflare/dist/src/workers/core/entry.worker.js:734:14) {
[cause]: undefined
}
}
The line erroring here is
await vectorize.insert([{ id: `${videoId}`, values: combinedEmb }]);
await vectorize.insert([{ id: `${videoId}`, values: combinedEmb }]);
Very weird behavior - but im now experiencing the 10000 error everywhere in my app in local dev mode, not just on that code. anywhere i use vectorize i see this error *** important to note im still able to access other bindings without issue, such as D1
garvitg
garvitg4w ago
Hey @Astro. Thank you for providing these details. We will pass them on to the team that looks after Wrangler auth. I do want to inform you that Vectorize does not operate in a "true" local mode, since each request will hit your "remote" Vectorize index and go through the Vectorize service. That might explain why you're able to work with other bindings and not Vectorize. Also wanted to check if your API token includes the "Vectorize Edit" Permissions since that is needed to insert vectors in a Vectorize index.

Did you find this page helpful?