Pgjs perf investigation

Hi @AJR, I saw in the Hyperdrive Changelog https://developers.cloudflare.com/hyperdrive/platform/changelog/#2024-05-22 that there were some driver performance improvements for postgres.js. However, I’m testing a hyperdrive connection using a worker and am finding that with caching enabled and { "prepare": true } I get 100% cache misses on a simple SELECT like const [user]: [User?] = await sql`SELECT * FROM users WHERE id = ${userId};. If I set { “prepare”: false } then latency improves, I get some cache hits but the majority are marked as uncacheable on the same repeated query. Any help is greatly appreciated, (happy to DM my connection id) - thanks.
Cloudflare Docs
Changelog · Hyperdrive
You can now create up to 25 Hyperdrive configurations per account, up from the previous maximum of 10.
23 Replies
AJR
AJR5mo ago
Setting "prepare: false" is going to have perf issues, for sure. We don't recommend that setting at this point, as the way it's implemented makes caching those queries very unreliable. That first bit, though, isn't how it should work and is surprising to me.
Couple questions just to clarify. 1. User id is the same on repeated queries? Obviously for different user Id passed in you would get cache misses. 2. What is your config ID? 3. What version of postgres.js are you using? 4. Would you mind sending an actual snippet of your code for connecting and for sending those queries? Ah. This actually jogs something I was just discussing this week too. @Ping for toast are you using "prepare: false" in your connection when you're testing too?
AJR
AJR5mo ago
I did just go and double check, and with prepare=true on the postgres-js connection string, parametrized/extended queries still cache on Hyperdrive. Here's the example javascript I just tossed together for reference.
elliott
elliottOP5mo ago
Hey apologies for slow reply. 1. Yes same user id used on all queries: 123e4567-e89b-12d3-a456-426614174000 2. Config ID: cab7a8666f1d41fa8ddd730463da0f48 3. Version of postgres.js is 3.4.4 4. Snippet below:
app.get('/users/:userId', async (c) => {
const userId = c.req.param('userId');
const sql = postgres(c.env.HYPERDRIVE.connectionString, { prepare: true });
const [user]: [User?] = await sql`SELECT * FROM platform.users WHERE id = ${userId}`;
if (!user) {
throw new Error('Handle error');
}
return c.json(user);
});
app.get('/users/:userId', async (c) => {
const userId = c.req.param('userId');
const sql = postgres(c.env.HYPERDRIVE.connectionString, { prepare: true });
const [user]: [User?] = await sql`SELECT * FROM platform.users WHERE id = ${userId}`;
if (!user) {
throw new Error('Handle error');
}
return c.json(user);
});
Database is an RDS instance in region: eu-west-1 With these settings I get 100% cache miss (and a greater latency than when prepare is set to false). I should add I'm getting the cache miss info from the Cloudflare Hyperdrive dashboard for this connection
AJR
AJR5mo ago
The dashboard feeds directly from the code that handles cache hits/misses, so if it says you're getting misses I'm pretty confident you are. I'll take a look through this. Thank you.
brain
brain5mo ago
Having pretty much the same problem on my end. database is a neon DB, using the pooler connection string for hyperdrive. using drizzle & postgres.js --- for the same (selecting a user by id) & prepare = true, I get uncacheable/misses & latency is relatively high when using prepare = false, I get some hits (not a high % though) and latency is lower @AJR
AJR
AJR5mo ago
Ok, noted, thank you. Looking at our global stats, we're still seeing something like 60% cache hits (which is the highest it's ever been), so there's definitely some specific pattern in this access causing issues (not that it should, obviously, this should work). We'll take a look. @brain same deal. Versions and config ID, if you wouldn't mind? When you say the pooler connection string, you mean you're using both Hyperdrive and Neon's built-in pooler?
brain
brain5mo ago
yes, using neon's built-in pooler & hyperdrive. sorry for the late reply. sent config ID by PM should I not be using neon's built-in pooler, might be causing issues?
AJR
AJR5mo ago
Generally multiple poolers on top of each other offers very little benefit, and can add some complexiy and be a source of issues. I don't know yet if that's the problem here, but in general I'd recommend not using their built-in unless you have some particular reason to know you need it.
brain
brain5mo ago
sure thing. will try without the pooler & let you know in a minute seems like by doing prepare: true I get an average of ~450ms per query hit (I'm pretty far from the DB which is un US & I'm in Argentina) --- by doing prepare: false I get an average of ~300ms per query hit, although, still seems that most queries are "uncacheable" or "misses". I'll send the other config ID by PM (the one without the built-in pooler) here's the code, for reference:
base.get("/user/:id", async (c) => {
const database = db(c.env.HYPERDRIVE);
const userId = c.req.param("id");
const startTime = new Date();
const user = await database
.select({
id: users.id,
email: users.email,
createdAt: users.createdAt,
})
.from(users)
.where(eq(users.id, userId))
.execute();
const executeTime = new Date();
console.log("Execute time", executeTime.getTime() - startTime.getTime());

if (!user) {
return c.json({ error: "User not found" }, 404);
}

return c.json(user);
});
base.get("/user/:id", async (c) => {
const database = db(c.env.HYPERDRIVE);
const userId = c.req.param("id");
const startTime = new Date();
const user = await database
.select({
id: users.id,
email: users.email,
createdAt: users.createdAt,
})
.from(users)
.where(eq(users.id, userId))
.execute();
const executeTime = new Date();
console.log("Execute time", executeTime.getTime() - startTime.getTime());

if (!user) {
return c.json({ error: "User not found" }, 404);
}

return c.json(user);
});
---
import postgres from "postgres";
import { drizzle } from "drizzle-orm/postgres-js";

export const db = (hd: Hyperdrive) => {
const sql = postgres(hd.connectionString, {
prepare: true, // or false
});
return drizzle(sql);
};
import postgres from "postgres";
import { drizzle } from "drizzle-orm/postgres-js";

export const db = (hd: Hyperdrive) => {
const sql = postgres(hd.connectionString, {
prepare: true, // or false
});
return drizzle(sql);
};
AJR
AJR5mo ago
That is very helpful, thank you!
brain
brain5mo ago
please keep me posted! still seeing the same
brain
brain5mo ago
it's looking like this with prepare: true
No description
brain
brain5mo ago
here's how it looks with prepare: false
No description
brain
brain5mo ago
note the difference:
No description
AJR
AJR5mo ago
Ah, to be clear, this is not something I'm dropping current work to solve. It'll be a bit.
brain
brain5mo ago
yes, of course! good luck with work
Ping for toast
Ping for toast5mo ago
sorry!!! I'm on vacation and IO missed this. Its unchanged
AJR
AJR5mo ago
No worries. You mean to say you're using default settings for your connection?
Ping for toast
Ping for toast5mo ago
Yup. Literally just
dialect: new PostgresJSDialect({
postgres: postgres(connectionString),
}),
dialect: new PostgresJSDialect({
postgres: postgres(connectionString),
}),
AJR
AJR5mo ago
Got it. Thanks!
Ping for toast
Ping for toast5mo ago
so is it reccomended to turn it off, based on this?
AJR
AJR5mo ago
Prepare: true is the default, and is what we recommend. You're good. I was hoping your issues were caused by that, but no such luck.
Ping for toast
Ping for toast5mo ago
sounds good. I will be trying ec2 in a couple days
Want results from more Discord servers?
Add your server