Smart or Idiotic? The Great BigInt-to-Hash Adventure!

Alright, fellow code-wranglers, here's a spicy tech taco for you: we're dancing with regular bigint IDs for all our database entities, thanks to our waltz with planetscale row-reads pricing. Now, I don't know about you, but I’d rather our users not play detective and guess how many entities are populating our database by spotting the latest ID. Feels a bit like showing your poker hand, right? Enter the possible hero of our story: a nifty method that takes these IDs and morphs them into encrypted random strings using a sprinkle of secret salt. In this magical world, our database is blissfully unaware, munching on regular int-type IDs, while our users get served fancy hashed strings that look like UUIDs. Fancy, huh? But... plot twist! What if our secret salt takes a vacation and ends up in the wrong hands? We'd be stuck with a bunch of links that might as well lead to Narnia. Picture a "https://tinyurl.com/WhereDidTheMagicGo" that's lost its sparkle. Given that I’ve been known to occasionally doubt the genius-level of my brainstorm sessions (hey, humility, right?), what say you? Smart or Idiotic? Bring on the verdict!
35 Replies
Angelelz
Angelelz14mo ago
Nice read! I started using ULIDs as DB IDs but what do I know Mostly because the use the fancy words like Lexicographically Sortable
Mendy
MendyOP14mo ago
Thanks for chiming in! Ah, ULIDs - the alluring blend of uniqueness and lexicographical sortability. I've come across them before, and they indeed have that certain appeal, especially with those "fancy words" in their selling points. But that's an interesting take. Have you faced any specific challenges using them as DB IDs? I'm curious to hear about your experience! Unusually (he said) i've been a tad lazy reading the docs recently, so pardon the direct ask: do you know if there's decent support for ULIDs with Drizzle and Planetscale?
Angelelz
Angelelz14mo ago
Well it's great not to care about transforming IDs before sending them to the client apps and still be able to use cursor based pagination
Mendy
MendyOP14mo ago
Definitely. I expect my "solution" to be the absolute last resort. I mainly came here to hear other options
Mendy
MendyOP14mo ago
interesting that planetscale chose to use nanoId in combination with an internal bigint id: https://planetscale.com/blog/why-we-chose-nanoids-for-planetscales-api
Why we chose NanoIDs for PlanetScale’s API — PlanetScale
Learn why PlanetScale used NanoID to generate obscure and URL friendly identifiers.
No description
Angelelz
Angelelz14mo ago
I don't know about planetscale, but since we use MySQL, I've found annoying not having a .returning() method. So I can't use the new $default() on the column definition
Mendy
MendyOP14mo ago
postgres is magic
Mendy
MendyOP14mo ago
also planetscale:
No description
Mendy
MendyOP14mo ago
so where are you generating the ulids?
Angelelz
Angelelz14mo ago
We didn't mind that pitfalls because I just maintain a small app used internally at our company Our users are just the employees
Mendy
MendyOP14mo ago
oh i see also, performance aside, ulids have the BEST pronunciation U-Lids
Angelelz
Angelelz14mo ago
There's a js library of course
Mendy
MendyOP14mo ago
@Angelelz I've been follwing your direction and I made a ulid custom column to use in my tables:
export const ulid = customType<{
data: string;
notNull: true;
default: false;
driverData: Buffer;
}>({
dataType() {
return "binary(16)";
},
toDriver(value) {
return Buffer.from(ULID.fromCanonical(value).toRaw(), "hex");
},
fromDriver(value) {
return ULID.fromRawTrusted(value.toString("hex")).toCanonical();
},
});
export const ulid = customType<{
data: string;
notNull: true;
default: false;
driverData: Buffer;
}>({
dataType() {
return "binary(16)";
},
toDriver(value) {
return Buffer.from(ULID.fromCanonical(value).toRaw(), "hex");
},
fromDriver(value) {
return ULID.fromRawTrusted(value.toString("hex")).toCanonical();
},
});
For reads and inserts this works prefect, but it won't work on update and deletes as the filter values get parametrized and the db gets the full string instead of the buffer - resulting in no match. Running this:
console.log(
db
.update(testUlid)
.set({ value: "Testing update" })
.where(eq(testUlid.id, "01HEENT3YW02X8Q915QFR4X23W"))
.toSQL(),
);
console.log(
db
.update(testUlid)
.set({ value: "Testing update" })
.where(eq(testUlid.id, "01HEENT3YW02X8Q915QFR4X23W"))
.toSQL(),
);
Results in the following print:
{
sql: 'update `test_ulid` set `value` = ? where `test_ulid`.`id` = ?',
params: [
'Testing update',
<Buffer 01 8b 9d 5d 0f dc 00 ba 8b a4 25 bb f0 4e 88 7c>
]
}
{
sql: 'update `test_ulid` set `value` = ? where `test_ulid`.`id` = ?',
params: [
'Testing update',
<Buffer 01 8b 9d 5d 0f dc 00 ba 8b a4 25 bb f0 4e 88 7c>
]
}
Do you have any suggestion on how to solve this? NM, solved by letting the DB handle the binary extraction from the hex string
Mendy
MendyOP14mo ago
If you're here looking for a solution, see this gist
Gist
Implementing Efficient Binary ULID Storage in MySQL with Drizzle ORM
Implementing Efficient Binary ULID Storage in MySQL with Drizzle ORM - drizzle-ulid.md
Angelelz
Angelelz14mo ago
OMG, this is gold! We did not take the time to even attempt to store it as binary Does it behave properly when comparing like this?
eq(testUlid.id, "01HEENT3YW02X8Q915QFR4X23W")
eq(testUlid.id, "01HEENT3YW02X8Q915QFR4X23W")
Sorry I got you to that tangent in your project? What do you think of it so far?
Mendy
MendyOP14mo ago
Yeah it will work. The toDriver will transform it into hex and then binary You pointed me in the right path 🙂 I did some (limited) testing on a 1.3M rows table, and couldn't notice any performence difference
Angelelz
Angelelz14mo ago
Well, the fact that is binary hints that there should be no performance issues Right? But they talk about storage
Mendy
MendyOP14mo ago
I can only see 3 possible reasons not to use it: 1. You don't want to leak creation time - ulid will store that (as opposed to uuid or nanoid) 2. You need something shorter for nicer urls 3. 16 bytes is still more then 8 bytes for bigint
Angelelz
Angelelz14mo ago
This should be really easy to tests in a sqlite db
Mendy
MendyOP14mo ago
lmk if you get the chance! would love to see the results!
Angelelz
Angelelz14mo ago
I don't have the time right now but I wanna look into it in the future
Mendy
MendyOP14mo ago
Oh, one drawback to be ware of, is that updates and inserts on drizzle studio will not work. As well as filtering or relation views
Angelelz
Angelelz14mo ago
I think this solution is the best of any world Well, based on your comments there should be an easy fix And don't use the studio anyway lol
Mendy
MendyOP14mo ago
same, I literally only used it for testing out the custom type
Angelelz
Angelelz14mo ago
Are you planning to keep that gist up? I think we should link it somewhere In the docs or something
Mendy
MendyOP14mo ago
Yeah, it will remain up I agree. I posted it here and gave the gist a good title so people have better chance of finding it when looking ulid and drizzle, posting it in the docs will be helpful for someone for sure
Angelelz
Angelelz14mo ago
@Andrew Sherman I use ULIDs in my project but this implementation is a lot more efficient in terms of storage and performance. Do you think we can include a link to the gist maybe in the examples for customTypes or the usage of the $default() method?
Mendy
MendyOP14mo ago
@Angelelz just a heads up in case you’re considering using my type - I’ll advise against it. It appears that drizzle have LOTS of issues with custom types. The toDriver and fromDriver are not reliable, and will not fire on many occasions (prepared statements, nested query builder relationships and more…) In order to get binary ULID working this will have to be resolved. There are many issues open on GitHub related to this. For myself, I gave up on it for now, and will store my ids as in a varchar type.
Angelelz
Angelelz14mo ago
Hey Mendy, thank you for reaching out. I'm invested in this now. I've been pushing PRs to drizzle and I'm familiar with the codebase. Do you mind if I ask in particular what issues you've found that I could take a look into?
Mendy
MendyOP14mo ago
I sent the message before leaving my house, when back, I’ll setup a box with the issues I’ve seen. If you have a small schema I can use and you can link it here it will be helpful. If you are following through with this, I also made some changes to the type to help mitigate some issues, I’ll update the gist when I’m back.
Angelelz
Angelelz14mo ago
Sounds good! Thanks
Mendy
MendyOP14mo ago
I haven't forgotten, got caught up with urgent matters, will get to doing that soon.
Angelelz
Angelelz14mo ago
It's ok. But I believe there were some fixes in the latest release of drizzle-kit
Whit
Whit12mo ago
How are you doing the bit where you are storing ULID as a binary in the database and then converting it to UUID to send to the users? I think this is the path I want to go down. It makes sense to store ULID as binary for mostly sortable IDs which means better perorming indexes, however I dont want to expose the created time of the ULID to the client.
Kai Revona
Kai Revona12mo ago
Please give me the usage for MySQL Planetscale in my schema https://gist.github.com/MendyLanda/d5dc176707fd23cf6625c170038a8835
Gist
Implementing Efficient Binary ULID Storage in MySQL with Drizzle ORM
Implementing Efficient Binary ULID Storage in MySQL with Drizzle ORM - drizzle-ulid.md
Want results from more Discord servers?
Add your server