Smart or Idiotic? The Great BigInt-to-Hash Adventure!
Alright, fellow code-wranglers, here's a spicy tech taco for you: we're dancing with regular bigint IDs for all our database entities, thanks to our waltz with planetscale row-reads pricing.
Now, I don't know about you, but I’d rather our users not play detective and guess how many entities are populating our database by spotting the latest ID. Feels a bit like showing your poker hand, right?
Enter the possible hero of our story: a nifty method that takes these IDs and morphs them into encrypted random strings using a sprinkle of secret salt. In this magical world, our database is blissfully unaware, munching on regular int-type IDs, while our users get served fancy hashed strings that look like UUIDs. Fancy, huh?
But... plot twist! What if our secret salt takes a vacation and ends up in the wrong hands? We'd be stuck with a bunch of links that might as well lead to Narnia. Picture a "https://tinyurl.com/WhereDidTheMagicGo" that's lost its sparkle.
Given that I’ve been known to occasionally doubt the genius-level of my brainstorm sessions (hey, humility, right?), what say you? Smart or Idiotic? Bring on the verdict!
35 Replies
Nice read!
I started using ULIDs as DB IDs but what do I know
Mostly because the use the fancy words like Lexicographically Sortable
Thanks for chiming in! Ah, ULIDs - the alluring blend of uniqueness and lexicographical sortability. I've come across them before, and they indeed have that certain appeal, especially with those "fancy words" in their selling points. But that's an interesting take. Have you faced any specific challenges using them as DB IDs?
I'm curious to hear about your experience!
Unusually (he said) i've been a tad lazy reading the docs recently, so pardon the direct ask: do you know if there's decent support for ULIDs with Drizzle and Planetscale?
Well it's great not to care about transforming IDs before sending them to the client apps and still be able to use cursor based pagination
Definitely. I expect my "solution" to be the absolute last resort. I mainly came here to hear other options
interesting that planetscale chose to use nanoId in combination with an internal bigint id:
https://planetscale.com/blog/why-we-chose-nanoids-for-planetscales-api
Why we chose NanoIDs for PlanetScale’s API — PlanetScale
Learn why PlanetScale used NanoID to generate obscure and URL friendly identifiers.
I don't know about planetscale, but since we use MySQL, I've found annoying not having a
.returning()
method. So I can't use the new $default()
on the column definitionpostgres is magic
also planetscale:
so where are you generating the ulids?
We didn't mind that pitfalls because I just maintain a small app used internally at our company
Our users are just the employees
oh i see
also, performance aside, ulids have the BEST pronunciation U-Lids
There's a js library of course
@Angelelz I've been follwing your direction and I made a ulid custom column to use in my tables:
For reads and inserts this works prefect, but it won't work on update and deletes as the filter values get parametrized and the db gets the full string instead of the buffer - resulting in no match.
Running this:
Results in the following print:
Do you have any suggestion on how to solve this?
NM, solved by letting the DB handle the binary extraction from the hex string
If you're here looking for a solution, see this gist
Gist
Implementing Efficient Binary ULID Storage in MySQL with Drizzle ORM
Implementing Efficient Binary ULID Storage in MySQL with Drizzle ORM - drizzle-ulid.md
OMG, this is gold!
We did not take the time to even attempt to store it as binary
Does it behave properly when comparing like this?
Sorry I got you to that tangent in your project? What do you think of it so far?
Yeah it will work. The toDriver will transform it into hex and then binary
You pointed me in the right path 🙂
I did some (limited) testing on a 1.3M rows table, and couldn't notice any performence difference
Well, the fact that is binary hints that there should be no performance issues
Right?
But they talk about storage
I can only see 3 possible reasons not to use it:
1. You don't want to leak creation time - ulid will store that (as opposed to uuid or nanoid)
2. You need something shorter for nicer urls
3. 16 bytes is still more then 8 bytes for bigint
This should be really easy to tests in a sqlite db
lmk if you get the chance! would love to see the results!
I don't have the time right now but I wanna look into it in the future
Oh, one drawback to be ware of, is that updates and inserts on drizzle studio will not work. As well as filtering or relation views
I think this solution is the best of any world
Well, based on your comments there should be an easy fix
And don't use the studio anyway lol
same, I literally only used it for testing out the custom type
Are you planning to keep that gist up?
I think we should link it somewhere
In the docs or something
Yeah, it will remain up
I agree. I posted it here and gave the gist a good title so people have better chance of finding it when looking ulid and drizzle, posting it in the docs will be helpful for someone for sure
@Andrew Sherman I use ULIDs in my project but this implementation is a lot more efficient in terms of storage and performance. Do you think we can include a link to the gist maybe in the examples for customTypes or the usage of the
$default()
method?@Angelelz just a heads up in case you’re considering using my type - I’ll advise against it.
It appears that drizzle have LOTS of issues with custom types. The toDriver and fromDriver are not reliable, and will not fire on many occasions (prepared statements, nested query builder relationships and more…) In order to get binary ULID working this will have to be resolved. There are many issues open on GitHub related to this.
For myself, I gave up on it for now, and will store my ids as in a varchar type.
Hey Mendy, thank you for reaching out. I'm invested in this now.
I've been pushing PRs to drizzle and I'm familiar with the codebase. Do you mind if I ask in particular what issues you've found that I could take a look into?
I sent the message before leaving my house, when back, I’ll setup a box with the issues I’ve seen. If you have a small schema I can use and you can link it here it will be helpful.
If you are following through with this, I also made some changes to the type to help mitigate some issues, I’ll update the gist when I’m back.
Sounds good! Thanks
I haven't forgotten, got caught up with urgent matters, will get to doing that soon.
It's ok. But I believe there were some fixes in the latest release of drizzle-kit
How are you doing the bit where you are storing ULID as a binary in the database and then converting it to UUID to send to the users? I think this is the path I want to go down. It makes sense to store ULID as binary for mostly sortable IDs which means better perorming indexes, however I dont want to expose the created time of the ULID to the client.
Please give me the usage for MySQL Planetscale in my schema https://gist.github.com/MendyLanda/d5dc176707fd23cf6625c170038a8835
Gist
Implementing Efficient Binary ULID Storage in MySQL with Drizzle ORM
Implementing Efficient Binary ULID Storage in MySQL with Drizzle ORM - drizzle-ulid.md