Drizzle Team•2y ago

Smart or Idiotic? The Great BigInt-to-Hash Adventure!

Alright, fellow code-wranglers, here's a spicy tech taco for you: we're dancing with regular bigint IDs for all our database entities, thanks to our waltz with planetscale row-reads pricing. Now, I don't know about you, but I’d rather our users not play detective and guess how many entities are populating our database by spotting the latest ID. Feels a bit like showing your poker hand, right? Enter the possible hero of our story: a nifty method that takes these IDs and morphs them into encrypted random strings using a sprinkle of secret salt. In this magical world, our database is blissfully unaware, munching on regular int-type IDs, while our users get served fancy hashed strings that look like UUIDs. Fancy, huh? But... plot twist! What if our secret salt takes a vacation and ends up in the wrong hands? We'd be stuck with a bunch of links that might as well lead to Narnia. Picture a "https://tinyurl.com/WhereDidTheMagicGo" that's lost its sparkle. Given that I’ve been known to occasionally doubt the genius-level of my brainstorm sessions (hey, humility, right?), what say you? Smart or Idiotic? Bring on the verdict!

35 Replies

Angelelz•2y ago

Nice read! I started using ULIDs as DB IDs but what do I know Mostly because the use the fancy words like Lexicographically Sortable

MendyOP•2y ago

Thanks for chiming in! Ah, ULIDs - the alluring blend of uniqueness and lexicographical sortability. I've come across them before, and they indeed have that certain appeal, especially with those "fancy words" in their selling points. But that's an interesting take. Have you faced any specific challenges using them as DB IDs? I'm curious to hear about your experience! Unusually (he said) i've been a tad lazy reading the docs recently, so pardon the direct ask: do you know if there's decent support for ULIDs with Drizzle and Planetscale?

Angelelz•2y ago

Well it's great not to care about transforming IDs before sending them to the client apps and still be able to use cursor based pagination

MendyOP•2y ago

Definitely. I expect my "solution" to be the absolute last resort. I mainly came here to hear other options

MendyOP•2y ago

interesting that planetscale chose to use nanoId in combination with an internal bigint id: https://planetscale.com/blog/why-we-chose-nanoids-for-planetscales-api

Why we chose NanoIDs for PlanetScale’s API — PlanetScale

Learn why PlanetScale used NanoID to generate obscure and URL friendly identifiers.

Angelelz•2y ago

I don't know about planetscale, but since we use MySQL, I've found annoying not having a .returning() method. So I can't use the new $default() on the column definition

MendyOP•2y ago

postgres is magic

MendyOP•2y ago

also planetscale:

MendyOP•2y ago

so where are you generating the ulids?

Angelelz•2y ago

We didn't mind that pitfalls because I just maintain a small app used internally at our company Our users are just the employees

MendyOP•2y ago

oh i see also, performance aside, ulids have the BEST pronunciation U-Lids

Angelelz•2y ago

There's a js library of course

MendyOP•2y ago

@Angelelz I've been follwing your direction and I made a ulid custom column to use in my tables:

export const ulid = customType<{
  data: string;
  notNull: true;
  default: false;
  driverData: Buffer;
}>({
  dataType() {
    return "binary(16)";
  },
  toDriver(value) {
    return Buffer.from(ULID.fromCanonical(value).toRaw(), "hex");
  },
  fromDriver(value) {
    return ULID.fromRawTrusted(value.toString("hex")).toCanonical();
  },
});

export const ulid = customType<{
  data: string;
  notNull: true;
  default: false;
  driverData: Buffer;
}>({
  dataType() {
    return "binary(16)";
  },
  toDriver(value) {
    return Buffer.from(ULID.fromCanonical(value).toRaw(), "hex");
  },
  fromDriver(value) {
    return ULID.fromRawTrusted(value.toString("hex")).toCanonical();
  },
});

For reads and inserts this works prefect, but it won't work on update and deletes as the filter values get parametrized and the db gets the full string instead of the buffer - resulting in no match. Running this:

console.log(
  db
    .update(testUlid)
    .set({ value: "Testing update" })
    .where(eq(testUlid.id, "01HEENT3YW02X8Q915QFR4X23W"))
    .toSQL(),
);

console.log(
  db
    .update(testUlid)
    .set({ value: "Testing update" })
    .where(eq(testUlid.id, "01HEENT3YW02X8Q915QFR4X23W"))
    .toSQL(),
);

Results in the following print:

{
  sql: 'update `test_ulid` set `value` = ? where `test_ulid`.`id` = ?',
  params: [
    'Testing update',
    <Buffer 01 8b 9d 5d 0f dc 00 ba 8b a4 25 bb f0 4e 88 7c>
  ]
}

{
  sql: 'update `test_ulid` set `value` = ? where `test_ulid`.`id` = ?',
  params: [
    'Testing update',
    <Buffer 01 8b 9d 5d 0f dc 00 ba 8b a4 25 bb f0 4e 88 7c>
  ]
}

Do you have any suggestion on how to solve this? NM, solved by letting the DB handle the binary extraction from the hex string

MendyOP•2y ago

If you're here looking for a solution, see this gist

Gist

Implementing Efficient Binary ULID Storage in MySQL with Drizzle ORM

Implementing Efficient Binary ULID Storage in MySQL with Drizzle ORM - drizzle-ulid.md

Angelelz•2y ago

OMG, this is gold! We did not take the time to even attempt to store it as binary Does it behave properly when comparing like this?

eq(testUlid.id, "01HEENT3YW02X8Q915QFR4X23W")

eq(testUlid.id, "01HEENT3YW02X8Q915QFR4X23W")

Sorry I got you to that tangent in your project? What do you think of it so far?

MendyOP•2y ago

Yeah it will work. The toDriver will transform it into hex and then binary You pointed me in the right path 🙂 I did some (limited) testing on a 1.3M rows table, and couldn't notice any performence difference

Angelelz•2y ago

Well, the fact that is binary hints that there should be no performance issues Right? But they talk about storage

MendyOP•2y ago

I can only see 3 possible reasons not to use it: 1. You don't want to leak creation time - ulid will store that (as opposed to uuid or nanoid) 2. You need something shorter for nicer urls 3. 16 bytes is still more then 8 bytes for bigint

Angelelz•2y ago

This should be really easy to tests in a sqlite db

MendyOP•2y ago

lmk if you get the chance! would love to see the results!

Angelelz•2y ago

I don't have the time right now but I wanna look into it in the future

MendyOP•2y ago

Oh, one drawback to be ware of, is that updates and inserts on drizzle studio will not work. As well as filtering or relation views

Angelelz•2y ago

I think this solution is the best of any world Well, based on your comments there should be an easy fix And don't use the studio anyway lol

MendyOP•2y ago

same, I literally only used it for testing out the custom type

Angelelz•2y ago

Are you planning to keep that gist up? I think we should link it somewhere In the docs or something

MendyOP•2y ago

Yeah, it will remain up I agree. I posted it here and gave the gist a good title so people have better chance of finding it when looking ulid and drizzle, posting it in the docs will be helpful for someone for sure

Angelelz•2y ago

@Andrew Sherman I use ULIDs in my project but this implementation is a lot more efficient in terms of storage and performance. Do you think we can include a link to the gist maybe in the examples for customTypes or the usage of the $default() method?

MendyOP•2y ago

@Angelelz just a heads up in case you’re considering using my type - I’ll advise against it. It appears that drizzle have LOTS of issues with custom types. The toDriver and fromDriver are not reliable, and will not fire on many occasions (prepared statements, nested query builder relationships and more…) In order to get binary ULID working this will have to be resolved. There are many issues open on GitHub related to this. For myself, I gave up on it for now, and will store my ids as in a varchar type.

Angelelz•2y ago

Hey Mendy, thank you for reaching out. I'm invested in this now. I've been pushing PRs to drizzle and I'm familiar with the codebase. Do you mind if I ask in particular what issues you've found that I could take a look into?

MendyOP•2y ago

I sent the message before leaving my house, when back, I’ll setup a box with the issues I’ve seen. If you have a small schema I can use and you can link it here it will be helpful. If you are following through with this, I also made some changes to the type to help mitigate some issues, I’ll update the gist when I’m back.

Angelelz•2y ago

Sounds good! Thanks

MendyOP•2y ago

I haven't forgotten, got caught up with urgent matters, will get to doing that soon.

Angelelz•2y ago

It's ok. But I believe there were some fixes in the latest release of drizzle-kit

Whit•16mo ago

How are you doing the bit where you are storing ULID as a binary in the database and then converting it to UUID to send to the users? I think this is the path I want to go down. It makes sense to store ULID as binary for mostly sortable IDs which means better perorming indexes, however I dont want to expose the created time of the ULID to the client.

Trần Bá Hiền•16mo ago

Please give me the usage for MySQL Planetscale in my schema https://gist.github.com/MendyLanda/d5dc176707fd23cf6625c170038a8835

Gist

Implementing Efficient Binary ULID Storage in MySQL with Drizzle ORM

Implementing Efficient Binary ULID Storage in MySQL with Drizzle ORM - drizzle-ulid.md

Gaming

Programming

Smart or Idiotic? The Great BigInt-to-Hash Adventure!

Did you find this page helpful?