Swart
Swart
DTDrizzle Team
Created by Swart on 8/31/2024 in #help
Invalid Byte Encoding with Postgres JSONB columns
I've been running a fairly successful website with drizzle-orm but have recently come across a problem that I cannot for the life of me figure out. Any insight into a solution, or even how to debug this further would be helpful. I have a pretty standard setup - - postgres16 [docker image] - Web frontend using nuxt. Data used for the backend API is provided by drizzle. I have a few tables that store information in jsonb columns. This data contains strings (peoples names) that often have characters outside the standard en-us character set. For a long time (almost a year) this has worked seemingly well for me and I had not noticed any errors. Recently though, I've started to receive errors like this from postgres - invalid byte sequence for encoding "UTF8": 0xf8 invalid byte sequence for encoding "UTF8": 0xe5 0x6b 0x61 What I found out from investigating this is that the data being sent to postgres is using an encoding like USC-2 . The byte sequence that is reported as error is the unicode representation of the extended character not the utf-8 sequence. I determined this by running db.execute(sql'${JSON.stringify(entries)}::bytea') and then dumping the resulting buffer with toString('hex') What's really crazy about this is that I can only get it to happen on my production server after about 30 minutes of uptime. Visiting a page triggers an update of a row in the database - The update, with extended characters, works immediately after the service comes up, but will then consistently fail after some amount of time. I started with the postgres.js driver and assumed that it was a bug in their implementation, so I switched to node-postgres and ran into the same issue. Right now I have no real guesses on what is going wrong - all the code is pretty straight forward. Maybe something with server side rendering is changing the default encoding output of JSON.stringify? Any help is appreciated!
10 replies