backup database snapshot reliably
I'm using HNSWlib-node to create an in-memory vector DB. My code writes it to a file when new embeddings are added or deleted. Basically, does railway save to the disk? because it might save as json...
I've tended to find these providers throw away the disk after the server restarts with a fresh disk so it has to be sent somewhere or the disk can’t be ephemeral
I want to hedge risk by making sure the DB gets backed up on exit too. So, my question is how do I make sure that this file gets saved again when the server is shut down for code updates?
22 Replies
Project ID:
0314313d-b6e5-4895-98b0-f3402dfa9adc
Project ID: 0314313d-b6e5-4895-98b0-f3402dfa9adc
Basically, does railway save to the disk? because it might save as json...
I've tended to find these providers throw away the disk after the server restarts with a fresh disk so it has to be sent somewhere or the disk can’t be ephemeral
you are right, containers on railway are empherial, so in a sense yes the data does get thrown away, you will want to be saving the file to bucket storage instead of to disk, ideally with file revisions
Okay I'll look into that. Do you have recommendations? somewhere else that’s not ephemeral with low latency to your servers?
@Brody
cloudflare r2 is my recommendation
r2 is just storage — cant run a server there
Are there any options you'd recommend for running a server?
@Brody
We're trying to set up an API server. Our product deployed on railway needs to make requests to it.
So ideally we can get set up on a solution with:
(a) low latency to Railway's servers and
(b) is not ephemeral
yes r2 is storage, store the backup of your in memory database there
We have multiple railway deploys
so?
I'm just clarifying that our main product is deployed on Railway. And we have a second deploy that we want to just be a really low-latency API server that our main product can call - it doesn't have to be railway
I say this because ideally we don't need to set up separate services for storage and server
We'd prefer if they were all-in-one
But the main priority is latency to the Railway server our main product is hosted on
railway doesn't offer storage, you'll have to use an external service like cloudflare r2
I'm sure cloudflare has a region very close to railway's us-west1 region
Isn't railway hosted on GCP though? gcp has storage on servers
@Brody
railway is hosted on gcp, but railway has not brought the option for persistent storage to the user yet
Okay. So for absolute fastest latency for a persistent storage solution, your recommendation would still be cloudflare R2?
Latency could kill our product - that's why I ask 🙏🏼
if latency to storage is that big of a concern, there will always be much higher latency to an external provider, so while railway is great, they don't have persistent storage volumes yet, but there are other PAAS providers that do offer storage volumes
but you have only been talking database backups, and I don't see how the latency of a background task could effect the end users
Good point @Brody - thanks for pushing back on this. I think we'll host on railway and then set up R2 storage. Do you have any resources you'd recommend for this? never know what tricks the railway team might have so I've gotta ask lol
cloudflare r2 is accessed through an aws S3 compatible api, so all you have to do is use Amazon's s3 sdk
cloudflare has these resources
https://developers.cloudflare.com/r2/examples/aws/
i don't know the language you will be using so I can't send specific links to the sdks since it's different for every language, but the aws S3 sdk docs are plenty easy to find
but if your language is node, don't use the v3 sdk, I've heard it has terrible performance compared to the v2
Yeah we're using node. Thanks for the heads up!
also, it very much sounds like this will be a commercial product so you will want to upgrade to the team plan at some point
for Railway? yeah the account this project is on is just my personal for testing. We're going to deploy it to our team account shortly 🙂
yes, but prepare yourself for the inevitable upgrade at some point in the future when v3 is considered stable
v3 is considered stable by Amazon for some reason, but I've heard otherwise from developers