Railway•13mo ago

latency

Hello, I am building an app that performs object detection using machine learning. The machine learning inference server is hosted in AWS sagemaker (US-east), and I am using railway to host a node express server as a sort of gateway between the client (US-east) and the sagemaker server. The client needs to send a single image with it's request. I have noticed that directly invoking the sagemaker server it takes about 300 milliseconds to get a response back. I know from local testing, the inference time is about 150 milliseconds, so it's taking 150 milliseconds presumably to send that image data and get a response back from sagemaker. When invoking the express server hosted on railway (US-west), it takes about 900 milliseconds - 1 second to get a response back. I am slightly surprised by that, but I imagine that it's mostly passing the image data between requests that's causing most of this, i.e client --> express --> sagemaker instead of just client --> sagemaker. It could also be that express server is US-west and sagemaker is US-east. There is also the fact that I need to do some authentication stuff on the express server before passing along the request to sagemaker, but I have tried to run those in parallel rather than sequential. I would like to reduce the latency as much as possible, also please forgive me, I am mostly a front-end dev that is diving into territory I know very little about, so any thoughts/ideas/suggestions are appreciated.

12 Replies

Percy•13mo ago

Project ID: N/A

Brody•13mo ago

why not run the express app in us-east as well to cut down on the rtt to your aws service?

CeresMillerOP•13mo ago

express app is running us-west (oregon)

Brody•13mo ago

my bad

CeresMillerOP•13mo ago

run sagemaker in us-west you mean? Yeah that could be an option

Brody•13mo ago

I fixed my question

CeresMillerOP•13mo ago

Railway requires an upgrade of my plan afaik to get access to US-east

Brody•13mo ago

yes they do

CeresMillerOP•13mo ago

are you suprised that it's triple the latency, does us-west and us-east differ that much

Brody•13mo ago

I'm sure not all that latency is coming from the travel time but you could also run your aws service in us-west if that's an option though if this project will have a userbase or clients, then at some point you will need to upgrade to pro anyway

CeresMillerOP•13mo ago

do you have some thoughts on what it could be or how to approach this, should I just atomically break it down and see what is causing the latency

Brody•13mo ago

I think you should just run the two things in the same region, eliminate that variable completely first

Gaming

Programming

latency