Unstable Response Times
I deployed a Remix prototype using a SaaS API, I am on a paid plan and using Smart Placement. Below is the response times I see after a fresh deployment, i refreshed the page 26 times. Response time varies between 288ms and 1.10s. First two ones over 2 seconds are understandable, first one with empty cache and Smart Placement used AMS, on second one Smart Placement changed to FRA and settled on FRA for the remaining requests. Why can't we have stable response times? I am sure it is not the SAAS API, on my local I get 60ms to 80ms response times.
Deployment ID: 30d04e5d-2fd0-4f4d-b866-ca98e9aad893
81 Replies
anyone?
same
I get between 100 and 600ms response time depending on time and random
for a nextjs project
hmm, at least this behavior is framework agnostic 😄
I think servers running workers are overloaded
so sometimes you can get 600ms when its busy and 100ms when its free
Workers have a flat deployment, the same server which receives your request runs the worker, there's no load balancer or anything inbetween, so unlikely
Smart Placement is.. interesting and has had some issues in the past. You'd have to look at the cf-placement header and location of your origin to try to tell more, and whatever else your app is doing
yes but why at 5am i get 100ms response time and at 6pm i get 600ms response time
and i don't use smart placement
as I said, it settles on FRA which is aligned with my SAAS
atm I get 500ms response time while this morning it was around 100ms
either the origin you are connecting to, or an issue with nextjs. Nextjs's support on Pages has been an ongoing challenge for a while, Vercel doesn't want to help out unlike most of the other frameworks. If it's not smart placement you should make your own thread though. I would check your functions metrics (cpu time, etc)
If you're in Germany/using DTAG that has its own set of challenges currently with free routing
ouch, look at that p99
i don't really know why, i don't do much things
but atm i can't get any request under 400ms response time
this is from the same page, all I do is refresh, so it's doing exactly the same thing in every request
Can you make your own thread? It seems your issue might be unrelated, and it's hard to keep both stories straight when you're not even using smart placement
okay
you're using smart placement, and you're in Germany/using dtag or not?
no, i'm in turkiye
turkey has its own set of issues with ip blocking but probably not going on here. You said even the slow ~1s requests are smart routed? Can you check what the cf-placement and cf-ray headers on one of those is?
local-FRA
89c922d18c552bf1-FRA
looks like smart placement isn't doing anything and you're being routed directly to fra
what do you mean, if I disable smart placement I'll probably be routed through IST?
no, wouldn't change that
also probably wouldn't help latency
curious that you said before you were being routed to ams. Well doesn't matter too much, if you go to your functions metrics, what you see for request latency? You should see smart routed vs non
unfortunately I don't see a Request duration chart
Doc says "The request duration chart is currently only available when your Worker has Smart Placement enabled."
this was the main reason I switched to Smart Placement...
well it turns out you do have the exact same problem unrelated to subrequests or smart routing, your cpu time is way too high. Something's taking a ton of time to execute, cpu time does not include network wait time. In the past it's been silly things like huge svgs that have to render or other expensive operations, it's going to depend on your framework/libs a bit
for reference, free gets 10ms of cpu time and it's plenty. I have a nextjs app deployed on Pages and its p99 is ~17.8ms
i am on a paid plan because in free I got exceeded errors after 3-4 requests
but again
you found the wrong solution to the problem
this is the same request
doing the same api call
rendering the same html
and they're all pretty long, none under ~300-400ms or so
the way cpu limits work is that each isolate/instance of your worker running on a metal gets a ton of startup time, and if you're going way over you're going to burn through it, which is why it took a few requests to fail
i am measuring the time in my loader(this is a remix app), from the entry to the return, it's 12ms for this request,
there are no api/backend calls after loader finishes, just react rendering
if you're doing performance.now or date.now(), it doesn't move the time unless on network i/o
Cloudflare Docs
Performance and timers · Cloudflare Workers docs
Measure timing, performance, and timing of subrequests and other operations.
ex:
var oldtime = Date.now()
// 500ms of cpu
var newtime = Date.now()
// newtime = oldtime are going to be equal
that's only true for deployed workers though, local wrangler will advance normally
i know that, i have at least 2 api calls
cached through kv
cache miss loaders take around ~600ms and as far as I know waiting for an api response does not count for cpu time limit
oh, you were saying before it was only 12ms for a cache hit of everything, and it's 600ms uncached?
yes, but that's just the measurement of loader function, not the overall response
and pages cache evict too qucikly, i think it's around 60 seconds or so
well eitherway if it's taking ~600ms that's way too long
pages doesn't have cache, which do you mean?
cache api
oh, you can pass in a cache-control max age but it may still be evicted faster, and cache is per location. Eitherway it shouldn't be taking ~600ms of cpu time to run it uncached
that's not the execution time
it's waiting for the api response, I am not calculating anything for 600ms
@st try stopping requesting the worker for 5 minutes then test again
the chart you showed above is ~600ms p90 and ~150ms p75
yes, because I am the only user/developer
no, your site just shouldn't ever be that slow/that much cpu usage, uncached or whatever
well, in my opinion and the way workers are priced at least, that is, it's just too slow
if you think it's the api call that's slowing it down, you could make a worker with smart routing that purely calls it, isolate each component
loader is the only place I can meausre time
i don't now what is the cost of render on pages, but on my local machine its around 100ms including API response
cloudflare workers have too many issues and we can't even know why because there is no logs 😐
I don't need to wait 5 minutes, 60 seconds is enough
workers are fine, there's request logs too. Under each deployment you can tail it and see all requests
yes but you don't see anything useful to debug performance
you don't even see the cpu time for a single request
yea, a bit part of that is Spectre and speculative execution mitigations
you can still time each api/kv call and such though, and try calling from a worker, etc
so, same page takes 100ms in my local environment, let's add 200ms latency to that, i need to see a constant ~300 ms response time
the bigger issue is just frameworks being a black box and doing so many things you don't know of. CF could help in that though if not for the spectre mitigations by giving timings of each function and such
yh but when i time that my api calls are taking 30ms but my request time is 600ms i don't understand
:NotLikeThis:
it's the cpu usage or other requests (if you have any)
yh cpuTime make no sense
but as you see that is not the case, and even workers claim that it takes around a second
maybe nextjs have memory leak
memory shouldn't cost that though, it'd error out at going over 128 mb
i had 600ms response time 5 minutes ago
now i have 200ms
changing nothing
just waiting
well something's changing internally lol
I think next-on-pages does take advantage of cache api internally
thats the cpu time that changed
so i don't understand
its same page
if it was caching rendering or page chunks
yes, but we are not talking about caches, we are talking about the cpu time
I just can't help much with the framework specific stuff other then saying add logs around everything you think and test locally (keeping in mind cpu differences), you can see from a standalone worker the fetch doesn't take that long to your origin, it's a mix of cpu time + what the fetch actually takes and round trip-time
i showed you, 12 ms using cache, but server response is ~1 second
one request later it's again 12ms but response is ~300ms
the SAAS API is algolia
there is no way that their response times will fluctuate that much
you mentioned having other net operations like kv in each call -- I would time all of them separately and log them, it'll give you some more insight. You could also use a separate worker to call algolia like I suggested before to see relative latency/how much it should be, and doing so would remove it being maybe cf/a worker from the mix (or point to it)
hmm, i think it is not possible, because I am using their react component which handles the api calls internally
well you could at least do the second part to try to debug that and confirm (or not confirm) that it's not workers/smart routing alone causing it
i can try but I have no clue
do you mean I should create a naked js function and deploy it to workers?
either an actual worker or a plain pages project using nothing but normal functions
workers are really easy to use, just normal fetch interface, ex:
i can proxy the algolia requests,
to a worker i guess,
but isn't this will be same with Subrequests
Subrequests: "Requests triggered by calling fetch from within your Functions."
I was saying use a worker (or a pages function) to do a subrequest to algogia, same as your actual deployed app would. By doing so, you're isolating the subrequest out, and you can see the latency of just that subrequest and see if it varies or not.
not make your app proxy the request or anything
algolia'a react library makes the api calls internally
https://www.algolia.com/doc/guides/building-search-ui/what-is-instantsearch/react/
and you're ssring it and thus calling that api internally?
that's right
well to test it purely you'd need to find out the exact request it's doing (if it supports client-side, could maybe use that)
otherwise you've just got too many pieces in play, like trying to figure out if the center of your cake is a lemon without pulling it apart
i can do that, but i need ssr for seo,
thanks anyway
sure, I mean just for testing, figuring out the exact request it makes, reproducing it cleanly in a worker or function (separate from your app), and then you can see the latency of it directly
you figure out if that subrequest is slow or not, and then you look at the rest of the layers. (ex: if the lib making too many requests? is react doing something? etc)
by the way, i did deploy the same app using next.js, it's much worse 😄
🥲