A
Admincraftā€¢8mo ago
mitche

Trying to diagnose performance issues

Hello everyone, I am trying to figure out this strange hiccup that is happening on my server(s). Chunk loading performs normally most of the time, and then it will occasionally take an extremely long time Can see in the bandwidth chart: https://i.imgur.com/7yewJbQ.png It will just stop and hang for an undetermined amount of time. Sometimes its just a few seconds, other times its 60 seconds and the server shuts down. I am trying to figure out if there could be some bottleneck with my hosting provider, such as the network, disk, etc to cause this? The server is running on 14GB of memory and 3 cpu cores, which provides very fast chunk loading outside of these spikes. I am testing it by flying around in creative mode, but these spikes happen in regular gameplay too. This happens on vanilla but I am using fabric to utilize spark, and it also occurs there. Thanks! Spark profile: https://spark.lucko.me/6tQQGFKim6
Imgur
spark
spark is a performance profiler for Minecraft clients, servers, and proxies.
35 Replies
Admincraft Meta
Admincraft Metaā€¢8mo ago
Spark Profile Analysis
āŒ Processing Error
The bot cannot process this Spark profile. It appears that the platform is not supported for analysis. Platform: Fabric
Requested by p.uppy#0
Admincraft Meta
Admincraft Metaā€¢8mo ago
Thanks for asking your question!
Make sure to provide as much helpful information as possible such as logs/what you tried and what your exact issue is
Make sure to mark solved when issue is solved!!!
/close !close !solved !answered
Requested by p.uppy#0
ProGamingDk
ProGamingDkā€¢8mo ago
have you pregenned?
Casper
Casperā€¢8mo ago
looks like chunk saving are you on an ssd or hdd?
mitche
mitcheOPā€¢8mo ago
Its SSD but over nfs
ProGamingDk
ProGamingDkā€¢8mo ago
ouch very ouch
Casper
Casperā€¢8mo ago
huge yikes moment here
No description
Casper
Casperā€¢8mo ago
wait no I misread this is the wait for next tick
ProGamingDk
ProGamingDkā€¢8mo ago
yes you do but yeah ssd over nfs not good
Casper
Casperā€¢8mo ago
no wait wtf this is on the netty thread, no? no im wrong ffs ill just shut up
ProGamingDk
ProGamingDkā€¢8mo ago
bruh what šŸ˜­ netty thread isnt profiled by default
Casper
Casperā€¢8mo ago
ok ok ok so
ProGamingDk
ProGamingDkā€¢8mo ago
thats what we have /spark profiler start --thread * for
mitche
mitcheOPā€¢8mo ago
Is that net pause we see caused by waiting on the NFS write?
Casper
Casperā€¢8mo ago
the wait for next tick is up here
No description
Casper
Casperā€¢8mo ago
this is processing for the move packet which leads to a getChunk call which goes into the run tasks, which leads to the parkNanos you see what I mean?
ProGamingDk
ProGamingDkā€¢8mo ago
mesa tired, mesa go sleep
Casper
Casperā€¢8mo ago
xd what I think is happening is that it is trying to read a chunk that is unloaded and then since youre on nfs, it is blocking the main thread till that chunk can be read on nfs which is killing your perf since normally you actually have the world on an ssd or something
mitche
mitcheOPā€¢8mo ago
the problem is we are using NFS in order for server files to be available for dynamically scaling nodes and containers. is there a way to counter the effects of this?
Casper
Casperā€¢8mo ago
get faster wifi your bottleneck seems to be your network speed you basically need network speed that is as fast as having the ssd locally on the device which is why id never actually do nfs for server files myself xd
ProGamingDk
ProGamingDkā€¢8mo ago
Get fast ethernet
Casper
Casperā€¢8mo ago
You know what I meant
ProGamingDk
ProGamingDkā€¢8mo ago
I do, some dont šŸ˜­ But this 100%
Casper
Casperā€¢8mo ago
To mitigate it you can also preload your chunks onto your local system further than what nms would use You'd still have it bad when anyone does anything to move fast like elytras But early game itd prob be fine assuming you preload enough then you can also run into collisions of multiple servers having the same chunks overall idk why you'd do it this way
walker
walkerā€¢8mo ago
Hi, server host owner here. For some context: NFS is definitely not a good solution for MC but unfortunately its sort of the best we've got considering our setup. We're a "pay by the minute" server host so we dynamically start and stop servers when players are online/offline. We also dynamically add and remove nodes depending on how many servers are online. Because of this there's no guarantee that the assigned node will be mounted to the volume that we store the servers on, therefore we use NFS to mount to the "always online" node that has the servers volume. While there are definitely performance issues its not too bad considering all the nodes are in the same data center, but its issues like this that occasionally prop up. I guess I'm curious if yall have any suggestions for this particular use case. I'm wondering if there's a way to tell MC to more aggressively write chunks to disk rather than all at once so there's not these massive lag spikes where we're blocking the main thread, but I'm not sure that's possible. For some more context we're using Longhorn with RWX volumes https://longhorn.io/docs/1.6.2/nodes-and-volumes/volumes/rwx-volumes/
walker
walkerā€¢8mo ago
Maybe setting the sync-chunk-writes property to false would help improve performance? That way we're asynchronously writing instead of blocking the main thread
Casper
Casperā€¢8mo ago
oh yeah this isnt paper paper forces that to be false
walker
walkerā€¢8mo ago
For performance reasons, presumably? I've read there are possible issues with data corruption on a crash (makes sense considering this is an async write). But I'm surprised Paper would force this to be false considering that possibility.
Casper
Casperā€¢8mo ago
yeah, performance
walker
walkerā€¢8mo ago
Cool. Paper disabling that by default makes me more confident that data loss is rare except in the case of a crash. I think what we'll do is advise our users to disable that property and install a backup plugin
Casper
Casperā€¢8mo ago
for your case I dont see data loss being an issue since part of the shutdown process will including flushing chunks to disk and then you do whatever you need to do to store that (s3 or whatever you do)
QarthO
QarthOā€¢8mo ago
realistically theres always some bit of dataloss on server crashes. that can't really be stopped
walker
walkerā€¢8mo ago
Sounds good, thanks. After some adhoc tests this has significantly improved performance
Snow Kit
Snow Kitā€¢8mo ago
nfs is notoriously bad for minecraft. If you need to use remote storage, iSCSI is probably a better option. I know SMB would likely be better for Minecraft if you were on windows, but I'm unsure about the current state of SMB on linux

Did you find this page helpful?