Admincraft•11mo ago

Trying to diagnose performance issues

Hello everyone, I am trying to figure out this strange hiccup that is happening on my server(s). Chunk loading performs normally most of the time, and then it will occasionally take an extremely long time Can see in the bandwidth chart: https://i.imgur.com/7yewJbQ.png It will just stop and hang for an undetermined amount of time. Sometimes its just a few seconds, other times its 60 seconds and the server shuts down. I am trying to figure out if there could be some bottleneck with my hosting provider, such as the network, disk, etc to cause this? The server is running on 14GB of memory and 3 cpu cores, which provides very fast chunk loading outside of these spikes. I am testing it by flying around in creative mode, but these spikes happen in regular gameplay too. This happens on vanilla but I am using fabric to utilize spark, and it also occurs there. Thanks! Spark profile: https://spark.lucko.me/6tQQGFKim6

Imgur

spark

spark is a performance profiler for Minecraft clients, servers, and proxies.

35 Replies

Admincraft Meta•11mo ago

Spark Profile Analysis

❌ Processing Error

The bot cannot process this Spark profile. It appears that the platform is not supported for analysis. Platform: Fabric

Requested by p.uppy#0

Admincraft Meta•11mo ago

Thanks for asking your question!

Make sure to provide as much helpful information as possible such as logs/what you tried and what your exact issue is

Make sure to mark solved when issue is solved!!!

/close !close !solved !answered

Requested by p.uppy#0

ProGamingDk•11mo ago

have you pregenned?

Casper•11mo ago

looks like chunk saving are you on an ssd or hdd?

mitcheOP•11mo ago

Its SSD but over nfs

ProGamingDk•11mo ago

ouch very ouch

Casper•11mo ago

huge yikes moment here

Casper•11mo ago

wait no I misread this is the wait for next tick

ProGamingDk•11mo ago

yes you do but yeah ssd over nfs not good

Casper•11mo ago

no wait wtf this is on the netty thread, no? no im wrong ffs ill just shut up

ProGamingDk•11mo ago

bruh what 😭 netty thread isnt profiled by default

Casper•11mo ago

ok ok ok so

ProGamingDk•11mo ago

thats what we have /spark profiler start --thread * for

mitcheOP•11mo ago

Is that net pause we see caused by waiting on the NFS write?

Casper•11mo ago

the wait for next tick is up here

Casper•11mo ago

this is processing for the move packet which leads to a getChunk call which goes into the run tasks, which leads to the parkNanos you see what I mean?

ProGamingDk•11mo ago

mesa tired, mesa go sleep

Casper•11mo ago

xd what I think is happening is that it is trying to read a chunk that is unloaded and then since youre on nfs, it is blocking the main thread till that chunk can be read on nfs which is killing your perf since normally you actually have the world on an ssd or something

mitcheOP•11mo ago

the problem is we are using NFS in order for server files to be available for dynamically scaling nodes and containers. is there a way to counter the effects of this?

Casper•11mo ago

get faster wifi your bottleneck seems to be your network speed you basically need network speed that is as fast as having the ssd locally on the device which is why id never actually do nfs for server files myself xd

ProGamingDk•11mo ago

Get fast ethernet

Casper•11mo ago

You know what I meant

ProGamingDk•11mo ago

I do, some dont 😭 But this 100%

Casper•11mo ago

To mitigate it you can also preload your chunks onto your local system further than what nms would use You'd still have it bad when anyone does anything to move fast like elytras But early game itd prob be fine assuming you preload enough then you can also run into collisions of multiple servers having the same chunks overall idk why you'd do it this way

walker•11mo ago

Hi, server host owner here. For some context: NFS is definitely not a good solution for MC but unfortunately its sort of the best we've got considering our setup. We're a "pay by the minute" server host so we dynamically start and stop servers when players are online/offline. We also dynamically add and remove nodes depending on how many servers are online. Because of this there's no guarantee that the assigned node will be mounted to the volume that we store the servers on, therefore we use NFS to mount to the "always online" node that has the servers volume. While there are definitely performance issues its not too bad considering all the nodes are in the same data center, but its issues like this that occasionally prop up. I guess I'm curious if yall have any suggestions for this particular use case. I'm wondering if there's a way to tell MC to more aggressively write chunks to disk rather than all at once so there's not these massive lag spikes where we're blocking the main thread, but I'm not sure that's possible. For some more context we're using Longhorn with RWX volumes https://longhorn.io/docs/1.6.2/nodes-and-volumes/volumes/rwx-volumes/

Longhorn

Longhorn | ReadWriteMany (RWX) Volume

walker•11mo ago

Maybe setting the sync-chunk-writes property to false would help improve performance? That way we're asynchronously writing instead of blocking the main thread

Casper•11mo ago

oh yeah this isnt paper paper forces that to be false

walker•11mo ago

For performance reasons, presumably? I've read there are possible issues with data corruption on a crash (makes sense considering this is an async write). But I'm surprised Paper would force this to be false considering that possibility.

Casper•11mo ago

yeah, performance

walker•11mo ago

Cool. Paper disabling that by default makes me more confident that data loss is rare except in the case of a crash. I think what we'll do is advise our users to disable that property and install a backup plugin

Casper•11mo ago

for your case I dont see data loss being an issue since part of the shutdown process will including flushing chunks to disk and then you do whatever you need to do to store that (s3 or whatever you do)

QarthO•11mo ago

realistically theres always some bit of dataloss on server crashes. that can't really be stopped

walker•11mo ago

Sounds good, thanks. After some adhoc tests this has significantly improved performance

Discount Milk•11mo ago

Oh. That'll do ya some problems https://discord.com/channels/348681414260293634/731261638652723362/1116484780452761631 https://discord.com/channels/348681414260293634/1201825233321861181

Snow Kit•11mo ago

nfs is notoriously bad for minecraft. If you need to use remote storage, iSCSI is probably a better option. I know SMB would likely be better for Minecraft if you were on windows, but I'm unsure about the current state of SMB on linux

Gaming

Programming

Trying to diagnose performance issues

Did you find this page helpful?