SK
Signal K8mo ago
Kees

cpu load signalk on cerbo gx high

i am having a high cpu load for signalk node on my cerbo gx. I am trying to disable plugin per plugin, but cpu load stays high. it's on 2.5.0 - is there anyway how i can easily debug my cpu load per process inside signalk?
38 Replies
Владимир Калачихин
the SignalK is a very heavy application. It requires a lot of memory and CPU. I've seen swap increase after a few days of work. There may be a memory leak.
Kees
KeesOP8mo ago
for some reason it has been worse since i upgraded cerbo gx to firmware 3.20 which includes signalk 2.5. I upgraded manually signalk to newest version 2.7.1 but still high cpu load.
Teppo Kurki
Teppo Kurki8mo ago
What do you have connected to sk? How busy is it, how many deltas per sec? As you may have done disabling all plugins may rule out one problem area Another would be discarding some not relevant pgns that have high update rate Bottom of Data Fiddler gives you access to pgns that sources produce. We could add stats there to help in analysis…
Kees
KeesOP8mo ago
i have nmea ve can enabled ais has high update rate, let me play with that. i might just disable whole nmea2000 to test
Kees
KeesOP8mo ago
i am trying to filter out address 43 which is my ais in the data connection settings. But it doesnt get filtered out. Am i forgettnig anything?
No description
No description
Teppo Kurki
Teppo Kurki8mo ago
Devices on the n2k bus is a more likely culprit than ais What id do you see in the data item list, 43 or canname? Ah your ais is n2k…nevertheless Have you restarted? Can’t remember if that is required for changes to take effect
Kees
KeesOP8mo ago
Ah thanks, it was the restart likely. The data is not showing anymore in databrowser, so that works. But the CPU load is not decreasing, maybe because signalk is still processing the input to make the decision to even filter it If I disable n2k as a whole, I go from 50% cpu load (which is 80% at 1 core almost if i look per core1 vs core2) to 6%. So it's definitely somewhere in n2k. Wondering what is the best option to tune that down. How to best 'disregard' pgns? should i use the filter? doesnt look like the load is going down though delta / s is around 60 without ais and 100 with OK, if i filter out ALL n2k addresses (until delta is 0 / s), the cpu load will still stay the same - only slidely reduce, so that hypotheses that filtering wont reduce cpu load seems to be true. If i disable n2k though it goes drastically down which is good, but then also critical n2k like speed and wind are not available unfortunately I can just shut down the hardware of the AIS, but I like the functionality to be a station for marinetraffic.com, hmmmm. Should i consider going with a bigger capacity in terms of hardware? Like adding raspberry pi next to cerbo where the raspberry pi then can be dedicated for signalk?
Teppo Kurki
Teppo Kurki8mo ago
is something else on the cerbo suffering from sk using cpu? something we can do is that you capture some of your data by turning data logging on for the n2k connection (and then turn logging off!) and share the log file with me, i can take a look at the actual input data
Kees
KeesOP8mo ago
signalk itself is suffering from the cpu usage. it gets very very slow, like opening node red 1 minute instead of 5 seconds will turn on data logging! ok turned it on for couple of minutes, now used: find /data/conf/signalk -type f -name "*.log" which one should i choose? 😉 just the raw one? /data/conf/signalk/skserver-raw_2024-04-10T18.log
Teppo Kurki
Teppo Kurki8mo ago
the raw log. will have a look over the weekend
Kees
KeesOP8mo ago
i guess it's the AIS, lot of vessels around. Boat is in Amsterdam and I think it already catches there 100s of boats you see lot of entries with PGN 129038 alone already around 1200 AIS targets tracked per minute, see here: https://www.marinetraffic.com/ais/details/stations/35031 (it's turned off now but you can see historical data)
Владимир Калачихин
The SignalK by itself works fine with about 800 AIS targets. I checked. However through the NMEA 0183. The FreeboardSK isn't.
Teppo Kurki
Teppo Kurki8mo ago
have you tried disabling node-RED?
Kees
KeesOP8mo ago
Yes, although not much change. Even when all plugins disabled, the cpu load is still high. THe plugin that reduce cpu load the most is the udp nmea 1803 one
Teppo Kurki
Teppo Kurki8mo ago
That sounds really realiy strange. Are you sure you had restarted for these results? Udp sender cpu consumption should practically zero
Kees
KeesOP8mo ago
yeah it's very hard to debug per plugin and watching the cpu load i dont see changing that much. the 'most' is quite relative here, maybe few percentage. the problem can also be the memory instead of the cpu load. Although the cpu load is at 60% for signalk, the system is 20% which stll has 20% idle left....
Kees
KeesOP8mo ago
No description
Kees
KeesOP8mo ago
maybe there is a memory leak or something the performance at start of signalk is also better then after a couple of hours
Greg Young
Greg Young8mo ago
FWIW (caveat .. im not using a cerbo for signalk) … but on RPi/signalk i had some very similar symptoms, in my case it was combintion of influxdb plugin… writing/saving a lot of data and grafana. .. couldnt see from the thread history if you have influx plugin operating? .. all was good after a fresh boot, but over time it bogged down and became really sluggish. (my solution was upgraded rpi to rpi4 with bigger memory, and whitelisted paths for influx)
Kees
KeesOP8mo ago
no influx running ok, same problem with can0 on raspberry pi directly. especially when opening data browser the things go wild as it is trying to load 100s of boats besides only context 'self'
Kees
KeesOP8mo ago
To give a sense of the volume:
No description
Kees
KeesOP8mo ago
I think it might be because I upgraded my antenna and coax cable to good quality haha
Владимир Калачихин
Try GaladrielMap SignalK Edition, it no have problems with hundreds AIS targets. And their display can be quickly turned off - just in this case.
Teppo Kurki
Teppo Kurki8mo ago
@Kees i've had the sample data file running now for a while, no evidence of a memory leak or performance degradation, everything working. maybe create a larger file, for a longer period? and maybe share it privately instead of on the channel, where it will stay for posterity...
Kees
KeesOP8mo ago
Thanks Teppo, I found the issue. The ais targets are just updating to much (and/or there are to many). I moved from cerbo to raspberry pi4 and it can handle the nmea stream, however, as soon as i start consuming the data in a client (databrowser, wilhelmsk, kip) the cpu load gets abnormal and will break the experience. This is because it starts loading to much ais stuff. I need to find a way how to reduce updates / datastream for that. Would be interesting plugin to 1. max 100 targets ascending from closest (this is what axiom plotters do) or better 2. tune with update rate of some PGNs, like that SOG and COG not updating every second or something. When i filter out the ais source as a whole in my data connection, at least the clients not crash, but then i not have AIS info anymore - so need to mitigate that with the above suggestions a bit, will find that out 😉
Teppo Kurki
Teppo Kurki8mo ago
Gotcha! Makes sense. A larger capture would help. I think this is a scenario that needs addressing, but i need to be able to replicate your problem to make progress So opening databrowser and vesselpositions cause the problem?
Kees
KeesOP8mo ago
Vesselpositions is not so worse for whatever reason. But databrowser, as well as streams to KIP and WilhelmSK (once i start loading those clients on my devices it the signalk app/cpu goes wild). btw side question. in raspberry pi 4b openplotter OS, if signalk (node) has 100% cpu load, the total cpu load of the raspberry is around 30%. Wondering why this is, can there be a setting where node can use all cpu power? probably architecture question and guess there is a reason for it. maybe even node is like designed like that, quite a noob here
Teppo Kurki
Teppo Kurki8mo ago
designed like that, single threaded could you create a larger log file? like i said, would be much easier to figure out improvements if i could reproduce your problem
Kees
KeesOP8mo ago
cool, just started logging, how big you want it to be? 😉
Teppo Kurki
Teppo Kurki8mo ago
ok, i was able reproduce Data Browser reconnect loop after having left the large log file running! two issues - https://github.com/SignalK/signalk-server/issues/1718 - https://github.com/SignalK/signalk-server/issues/1717
GitHub
Initial delta burst causes send buffer overflow and webapp reconnec...
If there are enough cached deltas a webapp connecting via ws will get its connection killed by the send buffer check mechanism. The webapp will then reconnect, only to be killed again. We are blast...
GitHub
Excessive memory consumption of tracking sent metadata · Issue #171...
We are currently tracking sending metadata per ws connection: if we have not sent metadata for a context-path combination we will send metadata and add a marker that is formed from concatenating co...
Teppo Kurki
Teppo Kurki8mo ago
to be continued
AdrianP
AdrianP8mo ago
Not sure if this is also related to this issue, but thought I'd mention. https://github.com/SignalK/freeboard-sk/issues/114
Teppo Kurki
Teppo Kurki8mo ago
@Kees here is a version of lib/interfaces/ws.js that supposedly fixes the ws connections getting severed and going to reconnect loop (databrowser) as well as reduced memory consumption for ws clients if you can locate the version that came with your SK install and overwrite it (first take a backup copy) with this you should be able to test drive this once the server has gathered enough data and you open a ws client, like databrowser, you should see a warning message that includes outgoing buffer > max message but not immediately the dreaded terminating connection maybe @Scott Bender can give pointers on where to find the installed file?
Gist
websocket code with fixes for too quick send buffer overflow and me...
websocket code with fixes for too quick send buffer overflow and memory consumption for multiple ws clients - ws.js
Scott Bender
Scott Bender8mo ago
on VenusOS it's /usr/lib/node_modules/signalk-server/lib/interfaces/ws.js would require root access and need to tun the script to make the root filesystem writable probably same location on the pi or could be /usr/local/lib/...
Kees
KeesOP8mo ago
Thanks a lot. Sorry I am away for work this week. I will try later this week!
Kees
KeesOP8mo ago
Works better now, this is tested with WilhelmSK client. Still spikes every minute, but they are short so easy to handle without things break. Data browser though is still challenging, when opening that, spikes are longer like 20 seconds, which then makes the other client (like WilhelmSK connected) break.
No description
Teppo Kurki
Teppo Kurki8mo ago
Thanks! So there’s more work there to make it play nicer
Want results from more Discord servers?
Add your server