Signal K•13mo ago

cpu load signalk on cerbo gx high

i am having a high cpu load for signalk node on my cerbo gx. I am trying to disable plugin per plugin, but cpu load stays high. it's on 2.5.0 - is there anyway how i can easily debug my cpu load per process inside signalk?

38 Replies

Владимир Калачихин•13mo ago

the SignalK is a very heavy application. It requires a lot of memory and CPU. I've seen swap increase after a few days of work. There may be a memory leak.

KeesOP•13mo ago

for some reason it has been worse since i upgraded cerbo gx to firmware 3.20 which includes signalk 2.5. I upgraded manually signalk to newest version 2.7.1 but still high cpu load.

Teppo Kurki•13mo ago

What do you have connected to sk? How busy is it, how many deltas per sec? As you may have done disabling all plugins may rule out one problem area Another would be discarding some not relevant pgns that have high update rate Bottom of Data Fiddler gives you access to pgns that sources produce. We could add stats there to help in analysis…

KeesOP•13mo ago

i have nmea ve can enabled ais has high update rate, let me play with that. i might just disable whole nmea2000 to test

KeesOP•13mo ago

i am trying to filter out address 43 which is my ais in the data connection settings. But it doesnt get filtered out. Am i forgettnig anything?

Teppo Kurki•13mo ago

Devices on the n2k bus is a more likely culprit than ais What id do you see in the data item list, 43 or canname? Ah your ais is n2k…nevertheless Have you restarted? Can’t remember if that is required for changes to take effect

KeesOP•13mo ago

Ah thanks, it was the restart likely. The data is not showing anymore in databrowser, so that works. But the CPU load is not decreasing, maybe because signalk is still processing the input to make the decision to even filter it If I disable n2k as a whole, I go from 50% cpu load (which is 80% at 1 core almost if i look per core1 vs core2) to 6%. So it's definitely somewhere in n2k. Wondering what is the best option to tune that down. How to best 'disregard' pgns? should i use the filter? doesnt look like the load is going down though delta / s is around 60 without ais and 100 with OK, if i filter out ALL n2k addresses (until delta is 0 / s), the cpu load will ~~still stay the sam~~e - only slidely reduce, so that hypotheses that filtering wont reduce cpu load seems to be true. If i disable n2k though it goes drastically down which is good, but then also critical n2k like speed and wind are not available unfortunately I can just shut down the hardware of the AIS, but I like the functionality to be a station for marinetraffic.com, hmmmm. Should i consider going with a bigger capacity in terms of hardware? Like adding raspberry pi next to cerbo where the raspberry pi then can be dedicated for signalk?

Teppo Kurki•13mo ago

is something else on the cerbo suffering from sk using cpu? something we can do is that you capture some of your data by turning data logging on for the n2k connection (and then turn logging off!) and share the log file with me, i can take a look at the actual input data

KeesOP•13mo ago

signalk itself is suffering from the cpu usage. it gets very very slow, like opening node red 1 minute instead of 5 seconds will turn on data logging! ok turned it on for couple of minutes, now used: find /data/conf/signalk -type f -name "*.log" which one should i choose? 😉 just the raw one? /data/conf/signalk/skserver-raw_2024-04-10T18.log

Teppo Kurki•13mo ago

the raw log. will have a look over the weekend

KeesOP•13mo ago

skserver-raw_2024-04...

KeesOP•13mo ago

i guess it's the AIS, lot of vessels around. Boat is in Amsterdam and I think it already catches there 100s of boats you see lot of entries with PGN 129038 alone already around 1200 AIS targets tracked per minute, see here: https://www.marinetraffic.com/ais/details/stations/35031 (it's turned off now but you can see historical data)

Владимир Калачихин•13mo ago

The SignalK by itself works fine with about 800 AIS targets. I checked. However through the NMEA 0183. The FreeboardSK isn't.

Teppo Kurki•13mo ago

have you tried disabling node-RED?

KeesOP•13mo ago

Yes, although not much change. Even when all plugins disabled, the cpu load is still high. THe plugin that reduce cpu load the most is the udp nmea 1803 one

Teppo Kurki•13mo ago

That sounds really realiy strange. Are you sure you had restarted for these results? Udp sender cpu consumption should practically zero

KeesOP•13mo ago

yeah it's very hard to debug per plugin and watching the cpu load i dont see changing that much. the 'most' is quite relative here, maybe few percentage. the problem can also be the memory instead of the cpu load. Although the cpu load is at 60% for signalk, the system is 20% which stll has 20% idle left....

KeesOP•13mo ago

maybe there is a memory leak or something the performance at start of signalk is also better then after a couple of hours

Greg Young•13mo ago

FWIW (caveat .. im not using a cerbo for signalk) … but on RPi/signalk i had some very similar symptoms, in my case it was combintion of influxdb plugin… writing/saving a lot of data and grafana. .. couldnt see from the thread history if you have influx plugin operating? .. all was good after a fresh boot, but over time it bogged down and became really sluggish. (my solution was upgraded rpi to rpi4 with bigger memory, and whitelisted paths for influx)

KeesOP•13mo ago

no influx running ok, same problem with can0 on raspberry pi directly. especially when opening data browser the things go wild as it is trying to load 100s of boats besides only context 'self'

KeesOP•13mo ago

To give a sense of the volume:

KeesOP•13mo ago

I think it might be because I upgraded my antenna and coax cable to good quality haha

Владимир Калачихин•13mo ago

Try GaladrielMap SignalK Edition, it no have problems with hundreds AIS targets. And their display can be quickly turned off - just in this case.

Teppo Kurki•13mo ago

@Kees i've had the sample data file running now for a while, no evidence of a memory leak or performance degradation, everything working. maybe create a larger file, for a longer period? and maybe share it privately instead of on the channel, where it will stay for posterity...

KeesOP•13mo ago

Thanks Teppo, I found the issue. The ais targets are just updating to much (and/or there are to many). I moved from cerbo to raspberry pi4 and it can handle the nmea stream, however, as soon as i start consuming the data in a client (databrowser, wilhelmsk, kip) the cpu load gets abnormal and will break the experience. This is because it starts loading to much ais stuff. I need to find a way how to reduce updates / datastream for that. Would be interesting plugin to 1. max 100 targets ascending from closest (this is what axiom plotters do) or better 2. tune with update rate of some PGNs, like that SOG and COG not updating every second or something. When i filter out the ais source as a whole in my data connection, at least the clients not crash, but then i not have AIS info anymore - so need to mitigate that with the above suggestions a bit, will find that out 😉

Teppo Kurki•13mo ago

Gotcha! Makes sense. A larger capture would help. I think this is a scenario that needs addressing, but i need to be able to replicate your problem to make progress So opening databrowser and vesselpositions cause the problem?

KeesOP•13mo ago

Vesselpositions is not so worse for whatever reason. But databrowser, as well as streams to KIP and WilhelmSK (once i start loading those clients on my devices it the signalk app/cpu goes wild). btw side question. in raspberry pi 4b openplotter OS, if signalk (node) has 100% cpu load, the total cpu load of the raspberry is around 30%. Wondering why this is, can there be a setting where node can use all cpu power? probably architecture question and guess there is a reason for it. maybe even node is like designed like that, quite a noob here

Teppo Kurki•13mo ago

designed like that, single threaded could you create a larger log file? like i said, would be much easier to figure out improvements if i could reproduce your problem

KeesOP•13mo ago

cool, just started logging, how big you want it to be? 😉

Teppo Kurki•13mo ago

ok, i was able reproduce Data Browser reconnect loop after having left the large log file running! two issues - https://github.com/SignalK/signalk-server/issues/1718 - https://github.com/SignalK/signalk-server/issues/1717

GitHub

Initial delta burst causes send buffer overflow and webapp reconnec...

If there are enough cached deltas a webapp connecting via ws will get its connection killed by the send buffer check mechanism. The webapp will then reconnect, only to be killed again. We are blast...

GitHub

Excessive memory consumption of tracking sent metadata · Issue #171...

We are currently tracking sending metadata per ws connection: if we have not sent metadata for a context-path combination we will send metadata and add a marker that is formed from concatenating co...

Teppo Kurki•13mo ago

to be continued

AdrianP•13mo ago

Not sure if this is also related to this issue, but thought I'd mention. https://github.com/SignalK/freeboard-sk/issues/114

Teppo Kurki•13mo ago

@Kees here is a version of lib/interfaces/ws.js that supposedly fixes the ws connections getting severed and going to reconnect loop (databrowser) as well as reduced memory consumption for ws clients if you can locate the version that came with your SK install and overwrite it (first take a backup copy) with this you should be able to test drive this once the server has gathered enough data and you open a ws client, like databrowser, you should see a warning message that includes outgoing buffer > max message but not immediately the dreaded terminating connection maybe @Scott Bender can give pointers on where to find the installed file?

Gist

websocket code with fixes for too quick send buffer overflow and me...

websocket code with fixes for too quick send buffer overflow and memory consumption for multiple ws clients - ws.js

Scott Bender•13mo ago

on VenusOS it's /usr/lib/node_modules/signalk-server/lib/interfaces/ws.js would require root access and need to tun the script to make the root filesystem writable probably same location on the pi or could be /usr/local/lib/...

KeesOP•13mo ago

Thanks a lot. Sorry I am away for work this week. I will try later this week!

KeesOP•13mo ago

Works better now, this is tested with WilhelmSK client. Still spikes every minute, but they are short so easy to handle without things break. Data browser though is still challenging, when opening that, spikes are longer like 20 seconds, which then makes the other client (like WilhelmSK connected) break.

Teppo Kurki•13mo ago

Thanks! So there’s more work there to make it play nicer

Gaming

Programming

cpu load signalk on cerbo gx high

Did you find this page helpful?