getting tcp connection state
im currently in following situation:
we have a tcp architecture where more or less everything can connect to (local network).
since devices conencting arent always behaving properly we cannot bet on graceful connection closing. as a workaround we are polling on IPGlobalProperties.GetActiveTcpConnections to forceclose connections and require a new handshake and datasyncing. we have the problem now though that with increasing amount of tcp sockets in windows (checked via netstat) that call gets rediculous slow and cpu heavy. more or less all i need is to check that the connection that is/was connected is still estabished.
are there any alternatives i could use? would p-invoking GetTcpTable be faster? are there better apis?
21 Replies
just looking throug runtime sourcecode... seems like GetActiveTcpConnections already does the pinvoking
so i need another plan
Don't you have your own list of connections which have been opened to you?
Can't you just send some heartbeat data to those, if they haven't sent anything recently?
i do but like written i sadly cannot bet that the connections get closed properly and they will linger as connected until tcptimeout
So what, you're getting halfway through the close handshake?
Or does the device just disappear?
not always possible... some devices have a specific protocol where no heartbeat is possible
Don't they have a "get status" command or anything you can use?
device looses connection temporarly or whatever, missing packets, when again in range maybe device wants to reconnect thinks has to do handshake again but server didnt had timeout yet -> nothing works anymore
these are all kinds of devices where i dont have really any say in implementation since they come from all kinds of companies
so the safestt bet is to just kill everything and start anew but tcp timeout is not playing nice
So you've still got your side of the connection open? So you can just close that, or try to send data (which will cause a timeout quire quickly). Also there being an existing dead connection on your side shouldn't stop the device from being able to establish a new connection
well what do i send without breaking protocol... how do i know that the connection is gone without sending anything?
You don't. GetActiveTcpConnections doesn't know either
the problem with the existing dead connection is that like in the case above the connection was just shortly gon
windows can query conenctionstate on deeper osi level. without the apps knowing anything. then it knows if its established listening whatever
If the connection is established but one side just disappears, the connection is kept open (for a long time) in case the other side reappears. Windows will also count the connection as active
That's the root of the problem: according to the spec, it's still open and healthy
(until you try to send some data and that's never acknowledged)
Hence my confusion around why you're using GetActiveTcpConnections rather than just using the list that your app knows about
Also, an existing open connection should not stop the device from opening another connection. I still don't understand that bit
(I hit exactly this problem with my company's own embedded devices fwiw)
there were some parameters you could set to set the checkhealthinterval of the tcpsocket. i dont currently have the sourcecode on me so cannot tell exactly how it was done. this will check the tcpconnection on deeper osi level (windows knows about it). if device is gone the conenctionstate goes from established to time-wait. usually couple tries before its completly disconnected. we set that param pretty low, i think it was in the 10s of ms default is somewhere in the multiple seconds iirc. the crux is tho that time-wait is for the tcpsocket still connected while what we want is that if it goes into timewait we know about it to treat it appropriatly
What "deeper osi level"? The layer below Transport is Network, and that's not relevant here
Time_wait means the connection had been closed, but the os is waiting for a period before reusing the port, to make sure that the other side doesn't try to keep using that port, no?
i count the app as the application layer.. at some point as you say you get into the session or transport layer
pretty much yea
Tcp is the Transport layer, yes
But, that means the connection has been closed. Which doesn't match up at all with your description of looking for connections to close?
And, why do you care if a connection is in time_wait? That shouldn't matter to anyone
Something else is going on I think, which you either haven't explained, or don't understand.
the conenction will stay in timewait even if device just dies until the timeout. from point of server everything is fine. but then device revives, tries to reestablish a new conenction on same port but cant anymore.. this is the shortest i can explaint it.
or don't understandjep might be the case... to be honest i had jsut hoped for a quick solution, im currently writing in my free time and dont want to invest too much time into wrok related stuff..xD
But the device should be able to establish a new connection
Multiple clients can talk the the same http server at the same time, for example
Unless the device is reusing a source port or something for multiple connections? Which would be horribly broken but conceivable
I'm afraid I need to get some sleep: I'll drop back in tomorrow
aah i forgot to mention that if we dont notice early enough the the device is gone that we produce 100s to 1000s of dollars garbage but ¯\_(ツ)_/¯ not my money
jeay good night and thank you for the discussion
But the device should be able to establish a new connectiontcp wise most likely but via the protocol to actually talk with the device not. just in case someone else reads this im going to sleep now aswell though
depends what "breaking protocol" means, you can send an invalid message if you receive back a response that says "that was an invalid message
What exact do you mean by that?
I think there's a lot of XY stuff going on here. What exactly if the problem you see if you don't do any of these workarounds?