Is NoHostAvailableException losing/not including relevant error context (3.7.0 above)?
Hey folks,
I've recently noticed that G.V() is "not as good" as it used to be in reporting some specific connectivity issue, and upon further investigation managed to attribute this to a change in behaviour with NoHostAvailableException seemingly losing some relevant context.
In my case, I'm testing the scenario of trying to connect to Azure Cosmos DB from an IP that is not allowed through their firewall, which usually results in a bespoke error message for Azure going as follows:
Attempting to submit a query to a Cosmos DB endpoint protected by a firewall results in the expected NoHostAvailableException, but the error's detailedMessage only states:
The rootCause for the thrown exception is itself a NoHostAvailableException and does not contain any reference to the actual error message returned when attempting the connection, which is instead output separately by the driver before the NHA exception is thrown - "I'm assuming this is what the check the error log to find the actual reason message refers to"
Is there a way to get this downstream failure from the NHA exception?
I've seen cases where it seemingly works (e.g. SecurityException from invalid credentials, serialization issue, etc) and others where the root cause seems to just get lost.
For reference I've attached what's output by the driver prior to throwing the exception (lengthy stack trace alert).
Solution:Jump to solution
i dont think we've changed any behavior for
NoHostAvailableException
since 3.5.5: https://tinkerpop.apache.org/docs/current/upgrade/#_gremlin_driver_host_availability Since that time there is really only one way that an NHA is thrown: if the connection pool cannot initialize a connection to any host. we are selective in what exceptions are raised within the NHA because there are cases where the exception can be more confusing than helpful. in this case, we weren't including handshake exception...GitHub
Improved error messaging for NHA · apache/tinkerpop@a37e93f
Added another exception type to those than can be raised as a cause of NHA CTR
2 Replies
Solution
i dont think we've changed any behavior for
NoHostAvailableException
since 3.5.5: https://tinkerpop.apache.org/docs/current/upgrade/#_gremlin_driver_host_availability Since that time there is really only one way that an NHA is thrown: if the connection pool cannot initialize a connection to any host. we are selective in what exceptions are raised within the NHA because there are cases where the exception can be more confusing than helpful. in this case, we weren't including handshake exceptions which was the cause in your log output. that seems like a sensible thing to include so i quickly added that: https://github.com/apache/tinkerpop/commit/a37e93f3c3b3c1b404c34ecf77ac05a0d959e046 thanks for bringing that up. cc/ @KennhGitHub
Improved error messaging for NHA · apache/tinkerpop@a37e93f
Added another exception type to those than can be raised as a cause of NHA CTR
I think that's exactly what I need here, awesome! I'll close this out and wait for the next tinkerpop release