Apache TinkerPop•8mo ago

[Bug?] gremlinpython is hanged up or not recovering connection after connection error has occurred

Hello, TinkerPop team. I am struggling to avoid problems after a connection error occur. And now, I suspect it might be led by something bug of gremlinpython... Are these bugs? Or just I use it wrongly? Please let me know. Best Regards, environments - wsl2 on Windows11 (Ubuntu) - Python 3.12.4 - gremlinpython 3.7.2 - TinkerPop server: JanusGraph 1.0.0 JanusGraph is launched by docker compose:

services:
  janusgraph:
    image: janusgraph/janusgraph:1.0.0
    ports:
      - '8182:8182'

services:
  janusgraph:
    image: janusgraph/janusgraph:1.0.0
    ports:
      - '8182:8182'

Case 1: Script is hanged up when all pooled connections are consumed? When I specify wrong url to simulate network error, gremlinpython might consume connections and do not return them into the pool. So, below script is hanged up after all pooled connections are consumed. Python Script: see case1.py The Output: see case1-output.txt The result is changed when I specify different value to pool_size argument. My expectation is that error messages are shown in 9 times and the script ends. Case 2: Manual transaction is never rolled back(closed) Same as case 1, manual transaction is never ended. So, I cannot recover the error. Python Script: see case2.py The Output: see case2-output.py My expectation is that this script is end after trying 9 times and all trials are failed. Case 3: Once a connection error occurred, pooled connections are broken After I stopped TinkerPop server(JanusGraph) temporary, some pooled connections are broken and will not be recovered. Python Script: see case3.py The Output: see case3-output.txt My expectation is that connections are refreshed if they are not available when get them from the pool.

case1.py

case1-output.txt

case2.py

case2-output.txt

case3.py

case3-output.txt

Solution:

What you're noticing here kind of boils down to how connection pooling works in gremlin-python. The pool is really just a queue that the connection adds itself back to after either an error or a success but it's missing some handling for the scenarios you pointed out. One of the main issues is that the pool itself can't determine if a connection is healthy or if it unhealthy and should be removed from the pool. I think you should go ahead and make a Jira for this. If it's easier for you, I can help you make one that references this post. I think the only workaround right now is to occasionally open a new Client to create a new pool of connections when you notice some of those exceptions....

Jump to solution

6 Replies

spmallette•7mo ago

sorry it's taking a bit to reply here, but i think these cases might need a bit of investigation to get some answers. cc/ @Yang Xia

Yang Xia•7mo ago

Yes, we'll take a look this week, thanks for the details!

Solution

Kennh•7mo ago

e8lOP•7mo ago

Thank you for replying

I think you should go ahead and make a Jira for this. If it's easier for you, I can help you make one that references this post.

I am sorry, but I do not know the 'Jira' that you mentioned and how to create it. So, I would appreciate it if you help me making it or you make it on my behalf...

I think the only workaround right now is to occasionally open a new Client to create a new pool of connections when you notice some of those exceptions.

Yes, I have already apply such a workaround. The main reason I posted this thread is just confirming what is the expected behavior or whether my usage is wrong or not.

Kennh•7mo ago

Jira ticket is: https://issues.apache.org/jira/browse/TINKERPOP-3114

e8lOP•7mo ago

Thank you so much for creating the ticket, @Kennh I appreciate your cooperation, it is very helpful🙇

Gaming

Programming

[Bug?] gremlinpython is hanged up or not recovering connection after connection error has occurred

Did you find this page helpful?