Stackoverflow when adding a larger list of property values using traverser.property()

Hey, we encounter a stack overflow:
Exception during Transaction, rolling back ...
org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(org/apache/tinkerpop/gremlin/process/traversal/step/util/AbstractStep.java:150): Java::JavaLang::StackOverflowError
from org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(org/apache/tinkerpop/gremlin/process/traversal/step/util/ExpandableStepIterator.java:55)
from org.apache.tinkerpop.gremlin.process.traversal.step.sideEffect.SideEffectStep.processNextStart(org/apache/tinkerpop/gremlin/process/traversal/step/sideEffect/SideEffectStep.java:38)
from org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(org/apache/tinkerpop/gremlin/process/traversal/step/util/AbstractStep.java:150)
Exception during Transaction, rolling back ...
org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(org/apache/tinkerpop/gremlin/process/traversal/step/util/AbstractStep.java:150): Java::JavaLang::StackOverflowError
from org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(org/apache/tinkerpop/gremlin/process/traversal/step/util/ExpandableStepIterator.java:55)
from org.apache.tinkerpop.gremlin.process.traversal.step.sideEffect.SideEffectStep.processNextStart(org/apache/tinkerpop/gremlin/process/traversal/step/sideEffect/SideEffectStep.java:38)
from org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(org/apache/tinkerpop/gremlin/process/traversal/step/util/AbstractStep.java:150)
when we try to add large list of values (~4000 entries) via a traversers property() calls. It seems the property method is implemented using recursion which causes the problem. See also https://github.com/JanusGraph/janusgraph/issues/3479 Is this a known issue?
GitHub
StackOverflowError adding alot of vertex properties inside one trav...
Version: 0.6.1 Storage Backend: berkleyje Mixed Index Backend: lucene Expected Behavior: No Stackoverflow Error Current Behavior: Stackoverflow Error Running this gremlin query inside JRuby: traver...
10 Replies
kelvinl2816
kelvinl281610mo ago
At some point, depending on the JVM thread stack setting (-Xss), if you have a few thousand Gremlin steps chained together, you are highly likely to encounter a stack overflow. The recommended mitigations are to either 1/ break the query up into multiple queries, or 2/ increase the -Xss value if that is something you have control over. Note that increasing the stack size is not an ideal fix as it will bump it for all JVM threads and thus can increase memory usage significantly. You can check the stack size a JVM is currently using, running a command like this one:
$ java -XX:+PrintFlagsFinal -version | grep ThreadStackSize

intx CompilerThreadStackSize = 0
intx ThreadStackSize = 1024
intx VMThreadStackSize = 1024

$ java -XX:+PrintFlagsFinal -version | grep ThreadStackSize

intx CompilerThreadStackSize = 0
intx ThreadStackSize = 1024
intx VMThreadStackSize = 1024

mrckzgl
mrckzglOP10mo ago
Thanks for the reply. But the thing is those steps are not chained in "userspace". These are all separate traversal.property() calls. Tinkerpop is internally calling things recursively inside traversal.iterate(). This is the problem and it shouldn't occur IMHO if traversal.iterate() would not be implemented recursively... Between, we have also figured the workaround of splitting things up into separate traversals (see the linked JanusGraph issue). For simple traversals this might be feasible, but the more complex they get the more problematic to implement this will turn out...increasing stack size is of course not an option.
kelvinl2816
kelvinl281610mo ago
So just to clarify, you are not doing something like .property().property().property().... 4000 times but doing something different?
mrckzgl
mrckzglOP10mo ago
no, we are not. please have a look at the linked issue, there it is described more verbose. But the essence is:
traversal = dbtx.traversal.add_v()

large_list.each{|v|
traverser.property(VertexProperty::Cardinality::list, 'search_value', v["value"])
}

traverser.iterate()
traversal = dbtx.traversal.add_v()

large_list.each{|v|
traverser.property(VertexProperty::Cardinality::list, 'search_value', v["value"])
}

traverser.iterate()
stack overflow is happening inside the iterate() call
kelvinl2816
kelvinl281610mo ago
That's the same thing. You are just building it up stepwise. The stack overflow can (and does) still happen if you built them all inline.
mrckzgl
mrckzglOP10mo ago
But the problem is that iterate is handling all those steps recursively instead of iteratively. It does not need to be implemented like this. (and btw. it is not the same thing. In what you describe, the stack overflow is happening outside of tinkerpops code base, even before the iterate call, for what we do it is happening inside) I guess, the problem would not be so apparent, if the property call / step would allow to take a list as value argument, instead of just one single value (and of course handle that list iteratively), but, I haven't found a way to add multiple values inside one property call / step.
triggan
triggan10mo ago
stephen mallette
Inserting a Vertex Using a Map
The typical method for setting properties on a graph element, such as a Vertex, is to use the property()-step. This step looks a bit like the put() method of a Java Map which takes a key and a value as its argument (though property() can optionally take additional arguments for Cardinality and meta-properties). It’s fitting that these APIs are s...
mrckzgl
mrckzglOP10mo ago
Oh nice one. No I did not. Have to try this out, thanks alot.
spmallette
spmallette10mo ago
i wonder if that will work. it isn't designed to handle Cardinality.list well and that looks like what's desired here
mrckzgl
mrckzglOP10mo ago
Currently testing this:
traversal = db.traversal.v(some_id).as('vertex')
traversal.side_effect(
T.inject(large_list)
.unfold().as('value')
.select('vertex').property(VertexProperty::Cardinality::list, 'search_id', T.select('value'))
)
traversal.iterate()
traversal = db.traversal.v(some_id).as('vertex')
traversal.side_effect(
T.inject(large_list)
.unfold().as('value')
.select('vertex').property(VertexProperty::Cardinality::list, 'search_id', T.select('value'))
)
traversal.iterate()
At first glance,it does not produce an error on a large list of 7000 values. If I understood correctly, it should be equivalent to:
large_list.each{|v|
db.traversal.v(some_id).property(VertexProperty::Cardinality::list, 'search_id', v).next()
}
large_list.each{|v|
db.traversal.v(some_id).property(VertexProperty::Cardinality::list, 'search_id', v).next()
}
but hopefully much more performant. It does not produce the stack overflow as the steps following unfold are handled iteratively and not recursively as in the OP case. something is still wrong, the values won't be found in the db. Maybe there is a problem in passing directly the JRuby version of the large_list, maybe it would work converting this to a native Java List, but I won't persue this further. We already implemented the split traversal work around, so this is fine at the moment. Still, not very good that tinkerpop is generating stack overflows out of its own ...
Want results from more Discord servers?
Add your server