Does bulking optimization provided by LazyBarrierStrategy improves query performance?

I’m having a hard time understanding usefulness of the LazyBarrierStrategy which supposedly adds bulking optimization. In the nutshell LazyBarrierStrategy simply adds barrier(2500) after FlatMapStep. As I understand it means to execute previous FlatMapStep up to 2500 times before moving to the next step. I hardly understand the usefulness of such barrier step being inserted after FlatMapStep. Do you know any use-case when LazyBarrierStrategy improves query performance anyhow or brings any benefit?
7 Replies
porunov
porunovOP2y ago
Here is a simple test which shows that without LazyBarrierStrategy we touch less vertices during traversals. Moreover I usually see better query performance without LazyBarrierStrategy but slightly worse with LazyBarrierStrategy. It would be great if anyone spill a little bit of light on this topic.
pieter
pieter2y ago
Don't know much about the Lazy barrier, however, Sqlg removes all TinkerPop barrier steps. TinkerPop has no notion of managing memory and nor should it. Loading a full query result into memory risks crashing the jvm. Better to let the underlying db manage the results.
porunov
porunovOP2y ago
Thanks for the insight. This strategy is enabled in JanusGraph by default and it seems doesn't provide any benefit (unless I'm missing any use-case when LazyBarrierStrategy provides any benefit). Thus, we are thinking of disabling this strategy by default.
rngcntr
rngcntr2y ago
In relational databases, these types of aggregation operators are quite common as well. The principle is most commonly referred to as "vectorization" and helps efficiently utilizing CPU caches. Vectorization is most helpful for CPU or Memory-intensive workloads. In JanusGraph however, most of the query evaluation is probably spent waiting for network traffic from the storage backend. I suppose that's why these barrier steps are not helpful in most queries.
Florian Hockmann
Not sure if you're already aware of it, but I think the original reasoning behind the LazyBarrierStrategy was described by Marko in this blog post under Section 3: Traversal Optimization via Bulking: https://www.datastax.com/blog/tales-tinkerpop
DataStax
Tales from the TinkerPop | Datastax
Read the latest announcements, product updates, community activities and more. Subscribe now to the DataStax blog!
porunov
porunovOP2y ago
Thanks for sharing this. I haven't seen this blog before. This one was interesting. So, basically the main advantage of using barrier step after FlatMapStep is to reduce the amount of operations needed for the same traversals. I.e. if out() returns 10 duplicate vertices then we are going to execute a single operation for them later instead of executing 10 times the same operation. This is an interesting use-case. Thanks for sharing! I'm not sure how often this use-case is practical, but I see it may be beneficial in some cases. I guess users should explicitly disable LazyBarrierStrategy in such case or use barrier(1) instead.
spmallette
spmallette2y ago
a pity that the formatting of that post has fallen into such disrepair

Did you find this page helpful?