`next(n)` with Gremlin JavaScript

I'm trying to do some basic pagination next(n) seems perfect, but it doesn't appear to be available for JavaScript as per the documentation. Is there a reason for this limitation?
No description
Solution:
AFAIK, that is only possible via scripts.
Jump to solution
12 Replies
Florian Hockmann
Interesting, didn't know that we have that in the docs and I especially why it says that about Gremlin.Net since we have support for Next(n) there: https://github.com/apache/tinkerpop/blob/82fe33939aa7058bd90d0dcc178817cc5720df17/gremlin-dotnet/src/Gremlin.Net/Process/Traversal/DefaultTraversal.cs#L220 For JavaScript, it really seems to be missing, but I can't say why
GitHub
tinkerpop/gremlin-dotnet/src/Gremlin.Net/Process/Traversal/DefaultT...
Apache TinkerPop - a graph computing framework. Contribute to apache/tinkerpop development by creating an account on GitHub.
triggan
triggan8mo ago
gremlin-javascript does support streaming, though. So you could likely get the same behavior using this: https://tinkerpop.apache.org/docs/current/reference/#_processing_results_as_they_are_returned_from_the_gremlin_server
Painguin
PainguinOP8mo ago
huh, I thought next(n) gonna do something special
public IEnumerable<TEnd?> Next(int amount)
{
for (var i = 0; i < amount; i++)
yield return Next();
}
public IEnumerable<TEnd?> Next(int amount)
{
for (var i = 0; i < amount; i++)
yield return Next();
}
This look like it's just doing Next multiple time. Is this only possible with submitting gremlin script? As in you can't do the same via traversal g.etc
Florian Hockmann
yep, the reason for this is that Gremlin.Net will send your traversal completely for evaluation to the server and get all results back. So, when you just call Next(), then you'll simply get the first result back, but all other results are available locally any way. That's why you can also simply call Next() again afterwards to get more results If you want the server to only iterate your traversal until it has produced X results, then you need to use something like Limit(x) which will be sent to the server
triggan
triggan8mo ago
I also see range(x,y) used heavily for this sort of pattern. Just be careful, because many databases will not ensure that range() calls are idempotent if you're making changes to the database while performing subsequent queries with range().
Solution
triggan
triggan8mo ago
AFAIK, that is only possible via scripts.
triggan
triggan8mo ago
But you can do something like this:
client = new Client(conninfo['url'], { traversalSource: 'g', headers: conninfo['headers']});
const graph = new Graph();
const g = graph.traversal();
query = (g.inject().V().has('Artist','name','James Earl Jones').
repeat(
__.in_('actor','actress').
out('actor','actress').
simplePath()).
until(
__.has('Artist','name','Kevin Bacon')
).
path().by('name').by('title'));

counter = 0
result = await client.stream(query.bytecode,{},{ batchSize: 1 });
try {
for await (const res of result) {
console.log('data', res.toArray());
counter = counter + 1;
console.log(counter);
}
} catch (err) {
console.log(err);
}

client.close();
client = new Client(conninfo['url'], { traversalSource: 'g', headers: conninfo['headers']});
const graph = new Graph();
const g = graph.traversal();
query = (g.inject().V().has('Artist','name','James Earl Jones').
repeat(
__.in_('actor','actress').
out('actor','actress').
simplePath()).
until(
__.has('Artist','name','Kevin Bacon')
).
path().by('name').by('title'));

counter = 0
result = await client.stream(query.bytecode,{},{ batchSize: 1 });
try {
for await (const res of result) {
console.log('data', res.toArray());
counter = counter + 1;
console.log(counter);
}
} catch (err) {
console.log(err);
}

client.close();
Florian Hockmann
Also note that not all graph databases will be able to execute range(x, y) efficiently. If I'm not mistaken, then JanusGraph will for example in this case simply execute your traversal as if you would have used limit(y). So, it will still fetch the first x values from your backend. That of course defies the purpose of using the range() step. I'm not sure though so it's best to use profile() to try it out
Painguin
PainguinOP8mo ago
Not even sure if I need pagination in my case tbh, my concern is pulling in a large amount of data in one go might break or tickle something funny. Like should I not even bother if say there're less than 10k results?
triggan
triggan8mo ago
I guess it depends on how computationally complex it is to find each result. If this is just a "find the first 10k vertices with x property", then there may not be a need. But if you're doing something like the code example above where you want to find multiple paths between two objects with varying levels of depth/breadth, then streaming, pagination, or using query cache are likely better patterns.
Painguin
PainguinOP8mo ago
First time hearing about query cache, what's that about?
triggan
triggan8mo ago
"Query cache" meaning more the traditional way of building an external query cache using something like Redis or Memcached. You're application hashes the overall query and stores the result of the query with the hash. Upon further queries being submitted, your application would first do a hash lookup in the cache to see if the results were previously retrieved. If so, return the cached results. We discuss this sort of pattern in relation to Neptune here: https://aws.amazon.com/blogs/database/part-3-accelerate-graph-query-performance-with-caching-in-amazon-neptune/ You can get really creative with this sort of pattern and derive your own cache invalidation strategies or use a TTL against each stored hash. You can even use this for pagination, which is where we find this used most.
Amazon Web Services
Accelerate graph query performance with caching in Amazon Neptune, ...
Graph databases are uniquely designed to address query patterns focused on relationships within a given dataset. From a relational database perspective, graph traversals can be represented as a series of table joins, or recursive common table expressions (CTEs). Not only are these types of SQL query patterns computationally expensive and complex...
Want results from more Discord servers?
Add your server