Apache TinkerPop•12mo ago

`next(n)` with Gremlin JavaScript

I'm trying to do some basic pagination next(n) seems perfect, but it doesn't appear to be available for JavaScript as per the documentation. Is there a reason for this limitation?

Solution:

AFAIK, that is only possible via scripts.

Jump to solution

12 Replies

Florian Hockmann•12mo ago

Interesting, didn't know that we have that in the docs and I especially why it says that about Gremlin.Net since we have support for Next(n) there: https://github.com/apache/tinkerpop/blob/82fe33939aa7058bd90d0dcc178817cc5720df17/gremlin-dotnet/src/Gremlin.Net/Process/Traversal/DefaultTraversal.cs#L220 For JavaScript, it really seems to be missing, but I can't say why

GitHub

tinkerpop/gremlin-dotnet/src/Gremlin.Net/Process/Traversal/DefaultT...

Apache TinkerPop - a graph computing framework. Contribute to apache/tinkerpop development by creating an account on GitHub.

triggan•12mo ago

gremlin-javascript does support streaming, though. So you could likely get the same behavior using this: https://tinkerpop.apache.org/docs/current/reference/#_processing_results_as_they_are_returned_from_the_gremlin_server

PainguinOP•12mo ago

huh, I thought next(n) gonna do something special

public IEnumerable<TEnd?> Next(int amount)
{
    for (var i = 0; i < amount; i++)
        yield return Next();
}

public IEnumerable<TEnd?> Next(int amount)
{
    for (var i = 0; i < amount; i++)
        yield return Next();
}

This look like it's just doing Next multiple time. Is this only possible with submitting gremlin script? As in you can't do the same via traversal g.etc

Florian Hockmann•12mo ago

yep, the reason for this is that Gremlin.Net will send your traversal completely for evaluation to the server and get all results back. So, when you just call Next(), then you'll simply get the first result back, but all other results are available locally any way. That's why you can also simply call Next() again afterwards to get more results If you want the server to only iterate your traversal until it has produced X results, then you need to use something like Limit(x) which will be sent to the server

triggan•12mo ago

I also see range(x,y) used heavily for this sort of pattern. Just be careful, because many databases will not ensure that range() calls are idempotent if you're making changes to the database while performing subsequent queries with range().

Solution

triggan•12mo ago

AFAIK, that is only possible via scripts.

triggan•12mo ago

But you can do something like this:

    client = new Client(conninfo['url'], { traversalSource: 'g', headers: conninfo['headers']});
    const graph = new Graph();
    const g = graph.traversal();
    query = (g.inject().V().has('Artist','name','James Earl Jones').
        repeat(
            __.in_('actor','actress').
            out('actor','actress').
            simplePath()). 
        until(
            __.has('Artist','name','Kevin Bacon')
        ). 
        path().by('name').by('title'));

    counter = 0
    result = await client.stream(query.bytecode,{},{ batchSize: 1 });
    try {
        for await (const res of result) {
          console.log('data', res.toArray()); 
          counter = counter + 1;
          console.log(counter);
        }
      } catch (err) {
        console.log(err);
      }

    client.close();

    client = new Client(conninfo['url'], { traversalSource: 'g', headers: conninfo['headers']});
    const graph = new Graph();
    const g = graph.traversal();
    query = (g.inject().V().has('Artist','name','James Earl Jones').
        repeat(
            __.in_('actor','actress').
            out('actor','actress').
            simplePath()). 
        until(
            __.has('Artist','name','Kevin Bacon')
        ). 
        path().by('name').by('title'));

    counter = 0
    result = await client.stream(query.bytecode,{},{ batchSize: 1 });
    try {
        for await (const res of result) {
          console.log('data', res.toArray()); 
          counter = counter + 1;
          console.log(counter);
        }
      } catch (err) {
        console.log(err);
      }

    client.close();

Florian Hockmann•12mo ago

Also note that not all graph databases will be able to execute range(x, y) efficiently. If I'm not mistaken, then JanusGraph will for example in this case simply execute your traversal as if you would have used limit(y). So, it will still fetch the first x values from your backend. That of course defies the purpose of using the range() step. I'm not sure though so it's best to use profile() to try it out

PainguinOP•12mo ago

Not even sure if I need pagination in my case tbh, my concern is pulling in a large amount of data in one go might break or tickle something funny. Like should I not even bother if say there're less than 10k results?

triggan•12mo ago

I guess it depends on how computationally complex it is to find each result. If this is just a "find the first 10k vertices with x property", then there may not be a need. But if you're doing something like the code example above where you want to find multiple paths between two objects with varying levels of depth/breadth, then streaming, pagination, or using query cache are likely better patterns.

PainguinOP•12mo ago

First time hearing about query cache, what's that about?

triggan•12mo ago

"Query cache" meaning more the traditional way of building an external query cache using something like Redis or Memcached. You're application hashes the overall query and stores the result of the query with the hash. Upon further queries being submitted, your application would first do a hash lookup in the cache to see if the results were previously retrieved. If so, return the cached results. We discuss this sort of pattern in relation to Neptune here: https://aws.amazon.com/blogs/database/part-3-accelerate-graph-query-performance-with-caching-in-amazon-neptune/ You can get really creative with this sort of pattern and derive your own cache invalidation strategies or use a TTL against each stored hash. You can even use this for pagination, which is where we find this used most.

Amazon Web Services

Accelerate graph query performance with caching in Amazon Neptune, ...

Graph databases are uniquely designed to address query patterns focused on relationships within a given dataset. From a relational database perspective, graph traversals can be represented as a series of table joins, or recursive common table expressions (CTEs). Not only are these types of SQL query patterns computationally expensive and complex...

Gaming

Programming

`next(n)` with Gremlin JavaScript

Did you find this page helpful?