Apache TinkerPop•3y ago

Translating bytecode into JupyterLabs compatible script.

I've found that the gremlin syntax in jupyter is different from every other language, including python. For now i've resorted to copying the java PythonTranslator into a new class and modifying the code to suit the differences. Does a better way exist?

10 Replies

triggan•3y ago

Are you using an existing library within Jupyter? Are you familiar with the Graph Notebook project, which has a custom %%gremlin magic? https://github.com/aws/graph-notebook

GitHub

GitHub - aws/graph-notebook: Library extending Jupyter notebooks to...

Library extending Jupyter notebooks to integrate with Apache TinkerPop, openCypher, and RDF SPARQL. - GitHub - aws/graph-notebook: Library extending Jupyter notebooks to integrate with Apache Tinke...

triggan•3y ago

^^ Not just meant to work with Neptune, but with other TInkerPop-enabled databases.

zendorphinsOP•3y ago

Yep i'm using the graph-notebook library as per neptune documentation, forgot to mention it. I'm trying to use a translator to get the query script out of an aws lambda (running with the java gremlin driver) in order to profile and sometimes debug them inside notebooks. There's certain differences like maps being translated within {} instead of the needed [], the steps not having underscores or boolean values not being capitalised. The groovy translator isn't a good fit either because of type casting and a lot of parentheses that it can't understand. Fortunately the translator code is pretty easy to understand - thanks spmallette and Marko - so I made myself a custom one for the time being, although I feel like I'm missing something if i'm the first one to run into this problem. I know there's a profile step but I don't think it accounts for neptune's optimisations, at least i haven't noticed it in the profiling section of neptune's docs, so i'm profiling queries in jupyter.

spmallette•3y ago

could you provide an example of what is not translating properly? the entire Gremlin test suite relies on the translator infrastructure to work so i'm surprised to hear that there could be massive gaps there.

zendorphinsOP•3y ago

Sure, i'll provide an example tomorrow Translator outputs:

{
    custom=Script[
            (g.mergeV(['long':1,'float':0.22f,T.id:'test','name':'test','bigDecimal':0.22f,'double':0.22f,'int':1,'bigInteger':1,'boolean':true]).property(Cardinality.single,'name','test').property(Cardinality.single,'float',0.22f).property(Cardinality.single,'int',1).property(Cardinality.single,'long',1).property(Cardinality.single,'double',0.22f).property(Cardinality.single,'boolean',true).property(Cardinality.single,'bigInteger',1).property(Cardinality.single,'bigDecimal',0.22f),Optional.empty)
        ], 
    groovy=Script[
            (g.mergeV([("long"):(1L),("float"):(0.22f),(T.id):("test"),("name"):("test"),("bigDecimal"):(new BigDecimal('0.22')),("double"):(0.22d),("int"):((int) 1),("bigInteger"):(new BigInteger('1')),("boolean"):(true)]).property(VertexProperty.Cardinality.single,"name","test").property(VertexProperty.Cardinality.single,"float",0.22f).property(VertexProperty.Cardinality.single,"int",(int) 1).property(VertexProperty.Cardinality.single,"long",1L).property(VertexProperty.Cardinality.single,"double",0.22d).property(VertexProperty.Cardinality.single,"boolean",true).property(VertexProperty.Cardinality.single,"bigInteger",new BigInteger('1')).property(VertexProperty.Cardinality.single,"bigDecimal",new BigDecimal('0.22')),Optional.empty)
        ], 
    python=Script[
            (g.merge_v({'long':1,'float':float(0.22),T.id_:'test','name':'test','bigDecimal':float(0.22),'double':float(0.22),'int':1,'bigInteger':1,'boolean':True}).property(Cardinality.single,'name','test').property(Cardinality.single,'float',float(0.22)).property(Cardinality.single,'int',1).property(Cardinality.single,'long',1).property(Cardinality.single,'double',float(0.22)).property(Cardinality.single,'boolean',True).property(Cardinality.single,'bigInteger',1).property(Cardinality.single,'bigDecimal',float(0.22)),Optional.empty)
        ]
}

{
    custom=Script[
            (g.mergeV(['long':1,'float':0.22f,T.id:'test','name':'test','bigDecimal':0.22f,'double':0.22f,'int':1,'bigInteger':1,'boolean':true]).property(Cardinality.single,'name','test').property(Cardinality.single,'float',0.22f).property(Cardinality.single,'int',1).property(Cardinality.single,'long',1).property(Cardinality.single,'double',0.22f).property(Cardinality.single,'boolean',true).property(Cardinality.single,'bigInteger',1).property(Cardinality.single,'bigDecimal',0.22f),Optional.empty)
        ], 
    groovy=Script[
            (g.mergeV([("long"):(1L),("float"):(0.22f),(T.id):("test"),("name"):("test"),("bigDecimal"):(new BigDecimal('0.22')),("double"):(0.22d),("int"):((int) 1),("bigInteger"):(new BigInteger('1')),("boolean"):(true)]).property(VertexProperty.Cardinality.single,"name","test").property(VertexProperty.Cardinality.single,"float",0.22f).property(VertexProperty.Cardinality.single,"int",(int) 1).property(VertexProperty.Cardinality.single,"long",1L).property(VertexProperty.Cardinality.single,"double",0.22d).property(VertexProperty.Cardinality.single,"boolean",true).property(VertexProperty.Cardinality.single,"bigInteger",new BigInteger('1')).property(VertexProperty.Cardinality.single,"bigDecimal",new BigDecimal('0.22')),Optional.empty)
        ], 
    python=Script[
            (g.merge_v({'long':1,'float':float(0.22),T.id_:'test','name':'test','bigDecimal':float(0.22),'double':float(0.22),'int':1,'bigInteger':1,'boolean':True}).property(Cardinality.single,'name','test').property(Cardinality.single,'float',float(0.22)).property(Cardinality.single,'int',1).property(Cardinality.single,'long',1).property(Cardinality.single,'double',float(0.22)).property(Cardinality.single,'boolean',True).property(Cardinality.single,'bigInteger',1).property(Cardinality.single,'bigDecimal',float(0.22)),Optional.empty)
        ]
}

Driver code:

Map<Object, Object> mergeMap =
                Map.ofEntries(entry(T.id, "test"), entry("name", "test"), entry("float", 0.22f), entry("int", 1),
                              entry("long", 1L), entry("double", 0.22d), entry("boolean", true),
                              entry("bigInteger", BigInteger.valueOf(1)),
                              entry("bigDecimal", BigDecimal.valueOf(0.22d))
                );
        GraphTraversal traversal = g.mergeV(mergeMap).property(single, "name", "test").property(single, "float", 0.22f)
                                    .property(single, "int", 1)
                                    .property(single, "long", 1L).property(single, "double", 0.22d)
                                    .property(single, "boolean", true)
                                    .property(single, "bigInteger", BigInteger.valueOf(1))
                                    .property(single, "bigDecimal", BigDecimal.valueOf(0.22d));

        logger.info(Map.ofEntries(
                entry("groovy", groovyTranslator.translate(traversal)),
                entry("python", pythonTranslator.translate(traversal)),
                entry("custom", customTranslator.translate(traversal))
        ).toString());

Map<Object, Object> mergeMap =
                Map.ofEntries(entry(T.id, "test"), entry("name", "test"), entry("float", 0.22f), entry("int", 1),
                              entry("long", 1L), entry("double", 0.22d), entry("boolean", true),
                              entry("bigInteger", BigInteger.valueOf(1)),
                              entry("bigDecimal", BigDecimal.valueOf(0.22d))
                );
        GraphTraversal traversal = g.mergeV(mergeMap).property(single, "name", "test").property(single, "float", 0.22f)
                                    .property(single, "int", 1)
                                    .property(single, "long", 1L).property(single, "double", 0.22d)
                                    .property(single, "boolean", true)
                                    .property(single, "bigInteger", BigInteger.valueOf(1))
                                    .property(single, "bigDecimal", BigDecimal.valueOf(0.22d));

        logger.info(Map.ofEntries(
                entry("groovy", groovyTranslator.translate(traversal)),
                entry("python", pythonTranslator.translate(traversal)),
                entry("custom", customTranslator.translate(traversal))
        ).toString());

Not sure how good that custom translator is for all of the data types cause I haven't really used most of them but the query works in jupyter. I used logging to prevent any string escaping happening

spmallette•3y ago

@neptune any ideas on where things are amiss here?

kelvinl2816•3y ago

Will take a look @spmallette - is the issue here that the Gremlin generated from the code cannot be used in a notebook with %%gremlin ? The Notebooks don't need the Python form if so. They just send the queries as text (over a WebSocket) - so the Groovy form should work so long as it is compatible with the Antlr grammar that TinkerPop uses. So the flow should just be Java --> Groovy Form (aka basic Gremlin form) --> %%gremlin

zendorphinsOP•3y ago

Thanks for the response, the groovy translator I was using for this example is the default one you create with GroovyTranslator.of("g"). I mistakenly based my translator on the python one so I'm not as sure what differences there are between the script it returns and the one that works in jupyter, but I can at least say that it returns an error:

zendorphinsOP•3y ago

Seems like most of the map parenthesis need to be stripped and non-primitive typecasting like the bigDecimal one is somewhat wrong using new, although I have no clue about the type constraints on neptune or tinkerpop so I can't say that it is a problem "int":(int) 1 is also not usable it seems, along with VertexProperty.Cardinality.single

spmallette•3y ago

oh, i think i see it now. Neptune doesn't use Groovy to parse Gremlin. It uses the Gremlin ANTLR grammar. The ANTLR grammar is fairly close to Groovy syntax but not quite in all places (those places are typically in language areas core groovy syntax doesn't support). I think i'd classify this as a bug in the translator because the translator is just playing it safe with the keys. It's not really generating idiomatic Groovy, which in turn is not expected by the ANTLR grammar. Created this issue: https://issues.apache.org/jira/browse/TINKERPOP-2922 - it will be fixed for 3.6.3 which is hopefully releasing soon. Thanks for reporting this as it's definitely an important issue to fix given the prevalence of mergeV/E() usage that's take Map instances. Appreciate the patience in getting to an understanding on what was going on.

Gaming

Programming

Translating bytecode into JupyterLabs compatible script.

Did you find this page helpful?