Apache TinkerPop

AT

Apache TinkerPop

Apache TinkerPop is an open source graph computing framework and the home of the Gremlin graph query language.

Join

How to Work with Transactions with Gremlin Python

I`m trying to implement transactions but I have two scenarios. I start a transaction but when I use iterate on every add_v it saves on my gremlin_server before the commit. The second situation is if if take out the .iterate() and run a commit() it doenst save on gremlin-server. What am I doing wrong?...
No description

mergeV with onMerge when extra properties are unknown

I'm in the following situation: ``` jobId = "spark:bdx_job_1" ...

Using java/gremlin inside python with Jpype!

I recently experimented with using Jpype to give the python world at my day job access to Sqlg. It seems a very easy and powerful way to give python code full access to the any java api. In my case I am making SqgGraph available to python. It is about 5 lines of setup code and voila, the python code has the same functionality as native java. Does anyone use Jpype, anything caveats I should know about?...

Structure Test Suite - Test Data Types and Serialization Types Don't Match?

This issue is based on some assumptions I've and knowledge from my team. Correct me please if any of it is wrong or misguided. An ongoing thing we're doing is better supporting the structure testing suite and having a more accurate features list for our Graph. Array types are supported by our Graph, and the way we handled it is by using Lists, since when GLVs serialize property values of type array or list they come in as an ArrayList. However, the structure test suite, namely PropertyTest, sends the property value type directly to the graph as int[]{1, 2, 3} for example which breaks our Graph since we only expected ArrayList due to the expectation of serialization....

What's the significance of done: false ? (after calling .next())

Hi, I've encountered a query that I execute, and it usually never returns "done" false. But in a specific case, it does. I run 2 queries, and sometimes I'm not calling .next() or any terminal steps. ...

Profiling Neptune from javascript

Hi, I'm looking to profile some existing gremlin queries via Neptune so I can understand the current performance and then optimise. Looking at the documentation, there isa http endpoint at "<endpoint>:<port>/gremlin/profile. I've been able to access this through curl, sending a serialised string....
Solution:
Profiles for Gremlin queries either require using the HTTPS endpoint: https://docs.aws.amazon.com/neptune/latest/userguide/gremlin-profile-api.html Or you can use the AWS SDK: https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/client/neptunedata/command/ExecuteGremlinProfileQueryCommand/ There is no way to get this profile using the Gremlin drivers. ...

select T.id + optional properties

I am trying to work out how to select vertex id and some optional properties. select().by does not work as it filters out not productive properties. Here is the sample graph I am testing this on. ...

Is there a way to specify a query execution timeout via the GremlinLangScriptEngine?

I'm adding a way to specify a query timeout when running queries via G.V(). On the G.V() playground which uses TinkerGraph internally, we "submit" queries directly to the in-memory graph via the GremlinLangScriptEngine. Is there an equivalent of adding a timeout as seen in the Client object via the RequestOptions?
Solution:
no, it's just like standard ScriptEngine implementations in that it operates in the current thread without interrupt. we'd wrapped the GremlinScriptEngine up into the GremlinExecutor to try to generalize behavior for timeouts and Future based execution. you would have to use that class to get that sort of behavior and avoid direct use of the GremlinLangScriptEngine directly.

What algorithms exist for this hypergraph data structure?

This is very minimal, but it hints at a type of ontology structure and software system I want to develop. Does it remind you of any known, studied data structures and algorithms? ```python ontology = set()...

Basic vertex querying does not work in Amazon Neptune but it works with local Gremlin Server

``` const fankode : any = await this.gremlinService.readClientSource .V( profileId ) .hasLabel( 'FAN' ) .next();...

CollectingBarrierStep bug

Solution:
Overriding this function seems to fix ``` @Override public Traverser.Admin<Vertex> processNextStart() {...

pymogwai

https://github.com/juupje/pyMogwai is a an attempt for a python native implementation of the gremling query language there is a demo at: https://mogwai.bitplan.com/ Comments/Issues and Feedback are welcome!...

Naming multiple vertices

I've got a list of vertices and a list of unique names. I'm looking to apply one name to each node, but struggling with the syntax. I think I should be using some sort of query builder where I can do: names.forEach(index, name)...

Possibilities to improve performance on query?

I have a Python application with FastAPI that performs 3 actions when the endpoint is called: querying the structure that returns vertices and edges (21 itens in total), removing edges and registering/updating vertices and edges. For the query process to return the values, it takes a long time to process and the average is 1.4 seconds. Before, I separated this query that I will leave attached and sent it in threads, but it still took a little longer. Im also using the cluster reader url and both...
No description

Neptune Cluster Balancing Configuration

I'm trying to reach 40 rps in the registration flow, currently I'm reaching 20 rps. I have 6 instances of a python application in fastapi and I notice that each interaction takes around 200-400ms to communicate with neptune, for query flows I notice that there is a bottleneck where some of these queries take around 200ms-1s. about the cluster and the application, both are in the same region and with the same vpc. When analyzing the cluster I notice that some instances are having more CPU consump...
Solution:
What are you using to send queries to Neptune? Are you using the gremlin-python client and connecting via websockets? If so, each websocket connection is going to act like a "sticky session". It will connect to the same instance for the life of the connection. The reader endpoint is a DNS endpoint that is configured to resolve to a different read replica approximately every 5 seconds. So depending on when you establish your websocket connections or if you're just sending http requests, those could all go to the same instance if sent in quick succession. Customers have solved this in a number of ways. Some will create load balancers in front of Neptune read replicas that can more precisely "load balance" requests across the instances....
No description

[Bug] clone query affects original cloned query

[X] I am using the latest release (3.7.2) I believe the clone function code in gremlin-javascript contains a bug: Notes: - I renamed process to gProcess as it's a global variable in Nodejs...
Solution:
Hi @Jonathan Fridja, thanks for reporting the issue, I agree that this appears to be a bug. Unlike the clone() implementation in the java driver which this was based off of, the js clone only creates a shallow copy which does not clone the underlying bytecode. For reference this is how clone is implemented in Java: https://github.com/apache/tinkerpop/blob/76190de1086d8be4e207e69d1cc599c9d036a8b5/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/util/DefaultTraversal.java#L267-L290. Would you be willing to submit a PR to the 3.6-dev with a fix to fix to create a deep copy of the bytecode? If you could, that would be great, otherwise I can open up a JIRA to track this bug....

Neo4j news

Why there are Neo4j news in the graph-news? Did neo4j build on top of Tinkerpop?
Solution:
The TinkerPop Community has long maintained compatibility with neo4j, but recent releases of neo4j haven't been made easily compatible for ongoing maintenance of that support. As a result, support for neo4j has been pinned to a really old version at 3.x. Recent discussions within the TinkerPop community are generally in favor of dropping support for neo4j for TinkerPop 4.x which was an easier decision now that TinkerGraph supports basic transactions giving us a way to test that functionality. As for your question about why we keep neo4j news in the #graph-news channel, I suppose i'm mostly responsible for that. As someone who has been working on TinkerPop since its earliest days, we've long thought of TinkerPop as a place to talk about graphs, not just TinkerPop enabled graph, but all graphs. Traditionally, it's been that way, but in more recent times that general conversation seems to have drifted to other places. You're the second person to question the inclusion of neo4j here in graph-news so perhaps there are more folks who find it confusing as to why it is present. it's also fairly noisy as they post with great consistency and if you follow graphs generally, you're probably getting that information other places already. i've been thinking about removing it. i'd be happy to hear if you or others agree with that happening....

Confusing behavior of `select()`.

The following traversal acts as a counter: ``` g.withSideEffect("map", [3: "foo", 4: "bar"]). inject("a", "b", "c", "d")....
Solution:
This is caused (if you are using TinkerGraph for example) by Java's HashMap implementation and the fact that a Long will not match an Integer type. ``` g.withSideEffect("map", [3L: "foo", 4L: "bar"]). inject("a", "b", "c", "d"). aggregate(local, "x")....

Tinkerpop Server OOM

Hi Tinkerpop team, I'm trying to make sense of this OOMing that seems to consistently occur in my environment over the course of usually a couple hours. Attached is a screenshot of the JVM GC behavior metrics showing before & after a GC. It's almost like the underlying live memory continues to grow but I'm not sure why....
Solution:
Sorry for the delayed response. I'll try to take a look at this soon. But for now, I just wanted to point out that SingleTaskSession and the like are part of the UnifiedChannelizer. From what I remember, the UnifiedChannelizer isn't quite production ready, and in fact is being removed in the next major version of TinkerPop. We can certainly still make bug/performance fixes to this part of the code for 3.7.x though.
No description

Good CLI REPL allowing unlabeled edges?

Is there another tool like Gremlin with a REPL but perhaps overall simpler? I’m mainly looking for the ability to make labeled nodes and unlabeled directed binary edges (arrows) between nodes. (On the other hand, I can use a generic label for every level in my Gremlin graph, I guess.)...
Solution:
i think the recommendation would be to do as you suggested at the end of your qeustion and to just use default labels and just ignore them in Gremlin, like g.V().out() as opposed to g.V().out('default'). speaking more to your questions, i'm not sure what other graph frameworks you might use. i could be wrong, but i think NetworkX lets you create labelless graph elements: https://networkx.org/
Next