Apache TinkerPop

Join the Apache TinkerPop server to ask questions!

Apache TinkerPop

Join the community to ask questions about Apache TinkerPop and get answers from other members.

12/10/2024

Parameterized edges creation in existing graph

Hi :gremlin_smile: , I'm currently experimenting with Janusgraph. My graph is a directed hierachical graph coming straight from parsing an XML file. After this first bulk load, I want to add multiple new edges between vertices to create shortcuts or remove property duplication. This was easily done using Cypher and a double MATCH but struggle to do the same thing in Gremlin. I created a small dataset in Gremlify https://gremlify.com/jf036ue70jj/4 ...

Solution:

You've stumbled upon a common gap in Gremlin... has() steps cannot currently take a traversal as an argument. It's listed as a roadmap item for a future TinkerPop 4.x release: https://github.com/apache/tinkerpop/blob/087b3070914123055d3e4ededc2550f12715a0b4/docs/src/dev/future/index.asciidoc#has-traversal

rg2609

12/2/2024

Neo4j Chypre convention in to gremlin query

We are trying to convert the Neo4j chypre query into a Gremlin query, but we are stuck on some extract methods in the chypre query that need to be converted into the Gremlin query. The chypre query follows:

``
MATCH (from:

Person {title: "John"}), (to: Location` {title: "New York City"}) MATCH p = (from)-[rel*..5]->(to)...

Alex

11/28/2024

How to Work with Transactions with Gremlin Python

I`m trying to implement transactions but I have two scenarios. I start a transaction but when I use iterate on every add_v it saves on my gremlin_server before the commit. The second situation is if if take out the .iterate() and run a commit() it doenst save on gremlin-server. What am I doing wrong?...

Solution:

If you're looking to optimize for write throughput on Neptune, you want to consider the following: - For each write requests, attempt to batch 100-200 "object" into a single write request/query. An "object" would be any combination of a vertex, edge, or subsequent vertex/edge properties (vertex with 4 properties == 5 "objects"). - Use parallel write requests. If using Python, consider using multiprocessing to create separate processes. They can share a connection pool to Neptune if you so choose. The number of parallel processes should equal the number of query execution threads available on your Neptune writer instance (which is equal to 2x the number of vCPUs on whatever size instance you're using). If you follow those guidelines, you should get similar performance to what you would see with Neptune's bulk loader. Note that conditional writes will have overhead. If using mergeV(), you're unlikely to see the same write throughput as Neptune's bulk loader as the bulk loader is not doing conditional writes....

blacklight

11/25/2024

mergeV with onMerge when extra properties are unknown

I'm in the following situation: ``` jobId = "spark:bdx_job_1" ...

pieter

11/16/2024

Using java/gremlin inside python with Jpype!

I recently experimented with using Jpype to give the python world at my day job access to Sqlg. It seems a very easy and powerful way to give python code full access to the any java api. In my case I am making SqgGraph available to python. It is about 5 lines of setup code and voila, the python code has the same functionality as native java. Does anyone use Jpype, anything caveats I should know about?...

Rice

11/7/2024

Structure Test Suite - Test Data Types and Serialization Types Don't Match?

This issue is based on some assumptions I've and knowledge from my team. Correct me please if any of it is wrong or misguided. An ongoing thing we're doing is better supporting the structure testing suite and having a more accurate features list for our Graph. Array types are supported by our Graph, and the way we handled it is by using Lists, since when GLVs serialize property values of type array or list they come in as an ArrayList. However, the structure test suite, namely PropertyTest, sends the property value type directly to the graph as int[]{1, 2, 3} for example which breaks our Graph since we only expected ArrayList due to the expectation of serialization....

Solution:

From my understanding, the structure tests were suited for embedded graphs, whereas the feature tests should cover the remote cases, which might imply opting out of those tests. However, do note that the remote feature tests might not cover all cases at this point, though there is intention to make it more well-rounded.

Jonathan Fridja

10/27/2024

What's the significance of done: false ? (after calling .next())

Hi, I've encountered a query that I execute, and it usually never returns "done" false. But in a specific case, it does. I run 2 queries, and sometimes I'm not calling .next() or any terminal steps. ...

red

10/22/2024

Profiling Neptune from javascript

Hi, I'm looking to profile some existing gremlin queries via Neptune so I can understand the current performance and then optimise. Looking at the documentation, there isa http endpoint at "<endpoint>:<port>/gremlin/profile. I've been able to access this through curl, sending a serialised string....

Solution:

Profiles for Gremlin queries either require using the HTTPS endpoint: https://docs.aws.amazon.com/neptune/latest/userguide/gremlin-profile-api.html Or you can use the AWS SDK: https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/client/neptunedata/command/ExecuteGremlinProfileQueryCommand/ There is no way to get this profile using the Gremlin drivers. ...

pieter

10/18/2024

select T.id + optional properties

I am trying to work out how to select vertex id and some optional properties. select().by does not work as it filters out not productive properties. Here is the sample graph I am testing this on. ...

Arthur from gdotv

10/17/2024

Is there a way to specify a query execution timeout via the GremlinLangScriptEngine?

I'm adding a way to specify a query timeout when running queries via G.V(). On the G.V() playground which uses TinkerGraph internally, we "submit" queries directly to the in-memory graph via the GremlinLangScriptEngine. Is there an equivalent of adding a timeout as seen in the Client object via the RequestOptions?

Solution:

no, it's just like standard ScriptEngine implementations in that it operates in the current thread without interrupt. we'd wrapped the GremlinScriptEngine up into the GremlinExecutor to try to generalize behavior for timeouts and Future based execution. you would have to use that class to get that sort of behavior and avoid direct use of the GremlinLangScriptEngine directly.

Julius Hamilton

10/17/2024

What algorithms exist for this hypergraph data structure?

This is very minimal, but it hints at a type of ontology structure and software system I want to develop. Does it remind you of any known, studied data structures and algorithms? ```python ontology = set()...

Memo

10/16/2024

Basic vertex querying does not work in Amazon Neptune but it works with local Gremlin Server

``` const fankode : any = await this.gremlinService.readClientSource .V( profileId ) .hasLabel( 'FAN' ) .next();...

Solution:

The gremlin-javascript driver deserializes the elementMap() step into the Map class. await this.gremlinService.readClientSource.V().elementMap().toList() will return an Array of Maps. JSON.stringify() , which NestJS is likely calling for you, doesn't support Maps so you need to convert them into objects using something like Object.fromEntries()....

Lyndon

10/16/2024

CollectingBarrierStep bug

Anyone ever notice this bug in the CollectingBarrierStep? Offending line https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/util/CollectingBarrierStep.java#L81 ```...

Solution:

Overriding this function seems to fix ``` @Override public Traverser.Admin<Vertex> processNextStart() {...

Wolfgang Fahl

10/16/2024

pymogwai

https://github.com/juupje/pyMogwai is a an attempt for a python native implementation of the gremling query language there is a demo at: https://mogwai.bitplan.com/ Comments/Issues and Feedback are welcome!...

red

10/14/2024

Naming multiple vertices

I've got a list of vertices and a list of unique names. I'm looking to apply one name to each node, but struggling with the syntax. I think I should be using some sort of query builder where I can do: names.forEach(index, name)...

Alex

10/11/2024

Possibilities to improve performance on query?

I have a Python application with FastAPI that performs 3 actions when the endpoint is called: querying the structure that returns vertices and edges (21 itens in total), removing edges and registering/updating vertices and edges. For the query process to return the values, it takes a long time to process and the average is 1.4 seconds. Before, I separated this query that I will leave attached and sent it in threads, but it still took a little longer. Im also using the cluster reader url and both...

Alex

10/10/2024

Neptune Cluster Balancing Configuration

I'm trying to reach 40 rps in the registration flow, currently I'm reaching 20 rps. I have 6 instances of a python application in fastapi and I notice that each interaction takes around 200-400ms to communicate with neptune, for query flows I notice that there is a bottleneck where some of these queries take around 200ms-1s. about the cluster and the application, both are in the same region and with the same vpc. When analyzing the cluster I notice that some instances are having more CPU consump...

Solution:

What are you using to send queries to Neptune? Are you using the gremlin-python client and connecting via websockets? If so, each websocket connection is going to act like a "sticky session". It will connect to the same instance for the life of the connection. The reader endpoint is a DNS endpoint that is configured to resolve to a different read replica approximately every 5 seconds. So depending on when you establish your websocket connections or if you're just sending http requests, those could all go to the same instance if sent in quick succession. Customers have solved this in a number of ways. Some will create load balancers in front of Neptune read replicas that can more precisely "load balance" requests across the instances....

Jonathan Fridja

10/6/2024

[Bug] clone query affects original cloned query

[X] I am using the latest release (3.7.2) I believe the clone function code in gremlin-javascript contains a bug: Notes: - I renamed process to gProcess as it's a global variable in Nodejs...

Solution:

Hi @Jonathan Fridja, thanks for reporting the issue, I agree that this appears to be a bug. Unlike the clone() implementation in the java driver which this was based off of, the js clone only creates a shallow copy which does not clone the underlying bytecode. For reference this is how clone is implemented in Java: https://github.com/apache/tinkerpop/blob/76190de1086d8be4e207e69d1cc599c9d036a8b5/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/util/DefaultTraversal.java#L267-L290. Would you be willing to submit a PR to the 3.6-dev with a fix to fix to create a deep copy of the bytecode? If you could, that would be great, otherwise I can open up a JIRA to track this bug....

nhthavn

10/3/2024

Neo4j news

Why there are Neo4j news in the graph-news? Did neo4j build on top of Tinkerpop?

Solution:

The TinkerPop Community has long maintained compatibility with neo4j, but recent releases of neo4j haven't been made easily compatible for ongoing maintenance of that support. As a result, support for neo4j has been pinned to a really old version at 3.x. Recent discussions within the TinkerPop community are generally in favor of dropping support for neo4j for TinkerPop 4.x which was an easier decision now that TinkerGraph supports basic transactions giving us a way to test that functionality. As for your question about why we keep neo4j news in the #graph-news channel, I suppose i'm mostly responsible for that. As someone who has been working on TinkerPop since its earliest days, we've long thought of TinkerPop as a place to talk about graphs, not just TinkerPop enabled graph, but all graphs. Traditionally, it's been that way, but in more recent times that general conversation seems to have drifted to other places. You're the second person to question the inclusion of neo4j here in graph-news so perhaps there are more folks who find it confusing as to why it is present. it's also fairly noisy as they post with great consistency and if you follow graphs generally, you're probably getting that information other places already. i've been thinking about removing it. i'd be happy to hear if you or others agree with that happening....

Max

10/2/2024

Confusing behavior of `select()`.

The following traversal acts as a counter: ``` g.withSideEffect("map", [3: "foo", 4: "bar"]). inject("a", "b", "c", "d")....

Solution:

This is caused (if you are using TinkerGraph for example) by Java's HashMap implementation and the fact that a Long will not match an Integer type. ``` g.withSideEffect("map", [3L: "foo", 4L: "bar"]). inject("a", "b", "c", "d"). aggregate(local, "x")....

Previous Next

Gaming

Programming

Apache TinkerPop

Join the Apache TinkerPop server to ask questions!

Apache TinkerPop

Join the community to ask questions about Apache TinkerPop and get answers from other members.