Apache TinkerPop

AT

Apache TinkerPop

Join the community to ask questions about Apache TinkerPop and get answers from other members.

Join

mergeE(): increment counter on match

Hi, is there an easy way to increment an existing edge property based on its current value using mergeE() in one single query? (e.g., counter += 1) Something similar to this: ``` g.mergeE([(T.label):'called', (from): person1, (to):person2])....
Solution:
gremlin> g.mergeE([(Direction.from):44,(Direction.to):8]).valueMap(true)
==>[id:5062,label:route,dist:549]
gremlin> g.mergeE([(Direction.from):44,(Direction.to):8]).valueMap(true)
==>[id:5062,label:route,dist:549]
and then...

Serialization Issue

I have a weird error, when I am connecting with JanusGraph gremlin client using conf/remote-graph-binary.yaml I am able to get results. But when I am trying to use my java application I am getting, java.io.IOException: Serializer for custom type 'janusgraph.RelationIdentifier' not found. Googling around I got that this is due to serialization issue. It looks to me that the gremlin-client and my java application has similar configs but gremlin-client is not having any serialization problem. ``` hosts: [localhost] port: 8182...
Solution:
I have faced a similar issue in the past (but mostly related to gremlin-python) and @Boxuan Li suggested a solution in the JanusGraph discord server. It was something like along these lines: ``` private static MessageSerializer createGraphBinaryMessageSerializerV1() { final GraphBinaryMessageSerializerV1 serializer = new GraphBinaryMessageSerializerV1();...

Design decision related to multiple heterogenous relational graphs

I'm working with over 100k instances of heterogeneous, relational node-and-edge attributed graphs, each graph having around 5k vertices and 10k edges. Vertices are of 3 types with 10 attributes (7 numerical, 3 string), and edges are of 5 types with 8 attributes (4 numerical, 4 string). Considering the complexity and size of the data, running queries like traversal paths, average clustering coefficients, and identifying nodes in clustering triangles across all these instances presents a significant challenge. I've been using a naive gremlin-server setup with an in-memory database to run my queries on one graph instance, but it's becoming clear that this approach isn't sustainable for multi-graph persistence or memory efficiency, as a single graph instance consumes about 1.2 GB of RAM. I'm exploring the possibility of switching to JanusGraph with a Berkeley DB backend to support persistent storage of multiple graphs (based on the feedback I got from the gremlin google group, https://groups.google.com/g/gremlin-users/c/UotOZFVvi3k/m/-hVd2oNNAQAJ). Given the data structure and requirements, especially the need for efficient loading and querying of individual graph instances in a possibly serializable fashion, do you think JanusGraph with Berkeley DB is a viable solution, or are there alternative approaches I should consider for managing and querying this volume of graph data effectively?...
Solution:
No we actually recommend using user-defined IDs

Stackoverflow when adding a larger list of property values using traverser.property()

Hey, we encounter a stack overflow: ``` Exception during Transaction, rolling back ... org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(org/apache/tinkerpop/gremlin/process/traversal/step/util/AbstractStep.java:150): Java::JavaLang::StackOverflowError from org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(org/apache/tinkerpop/gremlin/process/traversal/step/util/ExpandableStepIterator.java:55)...

java: package org.apache.tinkerpop.shaded.jackson.core does not exist

While trying to mvn clean install with jdk11, I ran into the above error using the master branch. Any idea?

Performance issue in large graphs

When performing changes in large graph (ca. 100K nodes, 500K edges) which is stored in one kryo file I am experiencing some huge delays. Just as an example, when writing initially I can change 10K nodes in minutes, but when the graph is big the same changes need more than one hour. Is there any easy solution possible, i.e., like breaking down and saving in smaller files etc. Any suggestion is helpful. Initial preference is saving in file system (local or network). Thanks for your suggestions/sol...
Solution:
I'm not sure if any of the other serializations such as GraphML or GraphSON might perform better, but I would say this is likely not a common way we see those graphs used so we may not have too much data on which techniques may work best. With the exception of TinkerGraph which is often used as an in-memory, somewhat ephemeral, graph, we typically see persistent graph stores used where the data is persisted on disk by the database and you do not need to constantly reload the data each time. If y...

Concurrent queries to authentication required sever resulted in 401 error

Hey guys, playing around with gremlin & encountered this very odd error where concurrent queries will break authentication: ```js import gremlin from "gremlin"; ...
Solution:
Looks like a bug. Could you create an issue in https://issues.apache.org/jira/projects/TINKERPOP ?...

Discrepancy between console server id conventions and Neptune

So I'm working with my test server and on Neptune--and I'm noticing a difference in the type of the T.id field. Is there any way to configure the type of id generated by the gremlin server?
Solution:
Amazon Neptune uses strings for all IDs. You can configure a Gremlin Server to also use String IDs. There is a nice writeup here that may be useful (it's from the graph-notebook repo but the steps still apply) https://github.com/aws/graph-notebook/tree/main/additional-databases/gremlin-server
No description

how to connect the amothic/neptune container to the volume?

I need to know which directory needs to attach to containeer. so that the data is stored safely. even after a restart.
Solution:
Check out the graphLocation and graphFormatconfig options here: https://tinkerpop.apache.org/docs/current/reference/#tinkergraph-configuration You may also want to use a mapped directory from your local machine to ensure data is not lost if the contianer is deleted: https://docs.docker.com/storage/volumes/...

Docker yaml authentication settings (gremlinserver.authentication) question

Does anyone have any experience setting up authentication on Docker by using the supplied .yaml file? I'm having trouble passingin a map to properly set one of the options: gremlinserver.authentication.config. Additional info, but not related to the my main problem: I have a file with the contents of username/password pairs which follow the schema: ...
Solution:
Due to gremlin server expecting a map, but docker being unable to pass it to the server in the format that is expected.
I think you simply have a slight misunderstanding of the YAML format here. YAML is basically a nested map of maps. Now, if your YAML looks like this: ...

Gremlin Injection Attacks?

Is anyone talking about or looking into attacks and mitigations for Gremlin Injection Attacks? That is, just like all the commentary on how to design your PHP-based web frontend with Postgres backend to not be a sucker for an easy SQL Injection Attack, is anyone looking at how to handle your users of your Gremlin Server when those users give you Groovy lambdas that are rich in aggressive behavior?
Solution:
I think this goes back to a different thread we had where I mentioned that security was a reason driving an idea that lambdas should not be allowed outside of embedded use cases and why they should be removed otherwise. For some lightweight security you can try to sandbox the ScriptEngine in the server: https://tinkerpop.apache.org/docs/current/reference/#script-execution but it is not a perfect solution and really just a reference implementation that we have. Some commercial offerings in the...

Returned vertex properties (JS client)

Hi, I've got a question regarding the returned vertex value when using the JS client. How come non-array properties are parsed & returned as an array of length 1, as seen in the example below? Thank you. ```json { "id": 4104, "label": "account",...
Solution:
array is used to work with properties whose cardinality list or set gremlin> g.addV('test').property(list,'a','1').property(list,'a','2') ==>v[13] gremlin> g.V(13).valueMap() ==>[a:[1,2]]...

Anyone using Tinkerpop docker as a local Cosmos replacement

Running into some random issues. Looking for tips and tricks.
Solution:
One thing to consider in trying to do this is that you would likely use TinkerGraph and Gremlin Server for this local replacement. CosmosDB has a number of limitations and differences that this local environment would not catch, so it's possible that you could write some Gremlin that works locally but then fails when you try the same query on CosmosDB. That said, if you stay aware of those differences, stick to sending scripts and prefer the 3.4.x server release it could give you a basic but not...

Configuring Websockets connection to pass through a proxy server

Hey, I'm working on making G.V() fully proxy aware, but I can't seem to get websockets connection to pass through a SOCKS/HTTP proxy configuration. I've got all the proxy configuration java system properties set and working for HTTP connections. Is there any specific configuration to add to let the Gremlin driver to use a configured proxy?...

python goblin vs spring-data-goblin for interactions with gremlin server

I want an OGM to interact with my gremlin server. What would be a good choice?
Solution:
I've not kept up with the latest changes to these libraries. Goblin might be the most currently maintained one. If you're using Python I suppose I'd start there. Not sure if anyone here can chime in with some success stories around using OGMs. Most applications I hear about tend to just use Gremlin directly.

Is there any open source version of data visualizer for aws neptune?

Is there any open source version of data visualizer for aws neptune. I'll need it since it essential for me for using neptune for small scale purposes. I have used g.V(), and it was perfect for my use case. But because of budget constraints. Can;t offered it. Any solutions?
Solution:
AWS maintains Graph Explorer and Graph Notebook (https://docs.aws.amazon.com/neptune/latest/userguide/visualization-graph-explorer.html and https://github.com/aws/graph-notebook), there's some overlap with what G.V() offers. I was gonna suggest to hit me up re your budget constraints to see if we can work on something there too!

Dynamic select within query not working.

Any insights or help would be greatly appreciated. I have to pass a list of lists in the format below. Hundreds of them which is why I'm trying to iterate in a single query. Please explain why accessing element 0 within the row data works here:...
Solution:
Sorry it took a while for someone to get to this. I think your problem here is that you are trying to use has(String, Traversal) in __.V().hasLabel('UsdValue').has('date', select('row').limit(local, 1).unfold()) but it doesn't work the way you expect. basically, the result of the traversal you give to has() is not given as the value to the comparator. More generally, P does not take a Traversal making any such usage impossible. It is designed to work such that the value of "UsdValue" i...

Adding multiple properties to a vertex using gremlin-go

Hello Community, I have a question regarding how multiple properties can be added to a vertex using gremlin-go. I did something like this ...
Solution:
to add all properties from map to same vertex can be used something like `t := g.AddV("Person") for k, v := range prop { t = t.Property(k,v) }...

Is it possible to walk 2 different graphs using custom TraversalStrategy in Gremlin?

I have 2 different graphs in 2 different Neptune cluster. Both of them can have few reference vertices referring to vertex in other graph. e.g. As we walk through graph A and reach a reference vertex (referring to vertex in graph B), we should be able to traverse normally further inside Graph B and get the results of the query. Basically Graph A + Graph B should act as single virtual graph.
Solution:
At this time, there would be no easy way to do this and I don't think a custom TraversalStrategy would help in any way i can imagine. Maybe the closest thing I can imagine would be to subgraph the two graphs with their references vertices and merge them to a single TinkerGraph in your application and then run additional queries on that directly. I'm not sure that suits a lot of use cases we hear about though in relation to this feature so that suggestion may not be helpful. cc/ @Dave Bechberg...

SideEffect a variable, Use it later after BarrierStep?

I seek a query that builds a list and then needs to both sum the list's mapped values and divide the resulting sum by the count of the original list. This would be the mean() step - if the mapped list was still a Gremlin traversal object that offered mean(). However, the mapped list is, by that time, a Groovy list and mean() is no longer available....
Solution:
It can be done with all Gremlin steps in 3.7.1 where you have date functions available. Assuming: ```groovy g.addV().as('a'). addE('link').from('a').to('a').property('createdDate', '2023-01-01T00:00:00Z'). addE('link').from('a').to('a').property('createdDate', '2023-01-01T00:30:00Z')....