ColeGreer Comments - Answer Overflow

Topics

ColeGreer

ATApache TinkerPop

•Created by thorOdinson on 3/5/2025 in #questions

MergeE with a conditional on the value being set

Hi @thorOdinson, I can add some context which might be helpful here. The purpose of the "onMatch traversal" is not actually intended to directly modify the found edge. That traversal is meant to produce a map which defines any properties to be updated. In your case, you would want a traversal which produces ["modifiedTime" : 'Wed Mar 05 12:15:39 EST 2025'] if the timestamp needs to be updated, or produces an empty map [:] otherwise. Your approach of directly modifying the edge certainly can work, but I would give a caveat that this is not how the mergeE step was intended to be used. One thing you are missing if you want to continue with this approach is that the "onMatch" traversal must still produce a map. The simplest fix is to return an empty map if you don't want mergeE to make any changes beyond what is already done by your "match traversal". This might look something like this:

option(onMatch, __.filter(__.coalesce(
                __.values("modifiedTime").is(lt('Wed Mar 05 12:15:39 EST 2025')),
                __.constant(true))).
                property("modifiedTime", 'Wed Mar 05 12:15:39 EST 2025')
                .constant([:]))

option(onMatch, __.filter(__.coalesce(
                __.values("modifiedTime").is(lt('Wed Mar 05 12:15:39 EST 2025')),
                __.constant(true))).
                property("modifiedTime", 'Wed Mar 05 12:15:39 EST 2025')
                .constant([:]))

I haven't taken a close look at that traversal to ensure it's doing what you want, but hopefully this helps explains what's going on and helps you move forward. If you're still having issues with this query, it would be helpful to know what version of TinkerPop are you using and is this with TinkerGraph or some other graph provider? It would also be helpful to know specifically what sort of error or unintended behaviour you're seeing.

6 replies

ATApache TinkerPop

•Created by masterhugo on 1/30/2025 in #questions

Gremlin python MergeV update properties

In that case, none of the new syntax to specify cardinalities in property maps is supported. If upgrading your Neptune is an option for you, it looks like 1.3.2.0 is the earliest which supports the new syntax. https://docs.aws.amazon.com/neptune/latest/userguide/access-graph-gremlin-client.html If you are stuck on the older version, a query such as this might work for you:

g.merge_v({T.id_: "x1234"})
        .option(Merge.on_create, {T.label: 'Dog', 'name': 'Toby', 'age': 10})
        .option(Merge.on_match, __.side_effect(__.property(Cardinality.single, "age", 11)).constant(dict()))
        .toList()

g.merge_v({T.id_: "x1234"})
        .option(Merge.on_create, {T.label: 'Dog', 'name': 'Toby', 'age': 10})
        .option(Merge.on_match, __.side_effect(__.property(Cardinality.single, "age", 11)).constant(dict()))
        .toList()

I don't like recommending to use a query such as this as it's really abusing the ability to pass a sub-traversal to produce a map, to instead modify the matched vertex directly. This wasn't how mergeV was intended to be used but it might solve your issue.

9 replies

ATApache TinkerPop

•Created by masterhugo on 1/30/2025 in #questions

Gremlin python MergeV update properties

That's strange, I'm able to run the exact query I shared above with gremlin python 3.7.1 and Neptune 1.3.2.1. Could I ask you to double check the versions you are using here? The syntax I shared above is a relatively recent addition to TinkerPop (3.7.0) (docs), the error you are seeing seems consistent with what I would expect from an older server which does not yet support the syntax to set individual cardinalities on each property. I'm not recognizing what that dict represents. Where are you extracting that dict from? It might be helpful if you could share the full query you are attempting to run (with any sensitive property keys and values replaced with dummy values).

9 replies

ATApache TinkerPop

•Created by masterhugo on 1/30/2025 in #questions

Gremlin python MergeV update properties

If I'm understanding your question correctly, I think what you are seeing is a result of Neptune defaulting to set cardinality for properties. Essentially what that means, is if I start with a vertex with property("name", "Alice"), and I try to overwrite the property with property("name", "Bob") Neptune will instead add the new property to a set such that vertex.name = {"Alice", "Bob"}. I think this is what you are seeing this set cardinality behaviour when using MergeV(). If you want to use mergeV and enforce single cardinality for properties (overwrite existing values instead of appending), you can try a query like this:

from gremlin_python.process.traversal import Merge, T, CardinalityValue

g.merge_v({T.id_: "x1234"})
        .option(Merge.on_create, {T.label: 'Dog', 'name': 'Toby', 'age': 10})
        .option(Merge.on_match, {'age': CardinalityValue.single(11)})
        .toList()

from gremlin_python.process.traversal import Merge, T, CardinalityValue

g.merge_v({T.id_: "x1234"})
        .option(Merge.on_create, {T.label: 'Dog', 'name': 'Toby', 'age': 10})
        .option(Merge.on_match, {'age': CardinalityValue.single(11)})
        .toList()

9 replies

ATApache TinkerPop

•Created by danielcraig23 on 1/28/2025 in #questions

How can I use a subquery to translate airport code DAL into icao airport code KDAL, w air-routes?

Hi @danielcraig23, I might try to search for the the specific airports with the necessary iata->icao mapping instead of the "cross join then filter" approach. This is the traversal I came up with:

g.V().
  hasLabel("aircraft").as("ac").
  values("aircraftLocation").as("iata").
  select("ac").
  project("tailNumber", "aircraftLocationIcao").
    by("tailNumber").
    by(
      V().
      hasLabel("airport").
      where(values("code").where(eq('iata'))).
      values("icao"))

g.V().
  hasLabel("aircraft").as("ac").
  values("aircraftLocation").as("iata").
  select("ac").
  project("tailNumber", "aircraftLocationIcao").
    by("tailNumber").
    by(
      V().
      hasLabel("airport").
      where(values("code").where(eq('iata'))).
      values("icao"))

In my very small scale testing I'm seeing this run about an order of magnitude faster than the "cross join then filter" approach.

12 replies

ATApache TinkerPop

•Created by lijinv on 1/26/2025 in #questions

Gremlin.net for .net 8

I see. @Florian Hockmann do you know if it's intentional that the nuget docs only list .NET 6 and .NET Standard 2?

8 replies

ATApache TinkerPop

•Created by lijinv on 1/26/2025 in #questions

Gremlin.net for .net 8

Hi @lijinv, could you expand on any issues you may be seeing when attempting to use Gremlin.net with .net 8? .net 8 should be fully supported as of v3.7.2, and we currently run all of our Gremlin.net testing in .net 8. https://issues.apache.org/jira/browse/TINKERPOP-3030. If you have found any incompatibilities with v3.7.3 in .net 8, that should likely be considered a bug.

8 replies

ATApache TinkerPop

•Created by Memo on 10/16/2024 in #questions

Basic vertex querying does not work in Amazon Neptune but it works with local Gremlin Server

@Memo Can I ask how you are constructing and configuring your GraphTraversalSource (gremlinService.readClientSource)? I gave a quick test with the most basic setup (using gremlin 3.7.2 and Neptune 1.3.2.1) and it works as I would expect:

async function main(){
    const g = traversal().withRemote(new DriverRemoteConnection('wss://my-neptune.cluster-xxxxxxxxxxxx.region.neptune.amazonaws.com:8182/gremlin'));
    const results = await g.V().limit(1).elementMap().next()
    console.log(results.value)
}

async function main(){
    const g = traversal().withRemote(new DriverRemoteConnection('wss://my-neptune.cluster-xxxxxxxxxxxx.region.neptune.amazonaws.com:8182/gremlin'));
    const results = await g.V().limit(1).elementMap().next()
    console.log(results.value)
}

Map(4) {
  EnumValue { typeName: 'T', elementName: 'id' } => '2',
  EnumValue { typeName: 'T', elementName: 'label' } => 'person',
  'name' => 'vadas',
  'age' => 27
}

Map(4) {
  EnumValue { typeName: 'T', elementName: 'id' } => '2',
  EnumValue { typeName: 'T', elementName: 'label' } => 'person',
  'name' => 'vadas',
  'age' => 27
}

10 replies

ATApache TinkerPop

•Created by Wolfgang Fahl on 10/16/2024 in #questions

pymogwai

Hey thanks for sharing, that's pretty cool. That's the first time I've seen an embedded graph implementation in python with native support for gremlin. Do you have any future plans for the project?

6 replies

ATApache TinkerPop

•Created by Max on 9/28/2024 in #questions

Best practices for local development with Neptune.

To add to Daniel's comment, the String ID manager will be included in the 3.7.3 release which is expected to be published by the end of October.

11 replies

ATApache TinkerPop

•Created by Jonathan Fridja on 10/6/2024 in #questions

[Bug] clone query affects original cloned query

Hi @Jonathan Fridja, thanks for reporting the issue, I agree that this appears to be a bug. Unlike the clone() implementation in the java driver which this was based off of, the js clone only creates a shallow copy which does not clone the underlying bytecode. For reference this is how clone is implemented in Java: https://github.com/apache/tinkerpop/blob/76190de1086d8be4e207e69d1cc599c9d036a8b5/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/util/DefaultTraversal.java#L267-L290. Would you be willing to submit a PR to the 3.6-dev with a fix to fix to create a deep copy of the bytecode? If you could, that would be great, otherwise I can open up a JIRA to track this bug.

5 replies

ATApache TinkerPop

•Created by joshb on 8/29/2024 in #questions

Is it possible to configure SSL with PEM certificate types?

Hi @joshb, am I correct in assuming you are using the Java driver to connect to Aerospike? The java driver uses the JSSE keyStore and trustStore, which as far as I understand does not support the PEM format. You may be able to use a 3rd party tool such as openssl to convert from PEM to PKCS12 (https://docs.openssl.org/1.1.1/man1/pkcs12/). Perhaps @aerospike folks may have more direct recommendations for driver configuration.

6 replies

ATApache TinkerPop

•Created by Balan on 8/23/2024 in #questions

How can we extract values only

@Balan Are you trying to calculate the distance between 2 points in gremlin or are you trying to fetch the lat/lon from some point in your graph, and then compute the distance between points in some external language of your choice (python, java, js...)? If you are trying to do the distance calculation in gremlin, could you share an example of your current query? If you are trying to fetch a point and get the results back in some json without the "@type"/"@value" metadata then I believe @Kennh and @triggan's suggestions around using untypped graphson is what you are looking for.

9 replies

ATApache TinkerPop

•Created by dracule_redrose on 3/19/2024 in #questions

Design decision related to multiple heterogenous relational graphs

Hosting 100k small graph instances isn't a usage pattern I've seen a whole lot. JanusGraph seems like a reasonable choice to me, although I see you've been running into issues with conflicting vertex/edge id's. I'm unsure if JanusGraph supports non-globally unique id's in multiple graph deployments. My understanding is that JanusGraph generally recommends avoiding using user-defined id's whenever possible, in favour of automatically generated id's from JanusGraph. Perhaps some @janusgraph folks with more familiarity with configuring multiple graphs can give some clearer advice for your setup.

9 replies

ATApache TinkerPop

•Created by Dseguy on 12/28/2023 in #questions

Splitting a query with range()

In my opinion using range() like this is the easiest and most flexible method of splitting queries. As long as you are aware of the considerations mentioned above I think it's a good way to go. TinkerGraph works well with this as it produces results in a guaranteed order (g.V() in TinkerGraph will always return vertices in the order they were added to the graph). I believe the same is true for Neo4j and the Neo4j plugin although my experience there is limited and I have not seen it documented anywhere or properly tested. If you need a robust guarantee of ordering with Neo4j that likely warrants further investigation.

9 replies

ATApache TinkerPop

•Created by Dseguy on 12/28/2023 in #questions

Splitting a query with range()

I have often used range() steps to break up queries, it can be a useful technique but does come with several caveats. The most important piece is that this will only work if your database guarantees that the common part of your query will always produce results in a consistent order. The default implementation of range(x, x + 1000) will first iterate and discard the first x results, then pass the next 1000. If the result ordering changes on each execution, then you will essentially be taking a random sample of 1000 results each time, instead of progressively going batch by batch. You already mentioned the performance concerns with the common part of the query being executed each time, due to the way this is implemented, this performance penalty is proportional to x (minimal penalty when x is small as almost no results are skipped, larger penalty with large x as many results need to be processed and skipped). Results will depend greatly on your DB and your data but in general, if the left-hand side of the query is fast and efficient in your DB, and the right-hand side is slow and complex, then this technique works quite well. I've mostly used such queries in the form of g.V().range(x, x+1000).foo()... and the results have generally been acceptable for my purposes. Assuming that your database is able to efficiently lookup vertices by label and that there are no ordering concerns there, your proposed solution seems reasonable in my opinion. Other alternatives may be to filter based on vertex id's (depends largely on what type/structure your graph uses for id's), or adding some sort of metadata to your graph to help with partitioning.

9 replies

ATApache TinkerPop

•Created by billmanh on 8/26/2023 in #questions

Trying to run a local version for a test, what is the correct serializer?

Platform differences shouldn't matter here. I'm wondering if that server metrics error is somehow related to the server disconnecting. I believe I can reproduce that metrics error to confirm this.

33 replies

ATApache TinkerPop

•Created by billmanh on 8/26/2023 in #questions

Trying to run a local version for a test, what is the correct serializer?

@billmanh I'm not sure this is related to your issue here but I wanted to point out that in 3.7.0, the serializers were migrated from a gremlin.driver.ser package to gremlin.util.ser. Therefore the serializer in the config in your original question should be changed to org.apache.tinkerpop.gremlin.util.ser.GraphBinaryMessageSerializerV1

33 replies

ATApache TinkerPop

•Created by shivam.choudhary on 7/31/2023 in #questions

User-Agent Metric Not Exposed in Gremlin Server - Need Help Troubleshooting

Hi @shivam.choudhary, Yes unfortunately throughput isn't something which is currently recorded in the metrics. This is something that could be added by extending one of the Channelizers and adding a new handler to the pipeline (let me know if you want any guidance with doing something like this). It would also be good to open up a JIRA for this. It would be useful to have throughput metrics built into the server out of the box, as well as an easier extension point for adding custom handlers.

9 replies

ATApache TinkerPop

•Created by shivam.choudhary on 7/31/2023 in #questions

User-Agent Metric Not Exposed in Gremlin Server - Need Help Troubleshooting

Hey @shivam.choudhary, sorry I may have brushed over some details earlier. The server will look for a 'User-Agent' header from the initial connection request on the completion of the web socket handshake. It will not look for that header in subsequent messages sent via the existing connection. How are you currently setting the header? Also a bit of an unrelated question but what sort of insights are you most interested in observing? I've always wanted to come back to this feature at some point to add in additional extensibility on the server side.

9 replies