Dave
ATApache TinkerPop
•Created by red on 10/22/2024 in #questions
Profiling Neptune from javascript
Profiles for Gremlin queries either require using the HTTPS endpoint: https://docs.aws.amazon.com/neptune/latest/userguide/gremlin-profile-api.html
Or you can use the AWS SDK: https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/client/neptunedata/command/ExecuteGremlinProfileQueryCommand/
There is no way to get this profile using the Gremlin drivers.
Neptune supports both driver/WS based connections to the server and HTTPS/REST calls directly to the cluster endpoints. Besides the connection protocol there are a few differences, most notably that HTTPS responses will include all properties of elements in the graph by default and WS response do not.
5 replies
ATApache TinkerPop
•Created by Julius Hamilton on 9/24/2024 in #questions
Why is T.label immutable and do we have to create a new node to change a label?
@Julius Hamilton What database are you using? Amazon Neptune does allow you to have multiple labels but the syntax is different.
20 replies
ATApache TinkerPop
•Created by Andys1814 on 8/26/2024 in #questions
Very slow regex query (AWS Neptune)
If 99% of the time is being spent on the regex then there is not much you can do from an optimization perspective outside of adding FTS support or redoing the data model to make them exact lookups
7 replies
ATApache TinkerPop
•Created by Andys1814 on 8/26/2024 in #questions
Very slow regex query (AWS Neptune)
When Neptune stores data it stores it in 3 different indexed formats (https://docs.aws.amazon.com/neptune/latest/userguide/feature-overview-data-model.html#feature-overview-storage-indexing), each of which are optimized for a specific set of common graph patterns. Each of these indexes is optimized for exact match lookups so when running queries that require partial text matches, such as a regex query, all the matching property data needs to be scanned to see if it matches the provided expression.
To get a performant query for partial text matches the suggestion is to use the Full Text search integration (https://docs.aws.amazon.com/neptune/latest/userguide/full-text-search.html) , which will integrate with OpenSearch to provide robust full text searching capabilities within a Gremlin query
To get a performant query for partial text matches the suggestion is to use the Full Text search integration (https://docs.aws.amazon.com/neptune/latest/userguide/full-text-search.html) , which will integrate with OpenSearch to provide robust full text searching capabilities within a Gremlin query
7 replies
ATApache TinkerPop
•Created by salman_walmart on 8/12/2024 in #questions
logging and alerting inside a gremlin step
I hadn't actually thought of custom logging as a use of
call()
but I think that makes sense. Since the available logging features/locations are going to be different per provider this use would fit well into that paradigm. In this case though it would still have some amount of blocking even if it was a fire-and-forget type request.16 replies
ATApache TinkerPop
•Created by Painguin | Tiến on 4/24/2024 in #questions
Query optimisation
If would suggest looking at the profile of the query and post it here as well as what database you are using (e.g. Gremlin Server, JanusGraph, Neptune, etc.) to see where the time is being spent. Without having more information it is difficult to give specifics as to why they query is slow.
Without any additional context I would take a guess that most of the difference in time is being spent doing
Without any additional context I would take a guess that most of the difference in time is being spent doing
has("account", "id", "my_account")
since the first version is doing that filter twice.12 replies
ATApache TinkerPop
•Created by RN on 2/15/2024 in #questions
Is it possible to walk 2 different graphs using custom TraversalStrategy in Gremlin?
Yeah if you are using a provider like that then neither a strategy or the
call()
step approach would work. I think the only way to do it is what you pointed to about mergeing into a Tinkergraph14 replies
ATApache TinkerPop
•Created by RN on 2/15/2024 in #questions
Is it possible to walk 2 different graphs using custom TraversalStrategy in Gremlin?
I think you would need to write a custom procedure, which is the path I experimented with, in order to do this.
14 replies
ATApache TinkerPop
•Created by RN on 2/15/2024 in #questions
Is it possible to walk 2 different graphs using custom TraversalStrategy in Gremlin?
This only ever got as far as the POC stage but here is the custom procedure call that I wrote https://github.com/bechbd/tinkerpop/blob/query_federation/tinkergraph-gremlin/src/main/java/org/apache/tinkerpop/gremlin/tinkergraph/services/TinkerQueryFederationFactory.java
14 replies
ATApache TinkerPop
•Created by RN on 2/15/2024 in #questions
Is it possible to walk 2 different graphs using custom TraversalStrategy in Gremlin?
I don't know if it would be quite as seamless as that but I have done some previous POC work on the ability to federate queries across graphs that could be applicable here
14 replies
ATApache TinkerPop
•Created by Lonnie VanZandt on 1/7/2024 in #questions
May I suggest a new topic-channel for us? Like "really-big-data" or "pagination"?
Is this system expected to serve mostly transactional or analytic traffic?
8 replies
ATApache TinkerPop
•Created by Lonnie VanZandt on 1/7/2024 in #questions
May I suggest a new topic-channel for us? Like "really-big-data" or "pagination"?
That being said if you are having to traverse through 100ks to millions of edges in a single traversal that is going to take a significant amount of time and server memory so there are other issues surrounding that which I would expect you would run into.
8 replies
ATApache TinkerPop
•Created by Lonnie VanZandt on 1/7/2024 in #questions
May I suggest a new topic-channel for us? Like "really-big-data" or "pagination"?
There is not a feature in Gremlin directly that will directly handle this for you automatically but the drivers do let you stream back results instead of collecting them all at once which can help mitigate transferring large result sets. If you are using Amazon Neptune it also has a query results cache to assist with paging: https://docs.aws.amazon.com/neptune/latest/userguide/gremlin-results-cache.html#gremlin-results-cache-paginating
8 replies
ATApache TinkerPop
•Created by Lonnie VanZandt on 1/7/2024 in #questions
May I suggest a new topic-channel for us? Like "really-big-data" or "pagination"?
I am not sure I exactly understand your question. Let me try to rephrase it to see if I understand. You have a query where you want to paginate in the middle of the query after which you want to continue the traversal? (e.g. find me the first 10 followers of Taylor swift (ordered by name) and then find me their friends and group by the common friends). You then want to paginate over the middle portion of the step (i.e. find the first 10 friends, then a second call to find friends 10-20, then 20-30, etc.)
Is that understanding correct?
8 replies
ATApache TinkerPop
•Created by Andys1814 on 11/7/2023 in #questions
Sequential IDs in Neptune?
@andys1814 How much do you care about the ids being truly sequential or is having some gaps acceptable as long as they are human readable?
I ask as this was a common request when I was working with Cassandra. A common practice was to allot a range of ids to each client on connection versus getting a new one each time. When a client exhausts it's assigned range it then reaches out to get a new range.
This helps to minimize the single point of failure and additional overhead of having to go to a single coordinator to get an id value for each request. It does however means that inserts will not be in sequential order and that you may have gaps in the number. This may or may not be an issue depending on your use case.
This helps to minimize the single point of failure and additional overhead of having to go to a single coordinator to get an id value for each request. It does however means that inserts will not be in sequential order and that you may have gaps in the number. This may or may not be an issue depending on your use case.
16 replies
ATApache TinkerPop
•Created by bulletlegend on 8/1/2023 in #questions
Gremlin query has strange behavior with range() and limit()
I was only able to reproduce it for that singular case
17 replies
ATApache TinkerPop
•Created by bulletlegend on 8/1/2023 in #questions
Gremlin query has strange behavior with range() and limit()
I have found out the offsset and limit needed to reproduce this are varied base on the size of the graph and the total results.
- Can you provide a few examples here? I tried this on the sample graph you provided17 replies
ATApache TinkerPop
•Created by bulletlegend on 8/1/2023 in #questions
Gremlin query has strange behavior with range() and limit()
@bulletlegend I was able to reproduce this only if I used
range(global, 0, 1)
. Any other combination of offset and limit did not seem to reproduce this error. Is this what you were able to observe?17 replies
ATApache TinkerPop
•Created by sofiane0097 on 7/31/2023 in #questions
Gremlin Syntax Highlighter
Here is how the graph-notebook implements syntax highlighting: https://github.com/aws/graph-notebook/blob/main/src/graph_notebook/nbextensions/gremlin_syntax/static/main.js
9 replies
ATApache TinkerPop
•Created by sofiane0097 on 7/31/2023 in #questions
Gremlin Syntax Highlighter
In the graph-notebooks we use Groovy syntax highlighting which does pretty well and would be a good place to start. I agree that having enums/tokens highlighted would be a nice addition as well as highlighting parameters for parameterized query support
9 replies