Best practices for local development with Neptune.
I would like to use a local Gremlin server with TinkerGraph for local development, and then deploy changes to Neptune later. However, there are several differences between TinkerGraph and Neptune that impact the portability of the code.
The most important one is probably the fact that in Tinkergraph vertex and edge ids are numeric, but they are strings in Neptune. Also, I think there are some differences in how properties are handled if the cardinality is a list.
What is the recommended workflow to minimize discrepancies between my local environment and Neptune?
Solution:Jump to solution
There's a blog post here that contains some of the details on what properties you can change in TinkerGraph to get close: https://aws.amazon.com/blogs/database/automated-testing-of-amazon-neptune-data-access-with-apache-tinkerpop-gremlin/
It's unlikely that you'll find anything that emulates things like the result cache, lookup cache, full-text-search, features, etc.
I would be curious to hear what the needs are for local dev....
I would be curious to hear what the needs are for local dev....
Amazon Web Services
Automated testing of Amazon Neptune data access with Apache TinkerP...
Amazon Neptune, a fully managed graph database, is purpose built to work with highly connected data such as relationships between customers and products, or between pieces of equipment within a complex industrial plant. Neptune is designed to support highly concurrent online transaction processing (OLTP) over graph data models. Neptune supports ...
7 Replies
There's https://docs.localstack.cloud/user-guide/aws/neptune/ if you're looking for a quick solution that emulates the Neptune querying API.
Thanks. Does it support Neptune specific apis like
g.with('Neptune#enableResultCache', true)
?
By the way I figured out that I can start gremlin server container with modified TinkerGraph properties:
This allows to have ids as strings, which solves one of the major pain points.Solution
There's a blog post here that contains some of the details on what properties you can change in TinkerGraph to get close: https://aws.amazon.com/blogs/database/automated-testing-of-amazon-neptune-data-access-with-apache-tinkerpop-gremlin/
It's unlikely that you'll find anything that emulates things like the result cache, lookup cache, full-text-search, features, etc.
I would be curious to hear what the needs are for local dev.
I would be curious to hear what the needs are for local dev.
Amazon Web Services
Automated testing of Amazon Neptune data access with Apache TinkerP...
Amazon Neptune, a fully managed graph database, is purpose built to work with highly connected data such as relationships between customers and products, or between pieces of equipment within a complex industrial plant. Neptune is designed to support highly concurrent online transaction processing (OLTP) over graph data models. Neptune supports ...
Could you possibly clarify why the example code for Neptune includes a different
gremlin-server.yaml
configuration compared to the default gremlin server? I understand that it needs to reference a distinct tinkergraph.properties
file (to allow IDs as strings), but are there any additional configuration changes that justify the divergence from the default?
Neptune sample code: https://github.com/aws-samples/automated-testing-graph-queries/blob/main/test/db/conf/gremlin-server.yaml
Default config: https://github.com/apache/tinkerpop/blob/master/gremlin-server/conf/gremlin-server.yamlGitHub
automated-testing-graph-queries/test/db/conf/gremlin-server.yaml at...
This repository contains code that provides an approach to the automated testing of Gremlin queries that are intended to run against an Amazon Neptune database - aws-samples/automated-testing-graph...
GitHub
tinkerpop/gremlin-server/conf/gremlin-server.yaml at master · apach...
Apache TinkerPop - a graph computing framework. Contribute to apache/tinkerpop development by creating an account on GitHub.
Many of the differences there are a little opinionated from the person that wrote that blog post. When I create a Gremlin Server docker container to emulate Neptune (as close as possible), I typically just use something like the following (Dockerfile) and comments explain the changes:
In an upcoming version of GremlinServer, you should even be able to use "STRING" instead of "ANY" for an idManager, so that whenever you don't provide an id during element creation, the autogenerated id will also be of type string (instead of UUID).
To add to Daniel's comment, the String ID manager will be included in the 3.7.3 release which is expected to be published by the end of October.