Testing against AWS Neptune

Hi, the app that I'm maintaining has a Neptune integration for which I've written some integration tests. For those tests I'm using gremlin-server which so far is fine, but because of the cardinality difference between Neptune and other Gremlin implementation I get different behaviour in prod vs testing. Of course, this is a known difference, but I'm wondering if there's a solution that emulates the Neptune engine locally for such use cases. I could spin-up a new Neptune instance at testing time, but that seems a bit of an anti-pattern. What do people do to test the clients against multiple engines, if at all?
Solution:
Hi, the main solution I've seen around for such use case (and I'm using myself) is to start the gremlin-server using a custom configuration. Basically what you need to do is create a custom 'gremlin-server.yaml' file with the server config. There is a field called graphs in this file, which you can use to specify the path to a custom property file. This property file is where you define your graph properties, and in your case the defaultCardinality....
Jump to solution
8 Replies
Solution
3x1
3x114mo ago
Hi, the main solution I've seen around for such use case (and I'm using myself) is to start the gremlin-server using a custom configuration. Basically what you need to do is create a custom 'gremlin-server.yaml' file with the server config. There is a field called graphs in this file, which you can use to specify the path to a custom property file. This property file is where you define your graph properties, and in your case the defaultCardinality. Here is the configuration that I use to mimic Neptune :
gremlin.graph=org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph
gremlin.tinkergraph.vertexIdManager=ANY
gremlin.tinkergraph.edgeIdManager=ANY
gremlin.tinkergraph.defaultVertexPropertyCardinality=set
gremlin.graph=org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph
gremlin.tinkergraph.vertexIdManager=ANY
gremlin.tinkergraph.edgeIdManager=ANY
gremlin.tinkergraph.defaultVertexPropertyCardinality=set
I believe this is explained somewhere in the gremlin server documentation but I don't remember where. A good starting point is this book which I found very useful to start with gremlin server : https://www.kelvinlawrence.net/book/PracticalGremlin.html#serverconfig (Kelvin Lawrence is in the discord by the way)
3x1
3x114mo ago
The graph field I mentioned looks like this in the yaml file :
graphs: { graph: ./gremlin/neptune_graph.properties }
graphs: { graph: ./gremlin/neptune_graph.properties }
Dragos Ciupureanu
Dragos CiupureanuOP14mo ago
I see, thanks. From your answer, though, I imply that there's no way to test against a real Neptune engine and as soon as some of the configurations for gramlin-server can't be changed to match Neptune we're out of luck and the only solution is to test against a real Neptune cluster.
spmallette
spmallette14mo ago
looks like we overlooked this question - thanks for providing some explanation @sgaufret
3x1
3x114mo ago
That's my understanding yes, either local server with a config to be as close to Neptune as possible, either have a small Neptune instance ready for integration tests. Just to add on that, I remember seeing some differences between gremlin server and Neptune. For example I had a query like :
traversal.by(
__.coalesce(
__.select('v').values(params.eventTimeAttribute),
__.constant(0)
)
)
traversal.by(
__.coalesce(
__.select('v').values(params.eventTimeAttribute),
__.constant(0)
)
)
which would run fine in gremlin server when eventTimeAttribute was null but not in Neptune. So TLDR : - gremlin server for local unit-test / speed / 0-cost but not 100% matching - Neptune to make sure tests are run exactly like in reality
Dragos Ciupureanu
Dragos CiupureanuOP14mo ago
Yep, that's my thinking too. Thanks for confirming. I'm inclined to have a serverless instance laying around just for integration tests (which is good in theory but as soon as you have multiple tests running on the instance you're screwed). Conversley, the test can spin up its instance but that just takes too much time... Thanks all for your input
Kennh
Kennh14mo ago
I'm not sure if you've seen this blog post before but it touches upon a lot of what was said here (https://aws.amazon.com/blogs/database/automated-testing-of-amazon-neptune-data-access-with-apache-tinkerpop-gremlin/)
Amazon Web Services
Automated testing of Amazon Neptune data access with Apache TinkerP...
Amazon Neptune, a fully managed graph database, is purpose built to work with highly connected data such as relationships between customers and products, or between pieces of equipment within a complex industrial plant. Neptune is designed to support highly concurrent online transaction processing (OLTP) over graph data models. Neptune supports ...
Kennh
Kennh14mo ago
What do people do to test the clients against multiple engines, if at all?
I think there are enough subtle differences between different providers that some adjustments need to be made for client code to work against them. In general though, providers will test their read queries against this suite (https://github.com/apache/tinkerpop/tree/master/gremlin-test/src/main/resources/org/apache/tinkerpop/gremlin/test/features). Some of these tests may get opted out if a provider doesn't support a specific feature. There aren't, however, any tests for write operations (most providers use different ways of representing Id) so you may find a lot of differences there. Actually, a correction to what I said earlier.
There aren't, however, any tests for write operations (most providers use different ways of representing Id) so you may find a lot of differences there.
This isn't true, there actually are tests for write operations like addV(), mergeV(), etc.

Did you find this page helpful?