What is the ordering of group?
For instance (on Tinkerpop modern)
g.V().group().by('age').by('name')
gives
[{32=[josh], 35=[peter], 27=[vadas], 29=[marko]}]
There must be some logic to it?
(Understanding this would help me for unit testing, allowing me to compare results that are actually the same)
Solution:Jump to solution
Gremlin doesn't offer too many order guarantees. I think
group()
by default uses a regular old HashMap
for it's data structure. If you want an order you would want to specifically specify an order()
- for example:
I suppose you could also force use of a LinkedHashMap
:
```gremlin> g.withSideEffect('m', [:]).V().hasLabel('person').group('m').by('age').by('name').cap('m')...10 Replies
Solution
Gremlin doesn't offer too many order guarantees. I think
group()
by default uses a regular old HashMap
for it's data structure. If you want an order you would want to specifically specify an order()
- for example:
I suppose you could also force use of a LinkedHashMap
:
The first being an insertion order given however the underlying graph iterates vertices from V()
or with an explicit order()
as in the second. I'd say it's possible better to do the explicit order against vertices rather than than my first example where it does a in-memory sort of the Map
as it's possible you might get some performance improvement there depending on your graph. Hard to say which is nicer as the withSideEffect()
approach isn't quite as readable. (Note that in my Groovy examples above, the [:]
will resolve to a LinkedHashMap
by default.)Ah, thanks! A hash map does explain it.
I assumed the maps where ordered (unique keys, ordered by insertion). So I guess I can now refactor (my own implementation I'm testing) back to my native maps. They use = and > for it's binary tree. But I'll keep your solution for examination.
The easy way out is to simply overwrite my unit test results, but not knowing the 'why' kept me awake...
good question - thanks
A small follow up question...
If the maps use hashtables, how does ordering them work:
gremlin>g.V().group().by('age').order(local).by(keys)
==>[{27=[v[2]], 29=[v[1]], 32=[v[4]], 35=[v[6]]}]
The result appears to be a map (based on the {}), but how can a (hashtable) map have order?
the TinkerPop processing engine uses
LinkedHashMap
internally. graph implementations should abide to similar order requirements if they don't use the TinkerPop engine.Thanks. I assumed LinkedHashMap couldn't be ordered, but apparently you can order by insertion.
Still, with query "g.V().groupCount().by(inE().count()).order(local).by(values, desc) "
g.V()'s console returns (ordered)
[{1=3, 0=2, 3=1}]
yet g.V()'s JSON returns (unordered)
[ {"0": 2, "1": 3, "3": 1 } ]
I'm not even sure if g.V() uses the TinkerPop engine. Probably so obvious it's hard to find.
gdotv Ltd
gdotv Ltd
Gremlin Graph Database IDE, Debugging & Visualization Tool
A powerful IDE to query, debug and visualize your Apache TinkerPop™ graph DBs. Compatible with Azure Cosmos DB, JanusGraph, Amazon Neptune & more. Free to try!
Ha, right in my face. Guess it's just not easy to google or time to take a break.
So, TinkerPop orders the map by reinserting the elements in order?
Why is the JSON output not ordered?
So, TinkerPop orders the map by reinserting the elements in order?yes
Why is the JSON output not ordered?i'd agree that's a bit strange. i confirmed the serializer should be writing the JSON in
entrySet()
iterator order, but this is why i asked about https://gdotv.com/ i.e. i wanted to know how you are getting that JSON.gdotv Ltd
gdotv Ltd
Gremlin Graph Database IDE, Debugging & Visualization Tool
A powerful IDE to query, debug and visualize your Apache TinkerPop™ graph DBs. Compatible with Azure Cosmos DB, JanusGraph, Amazon Neptune & more. Free to try!
hey, so just to chime in on this it appears the JSON serialized version of the results gdotv displays is somehow losing the order of the results (and instead ordering by key name)
im guessing this is due to how serializing maps to JSON works in java, it may somehow not preserve the order
the console output is the accurate one in this instance, the JSON representation is more so for convenience in gdotv
(just a brief update, I've identified the issue in gdotv and fixed it, it will be out in the next release)