Aggregating vertices with set-cardinality properties

I am aggregating traversed vertices that have both single and set-cardinality properties. When capturing the vertex using elementMap() it assumes a single-cardinality for all properties and only considers the last element in the set when building the map. However, when trying to use valueMap(true).by(unfold()) (as described in this SO reply: https://stackoverflow.com/a/75225994/3516889) It just gives the last property value in the set. query using valueMap(true).by(unfold()) (1):
g.V()
.hasLabel('SampleContainer')
.filter(properties('samples').count().is(P.gte(2))) // only consider vertices with 2 samples or more
.limit(100)
.valueMap(true).by(unfold())
g.V()
.hasLabel('SampleContainer')
.filter(properties('samples').count().is(P.gte(2))) // only consider vertices with 2 samples or more
.limit(100)
.valueMap(true).by(unfold())
Query 1 returns only the first-added sample:
{<T.id: 1>: '8ac3ec3b-ffbd-8734-83c0-9f7f7bdbd468', <T.label: 4>: 'SampleContainer', 'name': 'SomeName', 'samples': 'first'}
{<T.id: 1>: '8ac3ec3b-ffbd-8734-83c0-9f7f7bdbd468', <T.label: 4>: 'SampleContainer', 'name': 'SomeName', 'samples': 'first'}
Query Using elementMap() (2):
g.V()
.hasLabel('SampleContainer')
.filter(properties('samples').count().is(P.gte(2))) // only consider vertices with 2 samples or more
.limit(100)
.elementMap()
g.V()
.hasLabel('SampleContainer')
.filter(properties('samples').count().is(P.gte(2))) // only consider vertices with 2 samples or more
.limit(100)
.elementMap()
Query 2 returns only the last-added sample:
{<T.id: 1>: '8ac3ec3b-ffbd-8734-83c0-9f7f7bdbd468', <T.label: 4>: 'SampleContainer', 'name': 'SomeName', 'samples': 'second'}
{<T.id: 1>: '8ac3ec3b-ffbd-8734-83c0-9f7f7bdbd468', <T.label: 4>: 'SampleContainer', 'name': 'SomeName', 'samples': 'second'}
What I need is the following result:
{<T.id: 1>: '8ac3ec3b-ffbd-8734-83c0-9f7f7bdbd468', <T.label: 4>: 'SampleContainer', 'name': 'SomeName', 'samples': [ 'first', 'second'] }
{<T.id: 1>: '8ac3ec3b-ffbd-8734-83c0-9f7f7bdbd468', <T.label: 4>: 'SampleContainer', 'name': 'SomeName', 'samples': [ 'first', 'second'] }
I tried playing with local() but I can't seem to get a proper map of the elements with a set-cardinality property. Appreciate your help!
Stack Overflow
Valuemap returns array
In Tinkerpop3 valueMap is returning an array, how can I get a real key value pair (without Array)? gremlin> Gremlin.version() ==>3.0.1-incubating :> def trav = g.V().hasLabel('Group'); t...
8 Replies
spmallette
spmallette2y ago
I think it's typically easiest to use project() in these cases:
gremlin> g.V().hasLabel('person').elementMap()
==>[id:1,label:person,name:marko,location:santa fe]
==>[id:7,label:person,name:stephen,location:purcellville]
==>[id:8,label:person,name:matthias,location:seattle]
==>[id:9,label:person,name:daniel,location:aachen]
gremlin> g.V().hasLabel('person').valueMap()
==>[name:[marko],location:[san diego,santa cruz,brussels,santa fe]]
==>[name:[stephen],location:[centreville,dulles,purcellville]]
==>[name:[matthias],location:[bremen,baltimore,oakland,seattle]]
==>[name:[daniel],location:[spremberg,kaiserslautern,aachen]]
gremlin> g.V().hasLabel('person').valueMap().by(unfold())
==>[name:marko,location:san diego]
==>[name:stephen,location:centreville]
==>[name:matthias,location:bremen]
==>[name:daniel,location:spremberg]
gremlin> g.V().hasLabel('person').
......1> project('name','location').
......2> by('name').
......3> by(values('location').fold())
==>[name:marko,location:[san diego,santa cruz,brussels,santa fe]]
==>[name:stephen,location:[centreville,dulles,purcellville]]
==>[name:matthias,location:[bremen,baltimore,oakland,seattle]]
==>[name:daniel,location:[spremberg,kaiserslautern,aachen]]
gremlin> g.V().hasLabel('person').elementMap()
==>[id:1,label:person,name:marko,location:santa fe]
==>[id:7,label:person,name:stephen,location:purcellville]
==>[id:8,label:person,name:matthias,location:seattle]
==>[id:9,label:person,name:daniel,location:aachen]
gremlin> g.V().hasLabel('person').valueMap()
==>[name:[marko],location:[san diego,santa cruz,brussels,santa fe]]
==>[name:[stephen],location:[centreville,dulles,purcellville]]
==>[name:[matthias],location:[bremen,baltimore,oakland,seattle]]
==>[name:[daniel],location:[spremberg,kaiserslautern,aachen]]
gremlin> g.V().hasLabel('person').valueMap().by(unfold())
==>[name:marko,location:san diego]
==>[name:stephen,location:centreville]
==>[name:matthias,location:bremen]
==>[name:daniel,location:spremberg]
gremlin> g.V().hasLabel('person').
......1> project('name','location').
......2> by('name').
......3> by(values('location').fold())
==>[name:marko,location:[san diego,santa cruz,brussels,santa fe]]
==>[name:stephen,location:[centreville,dulles,purcellville]]
==>[name:matthias,location:[bremen,baltimore,oakland,seattle]]
==>[name:daniel,location:[spremberg,kaiserslautern,aachen]]
There's no way to selectively make valueMap() (or elementMap() for that matter) choose multiproperties or single. There is probably some Gremlin that could be written to transform the results of valueMap() to the form you like, but that Gremlin is a bit harder to read compared to the simplicity of project(). it looks like this:
gremlin> g.V().hasLabel('person').valueMap().
......1> map(unfold().
......2> group().
......3> by(keys).
......4> by(choose(select(keys).is('name'),
......5> select(values).unfold(),
......6> select(values))))
==>[name:marko,location:[san diego,santa cruz,brussels,santa fe]]
==>[name:stephen,location:[centreville,dulles,purcellville]]
==>[name:matthias,location:[bremen,baltimore,oakland,seattle]]
==>[name:daniel,location:[spremberg,kaiserslautern,aachen]]
gremlin> g.V().hasLabel('person').valueMap().
......1> map(unfold().
......2> group().
......3> by(keys).
......4> by(choose(select(keys).is('name'),
......5> select(values).unfold(),
......6> select(values))))
==>[name:marko,location:[san diego,santa cruz,brussels,santa fe]]
==>[name:stephen,location:[centreville,dulles,purcellville]]
==>[name:matthias,location:[bremen,baltimore,oakland,seattle]]
==>[name:daniel,location:[spremberg,kaiserslautern,aachen]]
basically, you take each Map and deconstruct it with unfold() to entries, then for each entry you do an if/then on the keys to see if it is a "name" property and if it is you unfold() that to a single item, otherwise you just use the values as-is in a list form.
sap13n
sap13nOP2y ago
Thanks @spmallette I was able to adjust the last suggestion to make it work the way I was expecting (I need the element Id and Label as well). Unfortunately I couldn't figure out how to include this information with project(...) This is the query I ended up with:
g.V()
.hasLabel('SampleContainer')
.filter(properties('samples').count().is(P.gte(2)))
.valueMap(true)
.local(unfold()
.group()
.by(keys)
.by(choose(
select(keys).not(__.is('samples')),
select(values).unfold(),
select(values)
)
)
)
g.V()
.hasLabel('SampleContainer')
.filter(properties('samples').count().is(P.gte(2)))
.valueMap(true)
.local(unfold()
.group()
.by(keys)
.by(choose(
select(keys).not(__.is('samples')),
select(values).unfold(),
select(values)
)
)
)
oh i see that map(...) is replaced by local(...) in my example but this means that that the result is the same in my tests. I was just tinkering with the query...
spmallette
spmallette2y ago
careful with local() - it is not interchangeable with map() exactly: https://stephen.genoprime.com/snippet/2023/04/13/snippet-15.html as for:
Unfortunately I couldn't figure out how to include this information with project(...)
you could just add them to the project() like:
gremlin> g.V().hasLabel('person').
......1> project('name','location', 'vid', 'vlabel').
......2> by('name').
......3> by(values('location').fold()).
......4> by(id).
......5> by(label)
==>[name:marko,location:[san diego,santa cruz,brussels,santa fe],vid:1,vlabel:person]
==>[name:stephen,location:[centreville,dulles,purcellville],vid:7,vlabel:person]
==>[name:matthias,location:[bremen,baltimore,oakland,seattle],vid:8,vlabel:person]
==>[name:daniel,location:[spremberg,kaiserslautern,aachen],vid:9,vlabel:person]
gremlin> g.V().hasLabel('person').
......1> project('name','location', 'vid', 'vlabel').
......2> by('name').
......3> by(values('location').fold()).
......4> by(id).
......5> by(label)
==>[name:marko,location:[san diego,santa cruz,brussels,santa fe],vid:1,vlabel:person]
==>[name:stephen,location:[centreville,dulles,purcellville],vid:7,vlabel:person]
==>[name:matthias,location:[bremen,baltimore,oakland,seattle],vid:8,vlabel:person]
==>[name:daniel,location:[spremberg,kaiserslautern,aachen],vid:9,vlabel:person]
stephen mallette
Use of local() Revisited
In recent weeks there have been many questions about the local()-step both in Discord and in TinkerPop’s JIRA. I’d written on this topic once before in Use of local() but it perhaps begs a further revisit. The example that arose from Discord helped inspire this post as it was sufficiently primitive to hopefully clarify usage.
sap13n
sap13nOP2y ago
Whoa, thanks for the warning about local! Regarding vlabel and vid, looks like they are not returned from AWS Neptune. @spmallette
spmallette
spmallette2y ago
really? that's strange....what if you replace those last lines with:
by(id()).by(label())
by(id()).by(label())
does that make a difference?
sap13n
sap13nOP2y ago
apologies, I copy-paste issue on my part. I wrapped the parameters in by('id') when it should just be by(id) so, perfect, this works! I'm curious though, in the *least-favorable solution you shared there is filter referencing keys and values:
choose(
select(keys).not(__.is('samples')),
select(values).unfold(),
select(values)
)
choose(
select(keys).not(__.is('samples')),
select(values).unfold(),
select(values)
)
Do they exist in the Java Client? I can't find __.keys() but thre are __.key() and __.values()
spmallette
spmallette2y ago
They are part of the Column enum
sap13n
sap13nOP2y ago
You're a saint! Thanks!
Want results from more Discord servers?
Add your server