Apache TinkerPop•3y ago

store edges of a node in a sorted manner

Is there any way for me to store the edges of a specific node in a sorted manner? Is there any Gremlin based database that implements indices perhaps?

Solution:

Is there any way for me to store the edges of a specific node in a sorted manner?

to my knowledge none of the common ones will allow that. i dont think most even preserve insertion order for edges.

Is there any Gremlin based database that implements indices perhaps?

i assume you are still referring to edges here. Titan-like graphs such as JanusGraph, HugeGraph, etc. and DataStax Graph allow you to explicitly define indices on edges...

Jump to solution

16 Replies

Solution

spmallette•3y ago

Is there any way for me to store the edges of a specific node in a sorted manner?

to my knowledge none of the common ones will allow that. i dont think most even preserve insertion order for edges.

Is there any Gremlin based database that implements indices perhaps?

i assume you are still referring to edges here. Titan-like graphs such as JanusGraph, HugeGraph, etc. and DataStax Graph allow you to explicitly define indices on edges

kyano_kOP•3y ago

are you familiar with anything that will allow me to do that? so I'm assuming also i wouldn't be able to perform binary search on a sorted index correct?

spmallette•3y ago

all i think you could do is add a property to your edge that indicates the order your want (e.g. some numeric value) and then use that to order() your edges. whether that ordering is done in-memory or in some optimized manner will be entirely up to the graph you are using.

edolivares•3y ago

@kyano what would it be sorted by? Like time it was created? Or do you mean to analyze which edges are most commonly traversed or something like that?

kyano_kOP•3y ago

it will be sorted by a numeric property shared by all

kelvinl2816•3y ago

In general I think it is best to not depend on specific DB behavior for edge retrieval order, and to always order explicitly as needed. How many edges will you typically be needing to sort/order by a property? I suppose it might be possible to also use the edge ID for this, IFF the DB allows user defined edge IDs (not all do).

kyano_kOP•3y ago

I'll give you an example and maybe you could suggest the best solution (if there is one): I have an a node V which holds the property value: x. All of the incoming edges hold desiredValue: y for some y. I define a certain operation for all nodes V1 with incident edges with V that have desiredValue < value, and for all node V2 with incident edges with V that have desiredValue > value. Therefore in this instance it would be very efficient to keep all of those edges sorted, because with a single binary search of value I can identify which nodes need what operation. @KelvinL @edolivares @spmallette tagging for visibility

spmallette•3y ago

Since there aren't any graphs (that i'm aware of) that can store edges in some specified order, I think you would have to use order() step (whether by id as kelvin suggested or by property) and rely on the underlying graph to do its best to return that sort in whatever way it optimizes to do that.

kyano_kOP•3y ago

i see but if i have that property indexed, is it safe to assume that the order step will work faster than a naive order?

spmallette•3y ago

whether adding order() outperforms a naïve order will be dependent on your graph database. you will need to profile your queries to determine how the query is being optimized. that said, an "index" will typically help with filtering actions and not ordering actions. for example, a vertex centric index in JanusGraph will improve your ability to find certain edges with has() but won't do anything to help with order(). in other words doing outE().order().by('value').has('value', gt(desiredValue)) doesn't offer any improvement to the has(). i'd again call this general advice to consider. i think @KelvinL had a good question that i don't see answered:

How many edges will you typically be needing to sort/order by a property?

kyano_kOP•3y ago

variable, it could be thousands it could be a few hmmm i see. so essentially, I would have improvement in the has step correct? Meaning, that by using a graph like janus graph, filtering with has('value', gt(desiredValue)) would be more efficient since value is indexed as opposed to if it wasn't?

spmallette•3y ago

yes. in your case though, i wonder if you need to worry too much about this issue at all yet. i don't know the full context of your query or data but a few thousand edges should be pretty fast to filter through in-memory.

kyano_kOP•3y ago

correct but this is per node. I expect to have a couple 10s if not 100s of thousands of node, so it can get quite large

spmallette•3y ago

i see. what kind of performance are you seeing now? i think i remember that you're using Neptune correct?

kyano_kOP•3y ago

Actually, this is for a different project. Haven't reached development yet, just doing research atm for the architecture. I can share it with you privately if you're interested.

spmallette•3y ago

ok, feel free to share with me privately if you like. have i answered your question here at this point? or is there still something that isn't clear?

Gaming

Programming

store edges of a node in a sorted manner

Did you find this page helpful?