store edges of a node in a sorted manner

Is there any way for me to store the edges of a specific node in a sorted manner? Is there any Gremlin based database that implements indices perhaps?
Solution:
Is there any way for me to store the edges of a specific node in a sorted manner?
to my knowledge none of the common ones will allow that. i dont think most even preserve insertion order for edges.
Is there any Gremlin based database that implements indices perhaps?
i assume you are still referring to edges here. Titan-like graphs such as JanusGraph, HugeGraph, etc. and DataStax Graph allow you to explicitly define indices on edges...
Jump to solution
16 Replies
Solution
spmallette
spmallette2y ago
Is there any way for me to store the edges of a specific node in a sorted manner?
to my knowledge none of the common ones will allow that. i dont think most even preserve insertion order for edges.
Is there any Gremlin based database that implements indices perhaps?
i assume you are still referring to edges here. Titan-like graphs such as JanusGraph, HugeGraph, etc. and DataStax Graph allow you to explicitly define indices on edges
kyano_k
kyano_k2y ago
are you familiar with anything that will allow me to do that? so I'm assuming also i wouldn't be able to perform binary search on a sorted index correct?
spmallette
spmallette2y ago
all i think you could do is add a property to your edge that indicates the order your want (e.g. some numeric value) and then use that to order() your edges. whether that ordering is done in-memory or in some optimized manner will be entirely up to the graph you are using.
edolivares
edolivares2y ago
@kyano what would it be sorted by? Like time it was created? Or do you mean to analyze which edges are most commonly traversed or something like that?
kyano_k
kyano_k2y ago
it will be sorted by a numeric property shared by all
kelvinl2816
kelvinl28162y ago
In general I think it is best to not depend on specific DB behavior for edge retrieval order, and to always order explicitly as needed. How many edges will you typically be needing to sort/order by a property? I suppose it might be possible to also use the edge ID for this, IFF the DB allows user defined edge IDs (not all do).
kyano_k
kyano_k2y ago
I'll give you an example and maybe you could suggest the best solution (if there is one): I have an a node V which holds the property value: x. All of the incoming edges hold desiredValue: y for some y. I define a certain operation for all nodes V1 with incident edges with V that have desiredValue < value, and for all node V2 with incident edges with V that have desiredValue > value. Therefore in this instance it would be very efficient to keep all of those edges sorted, because with a single binary search of value I can identify which nodes need what operation. @KelvinL @edolivares @spmallette tagging for visibility
spmallette
spmallette2y ago
Since there aren't any graphs (that i'm aware of) that can store edges in some specified order, I think you would have to use order() step (whether by id as kelvin suggested or by property) and rely on the underlying graph to do its best to return that sort in whatever way it optimizes to do that.
kyano_k
kyano_k2y ago
i see but if i have that property indexed, is it safe to assume that the order step will work faster than a naive order?
spmallette
spmallette2y ago
whether adding order() outperforms a naïve order will be dependent on your graph database. you will need to profile your queries to determine how the query is being optimized. that said, an "index" will typically help with filtering actions and not ordering actions. for example, a vertex centric index in JanusGraph will improve your ability to find certain edges with has() but won't do anything to help with order(). in other words doing outE().order().by('value').has('value', gt(desiredValue)) doesn't offer any improvement to the has(). i'd again call this general advice to consider. i think @KelvinL had a good question that i don't see answered:
How many edges will you typically be needing to sort/order by a property?
kyano_k
kyano_k2y ago
variable, it could be thousands it could be a few hmmm i see. so essentially, I would have improvement in the has step correct? Meaning, that by using a graph like janus graph, filtering with has('value', gt(desiredValue)) would be more efficient since value is indexed as opposed to if it wasn't?
spmallette
spmallette2y ago
yes. in your case though, i wonder if you need to worry too much about this issue at all yet. i don't know the full context of your query or data but a few thousand edges should be pretty fast to filter through in-memory.
kyano_k
kyano_k2y ago
correct but this is per node. I expect to have a couple 10s if not 100s of thousands of node, so it can get quite large
spmallette
spmallette2y ago
i see. what kind of performance are you seeing now? i think i remember that you're using Neptune correct?
kyano_k
kyano_k2y ago
Actually, this is for a different project. Haven't reached development yet, just doing research atm for the architecture. I can share it with you privately if you're interested.
spmallette
spmallette2y ago
ok, feel free to share with me privately if you like. have i answered your question here at this point? or is there still something that isn't clear?
Want results from more Discord servers?
Add your server