M. alhaddad
M. alhaddad
ATApache TinkerPop
Created by M. alhaddad on 5/8/2024 in #questions
Using dedup with Neptune
I remember once i came accross AWS Neptune optimization guide that i don't remember where is it now. It mentions that .dedup() step is not optimized for Neptune which makes performance worse. However, I have the following scenario where i need deduplicates and pagination at same time. So the only possible way in mind is to do .dedup() then .range() Or .groupCount() then select keys then range() But i am not sure if grouping does maintain the order all the time. What could be done?
11 replies
ATApache TinkerPop
Created by M. alhaddad on 2/14/2024 in #questions
Memory issue on repeat
No description
5 replies
ATApache TinkerPop
Created by M. alhaddad on 4/11/2023 in #questions
Recommendation of a cache method
I have a complex query that involves ordering and traversing over several node types to obtain something like a cluster, the issue is when I have large number of nodes the query start to take longer time to evaluate. The use case is a user clicking on a word from a UI and I obtain a cluster of what come with it, What could be a possible way to cache the result, the result is not a number nor a string, its kind of array of a projection of several attributes calculated through the query. I thought about using hashmap of redis to cache result but that will need tons of memory, So that would save me the re-doing of the same query again and again. The used env. AWS Neptune + gremlin + python
3 replies
ATApache TinkerPop
Created by M. alhaddad on 3/9/2023 in #questions
Traversal is propagating to further edges?
5 replies
ATApache TinkerPop
Created by M. alhaddad on 3/3/2023 in #questions
Cannot access a stored value after fold
3 replies
ATApache TinkerPop
Created by M. alhaddad on 2/14/2023 in #questions
Big graph makes timeouts
I am having trouble querying big graph especially when it comes to apply filters. I want to order the nodes so that I can take highest degree ones, but the graph is always throwing timeouts, and the only trick i am applying is pre-limiting the accessed nodes
g.V()
.hasLabel("Word")
.tail(100000)
.order().by(outE("RetrievedBy").count(), desc)
.limit(100)
.project("term", "degree")
.by("term")
.by(outE("RetrievedBy").count())
g.V()
.hasLabel("Word")
.tail(100000)
.order().by(outE("RetrievedBy").count(), desc)
.limit(100)
.project("term", "degree")
.by("term")
.by(outE("RetrievedBy").count())
I am using Neptune with instance (db.r6g.xlarge)
11 replies