M. alhaddad Posts - Answer Overflow

M. alhaddad

•Created by M. alhaddad on 5/8/2024 in #questions

Using dedup with Neptune

I remember once i came accross AWS Neptune optimization guide that i don't remember where is it now. It mentions that .dedup() step is not optimized for Neptune which makes performance worse. However, I have the following scenario where i need deduplicates and pagination at same time. So the only possible way in mind is to do .dedup() then .range() Or .groupCount() then select keys then range() But i am not sure if grouping does maintain the order all the time. What could be done?

11 replies

ATApache TinkerPop

•Created by M. alhaddad on 2/14/2024 in #questions

Memory issue on repeat

5 replies

ATApache TinkerPop

•Created by M. alhaddad on 4/11/2023 in #questions

Recommendation of a cache method

I have a complex query that involves ordering and traversing over several node types to obtain something like a cluster, the issue is when I have large number of nodes the query start to take longer time to evaluate. The use case is a user clicking on a word from a UI and I obtain a cluster of what come with it, What could be a possible way to cache the result, the result is not a number nor a string, its kind of array of a projection of several attributes calculated through the query. I thought about using hashmap of redis to cache result but that will need tons of memory, So that would save me the re-doing of the same query again and again. The used env. AWS Neptune + gremlin + python

3 replies

ATApache TinkerPop

•Created by M. alhaddad on 3/9/2023 in #questions

Traversal is propagating to further edges?

5 replies

ATApache TinkerPop

•Created by M. alhaddad on 3/3/2023 in #questions

Cannot access a stored value after fold

3 replies

ATApache TinkerPop

•Created by M. alhaddad on 2/14/2023 in #questions

Big graph makes timeouts

I am having trouble querying big graph especially when it comes to apply filters. I want to order the nodes so that I can take highest degree ones, but the graph is always throwing timeouts, and the only trick i am applying is pre-limiting the accessed nodes

g.V()
.hasLabel("Word")
.tail(100000)
.order().by(outE("RetrievedBy").count(), desc)
.limit(100)
.project("term", "degree")
.by("term")
.by(outE("RetrievedBy").count())

g.V()
.hasLabel("Word")
.tail(100000)
.order().by(outE("RetrievedBy").count(), desc)
.limit(100)
.project("term", "degree")
.by("term")
.by(outE("RetrievedBy").count())

I am using Neptune with instance (db.r6g.xlarge)

11 replies

Gaming

Programming