M. alhaddad
M. alhaddad
ATApache TinkerPop
Created by M. alhaddad on 5/8/2024 in #questions
Using dedup with Neptune
thanks i will check it out
11 replies
ATApache TinkerPop
Created by M. alhaddad on 5/8/2024 in #questions
Using dedup with Neptune
that means i can do ?
.in_("HasSite")
.dedup()
.range_(offset, offset + limit)
.values("company_id")
.in_("HasSite")
.dedup()
.range_(offset, offset + limit)
.values("company_id")
so that won't affect the performance much? even if at in_("HasSite") we have thousands of results
11 replies
ATApache TinkerPop
Created by M. alhaddad on 5/8/2024 in #questions
Using dedup with Neptune
what benefits does toList() bring? maybe i adapted fold & next since 3 years thats why. but do we have benefits or its just the same?
11 replies
ATApache TinkerPop
Created by M. alhaddad on 5/8/2024 in #questions
Using dedup with Neptune
comp_ids = (
g.V()
.hasLabel("Word")
.has("term", "Neptune#fts {}*".format(term))
.in_("Has")
.in_("HasSite")
.range_(offset, offset + limit)
.values("company_id")
.fold()
.next()
)
comp_ids = (
g.V()
.hasLabel("Word")
.has("term", "Neptune#fts {}*".format(term))
.in_("Has")
.in_("HasSite")
.range_(offset, offset + limit)
.values("company_id")
.fold()
.next()
)
at in_("HasSite") i will have duplicate output as intended in the design. However, as a final output i want the paginated part to be not affected by duplicates so i am to apply range() on unique stuff. The performance would drop too much if i dedup() before range() because dedup() is not optimized
11 replies
ATApache TinkerPop
Created by Gil on 4/5/2024 in #questions
Fulltext-search-like features without ElasticSearch, OpenSearch, Solr and such?
by fail i meant fail to retrieve results, it brings empty output.
18 replies
ATApache TinkerPop
Created by Gil on 4/5/2024 in #questions
Fulltext-search-like features without ElasticSearch, OpenSearch, Solr and such?
@Gil I used TextP longtime ago but then it started to be slower with increased data size. So i integrated elasticsearch Engine: Neptune AWS But today i am facing a new problem, it fails on retrieving something with hyphens
18 replies
ATApache TinkerPop
Created by M. alhaddad on 2/14/2024 in #questions
Memory issue on repeat
@spmallette
5 replies
ATApache TinkerPop
Created by M. alhaddad on 3/9/2023 in #questions
Traversal is propagating to further edges?
Ah i see, thanks thanks
5 replies
ATApache TinkerPop
Created by M. alhaddad on 2/14/2023 in #questions
Big graph makes timeouts
thanks @triggan I just tried to run a query with a limit of 1Million and yes it was fine, so a possible solution is to batch the target within range(). I've just read that Neptune is not intended for analytics, it is intended for simple OLTP tasks, could that be the reason - am doing something on a DB not optimized todo so.. https://www.infoworld.com/article/3394860/amazon-neptune-review-a-scalable-graph-database-for-oltp.html#:~:text=Gremlin%20and%20SPARQL%20address%20different,with%20SELECT%20and%20WHERE%20clauses. If so what can i do? I have large number of nodes and sometimes I need to traverse relationships sometimes I need to do some predictions/analytics tasks
11 replies
ATApache TinkerPop
Created by M. alhaddad on 2/14/2023 in #questions
Big graph makes timeouts
although this https://groups.google.com/g/aureliusgraphs/c/TOQ2618KDnY worked, but i was not sure about making possible further computations, the simplest score I am having is the number of outbounding edges yet it is expensive traversal
11 replies
ATApache TinkerPop
Created by M. alhaddad on 2/14/2023 in #questions
Big graph makes timeouts
I have around 20Million nodes, and pre-limiting is just not a solution
11 replies
ATApache TinkerPop
Created by M. alhaddad on 2/14/2023 in #questions
Big graph makes timeouts
yes, i have maxed out the timeout, so i start to get memory errors
11 replies