Apache TinkerPop•14mo ago

Performance issue in large graphs

When performing changes in large graph (ca. 100K nodes, 500K edges) which is stored in one kryo file I am experiencing some huge delays. Just as an example, when writing initially I can change 10K nodes in minutes, but when the graph is big the same changes need more than one hour. Is there any easy solution possible, i.e., like breaking down and saving in smaller files etc. Any suggestion is helpful. Initial preference is saving in file system (local or network). Thanks for your suggestions/solutions.

Solution:

I'm not sure if any of the other serializations such as GraphML or GraphSON might perform better, but I would say this is likely not a common way we see those graphs used so we may not have too much data on which techniques may work best. With the exception of TinkerGraph which is often used as an in-memory, somewhat ephemeral, graph, we typically see persistent graph stores used where the data is persisted on disk by the database and you do not need to constantly reload the data each time. If y...

Jump to solution

6 Replies

kelvinl2816•14mo ago

Are you using TinkerGraph and saving the updates to file, or are you using some other graph database store? In general what you describe is actually quite a small graph, but I'm not sure what technology stack you are using.

TanvirOP•14mo ago

Hi Kelvin, I am using tinkergraph. I also used Janusgraph - it was even slower there and then have to switch back to tinkergraph.

kelvinl2816•14mo ago

How exactly are you using Kyro here? Are you serializing the whole graph out periodically and then reloading it?

TanvirOP•14mo ago

I am reading the whole graph and then adding the the nodes.

Solution

kelvinl2816•14mo ago

TanvirOP•14mo ago

Thanks a lot for the prompt response. Agree, a persistent store solution will help a lot. I think, that is possibly the solution I have to opt for long run.

Gaming

Programming

Performance issue in large graphs

Did you find this page helpful?