How to investigate latency
I'm using JanusGraph as my db. When I issue the query via
gremlin-console
with .profile()
step at the end, it finishes execution within 5–8 milliseconds, but when I perform the same request in a parametrized from an application, it takes up 25–40 milliseconds for the database to answer.
What takes this time, and how can I minimize it? I have a feeling that most of the time is taken by the backend storage layer (Apache Cassandra), and this time is not shown in the .profile()
. Am I correct?Solution:Jump to solution
So just switching to a static
.repeat()
traversal with .map{myArray[it.loops]}
saved my day.23 Replies
Have you tried using the clock class in gremlin? https://tinkerpop.apache.org/javadocs/3.2.1/full/org/apache/tinkerpop/gremlin/util/TimeUtil.html
You could use .path() at the end of your traversals to get what the path is. Perhaps you are making expensive traversals and don’t realize it
If I use
TimeUtil.clock({-> <query>})
, then it outputs number somewhere around 0.1
.
That's milliseconds, obviously not seconds.
So I guess that TimeUtil.clock in this case just outputs the time needed to execute the Groovy code to create a query, which is fast enough. The problem is in how long it takes to execute the query.
And yeah, the query is indeed not the simplest one, but I'm just trying to understand how 3 milliseconds of execution (or at least that tells me the .profile()
) turns into 30 milliseconds of wait time when the same query is executed from the application.Oh I see. So you mean in the gremlin-console the execution of the query takes 3ms but, it’s 30ms when you call if from your application which I am assuming is sending an HTTP request to a gremlin server?
It connects over a websocket and not http, but yes.
i think @edolivares is on to something. how are you connecting to JanusGraph in Gremlin Console? could you share that code?
I just
gremlin-console
and then
Not a clever way indeed.
remote.yaml
:
when you configure with
serializeResultToString
you are effectively bypassing the serializers. that could account for some of the difference you're seeing there given it's just a few millisecondsThe query results just a number. But I still changed the value to
false
.
Still, .profile()
reports the task being finished within 3ms and no 30.
Hm. I checked a simpler query. And with it:
.profile()
reports ~2 milliseconds of execution time.
My application itself takes ~5 milliseconds to fulfil the request.
So it's not something related to how the connection to the db is made, I guess. It's just .profile
not reporting 27 milliseconds which are... I don't know where, and that's why I initially asked this question.profile()
should cover the time when the first traverser is created to start the traversal all the way up to the last traverser to die when the traversal terminates. that would include time spent interacting with any underlying database like cassandra that is supplying the data to produce those traversers.
how are you making your request from your application? are you sending a Gremlin script as a string?Unfortunately, yes. But it's in a parametrised form.
Maybe that's a problem with a driver I'm using. I could check it with another driver later...
there will definitely be some cost on the first request. the script has to be passed through the
GremlinGroovyScriptEngine
. that groovy compilation/caching will add some time. subsequent calls will be faster. not sure if that can help fill some gaps with your missing time. not really sure what else that could be. given the difference of just a few milliseconds (compared to seconds) it might be hard to track down fully given all the variables at play that could steal a millisecond here or there.
i will say it's a bit odd if you can consistently replicate the difference given the variables though.Nah, 30 ms is after many requests made. On the first one it is as high as 50.
If no ideas are left, I will check how another language with another driver handles this situation. Thanks for the help.
maybe you should rule out
serializeResultToString
like make your driver use the exact same serializer configuration as the ConsoleI'll also try that, thanks
... I had wrong assumptions about how query is compiled.
How the traversal is compiled, to be exact.
The gremlin code was static. And it really fires quick. The problem is, this gremlin code generates different traversals given different input.
And this is deadly. Given some inputs (not to huge), the compilation of the traversal takes seconds.
The execution time stays pretty much the same, but the compilation is ridiculous.
Just making a sanity check:
this SHOULD NOT recompile every time
someArray
changes, right?i would not expect that to recompile if you sent that exact string again. they key to recompilation is an exact string match of the query. if the strings are the same you should get a cache hit and avoid the compilation penalty. groovy compilation in this context is terribly slow which is why we recommend parameterization. fwiw, we're slowly moving away from groovy. the first attempt at this came with bytecode based Gremlin requests which we've long recommended over scripts. going forward though it looks like
gremlin-language
and the ANTLR grammar is the better approach. recently it was noted here in discord in the #implementers channel that GremlinLangScriptEngine
was 200% to 500% faster than GremlinGroovyScriptEngine
. The former is not a direct replacement for the latter though...it will only handle the Gremlin language so any Groovy syntax will be invalid.The problem is not in the Groovy compilation.
The problem is in the compilation of the traversal that is generated by the query.
As I understand the process, it is:
Compile Groovy -> Run Groovy (Compile Traversals -> Execute Traversals) -> Return Result.
Am I right?
not sure what you mean by "compile traversals". the only step that is taken at the very start of the traversal is to apply TraversalStrategy implementations that may re-write the traversal to optimize it for execution. generally that is incredibly fast.
if you meant something other than this for "compilation" when you wrote:
The problem is, this gremlin code generates different traversals given different input.i'm not sure what you mean.
That's what I mean. But it seems like "apply traversal strategy" turns into a nightmare in my case.
are you only testing:
g.inject(0).repeat(map{someArray[it.loops]}).emit().times(someArray.size())
? if someArray
is really big that could be an issue
still, though, usually not terribly expensive to apply the strategy even in that case. RepeatUnrollStrategy
will rewrite the repeat as a inline query which should have the effect of speeding it up during execution... well, it's not huge enough to cause performance issues in my case.
My problem's solved.
Morale of the story: dynamically generating traversal via query is bad sometimes.
glad you figured it out. i agree that i've often found dynamically produced Gremlin to (1) be hard to debug and (2) sometimes doesn't perform well. it would be nice if you could clarify what the root cause of the problem was. do you have a clear understanding of the problem to share?
It all dates back to here:
https://discord.com/channels/838910279550238720/844295727487647744/1047948951233499156
I was using a Groovy
for
loop to generate a complex traversal.
And for some reason, the time needed to optimize the result traversal was growing up exponentially.
So with n=2
it was tolerable 30 ms
, for n>10
it takes minutes just to optimize the query.Solution
So just switching to a static
.repeat()
traversal with .map{myArray[it.loops]}
saved my day.