Using Spark inside Gremlin-Server
I am trying to invoke spark in Java inside a
call
step inside my graph implementation which is loaded in Gremlin-Server but I am running into a difficult error.
Outside of gremlin-server everything works properly.
I was wondering if anyone has experienced this or if there is anything that gremlin-server does with the class loader that might affect this?
The stack trace is attached8 Replies
It seems to be a problem with class loading that I do not get outside of gremlin-server
not much of an error
i guess some static initializer is failing?
i suppose the question is, what is trying to initialize in that moment?
Sorry i was out of town for the canadian long weekend. I was trying to invoke spark to do a read() on a csv. If you see at the middle it of the stack trace it kind of implies that in scala it has some class loader issue
Seems to me like the classloader is failing to find stuff but this only happens in gremlin-server which I thought might be doign something fancy to that
not sure if it's relevant or not, but there are some manifest entries to
spark-gremlin
that i don't think have been looked at in a while. maybe upgrading to newer spark versions broke something? https://github.com/apache/tinkerpop/blob/3.6.4/spark-gremlin/pom.xml#L325-L344eGitHub
tinkerpop/pom.xml at 3.6.4 · apache/tinkerpop
Apache TinkerPop - a graph computing framework. Contribute to apache/tinkerpop development by creating an account on GitHub.
I will have a look at this, any lead is useful right now, thanks Stephen
you can see how those manifest entries are used here in the
DependencyGrabber
that gremlin-server.sh install
invokes: https://github.com/apache/tinkerpop/blob/3.6.4/gremlin-groovy/src/main/groovy/org/apache/tinkerpop/gremlin/groovy/util/DependencyGrabber.groovyGitHub
tinkerpop/DependencyGrabber.groovy at 3.6.4 · apache/tinkerpop
Apache TinkerPop - a graph computing framework. Contribute to apache/tinkerpop development by creating an account on GitHub.
So I kind of figured it out.
If I go run
Take the output and add it to
gremlin-server.sh
as
it works. Not sure on a general solution for this yet but that is a startoh - well, i think you still have to do the standard steps that you would take if you were using the Gremlin Console: https://tinkerpop.apache.org/docs/current/reference/#hadoop-gremlin
so setting up stuff like
HADOOP_GREMLIN_LIBS
and the like