JanusGraph•12mo ago

Comma Separated Config Options Values Via Environment Variable?

Hey @Bo & @porunov I was trying out the retry logic added by my PR (https://github.com/JanusGraph/janusgraph/pull/4409) by setting environment variables in my docker compose like so: janusgraph.index.some_index_name_here.elasticsearch.retry-error-codes=408,429 Much like was done in the unit test I wrote (https://github.com/JanusGraph/janusgraph/blob/487e10ca276678862fd8fb369d6d17188703ba67/janusgraph-es/src/test/java/org/janusgraph/diskstorage/es/rest/RestClientSetupTest.java#L240). But to my surprise it doesn't seem to be received well during startup:

janusgraph | Caused by: java.lang.NumberFormatException: For input string: "408,429" janusgraph | at java.base/java.lang.NumberFormatException.forInputString(Unknown Source) ~[?:?] ..... janusgraph | at org.janusgraph.diskstorage.es.rest.RestClientSetup.connect(RestClientSetup.java:83) ~[janusgraph-es-1.1.0-20240427-134121.4bddfb4.jar:?]

The surprise to me being it's using the same String[] parsing logic as other commands, which got me curious if others didn't work either. So I tried a more mundane one like janusgraph.index.some_index_name_here.hostname=elasticsearch,elasticsearch And it too doesn't seem to get parsed correctly:

janusgraph | Caused by: java.net.UnknownHostException: elasticsearch,elasticsearch janusgraph | at java.base/java.net.InetAddress$CachedAddresses.get(Unknown Source) ~[?:?] .... janusgraph | at org.elasticsearch.client.RestClient.performRequest(RestClient.java:288) ~[elasticsearch-rest-client-8.10.4.jar:8.10.4] janusgraph | at org.janusgraph.diskstorage.es.rest.RestElasticSearchClient.clusterHealthRequest(RestElasticSearchClient.java:171) ~[janusgraph-es-1.1.0-20240427-134121.4bddfb4.jar:?]

So just wondered is there a special syntax that's intended here?

Solution:

GitHub

tinkerpop/gremlin-server/src/main/java/org/apache/tinkerpop/gremlin...

Apache TinkerPop - a graph computing framework. Contribute to apache/tinkerpop development by creating an account on GitHub.

GitHub

janusgraph/janusgraph-core/src/main/java/org/janusgraph/graphdb/man...

JanusGraph: an open-source, distributed graph database - JanusGraph/janusgraph

Jump to solution

22 Replies

criminosisOP•12mo ago

I found the env var parsing in the docker startup script (https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-dist/docker/docker-entrypoint.sh#L41-L49) but it seems like it should be faithfully passing things through: https://github.com/JanusGraph/janusgraph/blob/487e10ca276678862fd8fb369d6d17188703ba67/janusgraph-dist/docker/docker-entrypoint.sh#L48

Bo•12mo ago

Is there any chance you could try it out without using docker?

criminosisOP•12mo ago

But Docker is provided as an official way to deploy JanusGraph, so shouldn't we be able to do this through Docker?

Bo•12mo ago

yeah I am just wondering if there's something wrong with our Docker image I've never seen this bug before - but I've never used Docker to deploy JanusGraph

Solution

criminosis•12mo ago

Okay, after quite a journey of vetting the parsing behavior from environment variable into property files, into the various Confirmation implementations into ConfigOptions, etc etc I've found where this has gone sideways....and like the worst of these types of situations looks like this one was self inflicted 🤦‍♂️ . A long long long time ago when I was starting on working with JanusGraph I was tired of JanusGraph deployments that silently failed and leaving my container up but dead inside the container, for reasons that weren't its fault (started up before Cassandra was ready timing out its storage wait time, etc). Trying to find solutions for that I stumbled upon the Tinkerpop CheckGraphManager (https://github.com/apache/tinkerpop/blob/master/gremlin-server/src/main/java/org/apache/tinkerpop/gremlin/server/util/CheckedGraphManager.java) that would automatically terminate the process if the specified graphs did not successfully open. So I switched to that graph manager like so: ENV gremlinserver.graphManager=org.apache.tinkerpop.gremlin.server.util.CheckedGraphManager But this has the unintended consequence of removing the configuration parsing handling that is afforded by JanusGraphManager (https://github.com/JanusGraph/janusgraph/blob/487e10ca276678862fd8fb369d6d17188703ba67/janusgraph-core/src/main/java/org/janusgraph/graphdb/management/JanusGraphManager.java#L73), like handling String arrays config options 🤦‍♂️ . Would there be appetite for like a CheckJanusGraphManager that would extend JanusGraphManager and overlay a duplication of logic that the Tinkerpop CheckGraphManager performs? I could attempt to put that up. It's been handled in my k8s environment so the container doesn't just hang around as a faster version of waiting for probes to fail.

GitHub

tinkerpop/gremlin-server/src/main/java/org/apache/tinkerpop/gremlin...

Apache TinkerPop - a graph computing framework. Contribute to apache/tinkerpop development by creating an account on GitHub.

GitHub

janusgraph/janusgraph-core/src/main/java/org/janusgraph/graphdb/man...

JanusGraph: an open-source, distributed graph database - JanusGraph/janusgraph

Florian Hockmann•12mo ago

But such a graph manager would also only work for problems during startup, right? Doesn't our wait for storage logic during the startup of the Docker container also handle this problem? https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-dist/docker/docker-entrypoint.sh#L71

GitHub

janusgraph/janusgraph-dist/docker/docker-entrypoint.sh at master · ...

JanusGraph: an open-source, distributed graph database - JanusGraph/janusgraph

criminosisOP•12mo ago

I was also wanting the container to die if it failed to successfully apply the schema I had defined for it to load up on start.

Florian Hockmann•12mo ago

I see, that's of course a different requirement Doesn't it take quite some time to load the schema every time a container starts? Most of the time the schema should already be fully loaded, no?

criminosisOP•12mo ago

Of course, no argument there. But it's what drew me to the checked graph manager, albeit at the cost of unwittingly opting out of the JG add-ons with its graph manager. Hence asking if there would be appetite for a PR that'd add a "Checked"JanusGraphManager

Florian Hockmann•12mo ago

(Just trying to better understand your use case here)

criminosisOP•12mo ago

The scripts check if the schema was already loaded and skips doing it again. Each script is versioned and writes a "completed version X" indicator into the graph Loosely inspired by liquibase

Florian Hockmann•12mo ago

A problem I'm seeing with that approach is that schema changes require collaboration of all running JanusGraph instances which is hard to guarantee if you're automatically applying them every time a container starts. At my company we are only applying schema changes manually and stop all other JanusGraph instances and also all clients so only a single instance is up and running where we then execute the schema script. But in general, schema management is really a hard topic with JanusGraph which leads to a lot of headaches and it's definitely an area where JanusGraph could be improved a lot

criminosisOP•12mo ago

Well, to be clear I'm not doing it in the containers that are running JG for purposes of hosting, it's done in a secondary loader container that runs just once at the deployment start

Florian Hockmann•12mo ago

Ah, I see, then it even sounds similar to what we're doing

criminosisOP•12mo ago

But now that I'm retracing through the steps, I automated this a long time ago so it's gotten a little dusty in my memory, I may have conflated an issue here The reason I did it wasn't for schema

Florian Hockmann•12mo ago

I know that problem 😅

criminosisOP•12mo ago

Yeah, I ended up doing something similar with our hold HBase system. It feels fairly common place to rebuild these tools 😅 The storage timescript does watch for cassandra to be up, but IIRC it would just timeout but not kill the container? So the graph would attempt to open, fail, and then JG would just hang out without any graphs being opened.

Florian Hockmann•12mo ago

I guess if such a graph manager is useful for you, then it will probably also be useful for others. As a first step you could create an issue for this where you describe what that graph manager does differently and what benefits it brings: https://github.com/JanusGraph/janusgraph/issues/new?assignees=&labels=&projects=&template=feature-request.md You could then wait a bit to see whether it gets any reactions and then provide a PR

GitHub

Build software better, together

GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects.

criminosisOP•12mo ago

Vs what I wanted was the container to die in that case. Which the checked graph manager would do.

Florian Hockmann•12mo ago

With storage timescript you mean the docker-entrypoint.sh logic I mentioned earlier? If I'm not completely mistaken, then that should let the startup fail before even starting JanusGraph Server which means that the container won't get up and running at all It first tries to open the graph locally in the Gremlin Console via JanusGraphFactory.open() with your config and only if that works, it will start JanusGraph Server with the same config

criminosisOP•12mo ago

I'll have to reproduce the scenario before I create the feature request. TBH this was a change I made over a year ago so the details have fallen into the memory aether if I'm being honest 😅 We also sharpened other k8s infra with liveliness probes since then so we may have addressed this issue, at least for ourselves, via other means.

criminosisOP•12mo ago

Looking at my commit message for the change it looks like I described it as so:

Kill JG if the graph fails to open instead of idly be hosting no graphs

So it seems I'm correctly remembering the symptom, but having defined steps for reproduction would be beneficial for the feature request

Gaming

Programming

Comma Separated Config Options Values Via Environment Variable?

Did you find this page helpful?