Best configuration for a single server

Server Spec: 256 vCPU (Threads) / 4 TB memory RAID5: 35TB XFS Filesystem Current have a single server I'd like to test with in a semi-production state. I realize it's not HA but it has plenty of horsepower for a decently size dataset. Having never installed JanusGraph, nor ScyllaDB, I'm not sure how much memory to assign to each service. I imagine the underlying DB will consume the most amount of resources. Please correct me if I'm wrong. Currently thinking: JanusGraph + OS (500 GB Mem) ScyllaDB bare metal install (3.5TB Mem) Furthermore, would you install each service to the base OS or would you use Docker containers? If docker, how many for ScyllaDB? To those with experience using JG + ScyllaDB, what would your recommended memory allocations and configuration be? Any help would be greatly appreciated.
6 Replies
Florian Hockmann
Florian Hockmann10mo ago
For hardware recommendations in general: We don't really have any for JanusGraph itself, at least not that I know of. But JanusGraph Server can be thought of mostly as a translator of Gremlin traversals into queries to the backends, so Scylla in your case. Therefore most load is typically on your backend and not so much on JanusGraph Server. It's however important to know that JanusGraph Server is a JVM application so it's memory usage is mostly from the JVM heap. (JanusGraph caches data to avoid unnecessary lookups into the backends.) So, I'd look into resources you can find about configuring the JVM for large heaps which at least used to be a problem for older JVM versions. But maybe someone else can also share their experiences with this. We only use JanusGraph Server in Docker containers at my company so we only scale JanusGraph horizontally and keep each container relatively small (like 8 GB RAM at most per container). For Scylla you can find good recommendations directly in their documentation which you also already cited in your other question. Regarding bare metal vs Docker: Scylla is very optimized to use all available resources. It for example creates one thread per CPU core and then assigns each thread a share of the data for which it is responsible. This means that it doesn't make much sense to host more than one instance of Scylla on one server. You only add unnecessary coordination between the different instances with this. Scylla also comes with a setup script which performs some configuration of the server to optimize Scylla's performance. This also doesn't work in Docker so you need to perform this configuration yourself if you want to benefit from it. Additionally Scylla has its own TCP/IP stack. This means that performance will suffer if it has to use the Docker network stack. You can however configure the Scylla container to use the host network to avoid this problem. So, overall you get the best performance if you install Scylla bare-metal. I think that it can still make sense to use Scylla in Docker, for example if you just want to try it out to see in general whether it suits your needs. Just keep in mind then that you won't see Scylla reach its full potential performance wise due to the Docker overhead You can later on also still use Scylla in Docker in production without much impact on performance if you put in the effort to properly configure your host and Docker for it. We're doing that at my company as we find it easier to operate Scylla in Docker containers. Installing updates for example is very easy with containers. For JanusGraph, I don't see good reasons against running it inside of Docker. It especially makes scaling easier. Just note that you need some load balancer (like Nginx for example) in front of JanusGraph Server then. Regarding how many containers of JanusGraph you need: It depends on how big each container is. As I've mentioned, we have good experience with several small containers at my company but that's also due to the fact that we're using a k8s with several small hosts. Your situation is of course different if you just have one single but very big server. So I think that you need to find out / try out how much memory JanusGraph Server can handle efficiently, given the limitations of the JVM heap. If the result is for example that performance suffers with more than 64 GB heap space (really just a random number) due to longer periods of garbage collection, then you can use Docker containers with 64 GB memory each. How many containers you'll need then depends on how many traversals you want to execute in parallel and how many resources they need. You'll have to try this out for your specific work load. JanusGraph's metrics can help to get a better insight in this
Florian Hockmann
Florian Hockmann10mo ago
Note also that JanusGraph itself doesn't store anything persistently so it doesn't need storage. So, at least if you want to transform this into a productive setup, then I think it makes sense to only install Scylla on your server and use a different server for JanusGraph. That way, Scylla can really completely use the CPU cores without having JanusGraph also using them. You can alternatively also assign Scylla specific CPU cores and assign other cores for the JanusGraph docker container(s): https://stackoverflow.com/a/25999490/6753576 https://opensource.docs.scylladb.com/stable/operating-scylla/procedures/tips/production-readiness#cpuset-conf
Stack Overflow
Limiting a Docker Container to a single cpu core
I'm trying to build a system which runs pieces of code in consistent conditions, and one way I imagine this being possible is to run the various programs in docker containers with the same layout,
Production Readiness Guidelines | ScyllaDB Docs ...
ScyllaDB is an Apache Cassandra-compatible NoSQL data store that can handle 1 million transactions per second on a single server.
HailDevil
HailDevil4mo ago
@Florian Hockmann Do you use the same janusgraph servers to write the data as well or these servers are only reposible for reading the data? I want to understand if keeping Janusgraph as servers has any performance improvements in writing data. Currently we use embedded janusgraph which fails with many exceptions due to memory and performance. What do you recommend based on your experience? cc: @NeO
Florian Hockmann
We actually separate JanusGraph servers between writing / reading, but that's not because of resources but for security reasons. It allows us to use different credentials for apps that can only read data vs apps that can also write. So, yes, we are also using JanusGraph Server instances in Docker for writing. I am not sure though whether that has performance improvements vs embedded JanusGraph. Maybe it helps because it makes it easier to scale JanusGraph independently of your app? I personally prefer dedicated JanusGraph Server instances because of the scaling issue and because it also makes it easier to monitor JanusGraph, like its resource usage
NeO
NeO3mo ago
@Florian Hockmann Did you face similar issue at your end when you had remote setup used while creating vertices and edges. https://discord.com/channels/981533699378135051/1272925642832478228/1272925642832478228
Florian Hockmann
No, sorry, I would have responded there if I could help. Connection reset by peer is something that I also see from time to time in our logs, but I never investigated further as it's not unusual in a setup with pods restarting regularly. In the linked question, it's also related to some SSL error and I can't say anything about that
Want results from more Discord servers?
Add your server