pm_osc
pm_osc
Explore posts from servers
JJanusGraph
Created by pm_osc on 6/26/2024 in #questions
Incremental schema changes - Property Key constraint does not exist
Thanks a lot for your help on this. Indeed, closing the graph helped with the constraints.
5 replies
JJanusGraph
Created by pm_osc on 4/2/2024 in #questions
JanusGraph authentication - restricted privileges
In the janusgraph-server.yaml, this is the way to configure authorization: ------ authorization: authorizer: my.package.JanusGraphAuthorizer config: admins: jg_admin # for specifying multiple admins, list can be used # admins: # - jg_admin1 # - jg_admin2 ------ Find enclosed the Java source code for "my.package.JanusGraphAuthorizer". The logic is very simple, if the Gremlin query contains "JanusGraphFactory" or "ConfiguredGraphFactory" or "openManagement" keywords, the query can be executed only by users configured as admin user. This logic was packaged into a JAR and the JAR was added to the lib folder of JanusGraph. It is probably also useful to mention here how to add new users to JanusGraph (to the credentials graph): 1. open Gremlin console and connect to JanusGraph 2. for creating a jg_user user, execute: JanusGraphFactory.open("conf/credentials.properties").traversal(org.apache.tinkerpop.gremlin.groovy.jsr223.dsl.credential.CredentialTraversalSource.class).user("jg_user", "pw2") I hope the above helps others in case they want to secure their JanusGraph.
5 replies
JJanusGraph
Created by pm_osc on 4/2/2024 in #questions
JanusGraph authentication - restricted privileges
As "hadoopmarc" also answered on janusgraph-users list recently, the main pointers are: 1. authorization section of the Apache TinkerPop documentation (https://tinkerpop.apache.org/docs/current/reference/#authorization) 2. a sample file in the Gremlin server source code (https://github.com/apache/tinkerpop/blob/master/gremlin-server/src/test/java/org/apache/tinkerpop/gremlin/server/authz/AllowListAuthorizer.java) For reference, I thought it might be useful for others to share some snippets how I made authorization work with JanusGraph 1.0.0. The main purpose of authorization was to restrict users to access the JanusGraph Management System (e.g. open a graph and make schema changes on it) For implementing a meaningful authorization, first we need authentication. This requires a credentials graph, that can be defined with the below credentials.properties file (for cassandra backend) stored in conf folder: ------ gremlin.graph = org.janusgraph.core.JanusGraphFactory graph.graphname = credentials storage.backend = cql storage.hostname = cassandra storage.cql.keyspace = credentials ------ In the janusgraph-server.yaml, this is the way to configure authentication: ------ authentication: config: defaultUsername: jg_admin defaultPassword: pw1 credentialsDb: conf/credentials.properties authenticator: org.janusgraph.graphdb.tinkerpop.gremlin.server.auth.JanusGraphSimpleAuthenticator authenticationHandler: org.apache.tinkerpop.gremlin.server.handler.SaslAuthenticationHandler ------ Keep reading as the authorization part comes in the next message...
5 replies
JJanusGraph
Created by pm_osc on 3/10/2024 in #questions
JG 0.6 vs JG 1.0 different behaviour for same Gremlin query
I did some more digging. It seems that this is not related to JanusGraph but to TinkerPop. Once I used Gremlin Console 3.5.5 and loaded "Modern" graph, executing "g.V().project('name', 'age').by('name').by('age')" returns: ==>[name:marko,age:29] ==>[name:vadas,age:27] ==>[name:lop,age:null] ==>[name:josh,age:32] ==>[name:ripple,age:null] ==>[name:peter,age:35] Doing the same with Gremlin Console 3.7.0, results: ==>[name:marko,age:29] ==>[name:vadas,age:27] ==>[name:lop] ==>[name:josh,age:32] ==>[name:ripple] ==>[name:peter,age:35] So, it seems that TinkerPop 3.7.0, omits null values from the project result. After some more digging, it turned out that this was already introduced by TinkerPop 3.6.0. So, more details here: https://tinkerpop.apache.org/docs/current/upgrade/#_consistent_by_behavior
4 replies
JJanusGraph
Created by pm_osc on 2/5/2024 in #questions
Elasticsearch mixed index performance
Hi @Boxuan Li , sorry for the late reply on this. It took some time to play around with various configs. Thanks for the suggestions. For now, we kept the custom ID as property but this probably is not expected to have any effect on the mixed index performance. We played around with some settings but no success so far. We have our stack as follows: * RedHat Linux virtual machine - 8 vcpus, 64 GiB memory * the following components running as Docker containers * janusgraph 1.0 * Cassandra 4.0.11 * three node cluster Elasticsearch 8.0 We tried running node ingestion with the following configs: * Scenario 1 - baseline: * ingesting person nodes with id, firstName, lastName * only composite index on id * Scenario 2: * ingesting person nodes with id, firstName, lastName * composite index on id * mixed index on firstName * Default configs on JG and Elastic side. * Scenario 3: * ingesting person nodes with id, firstName, lastName * composite index on id * mixed index on firstName * adjusted configs: * storage.batch-loading set to true on the dynamic graph * storage.buffer-size set to 6144 on the dynamic graph * ids.block-size set to 100000 on the dynamic graph * for the firstname index, refresh-interval set to 30s We executed all three scenarios for 5,000 and 100,000 person nodes. Below are the results: * 5.000 nodes * Scenario 1: 16.49 sec * Scenario 2: 107.43 sec (1.79 min) * Scenario 3: 119.3 sec (1.99 min) * 100.000 nodes * Scenario 1: 141.84 sec (2.36 min) * Scenario 2: 1883.28 sec (31.39 min) * Scenario 3: 1798.08 sec (29.97 min) The strange thing that the results with default config and adjusted configs are not very different, and compared to the baseline, ingestion is still much slower. Could you please have a look at our configs as we might have misconfigured something? We welcome any suggestion on how we could improve our ingestion speed with mixed index. Thank you.
9 replies
JJanusGraph
Created by pm_osc on 2/5/2024 in #questions
Elasticsearch mixed index performance
Thanks Boxuan for the suggestion. I have opened the PR with id 4258. Thank you.
9 replies
JJanusGraph
Created by pm_osc on 2/5/2024 in #questions
Elasticsearch mixed index performance
Thanks a lot @Boxuan Li for your suggestions. We will have a look at the referenced resource and get back to you. Please note that under the "Write optimization" section in the JG docs, the "this blog post" external link is not reachable any more. Shall I open an issue on JG github regarding this? Thank you.
9 replies
JJanusGraph
Created by pm_osc on 11/20/2023 in #questions
JanusGraph 1.0 full-text search predicate in python - broken
Thanks a lot Florian for all the answers and clarification.
18 replies
JJanusGraph
Created by pm_osc on 11/20/2023 in #questions
JanusGraph 1.0 full-text search predicate in python - broken
Thanks a lot Boxuan and Florian for looking into this, I very much appreciate your help. With JanusGraph 0.6 and gremlin-python 3.5.4, I am sure that I was able to use all full-text search predicates and those utilized the index backend. Just for reference, here is a small example: * g.V().has('firstName', P('textPrefix', 'John')).toList() -- this version used index backend * g.V().has('firstName', TextP.startingWith('John')).toList() -- with this version I got the warning in the JG log like "Query requires iterating over all vertices [()]. For better performance, use indexes" Probably, the above "textPrefix" (and other JG full text search predicates) worked for me due to the fallback mechanism mentioned by Florian. If I understand correctly, even with JG 0.6, there was no "official" way of using JanusGraph predicates with gremlin-python (only the fallback mechanism I used), right? I see that the nicest way to move forward would be to implement a Python library for JanusGraph which implements serializers for JanusGraph specific types. Do I understand correctly that the only benefit of creating and maintaining of such Python library would be to be able to use the JanusGraph specific full text predicates in Python? I don't know how complicated was the "fallback mechanism" that was removed in JG 1.0. If not that complicated, would it be an option to add back the "fallback mechanism" so JG full text search predicates could be used with Gremlin-python without the need of writing/maintaining a dedicated Python library? I see that there was an attempt to probably do something similar 5 years ago (https://github.com/JanusGraph/janusgraph-python) but based on that I would suspect that not a lot of people is interested in using JanusGraph with python. Please let me know your thoughts on the drawbacks/benefits between adding back "fallback mechanism" to JG 1.0 vs. writing/maintaining a JanusGraph specific python serializer. Thanks a lot for your help and time.
18 replies
JJanusGraph
Created by pm_osc on 11/20/2023 in #questions
JanusGraph 1.0 full-text search predicate in python - broken
Sure. Please find enclosed. Note that I am using GraphSONSerializersV3d0 as I faced some other issues (unrelated to this) with the GraphBinaryMessageSerializerV1. Thank you.
18 replies