J
JanusGraph•8mo ago
CosmoBean

Issues faced for consistent indexing (both Composite & Mixed) [ElasticSearch]

--> Schema was provisioned, and then, all the schema was verified and the management objects closed. [All the indexes were in enabled state, both mixed and composite] --> [Issue:1] Index wasn't created in ElasticSearch, giving a 404, when a vertex totals direct Index Query is performed --> As a work around. for initial data 1000 Documents of sample data was ingested, and as we expected, the indexes were not present --> data was re-indexed. Indexes were created in ElasticSearch, and some composite indexes needed re-Indexing as well. After reindexing, the performance was as expected --> Closed the graph instances after testing the reindexed data --> Restarted the ingestion process for the full data. [23k Documents] --> [Issue:2] Once the Ingestion was done. Opened the graph, The indexes did not get updated. --> The index-query returned old response as before, not with the updated count [Mixed Index]. The composite Index was working as expected in this case giving the right count -->[Issue:3] [SOLVED] (edit) Even though a unique index constraint was placed on this Vertex, It did not get enforced and some duplicates were created. --> Unique Index was placed on both Composite as well as the Mixed Index The unreliable nature of the indexes wasn't something I was expecting working with Janusgraph. Where I am in need to constantly re-index the data for the indexes to work as expected. Please help me, and guide me if i got any step wrong. If any additional information needed, please feel free to ask me
Solution:
@Boxuan Li , while i was using multi-tenacy, there were more configs to be given to the elastic search as well, for the graphName. Marking this as resolved. The index was being created with janusgraph_IndexName, where as it was supposed to be created in tenant_IndexName, It was a misconfiguration, that wasn't carefully considered. ...
Jump to solution
10 Replies
Bo
Bo•8mo ago
Unique Index was placed on both Composite as well as the Mixed Index
Unique index only works for composite index: https://docs.janusgraph.org/schema/index-management/index-performance/#index-uniqueness Remember you also need to use locking to enforce uniqueness: https://docs.janusgraph.org/advanced-topics/eventual-consistency/#data-consistency
CosmoBean
CosmoBeanOP•8mo ago
Thanks a lot for the reply, I think it is about the locking, I thought once a Uniqueness index is placed( Composite Unique index has been placed). the uniqueness would be auto handled by janusgraph. Will look into this. Thanks once again Any Idea on the inconsistent indexes, where the indexes aren't getting updated unless I re-index the whole data again and again?, this is something that would be critical with the huge data that we were trying to take onto production
Bo
Bo•8mo ago
Issue 1 and issue 2 are unclear to me. Please describe it in more detail, including every step you do - better attach screenshots of what you did.
CosmoBean
CosmoBeanOP•7mo ago
Hey, I am sorry that I am currently not able to give you screenshots. Here's a more detailed explaination of what happneed 1. Schema was created using the management Object, to insert structured data, along with the indexes, which were enabled using the REINDEX, schema action, before start of insertion. 2. Once Schema & Indexes were created. I made sure to verify that the status of the Index had been set to ENABLED. [ which is a good to go sign that whatever data is going to be inserted is going to be indexed] ISSUE 1: after this step of schema creation, which was done using MIXED and COMPOSITE indexes. I wanted to see, if I was able to find the mixed index which is supposed to be created by Janusgraph. the index wasn't available/created on ElasticSearch. I assumed, that since, the data has not been inserted, maybe janusgraph is waiting for index to be created. The data was inserted into janusgraph, waited to see if any indexed data was present on elastic search. Still wasn't able to find any indexed data. ISSUE 2: To mitigate the issue of index not being created, I inserted a part of the data, and used the REINDEX operation to make sure the indexes were created, and this operation was working as expected and I was able to find the indexed data. As the issue of dangling management objects is known, I closed any and all the dangling management objects. this part data that was reindexed, was performing as expected. Now, I reingested all of the remaining data. Where I found that, the composite index was being updated, but the mixed index data was not being updated. Causing an issue of data inconsistancy when index queries are being used. Apologies for the delayed reply, I missed the message, please feel free to tag me as required. Thanks a lot for the help
Bo
Bo•7mo ago
ISSUE 1: That doesn't sound right. I suspect you did something wrong. Could you please provide code details of what you've done for each step? ISSUE 2: Sounds like essentially the same problem of first issue. @CosmoBean
CosmoBean
CosmoBeanOP•7mo ago
Hey @Boxuan Li , For the initial details of code, 1. Using embedded version of janusgraph through a java springboot application. 2. Before creating the schema, all the transactions and dangling management objects are closed through the code. 3. Vertex Properties and Edge Labels along with properties are created. 4. Based on the config that I have given inside of a hashMap, specific properties are indexed for a vertex. [ Both Composite and MIxed in my case] --> Unique Constraint Indexes 5. Once we build the index [ According to the documentation ], We check for if the Index is in REGISTERED state. 6. Once REGISTERED state is confirmed through the call, a REINDEX action is called to enable the index. 7. Once all of the indexes are enabled, a call is made to HTTP endpoint of janusgraph to create an entry in the "configurationManagementGraph" in regards to the elastic search endpoints and scylla db endpoints for the specific graph. 8. Once all of the schema creation is done, I open the gremlin console through a terminal, and try to open the graph and try to open the management to print out the schema and see, if all the indexes are ENABLED, This part is successful without any issue. 9. If i try to see if any index has been placed in the ElasticSearch index, I wasn't able to find any trace of the index. If you need the code in java for any specific part, please let me know, as the over code is big.
Bo
Bo•7mo ago
Once REGISTERED state is confirmed through the call, a REINDEX action is called to enable the index.
You need REINDEX if and only if you have existing data that needs to be reindexed. Otherwise your steps seem clear and correct. I suspect your code doesn't follow what you describe. In order for the community to help, could you please post your code here? If they are spread over places, could you please put them into a single place/method? Make sure you reproduce the problem using a minimal example.
Solution
CosmoBean
CosmoBean•7mo ago
@Boxuan Li , while i was using multi-tenacy, there were more configs to be given to the elastic search as well, for the graphName. Marking this as resolved. The index was being created with janusgraph_IndexName, where as it was supposed to be created in tenant_IndexName, It was a misconfiguration, that wasn't carefully considered. Thanks a lot for the patience, and answering the questions
CosmoBean
CosmoBeanOP•7mo ago
Really Sorry for all the trouble 😓
Bo
Bo•7mo ago
No problem. Good to know you figured it out!
Want results from more Discord servers?
Add your server