qfel Posts - Answer Overflow

qfel

•Created by qfel on 6/15/2024 in #questions

Setting index in gremlin-python

I was trying to create vote graph from tutorial on loading data in gremlin-python and afaik you can't simply add index from non-JVM languages because for example there is no TinkerGraph that you could .open(). I don't know how better is performance when having index on 'userId' but my code simply takes too long go through queries from vote file. I tried using client functionality

ws_url = 'ws://localhost:8182/gremlin'

# Create index on userId
client = Client(ws_url, 'g')
client.submit('graph = TinkerGraph.open()')
client.submit("graph.createIndex('userId', Vertex.class)")
client.close()

conn = DriverRemoteConnection(ws_url, 'g')
g = traversal().with_remote(conn)

ws_url = 'ws://localhost:8182/gremlin'

# Create index on userId
client = Client(ws_url, 'g')
client.submit('graph = TinkerGraph.open()')
client.submit("graph.createIndex('userId', Vertex.class)")
client.close()

conn = DriverRemoteConnection(ws_url, 'g')
g = traversal().with_remote(conn)

to do it from string query and i'm not sure if with_remote(conn) uses previously assigned graph, let me know how to do it correctly. I'm not sure how to assign to g from client.submit(...). Additionally: how does one speed up those queries, if setting index won't do it? In my implementation

def idToNode(g: GraphTraversalSource, id: str):
    return g.V().has('user', 'userId', id) \
            .fold() \
            .coalesce(__.unfold(), 
                      __.add_v('user').property('userId', id)) \
            .next()

def loadVotes():
    with open("/tmp/wiki-Vote.txt", "r") as file:
        for _ in range(4):
            next(file)

        for line in file:
            ids = line.split('\t')
            from_node = idToNode(g, ids[0])
            to_node = idToNode(g, ids[1])
            g.add_e('votesFor').from_(from_node).to(to_node).iterate()

def idToNode(g: GraphTraversalSource, id: str):
    return g.V().has('user', 'userId', id) \
            .fold() \
            .coalesce(__.unfold(), 
                      __.add_v('user').property('userId', id)) \
            .next()

def loadVotes():
    with open("/tmp/wiki-Vote.txt", "r") as file:
        for _ in range(4):
            next(file)

        for line in file:
            ids = line.split('\t')
            from_node = idToNode(g, ids[0])
            to_node = idToNode(g, ids[1])
            g.add_e('votesFor').from_(from_node).to(to_node).iterate()

call to idToNode for each line takes too long.

54 replies

Gaming

Programming