Limosin18
Limosin18
ATApache TinkerPop
Created by Limosin18 on 7/3/2024 in #questions
Optimising python-gremlin for fastApi
PS: Please give me some rope here as I am new to gremlin-python! I have a fastAPI which is serving some Rest API calls based on the data that I am fetching from a Neptune instance. I am using gremlin-python library to make the queries to graph instance. Some points to note: 1. I'm using Uvicorn workers with fastapi. You can consider a single worker for any stats I share. 2. There may be multiple graph calls in a single API request. 3. Currently getting around 8-10 RPS. Now having worked on a lot of fastApi services, I know that this RPS is way too low. I have done lots of optimisations in the past by moving towards asyncio "native" libraries like AioHttp, AioRedis, etc. Achieving a considerable 90-100 RPS in some instances on each worker. I have scourged through the gremlin-python source code, and am confused about a couple of things. 1. Even though the library is creating a separate loop, while doing any I/O, the coroutines are being used in a blocking manner by calling loop.run_until_complete. Why is this the case?? Is the separate event_loop the very reason?? 2. I have tried exploring other libraries like aiogremlin, aiogoblin but they all seem to have lost community support and are no longer being updated. Hence I am somewhat hesitant is using them. 3. Is the main event being blocked when I am running the queries, For example
vertices_list = (
self.__g.V(hopped_vertex_ids)
.has_label(within(hop_via_vertices))
.both_e(*hop_via_edges)
.both_v()
.has(T.id, without(hopped_vertex_ids))
.has_not('supernode_identified_on')
.dedup()
.limit(vertex_count)
.value_map(True)
.to_list()
)
vertices_list = (
self.__g.V(hopped_vertex_ids)
.has_label(within(hop_via_vertices))
.both_e(*hop_via_edges)
.both_v()
.has(T.id, without(hopped_vertex_ids))
.has_not('supernode_identified_on')
.dedup()
.limit(vertex_count)
.value_map(True)
.to_list()
)
If the main fastapi event loop is indeed being blocked, what can I do to unblock the same when making the gremlin queries?? Any suggestions to increase the throughput on each worker are highly appreciated!!
4 replies