Vitor Martins
ATApache TinkerPop
•Created by Vitor Martins on 8/8/2024 in #questions
Optimizing connection between Python API (FastAPI) and Neptune
Hi guys. I've been working with gremlin python in my company for the past 4 years, using Neptune as the database. We are running a FastAPI server, where Neptune has been the main database since the beginning.
We always have been struggling to get a good performance on the API, but recently it has become a more latent pain, with endpoints taking more than 10s to respond.
We took some actions trying to improve this perfomance, such as updating the cluster to the latest engine version, and the same for FastAPI and gremlin-python dependencies.
Right now we're running with 3 instances (2 read replicas) db.t4g.medium. We also tested with a single db.r6g.large, but we didn't see a significant improvement.
In the process of trying to understand more what's causing the slowness, we've created a proof of concept API, where the source code can be found on this repo: https://github.com/aca-so/neptune-poc/.
We also created a new connector to Neptune, different of what we use in our main application, ‘cause on our main application we have a mechanism of keep alive to avoid Neptune closing the connections. For this PoC we used a different approach, recycling the connections every 5 minutes, based on the instances available on cluster.
So the first question is:
1. ”What's the best way to handle these connections? We thought in three approaches: keep alive (we know it doesn't fits good with connection pool), using until closed and then renew, or renewing every X minutes. Is there another way? What's the best one?”
11 replies