kip
kip
RRunPod
Created by kip on 8/24/2024 in #⚡|serverless
Execution time discrepancy
I built a custom text embedding worker, when i time the request on the pod it takes about 20ms to process from start to finish. The request takes a lot longer (about 1.5 seconds), and runpod returns an executionTime: 1088ms in the response object. do you know where this discrepancy might come from? As it is, it's currently really limiting the throughput of my worker, and there isn't much point in using a GPU if it's so heavily bottlenecked. Thanks in advanced! Also happy to share the code or any logs if it'd help diagnose what's up
75 replies