Getting CancelledError Python FastAPI application using openai chat completion

I have Python FastAPI application that uses openai's chat completion api with stream. I am getting CancelledError very frequently. I also added timeout middleware with 2 mins timeout and is still same error. The same prompts works fine on my local machine. It's happening only in Railway. I guess may have to tweak some timeout out setting in Railway. I will provide full stack trace below. Please let me know how to resolve it.
CancelledError('Cancelled by cancel scope 7fa0af45f910')Traceback (most recent call last):\n\n\n File \"/usr/local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py\", line 472, in astream\n async for chunk in self._astream(\n\n\n File \"/usr/local/lib/python3.10/site-packages/langchain_openai/chat_models/base.py\", line 1989, in astream\n async for chunk in super().astream(*args, **kwargs):\n\n\n File \"/usr/local/lib/python3.10/site-packages/langchain_openai/chat_models/base.py\", line 779, in astream\n async for chunk in response:\n\n\n File \"/usr/local/lib/python3.10/site-packages/openai/streaming.py\", line 147, in aiter\n async for item in self._iterator:\n\n\n File \"/usr/local/lib/python3.10/site-packages/openai/_streaming.py\", line 160, in stream\n async for sse in iterator:\n\n\n File \"/usr/local/lib/python3.10/site-packages/openai/_streaming.py\", line 151, in iterevents\n async for sse in self._decoder.aiter_bytes(self.response.aiter_bytes()):\n\n\n File \"/usr/local/lib/python3.10/site-packages/openai/_streaming.py
CancelledError('Cancelled by cancel scope 7fa0af45f910')Traceback (most recent call last):\n\n\n File \"/usr/local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py\", line 472, in astream\n async for chunk in self._astream(\n\n\n File \"/usr/local/lib/python3.10/site-packages/langchain_openai/chat_models/base.py\", line 1989, in astream\n async for chunk in super().astream(*args, **kwargs):\n\n\n File \"/usr/local/lib/python3.10/site-packages/langchain_openai/chat_models/base.py\", line 779, in astream\n async for chunk in response:\n\n\n File \"/usr/local/lib/python3.10/site-packages/openai/streaming.py\", line 147, in aiter\n async for item in self._iterator:\n\n\n File \"/usr/local/lib/python3.10/site-packages/openai/_streaming.py\", line 160, in stream\n async for sse in iterator:\n\n\n File \"/usr/local/lib/python3.10/site-packages/openai/_streaming.py\", line 151, in iterevents\n async for sse in self._decoder.aiter_bytes(self.response.aiter_bytes()):\n\n\n File \"/usr/local/lib/python3.10/site-packages/openai/_streaming.py
Solution:
found your issue. you set max_tokens to 512 - https://github.com/erajasekar/StreamingFastAPIExample/blob/master/fastapp.py#L21 and so did i in my 1:1 golang rewrite....
No description
Jump to solution
33 Replies
Percy
Percy2mo ago
Project ID: ab55d7b2-61b7-4e97-b893-4b10bd69492b
erajasekar
erajasekarOP2mo ago
ab55d7b2-61b7-4e97-b893-4b10bd69492b
Brody
Brody2mo ago
for clarity, this is not an issue with railway, you would not have to tweak settings in railway, this is an application level issue. how long are these streams?
erajasekar
erajasekarOP2mo ago
@Brody I don't have this issue when running locally. I also tried increasing async time out using middle ware and no luck. The response streams only for few seconds
Brody
Brody2mo ago
it's good start that it works locally but please keep in mind that it does not mean you don't have an application level issue. since this is an application level issue we would need a minimal reproducible example.
erajasekar
erajasekarOP2mo ago
Thank you @Brody . I tried lot of troubleshooting options today like changing from guvicorn to uvicorn , setting keep-alive timeout in uvicorn and refactoring code to handle timeout etc. But none of it worked. Then, I installed docker and ran my app inside docker and it works perfectly fine. I will attach debug logs from local and inside railway app to see if that helps. I have spent whole day today trying to debug this and no luck. Any help will be greatly appreciated.
Brody
Brody2mo ago
since this is an application level issue we would need a minimal reproducible example.
erajasekar
erajasekarOP2mo ago
Thank you @Brody Here is minimal example to reproduce. App url : https://streamingfastapiexample-production.up.railway.app/ - when you click streaming response will stream, but it will stop before finishing. Code : https://github.com/erajasekar/StreamingFastAPIExample ( Use master branch)
GitHub
GitHub - erajasekar/StreamingFastAPIExample: FastAPI app that uses ...
FastAPI app that uses OpenAI APIs to stream responses - erajasekar/StreamingFastAPIExample
erajasekar
erajasekarOP2mo ago
FYI - I deployed same exact code on another clould provider render and it works fine https://streamingfastapiexample.onrender.com/. I am guess that there is some http connection or timeout parameter we need to tweak on Railway.
Brody
Brody2mo ago
just for the record other cloud providers tend to monkey patch away a lot of user mistakes, but Railway will over ever run your code as-is this is SSE, please see this proof of concept that has no such issues - https://utilities.up.railway.app/sse ive had it run for multiple minutes without stopping, this further cements the fact that this is an application level issue.
erajasekar
erajasekarOP2mo ago
hmm. Thank you for example. But it's sending events after every second. what happens when delay between events is more than 1 second and random. The example I provided has minimal code of my application and it is using openai's streaming sdk. If you show if my minimal example working or help me troubleshoot, I would really appreciate. Thank you for your support on weekend.
Brody
Brody2mo ago
ill try to implement a similar endpoint in my example app done, though it just streams the embedded text, it doesnt use openai - https://utilities.up.railway.app/sse2
erajasekar
erajasekarOP2mo ago
Thanks, It's very tricky problem. I tried lot of things on application level, but nothing worked.
Brody
Brody2mo ago
this is clearly an application level issue, i suggest getting even more minimal
erajasekar
erajasekarOP2mo ago
This is a minimum MVP example to confirm that I could deploy my app in Railway.
Brody
Brody2mo ago
it can get even more minimal, remove openai
erajasekar
erajasekarOP2mo ago
I have to use open api for my product. I doubt it's issue in openapi python library as it is used in chatgpt and many other applications.
Brody
Brody2mo ago
right but you need to eliminate all variables
erajasekar
erajasekarOP2mo ago
Hi @Brody , Is there any proxy used for outbound communication from Railway servers to external apis ? I use a python library for extracting youtube transcripts which works fine in Railway, but it doesn't work in render. When I researched further I found that youtube blocks requests from cloud machine servers and possible solution to get around is to use a proxy. But youtube is not blocking requests from Railway
Brody
Brody2mo ago
we do not have any proxy / filtering / firewall on outbound traffic
erajasekar
erajasekarOP2mo ago
Ok, it appears that there is some issue related to outbound communication from Railway to OpenAI related to streaming.
Brody
Brody2mo ago
nope, this would be an application level issue, it is not productive for you to pin this issue on the platform
erajasekar
erajasekarOP2mo ago
This is based on what I find from troubleshooting. My goal to not to pin point problem with the platform and but to figure out a solution for my issue. Let me know if this is something that could be resolved through support or I should move to another platform instead of spending time in trouble shooting this. Thank you for being with me in debugging this.
Brody
Brody2mo ago
unfortunately we can not provide application level support, so theres not much more we can do here as this is not an issue with the platform
Brody
Brody2mo ago
switched my own program over to using the same exact same openai completion, same prompt, same model, same settings, same everything, but written in golang. i can reproduce this locally, meaning this is very much not an issue with railway. [STREAM_CLOSED] means openai closed the stream.
No description
Solution
Brody
Brody2mo ago
found your issue. you set max_tokens to 512 - https://github.com/erajasekar/StreamingFastAPIExample/blob/master/fastapp.py#L21 and so did i in my 1:1 golang rewrite. the response i showed above reached 512 tokens exactly
No description
Brody
Brody2mo ago
@erajasekar for viz
erajasekar
erajasekarOP2mo ago
Thank you @Brody . The example application works after increasing token count. But my actual application still fails. will troubleshoot.
Brody
Brody2mo ago
during however many dozens of times I've used your Railway hosted site, it never outright failed to stream me text, what issue are you getting?
erajasekar
erajasekarOP2mo ago
async CancelledError. Attaching the logs and source code of original application that is failing. ( minimal example version works fine after increasing token size, it's still not clear how it works on local with 512 token size and not from Railway ).
Brody
Brody2mo ago
that just looks like you not handling an error properly
erajasekar
erajasekarOP2mo ago
Added logging for exception handling gives me this stack trace:
asyncio.exceptions.CancelledError: Cancelled by cancel scope 7fa2af3bd900



During handling of the above exception, another exception occurred:



Traceback (most recent call last):

File "/app/api/v1/chat.py", line 60, in streaming_response

async for chunk in llm_service.continue_chat(user_id, session_id, chat_request.message):

File "/app/services/llm_service.py", line 128, in continue_chat

async for chunk in self.systhesis_video_transcript_stream(user_id, session['transcript']):

File "/app/services/llm_service.py", line 88, in systhesis_video_transcript_stream

async for chunk in chain.astream(input="Synthesize the video transcript."):

File "/usr/local/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 3430, in astream

async for chunk in self.atransform(input_aiter(), config, **kwargs):

File "/usr/local/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 3413, in atransform

async for chunk in self._atransform_stream_with_config(

File "/usr/local/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 2332, in _atransform_stream_with_config

await run_manager.on_chain_error(e, inputs=final_input)

asyncio.exceptions.CancelledError: Cancelled by cancel scope 7fa2af3bd900
asyncio.exceptions.CancelledError: Cancelled by cancel scope 7fa2af3bd900



During handling of the above exception, another exception occurred:



Traceback (most recent call last):

File "/app/api/v1/chat.py", line 60, in streaming_response

async for chunk in llm_service.continue_chat(user_id, session_id, chat_request.message):

File "/app/services/llm_service.py", line 128, in continue_chat

async for chunk in self.systhesis_video_transcript_stream(user_id, session['transcript']):

File "/app/services/llm_service.py", line 88, in systhesis_video_transcript_stream

async for chunk in chain.astream(input="Synthesize the video transcript."):

File "/usr/local/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 3430, in astream

async for chunk in self.atransform(input_aiter(), config, **kwargs):

File "/usr/local/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 3413, in atransform

async for chunk in self._atransform_stream_with_config(

File "/usr/local/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 2332, in _atransform_stream_with_config

await run_manager.on_chain_error(e, inputs=final_input)

asyncio.exceptions.CancelledError: Cancelled by cancel scope 7fa2af3bd900
Not sure what is causing this
Brody
Brody2mo ago
you are now using langchan? i thought you where using openai either way, we can't help here since this is coding issue and we dont offer coding help
Want results from more Discord servers?
Add your server