400 Errors with allenai-olmocr on Serverless SGLang - Need Payload Help!
I'm trying to deploy the allenai/olmOCR-7B-0225-preview model (fintuned Qwen/Qwen2-VL-7B model) RunPod using the Serverless SGLang endpoint template, but I'm consistently getting 400 Bad Request errors when sending requests. running on L40S.
I'm trying to send PDF documents for OCR, and I hope the issue is with the input payload. I've tried various common input formats based on the RunPod documentation and examples, but no luck so far. I've tried sending as a pdf file & page number as well as what I originally tried (pdf anchor text and image). in the code below, I am using the retrieved https://molmo.allenai.org/paper.pdf
I'm using the allenai-olmocr model (Hugging Face link: https://huggingface.co/allenai/olmOCR-7B-0225-preview), deployed as a Serverless SGLang endpoint on RunPod. I deployed it the lazy way, providing huggingface handle and mostly default settings, and am wondering if I need to set up a handler and deploy using docker to get to work?
I've checked the RunPod documentation for Serverless requests (https://docs.runpod.io/serverless/overview) and the olmOCR documentation and examples (https://github.com/allenai/olmocr), but I'm still struggling to get the input payload correct. I've mainly only tried sending base64, I experimented with s3 url but didn't seem to make a difference.
Could someone please help me understand the exact JSON payload format expected by RunPod Serverless SGLang for a multimodal model like allenai/olmocr? Specifically, I'm unsure about the correct structure for including both text (anchor text) and image data in the request, and can't find any clear answers. I was able to get this to work on replicate (using an existing setup https://github.com/lucataco/cog-olmocr), but it was very much point and click setup. Replicate was pretty expensive & slow.
I've checked the RunPod logs and they indicate a "Bad Request", pointing to an issue with the input data format.
Any guidance or example payloads would be greatly appreciated!
1 Reply
import requests
import dotenv
import os
import base64
from io import BytesIO
from PIL import Image
from olmocr.data.renderpdf import render_pdf_to_base64png
from olmocr.prompts.anchor import get_anchor_text
import json # Import json module
import time
dotenv.load_dotenv()
RUNPOD_API_KEY = os.getenv('RUNPOD_API_KEY')
RUNPOD_ENDPOINT_ID = os.getenv('RUNPOD_ENDPOINT_ID')
if not RUNPOD_API_KEY or not RUNPOD_ENDPOINT_ID:
print("Error: RUNPOD_API_KEY or RUNPOD_ENDPOINT_ID not found in .env file.")
exit()
headers = {
'Content-Type': 'application/json',
'Authorization': f'Bearer {RUNPOD_API_KEY}'
}
pdf_file_path = 'paper.pdf' # Replace with your PDF file or keep 'paper.pdf' in the same directory
if not os.path.exists(pdf_file_path):
import urllib.request
urllib.request.urlretrieve("https://molmo.allenai.org/paper.pdf", pdf_file_path)
image_base64 = render_pdf_to_base64png(pdf_file_path, 1, target_longest_image_dim=1024)
anchor_text = get_anchor_text(pdf_file_path, 1, pdf_engine="pdfreport", target_length=4000)
print("Anchor Text:", anchor_text)
data = {
"input": { # Keep the "input" wrapper as per RunPod documentation
"messages": [ # Use "messages" as expected by SGLang Chat Completion API
{
"role": "user",
"content": [
{"type": "text", "text": anchor_text}, # Anchor text as text content
{
"type": "image_url", # Use "image_url" type for image data
"image_url": {"url": f"data:image/png;base64,{image_base64}"} # Data URL format for base64 image
},
],
}
],
"model": "allenai/olmocr", # Model name - important to include this for SGLang
"temperature": 0.8,
"max_new_tokens": 50,
"num_return_sequences": 1,
"do_sample": True,
}
}
endpoint_url = f'https://api.runpod.ai/v2/{RUNPOD_ENDPOINT_ID}/runsync' # Use /runsync for synchronous endpoint
response = requests.post(endpoint_url, headers=headers, json=data)
print("Status Code:", response.status_code)
print("Response Body:", response.json())
if response.status_code != 200:
print("\nRequest submission failed. Check the 'error' in the response body and RunPod logs for more details.")
else:
print("\nRequest submitted successfully! Output:")
print(json.dumps(response.json(), indent=2)) # Directly print the output in readable JSON format