503 (server) Error when I send GET request
I built a phone scraper for a personal assistant app usiing puppeteer. It runs locally but when I deploy it I get 503 errors. Here's the specific errors I get on diff GET:
GET https://phone-scraper-production.up.railway.app/search
HTTP/1.1 400 Bad Request
x-powered-by: Express
access-control-allow-origin: *
content-type: application/json; charset=utf-8
content-length: 35
etag: W/"23-Z/nEQkTnjqNIEOkyXk5wBEV9AFg"
date: Tue, 06 Jun 2023 19:14:38 GMT
server: railway
connection: close
GET https://phone-scraper-production.up.railway.app/search?query=test&number=2
HTTP/1.1 503 Service Unavailable
content-type: text/html
x-railway-fallback: true
content-length: 2942
date: Tue, 06 Jun 2023 19:13:35 GMT
server: railway
connection: close
My project ID is f0c9e61c-b4bb-4387-be5b-939b5a0f9b9e
18 Replies
Project ID:
f0c9e61c-b4bb-4387-be5b-939b5a0f9b9e
web scraping is not allowed as stated by their tos
https://railway.app/legal/terms
Ah gotcha. Does that mean that something like AutoGPT or AgentGPT that interacts with web pages wouldn't work either? @Brody
are those just free chatgpt libraries?
I don't think so
Let me send you some links:
https://js.langchain.com/docs/use_cases/autonomous_agents/auto_gpt
https://js.langchain.com/docs/use_cases/autonomous_agents/baby_agi
That's basically the use-case. However, I'm trying to minimize the amount of AI in the app to reduce cost. Hence pupeteer. I think there's a lof of bad actors in the space but if Railway won't support AI projects like this then I think that's a major turn off for devs trying to build AI products...
i dont see any problem with what either of those links do
I thought you just wanted context on AutoGPT
looks good to me
So I sent those
yes thank you
Okay. Well, to make them able to interact with webpages they need a library like puppeteer
is that against TOS?
Just trying to undertand if we need to move away from Railway for this
the puppeteer part is
Dang
Do you know what would work?
Ideally we can stay - but being able to interact with web pages is pretty important
using the offical apis then feeding the data from the api into the ai things
unless you have permission to scrape the sites??
I mean, they're public sites. I'm not super well-versed on the legalities truthfully. But if we want an AI to, lets say, place a pizza order.... then we've gotta get local pizzeria numbers. Not exactly an API for that, is there?
I'm open to suggestions. This definitely puts a hamper on things. Don't really want to set up an EC2
i mean does your scraper even respect robots.txt?
this is a shared hosting platform after all, with shared ip's, if anyone blocks your bot, they also block railway, and hopefully you can see why this type of situation is undesirable
Totally understand. We'd hate to be those guys
I'll figure something out. Thanks for your time man!
no problem π