R
Railway2y ago
luna

Does the TOS allow scraping websites?

“crawls,” “scrapes,” or “spiders” any page, data, or portion of or relating to the Services or Content (through use of manual or automated means);
Does this part of the TOS apply to apps that scrape sites and/or people scraping railway's site?
31 Replies
Percy
Percy2y ago
Project ID: N/A
luna
lunaOP2y ago
Curious about scraping websites that don't have an API. Don't want to start working on something if it's not allowed by railway.
Brody
Brody2y ago
its more of a grey area, and railway would just prefer you didn't, so I would personally avoid it
luna
lunaOP2y ago
Okay. What if a site allows scraping? Guessing that’s where they grey area is?
Brody
Brody2y ago
if it explicitly allows scrapping then yes that would be fine
eirk
eirk2y ago
what knid of website just says "pls scrape"
luna
lunaOP2y ago
I know a fair few people who have personal sites all of which are fine scraping.
eirk
eirk2y ago
whats there to scrape 🤔
luna
lunaOP2y ago
blog articles, etc. all sorts of different things.
eirk
eirk2y ago
mm ig
Brody
Brody2y ago
theres a difference between "fine to scrape" and "explicitly allows scrapping"
eirk
eirk2y ago
i think they meant that its like the website says "u can scrape if u want"
luna
lunaOP2y ago
guessing railway also needs to keep within the laws of the country theyre hosting in and operating in. scraping isnt clearly in law in a lot of places. hence why i asked
eirk
eirk2y ago
if its not in the law isnt it legal by default
luna
lunaOP2y ago
you'd be surprised how different countries treat that.
eirk
eirk2y ago
fine im in the usa
Brody
Brody2y ago
I mean, even if the site says you can scrape them, I'd just avoid it anyway
eirk
eirk2y ago
what if it says "pls scrape or else"
luna
lunaOP2y ago
For example BOM the aussie weather site, allowed scraping for years before their API came out. that was the official way and theyre even run by the government lol
Brody
Brody2y ago
web scraping just sounds like a bad idea regardless, unless you have some guarantee that the sites structure won't ever change
luna
lunaOP2y ago
true, for something like BOM for example it never changed so simple as set it and forget it.
Brody
Brody2y ago
well if you want to scrape a site, and the site says somewhere that it fully allows scrapping, and you are sure the structure won't change, go for it
luna
lunaOP2y ago
Also this was fun for learning to build a search engine in school http://www.scrapethissite.com/
Scrape This Site | A public sandbox for learning web scraping
A public sandbox for learning web scraping
Brody
Brody2y ago
Be a good web scraping citizen the sites owner, is a fellow Brody sounds like a good guy
luna
lunaOP2y ago
any way we can get an official response on this from a staff member? there's a few features I'd like to add to a bot but they require me to scrape a page. sadcat would connecting to another service that DOES allow this to grab data would that also still be considered under this. 🤔 like railway -> middle server where it's allowed -> site
Brody
Brody2y ago
@Angelo - care to throw a quick answer at this?
angelo
angelo2y ago
you are good here the DCMA/Fair Use is there to prevent some person just hosting a whole ass movie on the platform
luna
lunaOP2y ago
figured that would be case but wanted to check
angelo
angelo2y ago
which people have attempted to do before and get 🔨
luna
lunaOP2y ago
sweet, thanks for the clarification im also likely going to include abuse headers so if a site im scraping wants me to go away they have way to contact me. same goes for using a clear user agent. marking my bot as a bot should make everyone happy
angelo
angelo2y ago
appreciate it 🙂
Want results from more Discord servers?
Add your server