Memory Usage
could someone please explain what might be causing the memory to continuously rise? I just have a single script running continuously but its not storing any new information.
88 Replies
Project ID:
N/A
N/A
idk if it's just me, but I can't see the image you sent
it's literally invisible
wait gimme a sec lol
its just this
i deployed at 6PM
I went on my website and clicked a button that ran one script continuously
it's definitely something wrong with your script 🤔
what does your script do exactly
it webscrapes a website given a runtime
i set the script to run for 12 hours
i set it to scrape a website every 10 minutes
and on each iteration the only variable I change is i
which increments the number of times the script has looped
keep in mind webscraping is a somewhat gray area of Railway
make sure the website you're scraping from is allowing webscraping and that you respect its robots.txt
wdym by robots.txt
where can i find a websites robots.txt
each website has a
robots.txt
on its root
example: https://youtube.com/robots.txt
it tells robots what they can and can't do on any given websitewho wrote youtube's robots.txt 💀
anyway, the memory leak is very likely an issue with your code and not railway
maybe whatever you're using to scrape has a memory leak in it
the website im scraping doesn't have that webpage
its just a scheduler for a school
like where students can check if classes are open
then it's all good
oh ok
but yea im still confused on what might be causing the gradual increase in ram usage
it seems to be increasing by 1MB every 8 minutes
bump
it has now risen to almost 300MB
i made sure to make sure all processes are completed but why is it still using 300MB?
does someone have an explanation?
i'm just trying to lower it to conserve costs
I'm sorry but there's generally not all that much the community can do for you about a memory leak in your code, but lets do some basic debugging
Oh so this is a memory leak?
from your screenshot earlier I saw a ramp on the memory usage
but either way, what kind of app is this
its a script that sends an email when an element in a website updates
how is it being ran
wdym
whats is the set start command?
oh the script starts running on button click
and it runs for a set amount of time (one of the inputs on my frontend)
not quite what i asked lol
sorry what do you mean by this then
can you just send the repo instead?
yea sure
https://github.com/julianxchang/UCI_Class_Watcher
wait i think its private gimme a sec
public now
the .env file 😐
the script that is being ran is inside
ye lol
remove it lol
gonna get rid of that
going forward use a .gitignore file
thats why i set it to private lol
yea initially i had it but i didn't know how to deploy to railway without the .env
never save secrets to your repo regardless of public or private
GitHub
UCI_Class_Watcher/app/huey/tasks.py at main · julianxchang/UCI_Clas...
Contribute to julianxchang/UCI_Class_Watcher development by creating an account on GitHub.
but yea the script being ran is here
Ooo thanks I will take a look
im thinking the memory leak is because I'm not setting driver = None after each loop?
but I feel like that shouldn't matter because it should just overwrite driver after each iteration
gotta delete these
__pycache__
folders toowhat are these folders?
cache folders
they keep coming up after i debug locally
you are using chrome in this app, given that fact, your memory usage isnt bad
oh really?
have you never seen a chrome memory usage meme??
but the thing is when i deploy it started at around 140mb
yea ive seen them before lol
the thing is I do driver.quit() after each run
so shouldn't the chrome instance stop
but you do create a new driver on every loop, you dont need to do that
Ooo could that be the issue?
I think for better readability I might just make a function to create the chorme driver
no clue, i dont do stuff with selenium as mostly what its used for is scraping and thats in railways tos
oh shoot i see
this is just a small project im testing
to practice deployment
youre just getting class info every 30 minutes right?
i don't plan on having it up for long
every 2 minutes
i don't think interval should be issue tho
since its performing the same task
it just might use more vcpu
to an extent is would be, im talking about if it breaks railways tos or not
oh yea thats all im doing
nothing more
railway isnt going to care if you are looking at a site every 2 minutes, they would if you where looking at it every 2 miliseconds though
oh yea i don't need to check that often haha
but that would use a good amount of vcpu too no?
a good amount of vcpu? 0.0 to 0.1 would be a good vcpu
yea thats around what im getting rn
but yeah just try to optimize your code, and if you want to drop your memory usage considerably, dont use a web browser lol
oh btw
all scripts have finished running and now its just consistently stuck at 300MB
it seems like the memory usuage only goes up when a script is running
prob still have chrome running
ah ok ig i just have to add driver.quit() to the end once script stops
wait...but I did hav that before tho...
guess selenium isnt killing the chrome process
Hmm
do you even need to use a web browser to do this?
no, it doesnt look like you do (ignore the no results, i dont have a clue what to put in to get a result but that doesnt matter)
make the request with the python request module, then parse the html in code with bs4
Yea the problem is there’s no unique url for this website
why does there need to be?
I need to get specific element after entering specific inputs into website
The url doesn’t change
why does the url need to change?
Cuz the page that I’m scraping isn’t this
I reach the page after entering the class I want and clicking search for class button
The page contents change but the url remains the same
Hopefully that makes sense
of course its not the same page you get, my page says no results
Python requests isn’t able to click the button to load the new content tho
Or am I missing sth
why would you need to do that though, look at my screenshot, all im doing is giving the same data the web page gives itself
in the end, its just form data, skip chrome and do that part yourself in code
What are u using to see this page?
thats postman, but theres nothing stopping you from doing the same in code
^
Hmmm ok will look into this thx
no problem