Recommended architecture for my simple web scraping data visualizer for gym capacity

Hello. As of now, I have set up a lambda function that runs every 15min (scheduled by AWS EventBridge). This Lambda hits an API to get my gym's current capacity, then hits a google sheet API to store that data as a row in a google sheet. I am now building a Vite + React app to visualize the data and provide a bunch of over-engineered data about how busy my gym is. I built a nodejs script that reads from the google sheet and stores the data in a JSON file that is stored within the React app. I would eventually like to host this site and have the data live update. Does anyone have any suggestions for a light-weight way to provide the react app access to the data in a live fashion (a new row is added every 15 minutes) and I don't mind making the user refresh to get the latest data? I don't really want to spin up an entire database. One idea I had is just to make a lambda that gets all of the rows from the sheet that is accessible via API gateway and have the Vite/React app hit that endpoint anytime it's loaded, but that doesn't seem to scale if traffic were to be a bit high. Maybe if I could store the processed JSON file in S3 and somehow have the Vite app always read from the latest S3 file?
10 Replies
n3sonline
n3sonlineOP2y ago
Still looking for a solution here :/
Liam
Liam2y ago
One way you could do it if you deploy on vercel and use next js instead of react + vite, you could just use next cache. This would essentially just generate a static version of that site and serve it for the next 15 min at which point it would regenerate the site. In this case you could also use the sheets API for the data without having to spin up a db. If not that, then the most "scalable" way might be to put the data in a json blob on something like S3, though I'd be weary of caching weirdness and such. In Next, this is as simple as exporting
const revalidate = 900; /* 15 min in seconds, or any other refresh rate */
const revalidate = 900; /* 15 min in seconds, or any other refresh rate */
inside of the page component.
Josh
Josh2y ago
your data will easily fit in the free tier of something like planetscale for a very long time my reccomendation is just go w/ something like that and like lermatroid said, go with nextjs
Liam
Liam2y ago
Agreed, but if they really don't want to spin up a db...
Josh
Josh2y ago
it sounds like the reason they dont want to spin up a db is for complexity. Planetscale is like 3 clicks and your done w/ prisma its a breeze
Liam
Liam2y ago
For sure, if that is the reason n3s def check out planetscale.
Josh
Josh2y ago
better to start w/ sql now vs later if / when you need to start scaling
chocolatebananarhino
Instead of storing the data in a google sheet, store it in something like supabase. Then you can query the data using their realtime API. Or even just their regular serverless api
n3sonline
n3sonlineOP2y ago
yeah I think some kind of actual db + nextjs is optimal, I have just already used all of my free tiers on planetscale and vercel probably the right move though, thanks
Liam
Liam2y ago
I'd consider giving https://railway.app a shot if you have ran out of free tiers and are looking to host a db easily and cheap. In my experience even apps with moderate usage only cost $2 / month on the db side there. Highly reccomend.
Railway
Railway
Railway is an infrastructure platform where you can provision infrastructure, develop with that infrastructure locally, and then deploy to the cloud.
Want results from more Discord servers?
Add your server