data pipeline for custom user analytics dashboards

I have a project with user generated content and would like to be able to provide personalized analytics dashboards to each user. I am using mixpanel (but open to switch providers) to log analytics events like page views and engagements. I am also using T3/Vercel/Planetscale. I am curious if anyone has experience or guidance building out a feature like this. I can query mixpanel's APIs for the data I need to build these dashboards, but because of the rate limits it does not make sense to do this on-demand in an api route. - Should I schedule a cron to export/parse the data and write the reports I need to my primary database? - At what scale does it make sense to export to something like a data warehouse? (https://aws.amazon.com/redshift/) Thanks for any insights
3 Replies
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
zthomas
zthomasOP2y ago
Thanks for the reply! As far as event volume a daily event summary for an individual user might have at most a few thousand events. I do not need realtime updates and I think as often as once a day would be fine. Mixpanel api queries are limited to 60 queries per hour so I would more or less be raw exporting events and parsing them myself, since I have many more users that I could accommodate with some kind of scheduled query queue. I suspect I would just be dumping a daily raw export, parsing events and creating a summary for each user, and writing these summaries to the database. I'd like to avoid dumping every analytics event into my primary database and would be fine with just storing summaries that could be queried quickly and updated periodically (hourly or daily). Explo looks like a nice product but I don't think I can justify their price point and have no issues building the front-end myself. The downside of this approach as you mentioned is that I'm redoing the work of generating reports that mixpanel is already capable of. While we use mixpanel for our own internal reports and it's not painfully expensive, it's not ideal that I can't leverage their query engine to solve this particular problem. If I used a data warehouse to allow for the opportunity to have more granular reporting it would feel like paying for the same service twice. This project doesn't need to scale to millions of users so I'm trying to not over-engineer
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
Want results from more Discord servers?
Add your server