Large database results in long data loading on frontend
What is a good way to sort and or filter data from a database table that is pretty large? Right now it takes a lot of time. When a user goes to the page that will show this data paginated it queries the database but the method i use now takes pretty long. Is there any efficient sort/filter algoritm i should use with postgres in order to reduce this time?
21 Replies
Hi @Sven. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
----------------------
To address the issue of long data loading times for large database tables in Wasp, there are a few approaches you can consider:
1. Pagination: Implement server-side pagination to limit the amount of data fetched and transferred at once. This can significantly reduce query time and improve frontend performance.
2. Indexing: Ensure that your PostgreSQL database has proper indexes on the columns you're filtering and sorting by. This can dramatically speed up query execution times.
3. Optimized Queries: Write efficient SQL queries, possibly using Prisma's query optimization features. Wasp uses Prisma as its ORM, which provides various optimization techniques Prisma Configuration (https://wasp-lang.dev/docs/data-model/backends#prisma-configuration).
4. Caching: Implement caching strategies to store frequently accessed data and reduce database load.
5. Asynchronous Loading: Use asynchronous loading techniques on the frontend to improve perceived performance.
6. Database Extensions: Consider using PostgreSQL extensions for specific optimization needs. Wasp allows you to enable these through configuration PostgreSQL Extensions (https://wasp-lang.dev/docs/data-model/backends#postgresql-extensions).
7. Query Invalidation: Be mindful of Wasp's automatic query invalidation feature, which can cause unnecessary updates. While convenient, it may lead to performance issues in large apps Automatic Query Invalidation (https://wasp-lang.dev/docs/tutorial/actions#invoking-the-action-on-the-client).
For more specific optimizations, you might need to analyze your particular use case and database schema. If you're still experiencing issues after implementing these strategies, you may want to consider more advanced techniques or consult with database optimization experts.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
Sorry @Sven, I seem to be having a bit of trouble. Don't worry though, I've pinged my support and they will try to fix me up! :technologist:
Perhaps try again in 15mins?
Are you filtering the data on the server or on the client?
Server
This is more a general DB question than a Wasp related question. If you are filtering on the server, that's a good first step. Now, if you are fetching all the data and then filtering in JS - that's bad. If you are writing filtering queries with Prisma - that's good. And if you are doing that - then database indexes will be helpful for you. I'd research that direction 🙂
I'm working on a feature that involves querying daily event data and aggregating metrics over a specified timeframe (e.g., last 30 days or 90 days) for each event. Based on these aggregates, I need to filter and sort the events according to certain thresholds, such as total revenue or average price within the selected timeframe. Here’s a more detailed breakdown of my issue:
Data Structure:
Events table
: Stores information about each event, like its name and category.
EventData table
: Contains daily metrics for each event, including:
totalRevenue
: Total revenue generated by the event on a given day.
totalTicketsSold
: Total tickets sold on that day.
avgTicketPrice
: Average ticket price for that day.
eventDate
: The date of the event metrics.
The Events table has a relation to EventData
, where EventData
holds the daily details.
Query for all events within a given timeframe (e.g., last 30 or 90 days).
For each event, I need to aggregate the totalRevenue
, totalTicketsSold
, and avgTicketPrice
over the timeframe.
Apply filters on these aggregated values. For example:
Only retrieve events where the sum of totalRevenue
in the timeframe is between a specified minimum and maximum.
Similarly, apply filters for totalTicketsSold
and avgTicketPrice
(where avgticketprice is average price over that timeframe) based on specified thresholds.
Finally, sort the results based on one of these aggregate fields (e.g., sort by totalRevenue
in descending order).
Challanges im facing
Filtering on Aggregated Data: I need to filter events based on the sum or average of fields in EventData over the specified timeframe. For example, filtering events with a total revenue between 1000 and 5000 within the last 30 days.
Sorting: I want to sort the filtered results based on these aggregated values (e.g., total revenue or average ticket price).
-----------------------
I know this is not really wasp related but i know you guys maybe can give some input or recomendationsUsing specialised databases would help here i.e. they are called time series DBs. Stuff like https://www.timescale.com/ or even going with https://clickhouse.com/ would be nice.
I'm not an expert, but if you are dealing aggregates - either you calculate them and save them in the DB under some fields or you calculate them on the fly (but your DB needs to be fit for that so it's fast enough)
PostgreSQL ++ for time series and events | Timescale
Engineered to handle demanding workloads, like time series, vector, events, and analytics data. Built on PostgreSQL, with expert support at no extra charge.
ClickHouse
Fast Open-Source OLAP DBMS - ClickHouse
ClickHouse is a fast open-source column-oriented database management system that allows generating analytical data reports in real-time using SQL queries
Thanks, someone else recommended me this. Is this suitable for wasp? And if so how would i implement this?
The best approach would be to use TypedSQL as Prisma can't handle all the requirements you've listed. You can create a new SQL file in the prisma/sql directory, for example, getEventMetrics.sql:
Generate Prisma Client with the SQL flag:
you can use the generated TypedSQL function in your TypeScript code:
Wasp doesn't support TypedSQL - we'll probably work on it in the future, but it's not a priority atm.
The good thing, TypedSQL is just fancy raw SQL which you can use with Wasp without trouble https://www.prisma.io/docs/orm/prisma-client/using-raw-sql/raw-queries
Thanks, someone else recommended me this. Is this suitable for wasp? And if so how would i implement this?Wasp is just a Node.js server, so you'd do it in any way you see fit to connect a Node.js app with one of those apps. I'd probably setup things in the
server.setupFn
function and import the DB client for e.g. Clickhouse from there in the rest of the server.Raw queries | Prisma Documentation
Learn how you can send raw SQL and MongoDB queries to your database using the raw() methods from the Prisma Client API.
What do you mean with server.serupFn i dont understand fron there
Server Config | Wasp
You can configure the behavior of the server via the server field of app declaration:
So i need to do this setup in the server in order to use raw SQL?
I think calculating them and saving under some fields is best option for me, Would a cronjob be best?
So also lets say i add a new model in my schema.prisma and migrated it in dev version, what happens when i deploy then? what is best practise
@kapa.ai
This job does not run on 22:40 of my local time
Let me check the knowledge base for you again. One moment! :books:
----------------------
Based on the information provided in the knowledge sources, I can explain why your job might not be running at 22:40 local time as expected.
The cron expression in your job declaration is set to run at 22:40 (10:40 PM) every day. However, it's important to note that the time zone for cron jobs in Wasp (which uses pg-boss) is typically based on the server's time zone, not your local time zone.
According to the pg-boss documentation (https://wasp-lang.dev/docs/advanced/jobs#pg-boss-considerations), Wasp uses pg-boss for job scheduling. The cron expression is interpreted based on the server's time zone where your application is running.
To address this issue, you have a few options:
1. Adjust the cron expression to match your desired local time, taking into account the time difference between your local time and the server time.
2. If you're running this locally for development, ensure your server's time zone is set correctly.
3. For production environments, you might need to configure the server's time zone to match your desired execution time.
4. Alternatively, you could adjust your job logic to account for time zone differences if needed.
If you need more precise control over the execution time or need to handle multiple time zones, you might need to implement additional logic within your
dataJobfn
to manage these time-related complexities.
Remember, when deploying to production, the behavior might differ based on the hosting environment's time zone settings. Always test your scheduled jobs in an environment that closely matches your production setup to ensure they run at the expected times.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
Sorry @Sven, I seem to be having a bit of trouble. Don't worry though, I've pinged my support and they will try to fix me up! :technologist:
Perhaps try again in 15mins?Fixed, no datalost just deployed to fly
also my job didnt run because i had scheduled beow entities
has to be other way around
well to be honest somethimes i notice that it just does not run
even with
cron: "* * * * *",
did you acomplished at the end?
Yes
how did you NOT lost data?
in prod
Wait
I just added column to a table
Then did wasp db migrate-dev
And then wasp deploy fly deploy
Fly migrates these changes to prod
And i didnt loose data
And if i would fly makes snapshot of database to go back
fly has snapshot option?!?
omg
It does ir daily
in wich section can you see that?