Wasp•5mo ago

Large database results in long data loading on frontend

What is a good way to sort and or filter data from a database table that is pretty large? Right now it takes a lot of time. When a user goes to the page that will show this data paginated it queries the database but the method i use now takes pretty long. Is there any efficient sort/filter algoritm i should use with postgres in order to reduce this time?

21 Replies

kapa.ai•5mo ago

Hi @Sven. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave: ---------------------- To address the issue of long data loading times for large database tables in Wasp, there are a few approaches you can consider: 1. Pagination: Implement server-side pagination to limit the amount of data fetched and transferred at once. This can significantly reduce query time and improve frontend performance. 2. Indexing: Ensure that your PostgreSQL database has proper indexes on the columns you're filtering and sorting by. This can dramatically speed up query execution times. 3. Optimized Queries: Write efficient SQL queries, possibly using Prisma's query optimization features. Wasp uses Prisma as its ORM, which provides various optimization techniques Prisma Configuration (https://wasp-lang.dev/docs/data-model/backends#prisma-configuration). 4. Caching: Implement caching strategies to store frequently accessed data and reduce database load. 5. Asynchronous Loading: Use asynchronous loading techniques on the frontend to improve perceived performance. 6. Database Extensions: Consider using PostgreSQL extensions for specific optimization needs. Wasp allows you to enable these through configuration PostgreSQL Extensions (https://wasp-lang.dev/docs/data-model/backends#postgresql-extensions). 7. Query Invalidation: Be mindful of Wasp's automatic query invalidation feature, which can cause unnecessary updates. While convenient, it may lead to performance issues in large apps Automatic Query Invalidation (https://wasp-lang.dev/docs/tutorial/actions#invoking-the-action-on-the-client). For more specific optimizations, you might need to analyze your particular use case and database schema. If you're still experiencing issues after implementing these strategies, you may want to consider more advanced techniques or consult with database optimization experts. ---------------------- Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: ---------------------- Sorry @Sven, I seem to be having a bit of trouble. Don't worry though, I've pinged my support and they will try to fix me up! :technologist: Perhaps try again in 15mins?

miho•5mo ago

Are you filtering the data on the server or on the client?

SvenOP•5mo ago

Server

miho•5mo ago

This is more a general DB question than a Wasp related question. If you are filtering on the server, that's a good first step. Now, if you are fetching all the data and then filtering in JS - that's bad. If you are writing filtering queries with Prisma - that's good. And if you are doing that - then database indexes will be helpful for you. I'd research that direction 🙂

SvenOP•5mo ago

I'm working on a feature that involves querying daily event data and aggregating metrics over a specified timeframe (e.g., last 30 days or 90 days) for each event. Based on these aggregates, I need to filter and sort the events according to certain thresholds, such as total revenue or average price within the selected timeframe. Here’s a more detailed breakdown of my issue: Data Structure: Events table: Stores information about each event, like its name and category. EventData table: Contains daily metrics for each event, including: totalRevenue: Total revenue generated by the event on a given day. totalTicketsSold: Total tickets sold on that day. avgTicketPrice: Average ticket price for that day. eventDate: The date of the event metrics. The Events table has a relation to EventData, where EventData holds the daily details. Query for all events within a given timeframe (e.g., last 30 or 90 days). For each event, I need to aggregate the totalRevenue, totalTicketsSold, and avgTicketPriceover the timeframe. Apply filters on these aggregated values. For example: Only retrieve events where the sum of totalRevenue in the timeframe is between a specified minimum and maximum. Similarly, apply filters for totalTicketsSold and avgTicketPrice (where avgticketprice is average price over that timeframe) based on specified thresholds. Finally, sort the results based on one of these aggregate fields (e.g., sort by totalRevenuein descending order). Challanges im facing Filtering on Aggregated Data: I need to filter events based on the sum or average of fields in EventData over the specified timeframe. For example, filtering events with a total revenue between 1000 and 5000 within the last 30 days. Sorting: I want to sort the filtered results based on these aggregated values (e.g., total revenue or average ticket price). ----------------------- I know this is not really wasp related but i know you guys maybe can give some input or recomendations

miho•5mo ago

Using specialised databases would help here i.e. they are called time series DBs. Stuff like https://www.timescale.com/ or even going with https://clickhouse.com/ would be nice. I'm not an expert, but if you are dealing aggregates - either you calculate them and save them in the DB under some fields or you calculate them on the fly (but your DB needs to be fit for that so it's fast enough)

PostgreSQL ++ for time series and events | Timescale

Engineered to handle demanding workloads, like time series, vector, events, and analytics data. Built on PostgreSQL, with expert support at no extra charge.

ClickHouse

Fast Open-Source OLAP DBMS - ClickHouse

ClickHouse is a fast open-source column-oriented database management system that allows generating analytical data reports in real-time using SQL queries

SvenOP•5mo ago

Thanks, someone else recommended me this. Is this suitable for wasp? And if so how would i implement this? The best approach would be to use TypedSQL as Prisma can't handle all the requirements you've listed. You can create a new SQL file in the prisma/sql directory, for example, getEventMetrics.sql:

SELECT 
  e.id, 
  e.name, 
  e.category,
  SUM(ed.totalRevenue) as totalRevenue,
  SUM(ed.totalTicketsSold) as totalTicketsSold,
  AVG(ed.avgTicketPrice) as avgTicketPrice
FROM 
  "Events" e
JOIN 
  "EventData" ed ON e.id = ed."eventId"
WHERE 
  ed.eventDate >= CURRENT_DATE - INTERVAL :daysAgo DAY
GROUP BY 
  e.id, e.name, e.category
HAVING 
  SUM(ed.totalRevenue) BETWEEN :minRevenue AND :maxRevenue
ORDER BY 
  SUM(ed.totalRevenue) DESC

SELECT 
  e.id, 
  e.name, 
  e.category,
  SUM(ed.totalRevenue) as totalRevenue,
  SUM(ed.totalTicketsSold) as totalTicketsSold,
  AVG(ed.avgTicketPrice) as avgTicketPrice
FROM 
  "Events" e
JOIN 
  "EventData" ed ON e.id = ed."eventId"
WHERE 
  ed.eventDate >= CURRENT_DATE - INTERVAL :daysAgo DAY
GROUP BY 
  e.id, e.name, e.category
HAVING 
  SUM(ed.totalRevenue) BETWEEN :minRevenue AND :maxRevenue
ORDER BY 
  SUM(ed.totalRevenue) DESC

Generate Prisma Client with the SQL flag:

prisma generate --sql

prisma generate --sql

you can use the generated TypedSQL function in your TypeScript code:

import { PrismaClient } from '@prisma/client'
import { getEventMetrics } from '@prisma/client/sql'

const prisma = new PrismaClient()

const result = await prisma.$queryRawTyped(getEventMetrics({
  daysAgo: 30,
  minRevenue: 1000,
  maxRevenue: 5000
}))

console.log(result)

import { PrismaClient } from '@prisma/client'
import { getEventMetrics } from '@prisma/client/sql'

const prisma = new PrismaClient()

const result = await prisma.$queryRawTyped(getEventMetrics({
  daysAgo: 30,
  minRevenue: 1000,
  maxRevenue: 5000
}))

console.log(result)

miho•5mo ago

Wasp doesn't support TypedSQL - we'll probably work on it in the future, but it's not a priority atm. The good thing, TypedSQL is just fancy raw SQL which you can use with Wasp without trouble https://www.prisma.io/docs/orm/prisma-client/using-raw-sql/raw-queries

Thanks, someone else recommended me this. Is this suitable for wasp? And if so how would i implement this?

Wasp is just a Node.js server, so you'd do it in any way you see fit to connect a Node.js app with one of those apps. I'd probably setup things in the server.setupFn function and import the DB client for e.g. Clickhouse from there in the rest of the server.

Raw queries | Prisma Documentation

Learn how you can send raw SQL and MongoDB queries to your database using the raw() methods from the Prisma Client API.

SvenOP•5mo ago

What do you mean with server.serupFn i dont understand fron there

miho•5mo ago

https://wasp-lang.dev/docs/project/server-config

Server Config | Wasp

You can configure the behavior of the server via the server field of app declaration:

SvenOP•5mo ago

So i need to do this setup in the server in order to use raw SQL? I think calculating them and saving under some fields is best option for me, Would a cronjob be best? So also lets say i add a new model in my schema.prisma and migrated it in dev version, what happens when i deploy then? what is best practise @kapa.ai

job dataJob {
  executor: PgBoss,
  perform: {
    fn: import { dataJobfn } from "@src/workers/dataJob"
  },
  entities: [Data,Aggregates,Prods],

  schedule: {
    cron: "40 22 * * *",
  }
}

job dataJob {
  executor: PgBoss,
  perform: {
    fn: import { dataJobfn } from "@src/workers/dataJob"
  },
  entities: [Data,Aggregates,Prods],

  schedule: {
    cron: "40 22 * * *",
  }
}

This job does not run on 22:40 of my local time

kapa.ai•5mo ago

Let me check the knowledge base for you again. One moment! :books: ---------------------- Based on the information provided in the knowledge sources, I can explain why your job might not be running at 22:40 local time as expected. The cron expression in your job declaration is set to run at 22:40 (10:40 PM) every day. However, it's important to note that the time zone for cron jobs in Wasp (which uses pg-boss) is typically based on the server's time zone, not your local time zone. According to the pg-boss documentation (https://wasp-lang.dev/docs/advanced/jobs#pg-boss-considerations), Wasp uses pg-boss for job scheduling. The cron expression is interpreted based on the server's time zone where your application is running. To address this issue, you have a few options: 1. Adjust the cron expression to match your desired local time, taking into account the time difference between your local time and the server time. 2. If you're running this locally for development, ensure your server's time zone is set correctly. 3. For production environments, you might need to configure the server's time zone to match your desired execution time. 4. Alternatively, you could adjust your job logic to account for time zone differences if needed. If you need more precise control over the execution time or need to handle multiple time zones, you might need to implement additional logic within your dataJobfn to manage these time-related complexities. Remember, when deploying to production, the behavior might differ based on the hosting environment's time zone settings. Always test your scheduled jobs in an environment that closely matches your production setup to ensure they run at the expected times. ---------------------- Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: ---------------------- Sorry @Sven, I seem to be having a bit of trouble. Don't worry though, I've pinged my support and they will try to fix me up! :technologist: Perhaps try again in 15mins?

SvenOP•5mo ago

Fixed, no datalost just deployed to fly also my job didnt run because i had scheduled beow entities has to be other way around well to be honest somethimes i notice that it just does not run even with cron: "* * * * *",

NEROX•5mo ago

did you acomplished at the end?

SvenOP•5mo ago

Yes

NEROX•5mo ago

how did you NOT lost data? in prod

SvenOP•5mo ago

Wait I just added column to a table Then did wasp db migrate-dev And then wasp deploy fly deploy Fly migrates these changes to prod And i didnt loose data And if i would fly makes snapshot of database to go back

NEROX•5mo ago

fly has snapshot option?!? omg

SvenOP•5mo ago

It does ir daily

NEROX•5mo ago

in wich section can you see that?

SvenOP•5mo ago

https://fly.io/docs/postgres/managing/backup-and-restore/#listing-snapshots

Fly

Backup, Restores, & Snapshots

Documentation and guides from the team at Fly.io.

Gaming

Programming

Large database results in long data loading on frontend

Did you find this page helpful?