W
Wasp6mo ago
mb23

Best way to use jobs/workers

I've created jobs/workers for most of the functions that are using external APIs (such as my SERP API or OpenAI) to ensure the task are properly executed. Unfortunately, now, it happens sometimes that I run into problems with PGBoss and the Postgres DB saying that there are too many connections. At the moment, I create a PBBoss instance for each job/worker. Could you tell me if: 1) I should rather create one PGBoss instance per user session instead to reduce the number of connections? 2) I should rather avoid using jobs/workers for smaller tasks (that still use an external API but might take less time than others). Would love to get your feedback on best practices here! Thanks ❤️
15 Replies
kapa.ai
kapa.ai6mo ago
Hi @mb23. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave: ---------------------- Based on the information provided in the knowledge sources, it's clear that Wasp uses PgBoss as a job executor to handle scheduling, monitoring, and execution of jobs. However, there are some considerations to keep in mind when using PgBoss: 1. PgBoss runs alongside your other server-side code, sharing the CPU. Therefore, it's not appropriate for CPU-heavy workloads. If you're running into issues with too many connections, it might be worth evaluating the nature of your tasks and whether they are CPU-intensive. If they are, you might want to consider handling them differently. 2. When you add PgBoss to a Wasp project, it automatically adds a new schema to your database with some internal tracking tables. If you're creating a new PgBoss instance for each job/worker, this could be contributing to the high number of connections. It might be more efficient to create one PgBoss instance per user session, as you suggested, but this would depend on the specifics of your application and its requirements. 3. If you're using jobs/workers for smaller tasks that use an external API but might take less time than others, it could be worth considering whether these tasks really need to be handled as jobs. If they're not CPU-intensive and don't require the scheduling and monitoring provided by PgBoss, it might be more efficient to handle them directly. Remember, if you need to customize the creation of the PgBoss instance, you can set an environment variable called PG_BOSS_NEW_OPTIONS to a stringified JSON object containing initialization parameters. Note that setting this overwrites all Wasp defaults, so you must include database connection information as well. For more detailed information, you can refer to the Wasp documentation on jobs (https://wasp-lang.dev/docs/advanced/jobs#declaring-jobs) and the PgBoss documentation (https://github.com/timgit/pg-boss/blob/8.4.2/docs/readme.md#newoptions). ---------------------- Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: ---------------------- Sorry @mb23, I seem to be having a bit of trouble. Don't worry though, I've pinged my support and they will try to fix me up! :technologist: Perhaps try again in 15mins?
miho
miho6mo ago
Could you share how you did this?
At the moment, I create a PBBoss instance for each job/worker.
You shouldn't need to do this with Wasp, we create a PgBoss instance and make it happen for you 😄
mb23
mb23OP6mo ago
@miho oh, good to know 😄 Here's an extract of how I do it in the worker: import PgBoss from 'pg-boss'; .... export const createOutlineWorker: CreateOutlineJob<Input, Output> = async ({ keywordId, keywordType }, context) => { const boss = new PgBoss({ connectionString: process.env.DATABASE_URL_PGBOSS }); await boss.start(); try {
miho
miho6mo ago
Oooh yeah, that's not necessary 😄 did you use the workers without that and concluded you needed it or you started like this?
miho
miho6mo ago
You can just write functions and Wasp will execute them: https://wasp-lang.dev/docs/advanced/jobs
Recurring Jobs | Wasp
In most web apps, users send requests to the server and receive responses with some data. When the server responds quickly, the app feels responsive and smooth.
mb23
mb23OP6mo ago
@miho I've just started like this and didn't question it haha. Good to know! So, this will probably also solve my problem of having too many connections, right? 🙂
miho
miho6mo ago
Yep, we only start one PgBoss instance and that should get rid of that warning 🙂 Please let us know if the current system limits you in any way so we can learn and improve Wasp!
mb23
mb23OP6mo ago
@miho ok great! at the moment, I'm also using the id and the state of the current job to be able to cancel the job: const boss = new PgBoss({ connectionString: process.env.DATABASE_URL_PGBOSS }); await boss.start(); ... let bosscancel = await boss.cancel(jobId); Can you tell me how I can access the PgBoss instance that's automatically started? 🙂
miho
miho6mo ago
This is not really public API, but we should add it to wasp/server/jobs since people need it. Here's how you can get the PGBoss instance now:
import { pgBossStarted } from 'wasp/server/jobs/core/pgBoss/pgBoss'

const pgBossInstance = await pgBossStarted
import { pgBossStarted } from 'wasp/server/jobs/core/pgBoss/pgBoss'

const pgBossInstance = await pgBossStarted
mb23
mb23OP6mo ago
@miho nice, thanks a lot, you've helped me a lot!
miho
miho6mo ago
Created an issue to follow this: https://github.com/wasp-lang/wasp/issues/2229
GitHub
Expose PgBoss instance as part of public API · Issue #2229 · wasp-l...
From a Discord question: at the moment, I'm also using the id and the state of the current job to be able to cancel the job: let bosscancel = await boss.cancel(jobId); Can you tell me how I can...
matijash
matijash6mo ago
@mb23 I think it also a valid question when to use a job vs. simply doing an API request and waiting for the response. I think the most API requests can actually be consumed directly. Typically, if an API request takes longer it will offer a webhook to notify you once your result is ready. Jobs are mostly for when you're doing something yourself (e.g. some extensive analytics combing your whole DB, and you know it will take a while), then job is a good choice. Feel free to share with us also the specific examples and we can comment here
mb23
mb23OP6mo ago
@matija Thanks for your feedback! I've created an AI blog post generation tool: contentforce.ai I'm using jobs for all content creation related tasks at the moment, for example: - creating an outline (multiple OpenAI API requests + ValueSERP API request) --> task duration: ~ 5 minutes - doing research for blog articles (multiple OpenAI API requests + + ValueSERP API request + DataForSEO API request + Google Cloud NLP API request) --> task duration: ~ 5 minutes When testing it without jobs, I had the feeling that these task are not properly executed when I something "unexpected" happens such as closing the tab/browser or going to another page.
matijash
matijash6mo ago
Got it - these indeed sounds as a good fit for jobs, given they take so long
mb23
mb23OP6mo ago
@matija ok great, thanks 🙂

Did you find this page helpful?