I am trying to understand the necessity of Kafka

Suppose I have a dashboard where we will show the user about the stock changes over a span of time (e.g. 7 days, 1 month, 1 year, 5 years, etc.). We have a database where the stock gets updated every 2 hours. Can I just read the data from time series db? Is Kafka even necessary in this use case?
9 Replies
Lois
LoisOP2y ago
Or this
JulieCezar
JulieCezar2y ago
It depends what that script does and how you want to handle it... Kafka has the option to aggregate and manipulate data and then share it to your consumers. If you already are doing so in your script you can just skip using Kafka. In this use case, to my knowledge, you wouldn't gain anything by adding it, on the contrary it would just be a useless step. However, if you don't do aggregation etc. in the script and just save raw data in the time_series_db, then putting kafka as a layer between your db and your app would make sense. However, this can also depend on what you want to do. If you want realtime data then Kafka would be a better solution then having to query every X seconds to get the newest data from your Db.
Neto
Neto2y ago
kafka as a whole a is giant broker setup you can have multiple subscribers to stream data to it kafka can send into the time_series_db, to consumer2, consumer3, consumer4, etc if its just reading data and storing the data, a "simpler" queue would be enough
JulieCezar
JulieCezar2y ago
And yes, this is also a valid point, you need to Deploy Kafka somewhere, or use something like Pusher.
Neto
Neto2y ago
kafka infra is heavy af + expensive cuz usually its a cluster + something to manage it
Lois
LoisOP2y ago
Atm @JulieCezar is in Upstash. I am purely doubting the necessity for the time being
jingleberry
jingleberry2y ago
I wouldn't have the FE read directly from Kafka. Reading from time series db is perfectly fine. Kafka is useful for things such as when I want to keep two other services up to date with some stream of data. In your case this is the script which gets the stock data. Then you can have a lambda/server or whatever read from the kafka stream and update an elasticsearch cluster so you can run aggregations or other random access patterns on that stock data. You can also have another lambda/server read from the kafka stream and update the time series db. And from there your FE can have all the query patterns it wants to access the data. It can read from your elastic search cluster or it can read from your time series db. Regardless, you have a system where both all your data will be consistent with the data from your original script. For your case though, unless the write throughput straight to your time series db is a limiting factor, I'd skip kafka as the middleman entirely
Lois
LoisOP2y ago
thanks. That's a good answer and knowing what next. I have also asked in Upstash server and they responded
Lois
LoisOP2y ago

Did you find this page helpful?