Any suggestions for an Architecture improvements ?
I have the following architecture (in blue = our microservices, in red = external):
On the top of diagram: Clients create/update products using Products-Domain REST endpoints. Each time a product is created/updated, we store it in the database and send a JSON message to the ProductPublisher microservice. This microservice converts it to an XML format understood by ExternalSystem1 and sends it to that external system. Each product has an id and a type ("New," "Used," "Refurbished"). Each product also has other information like dimensions and weight.
On the bottom of diagram: the Stock-Consumer receives XML messages from ExternalSystem2 containing updates about the stock of products. For each product id, we receive information about available-stock, blocked-stock, etc. The Stock-Consumer converts this information to JSON, understandable in our domain, and sends it to Stock-Domain. The domain calculates the sum of the received message and existing stock information.
Problem: We want to create a front-end for only a subtype of products (Refurbished, representing only 5% of total products). In the front end, we need to aggregate data about refurbished products and their stocks. For example, we need to provide a filter functionality about (refurbished) products whose stock is below or above a given level (given by user).
104 Replies
Solution 1: The Aggregator microservice (in Green in the diagram) will have its own database with aggregated information about refurbished products and their stock. The front end will use this service to provide its features.
Each time a product is created/updates the Aggregator will update its own database. To do so, it will listen for events in the products kafka topic. The messages in that kafka topic contains a field indicating the type of products, so the Aggregator will read all messages but "skip" ones whose type is not Refurbished. (by "skip" we mean, we don't do any calls to the database, we juste do a continue in the code).
Every time we receive information about a stock The Aggregator will update its own database. To do so, it will listen for events in the stock kafka topic. The messages in that kafka topic contains only productId, no iformation whether the product is of type Refurbished or not. So the Aggregator will read each message, call its own database to see whether the product exists or not (if yes = it's refurbished, if no = it's not). If it exists, it will update its stock information.
Drawbacks:
When listening for stock messages, each message will require a call to the Aggregator database, this is far from optimal since we will be doing 95% waisting calls since only 5% are refurbished !
Also, since the Aggregator is listening for a topic that is situated BEFORE the domain, every logic in the domain needs to be duplicated (for example the summing logic).
Any other drawbacks of this architecture? Any improvement suggestions?
I do something similar in a scenario, but in use an actor framework, actors carry state and have behavior and are typically single threaded.
if the event state and actor state are identical the only change to the actor is a last check date.
If the state of the event is different than the actor the actor updates state, and ships delta to the aggregation database.
If the actor hasn't heard a message in some time, the actor reaches out to data sources and update state, and ships the delta if it's different.
The main drawback is, actor frameworks tend to be complex; they're harder to code against, harder to debug, etc.
I use orleans which is an actor framework created by and maintained by Microsoft.
I don't think I can use another framework in my job, I am not the one who decide unfortuantely :/
I don't know if Kafka can act like an actor, that might be an option where some event is filtered if the previous state is identical.
Otherwise to a greater degree you're going to have 95% wasted effort.
There are data patterns you could follow where you have write optimized data tables, and read optimized data tables, this way you get a fast write write to a temporary landing table and another process that flow controls to the read tables.
That being said I'd be certain you need to solve this problem.
but how kafka will know if some stock event concerns the aggregator or not ?
It won't
This is assuming there is some magic filter which could sit on top of Kafka with the needed criteria
I don't know if such a thing exists
no it doesn't :/
So, do you even need flow control to the db?
95% of calls are wasted
; so what?it's a lot of waste !
Are you paying per transaction?
no 😆
Do you know the rate of messages
I'm assuming huge rate of messages because Kafka
yes, it's huge, we receive an update of stock every hour, which contain stock info about 1000 to 10000 products
Oh
Just write to the database
That's not huge
it will results in +150 call in a minute
the aggregator will listens for that stock kafka topic and for each one, it needs to call its own database to see whether the product exists
Unless your databases are multi tenant or extremely write optimized, you likely don't have a flooding message problem
Why?
Do you worry about having a problem where products exist in the aggregator you don't need?
yes, I am only allowed to have Refurbished products in the aggregator
I will not put evey type of products, only the Refurbished (which are 5%)
so first the Aggregator listens to the products in the products topic, there is a field there indicating the type of product. If it's Refurbished it will save it to the Aggregator database
Does the message contain detail if it's refurbished?
in the product topic yes
but in the stock topic no
and that's the problem
when listening for stock events, a message does not contain any info about the the product being refurbished or not
Is there a real case for this, or someone feels like limiting is important?
How many SKUs?
yes there is a real case, the front will only shows products refurbished, it's an app about refurbished products
a message will contain infos about, 1000 products or so
we receive a message every 20 minutes to 1 hour
10.000 **
It's just not a lot of data
I'd confirm you have a problem fixing
If it's a real problem you'll need flow control somewhere
what do you mean by "flow control" ?
You have a message with 1_000_000 SKUs, too much to handle, database inflates, memory requirements for producer/consumer, you hit some bottleneck
So you need flow control; a pool where work can wait and get pulled in some at a time
But you have 10_000 every 20 min; is the problem real?
the problem is not in having 10.000 in every 20 minutes
the consumer is able to hadle this without any problem what so ever
the problem is that, the aggregator will need to consumer every single message in the kafka topic of stock. For every single product it will check its database to see whether the product exists (that's the only way to have only refurbished products)
Oh I think I see
You mean you get bajillion messages in a polluted topic
this will result in 10.000 calls, wheras only 500 are refurbished
yes, for the aggregator it doesn't care about all that crap, it only cares about products that are refurbished and that's all
I'm still stuck on 10k, is it a real problem, or you're concerned it might be a problem?
I don't want to have a service calling its own database 10.000 times, and finally only 500 products will be found and updtaed. It's a scaling problem
in a period of 6 months, it could be 100.000 per minute
the growth rate is ery important
How does this know it's refurbished or not?
it doesn't need , it must stock all the infos
it doesn't care about distinguishing refurbished or not, it will just keep all the informations
it's only an Aggregator problem the filtering by Refurbished
the Stock-domain, is used for all types of products, we don't make any difference between a. refurbished/ new/used
Okay, so same question, how does the aggregator know
First, it will listents to the products topic, Every message there have an info whteher it's refurbished or not
so the aggregator will have a kafka consumer, it the product is refurbished, it will store it in its own database
How much footprint is the aggregator?
thus, we can assume safely that the Aggregator will have all the products that are refurbished stoed in its own database
hum what do you mean ?
Can you store all products in memory in the aggregator?
no I can't
there is 20 million products (new + refurbished + used)
Can you store all refurbished products
yes, that's the point of listening to the product topic
everytime a new Refurbished product is created, the Aggregator will save it in its own database
I mean in memory
Like a list of refurbished SKUs
At some point if you want to scale something will need to change
don't think so, it's not very scalable, but I am intrested to know if it's the case what we can do with it ?
Rather than calling its DB the aggregator will check its memory ?
Yes, it's got problems though
Because if you know most of the products you can filter in the aggregator
And quarantine messages you don't know
Then check the couple thousand you don't know
okay it's like a small cach then ?
Yes, but caching only refurbished won't work unless you can guarantee one of two things. You know about all the refurbished products all the time, or missing a message will be healed at a later time with another message.
And the healing happens in a reasonable time
You said 95% of calls would be wasted, so if you have 1, 2, 3 SKUs as refurbished, but the message has 1, 2, 3, 4, 5, 6 and 6 is a refurbished sku; you lost the refurbished message
But later on you'll know about 6
And a later message contains 6
So now you're good again
This is where in my scenario, I use an actor model
Because now, you have a trivial amount of actors, that all know if their product is either refurbished, used, new or not known
If not known the actor goes to find out
And any time a state change happens where write to aggregator conditions are met, boom, write to database
hum, still can't see how the actor "goes to find out"
Actors live in memory; so when a new stock comes in, either an actor exists for that stock or not
If not actor is created
So an undifferentiated actor gets created, his whole job is to "find out" and create a differentiated actor
Then the undifferentiated actor is either deleted or points to the differentiated actor
So a differentiated actor would be an instance of RefurbishedStock for instance
Or NewStock
this sounds a little. bit complicated 🤔
I mean yes and no
Writing your own actor framework is hard
Using orleans, akka.net, etc, much easier
It would be a learning curve no matter what
Orleans has some advantages around the idea of virtual actors
The main advantage of a virtual actor is, the moment I need you, either you exist or you will at that moment
Orleans deactivates virtual actors after some time
Now all that being said
Go back on time to "so what"
100_000 every 20 min. So what?
your'e right ! also, If I am aable to keep the list of refurbished in the aggregator Memory tha'ts enough I guess
The worst that happens is you miss a message
Err a sku in a message
It comes later
not necessairly
I mean; so do this
I guess that the database in the Aggregator will always take all the refurbished products
oh no :/
assuming that something bad happened in the products topic (some kafka partition is down), then we will miss some refurbished products
the Aggregator will not have them :/ then when receiving stock messages we will not know that the proudct is refurbished so the message is lost as you said
because it will be skipped
Message comes in all known refurbished SKUs write immediately, then use a channel in code to quarantine and check messages that aren't well known
You're always going to have this chance, you have no guarantees with this.
The third Party red product feed could be down for instance
But stock comes in still
Or, stock arrives before product does
😆 yeah, it's nightmare all of this
Eh
There are a lot of mitigations, if there is no product for a sku, then you dead letter a stock message
You can always replay dead letters later
You can also, just build out a data store of all messages and products; and periodically mine the data for your aggregate
But that's a different design
Essentially your aggregator says to the product store; give me all refurbished products, and then you read all stock and update aggregation database
Or combine both
Concepts
@Schrödinger's Cat if you can't guarantee messages won't be replayed naturally at some point; how do you handle stale data
In either case you still need to deal with I haven't heard about this stock in X time frame
not sure I understand "combining both"
You have messages that are optimistically processed; but you still build out message stores and have a grand rectifier once a day or some interval
@Schrödinger's Cat the other option maybe consider the outbox messaging pattern
you mean something like this :
In regards to combining both or outbox pattern?
The aggregator will periodically asks the Product-domain "give me refurbished products" and then, since it has obtained the list of the ids of refurbished producsts, it will asks the stock domain "give me the stock of all the follwoing products"
regarding "both", what you described here
Right, but you can do this pretty infrequently; if you're also processing messages.
Er multiple subscribers to the same topic
you mean, I still need to subscribe to kafka messages, but in addition I will do that "direct asking" to resolve inconsistencies
hum, since we know we have a stock update every 20 minutes or so, why not just doing the direct asking then ? without kafka thing
That's fine too
It's not a lot of data
with 10.000 refurbished products it will be fine I guess
there is enough time to process all the data and update before the next update
With 100_000 it's fine too really
You don't really care about this
okay.
From a robustness point of view :
Solution 1 : subscribing to kafka topics + caching the refrubished products in memory
Solution 2: Solution 1 + Direct asking to resolve inconsistencies
Solution3 : Only Direct asking
My problem I'm solving similarly; in have millions of products I need to reach out to systems to get data, they're mostly archaic and don't have any ability to event change
I feel like solution 3, is more robust since we don't depend on kafka being down or something. It will be a stock-daomin problem not the aggregator problem
Literally I have to talk to an as/400 to get stock data
In my scenario
what is an as/400 😆 ?
Mainframe
? 😮
So my actors have to be really smart about how they get updates
That guy
oh tha't really archaic 😄
didn't think it would exist in 2023
It's one of ibms largest books of business
why people still use this :d
they don't have money to modernize ?
IBM iseries/as400 is absurdly reliable
okay, intresting to know !
So, fun fact, no one knows how to run them any longer
Or write code against them
Or modify existing code
So like the banking industry is gonna fall apart as gray hairs retire and die off
yeah, I am always wondering how banking stytems still use Cobol with very few people knowing how to write/read cobol
Huge monolithic footprint and working code
I have one final (or two) questions :p how to do that direct asking periodically in a clean way ?
I mean is there another way ather than doing some infine loop in a background service and each time waiting some time provided in a config (say 20 minutes or so )
Start there
There are better ways
You have Kafka, raise an event
hum what do you mean by htis ?
Consider that your data sources are a data clearinghouse, so in theory stock has done its magic, publish a message to Kafka suggesting data is prepared and ready to read
Start with the forever loop though
is that something similar to Dtock-Domain having some webhook or something ?
And let requirements evolve complexity; don't let complexity evolve for it's own sake
More like when the data is ready to read from products you get a message from Kafka topic. ProductDataReady
From stock StockDataReady
But in mean start with a loop
yes, will start by the forever loop but intrested in learning about all these stuff 😆
what kind of evolution I may have that could change the system from a forver loop to some kafka event (procuts / stocks are ready) ?
Maybe a manual update of some sort that needs to be brought in
But don't plan for what if
In a distributed messaging system everything must be simple
If something is too complex break it up and use messages to make it simple
okay, will do infinite loop that send GET requests to the Product-Domain to get the Refurbished products and then Will do GET requests to the stack Domain with the list of refurbished proudcts
I'd start there and see what breaks
Don't be afraid to simulate messages with a million rows
Always consider, am I solving a problem I don't have?
Follow that up with; am I creating a problem in the near future
So I know how many SKUs I need to plan for, since I own them all; I know growth over years, so I doubled may capacity, because it gets me 3-4 years of planned capacity
I know my org is slow to make changes too
that's exactly the problem, my org is VERY slow to makes changes
I mean, to have a new kafka topic, you need to make a request and pass with a comitee and getting an ok
and only then you can get the topic
that could take up to 1 month
same for database collections, Redis, eveything that is an infrastructure-related take very long time
Morever, I am not very experienced at all 🙄
I am trying to do some personnal projects on my own to learn how to simulate problems and frind solutions with them
today I've learned about the actors, but I need to siulate a problem, then do some implementations with actors to solve it, to really grasp how this work
I need to construct some knowledge base of problems with their solutions and a discussion of the tradeoffs and so on
thanks !
your problem is similar to mine but you have a lot more data and that's why you're using actors right ?
I need to update millions of sku and stock positions every 20 minutes, and I can't make millions of calls in 20 minutes without impacting some transitive dependencies in a negative way. So I couple a number of channels that send events to actors; so then if actors haven't heard any updates for an hour or so they make requests to update.
Since I have the events as the primary source, I can wait a much longer time, and set up time based tranches where different classes or products can wait longer
So for some I might wait an entire day
Hum, still can't see clearly How actors help with not making millions of calls every 20 minutes ?
So, actors have state, and that state might include a last updated date, so actor frameworks usually have a way to schedule work for actors, so at the 20 min mark my actors check state, if they've updated by an event, they're current; I don't need to make a request to update.
So as I start to build in messaging to older parts of my application to backfill events I need to "phone home" less and less
okay thanks for the calirification, will give it a try