Modular•9mo ago

Energy usage benchmarks?

I've read a lot about Mojo performance benchmarks but not that much about energy consumption? Are there any good articles where one could derive roughly the savings on Energy?

19 Replies

Darkmatter•9mo ago

Savings compared to what? If you compare pure python to Mojo it's almost a spite match. If you are comparing Mojo to other systems languages (C/C++/Rust), then it's a lot less clear and is more of a question of "how much effort did the programmer put in?" Since I see you're also a distributed systems person, I'm pushing for more efficient network APIs that should lets you maximize the use of a single server, so while it's probably more expensive if you have a single 400 watt server (due to polling and similar optimizations), at datacenter scale you need less servers to do the same work.

SiaOP•9mo ago

@Owen Hilyard Sorry, let me clarify with an example. Let's say you got a piece of Python code, and port it to Mojo, is there an understanding of the energy savings? For example doing a very large matrix multiplication..

Darkmatter•9mo ago

Code dependent. If you are comparing pure python, not numpy, pytorch, etc, to Mojo, it's like saying "Let's race a raspberry pi and a quad socket server for raw computational power". It's such a big difference in compute usage that it's hard to compare. For example, other languages around the same performance class as Mojo can do web apps with tens of millions of requests per second on a single server, how many servers would you need to do that in python?

SiaOP•9mo ago

a) The energy used in the end should be an absolute number, or am I wrong? b) Not thinking servers or network io here - just pure algorithmic, based on utilising more efficient instruction sets. --- Let's step back. I expect a piece of Python, ported to Mojo, to run more efficiently, thereby not just faster but also lower energy consumption. Essentially, I expect Mojo to do to normal Python code, what simdjson does to json-parsing: https://github.com/simdjson/simdjson So, my question is, are there any benchmarks out there? I do understand it will vary depending on the task at hand, and factors such as where the bottle-necks will be (network io, etc). But still, reading some benchmarks would help me form an understanding.

GitHub

GitHub - simdjson/simdjson: Parsing gigabytes of JSON per second : ...

Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks - simdjson/simdjson

ModularBot•9mo ago

Congrats @Sia, you just advanced to level 1!

Darkmatter•9mo ago

The language is still in flux, but generally is very performance focused right now. We're in the progress of integrating a 10x performance increase over the old mojo algorithm for UTF-8 validation. Any benchmark you take right now will probably not be valid in 3 months since we keep pushing performance upwards.

SiaOP•9mo ago

@Owen Hilyard How does one get involved? 😛

Darkmatter•9mo ago

Pick up a pet issue and either make a library or submit a pr to the standard library. Some stuff is only available to Modular employees since they are keeping some stuff closed for now to prevent a "too many cooks in the kitchen" issue.

SiaOP•9mo ago

@Owen Hilyard Can I make a wish? That next benchmark article also, if possible, looks into the energy savings? Actually, if you'd be interested, I could collaborate to profile that...

Darkmatter•9mo ago

As I said, we keep pushing performance upwards so it's hard to measure. The original "python and JS are bad for energy usage, C++ and Rust are good" paper has been widely criticized for its methodology. There's too much that is highly specific to what algorithms you use and what hardware you have. For instance, if you don't have enough work doing polling for IO is wasteful, but if you do interrupts are wasteful. A throughput optimized vs latency optimized implementation of an algorithm. Or simply different ways to compile python. Since I've seen the way you compile the runtime swing energy usage by 5-10% in some cases for Python. I'd say the best way to do it is to take your usecase, get out a multimeter (or use a server with a BMC that does power measurements), and test bother full throttle python and full throttle Mojo.

ModularBot•9mo ago

Congrats @Owen Hilyard, you just advanced to level 18!

Darkmatter•9mo ago

Anything other than that will likely be invalid, especially if you are going to make an argument to someone to switch to Mojo based on energy usage. The last mojo marithon was matmul, so you can use the winner of that and then hand-roll pure python matmul since every library I know of does matmul in C or C++.

SiaOP•9mo ago

That is exactly what I intend to do... 😉

Darkmatter•9mo ago

Have fun, I'll be interested to see the results.

SiaOP•9mo ago

I mean advocating for mojo based on energy consumption. It might look stupid now but from where I stand, analysing the energy required to keep all AI-workloads a float, 1-2 years from now it might actually be the better argument...

Darkmatter•9mo ago

We need to actually have GPGPU and support for other misc accelerators working first. Models are getting more efficient too, Llama3-8b is competitive with the full size Llama2.

SiaOP•9mo ago

Doesn't that indicate less efficiency if Llama 3-8b is competitive with 70b Llama 2? Or am I misunderstanding?

Darkmatter•9mo ago

No, it means a much smaller model that is easier to run inference on is getting the same results. Less work for the same result.

SiaOP•9mo ago

Ah, you meant the models performance - gotcha, and agreed.

Gaming

Programming

Energy usage benchmarks?

Did you find this page helpful?