Python integration and performance
I am looking into the performance of the Python Integration in Mojo. I use Dict here as example but that is just random, my question is not about a `Dict but in general
The following python program measures
on my computer to fill and modify a dictionary. as follows
When I include the dict into Mojo the performance drops significantly
Now when i shift the first loop into a python program
`
and use this in Mojo as follows:
i get
which is 1,5 times faster.
What I am mainly wondering about now are the last 2 examples. If performance is crucial, is it in certain cases when we need to rely on Python Integration advisable to perform some calculations directly in Python instead of just importing the Python object to Mojo, It feels odd but here it brings speedup.
Thanks for any thoughts on that.
7 Replies
In the code that runs in 15.87 sec, why do you have
dict2
instead of dict
as in the initial Python version? It is also not initialized assorry for that, corrected it. i have various dict implementation running here in one program and just extraced the code wrongly.
Congrats @Martin Dudek, you just advanced to level 11!
If indeed you often invoke CPython functionality in a loop, there will be indeed considerable overhead crossing the border from Mojo to CPython.
Mind you, there are more mature languages where the overhead is much larger. And I can imagine that Mojo still can improve in this area.
wondering if, in this example, we can minimize the number of calls between mojo and python. perhaps batch operations together in python as much as possible before passing the result to mojo
also, just building on the comment above, for now it seems like we'd have to handle such issues on a case-by-case basis rather than a general approach. here it could be something like:
utils.py
def get_dict_and_modify(num):
di = {}
for i in range(num):
di[str(i*2)] = i%3
for key in dic.keys():
di[key] *= 2
return di
and then
from python import Python
from time import now
alias NUM = 1_000_000
fn main() raises:
start = now()
Python.add_to_path("./utils")
var utils: PythonObject = Python.import_module("utils")
var dict = utils.get_dict_and_modify(NUM)
var elapsed = (now()-start)/1_000_000_000
print("time:", elapsed, "sec")
_ = dict["112"]
The example is just to illustrate the difference in performance when doing some of the operation in python. In a regular program we wouldn't loop twice here in the first place (we could combine mod and * of course)
Sorry for being unclear on that .
Sorry for being unclear on that .
ohh got it..makes sense!