How to improve the performance of formatting output

I wrote a piece of scientific simulation code as shown below:
fn run(inout self) raises -> None:
with open(self.output_path, "w") as f:
var out: String = "location,velocity,acceleration\n"
for _ in range(1, self.numsteps + 1):
self.velocity = self.velocity + self.acceleration * self.deltaT / 2
self.location = self.location + self.velocity * self.deltaT
self.acceleration = -1 * self.location
out += str(self.location) + "," + str(self.velocity) + "," + str(self.acceleration) + "\n"

f.write(out)
fn run(inout self) raises -> None:
with open(self.output_path, "w") as f:
var out: String = "location,velocity,acceleration\n"
for _ in range(1, self.numsteps + 1):
self.velocity = self.velocity + self.acceleration * self.deltaT / 2
self.location = self.location + self.velocity * self.deltaT
self.acceleration = -1 * self.location
out += str(self.location) + "," + str(self.velocity) + "," + str(self.acceleration) + "\n"

f.write(out)
Its performance is much lower than that of C/C++ and Swift. I know I’m a rookie at coding… As I’ve checked, the lagging part is: out += str(self.location) + "," + str(self.velocity) + "," + str(self.acceleration) + "\n". Are there any good ways to optimize the data output?
11 Replies
Darkmatter
Darkmatter4mo ago
Right now, not really. In the future there will be a way to do vectorized writes instead of all of those string copies. What value do you expect from numsteps?
DayDayUp
DayDayUpOP4mo ago
Thanks for the reply! numsteps is a constant representing the "number of steps", so it’s just an integer.
Darkmatter
Darkmatter4mo ago
Can you give me a rough order of magnitude of the value? numsteps=10 is going to need to be treated very differently than numsteps=1000000.
DayDayUp
DayDayUpOP4mo ago
It is set to 160000 in my test. It might be 10^7 in actual operation.
Darkmatter
Darkmatter4mo ago
You probably want buffered IO. Which Mojo doesn't do by default. Ideally what you would do is have a struct with location, velocity and acceleration that you add to a list for each round, run through the entire simulation, then print out the list. That will mean the computation part gets over as fast as possible and you move on to writing as fast as your disk can handle. Inteaweaving the two is going to cause unpredictable effects from a performance standpoint, since syscalls tend to trash your cache. You might also be able to just write the bytes of the list out to a file when you are done with the simulation and then open it up in numpy. Since Mojo has no data analysis ecosystem to speak of yet. If I was doing this in another language, I would have 2 threads and then have one do all of the IO using vector writes while the other does the sim and use the queue for backpressure. But that's hard in Mojo for a variety of reasons.
DayDayUp
DayDayUpOP4mo ago
Yes, that is what it happened. I don't mind these data can be stored in a list then written down every N steps. But i haven't found a good example that I can learn from.
Darkmatter
Darkmatter4mo ago
The problem that you're running into is that Mojo doesn't have good ways to do IO.
DayDayUp
DayDayUpOP4mo ago
🥲
Darkmatter
Darkmatter4mo ago
You could call into libc to open up the file, write the pointer from the list, deal with errors, and then clear the list. That would mean opening in numpy later would be fine and big batch IOs like that should be fast.
sb
sb4mo ago
does mojo have string interpolation yet?
DayDayUp
DayDayUpOP4mo ago
Thank you so much for these suggestions! It is indeed a feasible way even though not such elegant.

Did you find this page helpful?