How to improve the performance of formatting output

I wrote a piece of scientific simulation code as shown below:

    fn run(inout self) raises -> None:
        with open(self.output_path, "w") as f:
            var out: String = "location,velocity,acceleration\n"
            for _ in range(1, self.numsteps + 1):
                self.velocity = self.velocity + self.acceleration * self.deltaT / 2
                self.location = self.location + self.velocity * self.deltaT
                self.acceleration = -1 * self.location
                out += str(self.location) + "," + str(self.velocity) + "," + str(self.acceleration) + "\n"
            
            f.write(out)

    fn run(inout self) raises -> None:
        with open(self.output_path, "w") as f:
            var out: String = "location,velocity,acceleration\n"
            for _ in range(1, self.numsteps + 1):
                self.velocity = self.velocity + self.acceleration * self.deltaT / 2
                self.location = self.location + self.velocity * self.deltaT
                self.acceleration = -1 * self.location
                out += str(self.location) + "," + str(self.velocity) + "," + str(self.acceleration) + "\n"
            
            f.write(out)

Its performance is much lower than that of C/C++ and Swift. I know I’m a rookie at coding… As I’ve checked, the lagging part is: out += str(self.location) + "," + str(self.velocity) + "," + str(self.acceleration) + "\n". Are there any good ways to optimize the data output?

11 Replies

Darkmatter•7mo ago

Right now, not really. In the future there will be a way to do vectorized writes instead of all of those string copies. What value do you expect from numsteps?

DayDayUpOP•7mo ago

Thanks for the reply! numsteps is a constant representing the "number of steps", so it’s just an integer.

Darkmatter•7mo ago

Can you give me a rough order of magnitude of the value? numsteps=10 is going to need to be treated very differently than numsteps=1000000.

DayDayUpOP•7mo ago

It is set to 160000 in my test. It might be 10^7 in actual operation.

Darkmatter•7mo ago

You probably want buffered IO. Which Mojo doesn't do by default. Ideally what you would do is have a struct with location, velocity and acceleration that you add to a list for each round, run through the entire simulation, then print out the list. That will mean the computation part gets over as fast as possible and you move on to writing as fast as your disk can handle. Inteaweaving the two is going to cause unpredictable effects from a performance standpoint, since syscalls tend to trash your cache. You might also be able to just write the bytes of the list out to a file when you are done with the simulation and then open it up in numpy. Since Mojo has no data analysis ecosystem to speak of yet. If I was doing this in another language, I would have 2 threads and then have one do all of the IO using vector writes while the other does the sim and use the queue for backpressure. But that's hard in Mojo for a variety of reasons.

DayDayUpOP•7mo ago

Yes, that is what it happened. I don't mind these data can be stored in a list then written down every N steps. But i haven't found a good example that I can learn from.

Darkmatter•7mo ago

The problem that you're running into is that Mojo doesn't have good ways to do IO.

DayDayUpOP•7mo ago

🥲

Darkmatter•7mo ago

You could call into libc to open up the file, write the pointer from the list, deal with errors, and then clear the list. That would mean opening in numpy later would be fine and big batch IOs like that should be fast.

sb•7mo ago

does mojo have string interpolation yet?

DayDayUpOP•7mo ago

Thank you so much for these suggestions! It is indeed a feasible way even though not such elegant.

Gaming

Programming

How to improve the performance of formatting output

Did you find this page helpful?