Random123: Splittable pseudorandom number generators
GitHub
GitHub - YichengDWu/random123: Splittable pseudorandom number gener...
Splittable pseudorandom number generators. Contribute to YichengDWu/random123 development by creating an account on GitHub.
13 Replies
Very interesting. Do you have any concrete applications in mind for a random as pure function in Mojo? I just started to learn about JAX, so interestingly different to Pytorch and TF.
It's important for reproducibility in the async world. Jax's docs explaint it pretty well so please refer to it.
I noticed that you used the following pattern in several place places:
which causes unnecessary allocation on line
(*)
. resize
could also allocate, not in this case though. Could have written
Yeah I tested it and it showed no impact on the performance. Allocation seems to be very cheap without actual copy.
I think your way of writing can cause memory leak
the
keys.data
allocated by List[PRNGKey](capacity=num)
is never freed.Good point. I will fix it. Thank you for pointing it out!
I don't know if the use cases are similar. I ported the xoshiro256 prngs to mojo. You can jump the streams to produce multiple independent sequences. These can be computed in parallel using SIMD. https://github.com/Mojo-Numerics-and-Algorithms-group/MojoSci/blob/main/src/stochasticity/xoshiro.mojo
GitHub
MojoSci/src/stochasticity/xoshiro.mojo at main · Mojo-Numerics-and-...
Numerics for Mojo. Contribute to Mojo-Numerics-and-Algorithms-group/MojoSci development by creating an account on GitHub.
It's different. Xoshiro256 is not splittable. You still need to explicitly keep tracking of each PRNG instance and and its state to ensure they remain independent. Splittable PRNGs are more scalable and flexible.
Suppose you have a function f that calls g1 and g2, g1 then calls h1 and h2. g1 somehow needs to know the existence of g2. So you still need a global state that keeps track of everything.
That does sound different. I've used xoshoro et al. in large MPI applications where each process gets its own prng, all seeded from a common global seed, but each process jumps according to its MPI rank, thus providing reproducible results with independent streams running in parallel.
Congrats @tkeitt, you just advanced to level 3!
Yeah it's in general all good. But you need to know the number of prngs before lauching it. In some cases you might not know how many is needed.
Interesting. Splittable sounds ideal for a lot of uses. Do you generate a deterministic sequence of keys from an initial seed in each code block? I ran across this: https://github.com/ElsevierSoftwareX/SOFTX-D-23-00704
GitHub
GitHub - ElsevierSoftwareX/SOFTX-D-23-00704: Reproducible random nu...
Reproducible random number generation for parallel computations - ElsevierSoftwareX/SOFTX-D-23-00704
Yes. More specifically each key splits before use.
I was under the impression that all counter-based PRNGs can be splittable.