PhoToN 飞得高
PhoToN 飞得高
MModular
Created by Helehex on 8/5/2024 in #community-showcase
Thermo
Sure, thanks!
53 replies
MModular
Created by Helehex on 8/5/2024 in #community-showcase
Thermo
I am mostly working on scientific right now. but I am interested in trying out game dev sometime too.
53 replies
MModular
Created by Helehex on 8/5/2024 in #community-showcase
Thermo
@Ghostfire The Insatiable this looks amazing! Since I’m working on something similar, this is a great resource to learn from!
53 replies
MModular
Created by PhoToN 飞得高 on 7/6/2024 in #community-showcase
Tenka 点火 - Mojo package manager
🎉 Announcing Tenka v0.1 - A Package Manager for Mojo 🔥 Hey Mojicians! I'm excited to introduce Tenka, a package manager for Mojo. Here's what you can do with this first release: ✨ Key Features: • Create, activate, and manage multiple Mojo environments with different Mojo versions • Search, install, and uninstall Mojo packages from Github • Package your local Mojo files and add them to the active environment • Easy-to-use CLI tool 🚀 Getting Started Please check out the README for a detailed explanation of all features and Installation procedure, It's easy to install and use! GitHub: https://github.com/shivasankarka/Tenka Tenka is just getting started, and we'd love your feedback! Try it out and let us know what you think. If you're interested in contributing, please reach out. Feel free to open an issue or make a PR if you have any ideas or suggestions. Together, we can become better Mojicians 🪄
2 replies
MModular
Created by mad alex 1997 on 7/3/2024 in #community-showcase
NuMojo’s got NDArrays! 🥳
Hey @benny sorry for the that, that passed the radar. We were testing your implementation (which is really fast!) and it’s not part of NuMojo right now since we don’t have some compile time stuff and can’t be used. I will remove it for now and We will add the acknowledgment once we integrate it in later version! Thank you for understanding!
5 replies
MModular
Created by Sam.H.W on 6/27/2024 in #questions
Unable to run the kmean blog example.
For all the changes, Please look at “Standard library changes” in the latest change log here, https://docs.modular.com/mojo/changelog Kmeans should work once you implement these changes.
3 replies
MModular
Created by Sam.H.W on 6/27/2024 in #questions
Unable to run the kmean blog example.
@Sam.H.W the main problems I see from the Error text file you have provided is importing problems. In the latest Mojo version, some of these math functions were moved around if I am not wrong. For example, 1) You should use SIMD.__div__ instead of math.div in above case. 2) slice.__len()__ has been removed in 24.4 and you have to calculate it explicitly instead.
3 replies
MModular
Created by Martin Dudek on 6/17/2024 in #questions
Seeking Clarification on Current and Future Tensor Library Support in Mojo
these are the plots corresponding to matmul.
26 replies
MModular
Created by Martin Dudek on 6/17/2024 in #questions
Seeking Clarification on Current and Future Tensor Library Support in Mojo
ah now I see the confusion! I'm so sorry @benny. The above graph is for torch.mul and not torch.matmul, I think the label wasn't changed while changing graph from torch.matmul.
26 replies
MModular
Created by Martin Dudek on 6/17/2024 in #questions
Seeking Clarification on Current and Future Tensor Library Support in Mojo
where the time_mojo is generated from the mojo benchmark code above. time_torch is from torch running in the same .py file. As for the matmul, we are using almost same implementation as Modular one (https://docs.modular.com/mojo/notebooks/Matmul), except for the changes in store[], load[] methods where we use single index to reduce some overhead instead of the two index used in modular version.
26 replies
MModular
Created by Martin Dudek on 6/17/2024 in #questions
Seeking Clarification on Current and Future Tensor Library Support in Mojo
@benny , the following is the plotting code
import matplotlib.pyplot as plt
import matplotlib as mpl
import numpy as np

mpl.rcParams["text.latex.preamble"] = r"\usepackage{mathpazo}"
plt.rcParams["axes.linewidth"] = 2
plt.rc("text", usetex=True)
plt.rc("font", family="serif")
plt.rcParams["axes.linewidth"] = 2

fig = plt.figure(figsize=(10,6))
fig.tight_layout()

time_torch = [[],[],[]] # times for float16, float32, float64
time_mojo = [[],[],[]] # times for float16, float32, float64
size = [16, 256, 512, 1024, 2048]

# Bar width
bar_width = 0.35
index = np.arange(len(size))

# Adjusted x-axis labels for clarity
x_labels = [f"{s}" for s in size]

# Plot for float16
plt.bar(index - bar_width/2, time_torch[0], bar_width, label='Torch ("cpu)', align='center')
plt.bar(index + bar_width/2, time_mojo[0], bar_width, label='NuMojo', align='center', fill=False)

# Plot for float32
plt.bar(index - bar_width/2, time_torch[1], bar_width, align='center')
plt.bar(index + bar_width/2, time_mojo[1], bar_width, align='center', fill=False)

# Plot for float64
plt.bar(index - bar_width/2, time_torch[2], bar_width, align='center')
plt.bar(index + bar_width/2, time_mojo[2], bar_width, align='center', fill=False)

plt.xlabel('Size')
plt.ylabel('Time')
plt.yscale('log')
plt.xticks(index, x_labels) # Use adjusted x-axis labels
plt.legend()
plt.show()
import matplotlib.pyplot as plt
import matplotlib as mpl
import numpy as np

mpl.rcParams["text.latex.preamble"] = r"\usepackage{mathpazo}"
plt.rcParams["axes.linewidth"] = 2
plt.rc("text", usetex=True)
plt.rc("font", family="serif")
plt.rcParams["axes.linewidth"] = 2

fig = plt.figure(figsize=(10,6))
fig.tight_layout()

time_torch = [[],[],[]] # times for float16, float32, float64
time_mojo = [[],[],[]] # times for float16, float32, float64
size = [16, 256, 512, 1024, 2048]

# Bar width
bar_width = 0.35
index = np.arange(len(size))

# Adjusted x-axis labels for clarity
x_labels = [f"{s}" for s in size]

# Plot for float16
plt.bar(index - bar_width/2, time_torch[0], bar_width, label='Torch ("cpu)', align='center')
plt.bar(index + bar_width/2, time_mojo[0], bar_width, label='NuMojo', align='center', fill=False)

# Plot for float32
plt.bar(index - bar_width/2, time_torch[1], bar_width, align='center')
plt.bar(index + bar_width/2, time_mojo[1], bar_width, align='center', fill=False)

# Plot for float64
plt.bar(index - bar_width/2, time_torch[2], bar_width, align='center')
plt.bar(index + bar_width/2, time_mojo[2], bar_width, align='center', fill=False)

plt.xlabel('Size')
plt.ylabel('Time')
plt.yscale('log')
plt.xticks(index, x_labels) # Use adjusted x-axis labels
plt.legend()
plt.show()
26 replies
MModular
Created by Martin Dudek on 6/17/2024 in #questions
Seeking Clarification on Current and Future Tensor Library Support in Mojo
Hi @benny for benchmarking, it’s the same as that I have shared above except for the matplotlib code. Do you want the plotting part of the code?
26 replies
MModular
Created by Martin Dudek on 6/17/2024 in #questions
Seeking Clarification on Current and Future Tensor Library Support in Mojo
No description
26 replies
MModular
Created by Martin Dudek on 6/17/2024 in #questions
Seeking Clarification on Current and Future Tensor Library Support in Mojo
No description
26 replies
MModular
Created by Martin Dudek on 6/17/2024 in #questions
Seeking Clarification on Current and Future Tensor Library Support in Mojo
@Martin Dudek Update: As expected (I fell for my own trap xD, gotta be careful when benchmarking), I made a mistake in parallelization which leads to the reduction in time for larger sizes. I am adding the new plots, please find it here. I will update the text above in accordance with new plots.
26 replies
MModular
Created by Martin Dudek on 6/17/2024 in #questions
Seeking Clarification on Current and Future Tensor Library Support in Mojo
@Martin Dudek you are right, we haven’t released the array (NDArray) yet as we are polishing it and doing tests to clear out edge cases. We will be releasing it soon (pretty exciting! 😁) along with many other functionalities and get some community feedback. Looking forward to see Mojo and NuMojo evolve! 🔥.
26 replies
MModular
Created by Martin Dudek on 6/17/2024 in #questions
Seeking Clarification on Current and Future Tensor Library Support in Mojo
Hope that helps! Cheers fellow mojicians 🪄
26 replies
MModular
Created by Martin Dudek on 6/17/2024 in #questions
Seeking Clarification on Current and Future Tensor Library Support in Mojo
The code I used for Pytorch benchmark is the following:
import torch.utils.benchmark as benchmark
def batched_dot_mul_sum(a, b):
'''Computes batched dot by multiplying and summing'''
return torch.matmul(a,b)

def measure_matmul_time_pytorch(rows_a, cols_a, cols_b, dev, dtype):

# Input for benchmarking
x = torch.randn(rows_a, cols_a, dtype=dtype).to(dev)
y = torch.randn(cols_a, cols_b, dtype=dtype).to(dev)

t0 = benchmark.Timer(
stmt='batched_dot_mul_sum(x, y)',
setup='from __main__ import batched_dot_mul_sum',
globals={'x': x, 'y':y})

print("done")
return t0.timeit(50).mean

size= [10, 100, 500, 1000, 2000, 5000, 10000]
dtype = torch.float64
times = [measure_matmul_time_pytorch(s, s, s, "cpu", dtype) for s in size]
times_mps = [measure_matmul_time_pytorch(s, s, s, "mps", dtype) for s in size]
import torch.utils.benchmark as benchmark
def batched_dot_mul_sum(a, b):
'''Computes batched dot by multiplying and summing'''
return torch.matmul(a,b)

def measure_matmul_time_pytorch(rows_a, cols_a, cols_b, dev, dtype):

# Input for benchmarking
x = torch.randn(rows_a, cols_a, dtype=dtype).to(dev)
y = torch.randn(cols_a, cols_b, dtype=dtype).to(dev)

t0 = benchmark.Timer(
stmt='batched_dot_mul_sum(x, y)',
setup='from __main__ import batched_dot_mul_sum',
globals={'x': x, 'y':y})

print("done")
return t0.timeit(50).mean

size= [10, 100, 500, 1000, 2000, 5000, 10000]
dtype = torch.float64
times = [measure_matmul_time_pytorch(s, s, s, "cpu", dtype) for s in size]
times_mps = [measure_matmul_time_pytorch(s, s, s, "mps", dtype) for s in size]
and for NuMojo is the following,
import numojo
fn main() raises:
var size:VariadicList[Int] = VariadicList[Int](10,100,500,1000,2000,5000,10000)
var times:List[Float64] = List[Float64]()
alias type:DType = DType.float16
measure_time[type](size, times)

fn measure_time[dtype:DType](size:VariadicList[Int], inout times:List[Float64]) raises:

for i in range(size.__len__()):
var arr1 = numojo.array[dtype](size[i], size[i], random=True)
var arr2 = numojo.array[dtype](size[i], size[i], random=True)

var t0 = time.now()
for _ in range(50):
var arr_mul = matmul[dtype](arr1, arr2)
keep(arr_mul.unsafe_ptr())
times.append(((time.now()-t0)/1e9)/50)

for i in range(size.__len__()):
print(times[i])
import numojo
fn main() raises:
var size:VariadicList[Int] = VariadicList[Int](10,100,500,1000,2000,5000,10000)
var times:List[Float64] = List[Float64]()
alias type:DType = DType.float16
measure_time[type](size, times)

fn measure_time[dtype:DType](size:VariadicList[Int], inout times:List[Float64]) raises:

for i in range(size.__len__()):
var arr1 = numojo.array[dtype](size[i], size[i], random=True)
var arr2 = numojo.array[dtype](size[i], size[i], random=True)

var t0 = time.now()
for _ in range(50):
var arr_mul = matmul[dtype](arr1, arr2)
keep(arr_mul.unsafe_ptr())
times.append(((time.now()-t0)/1e9)/50)

for i in range(size.__len__()):
print(times[i])
26 replies
MModular
Created by Martin Dudek on 6/17/2024 in #questions
Seeking Clarification on Current and Future Tensor Library Support in Mojo
Please refer to plots attached below few messages. 1) In the float16 plot, NuMojo implementation is on par and ever slightly so slightly faster than Torch("CPU"). 2) In the float32 and float64, Torch performs better than current NuMojo implementation. Especially in float32 since I think Torch is well optimized for that. There's still a lot of compile time optimizations and tricks we can do in Mojo to reduce overhead, we aim to take advantage of these and slowly optimize all our math implementations step by step. As of now, NuMojo in its early infancy is on par or performs ever so slightly better than Torch("cpu") in some cases and in other cases, it is only roughly an order of magnitude or so slower than Torch in some cases. With Mojo GPU support in future, better compile time optimizations and some tricks, we can hope for more improvements and close in on these differences and improve. NOTE:
1) There are small fluctuations in mean time value in every run, so I have plotted these multiple times and irrespective of fluctuations, the Behaviour stays constant. 2) There are parameters in the matmul that can affect the time such as unroll factor, parallelize size etc. I have fixed certain set of values and went along, I haven't tried optimizing these values. These plots could scale differently in different systems depending on the parameters. 3) I honestly don't know if torch.float and DType.float are similar in representation, so it might not exactly be an apple to apple comparision. If anyone knows more details about this, please do share. "Update: There was error in above plots, so I have updated this text to reflect the correct plots attached below."
26 replies