Martin Dudek
Martin Dudek
MModular
Created by TilliFe on 7/18/2024 in #community-showcase
Endia
Exciting project. :mojo: @TilliFe In your Endia Stack Concept image https://raw.githubusercontent.com/endia-org/Endia/nightly/assets/endia_stack_concept.png machine learning is a box above the Endia box. Does that mean that functionality like in torch.nn won't become part of the core Endia lib or are you planing to integrate functions to build and train neural networks.
46 replies
MModular
Created by Jack Clayton on 7/26/2024 in #community-showcase
MAX tutorials community feedback and questions
I second @Darin Simmons feedback, the order within the tutorial seems mixed up.
One thing I feel is missing in the MAX docu, for people coming from having some grasp of what is possible with Mojo, is an initial explanation what MAX offers that can't be achieved equally with pure Mojo. After all there is Basalt in pure Mojo, there is llama2.mojo. While it of course becomes clearer once you learn about MAX, I think a few words about this would help many people get motivated to learn about MAX. I assume many come to MAX after having dived into Mojo for a while. Of course i might just have missed to see this part in the docu ...
9 replies
MModular
Created by TilliFe on 7/18/2024 in #community-showcase
Endia
Eager Mode: Iter: 10 x 1000 Avg Loss: 0.22449450194835663 Functional Eager Mode with Grad : Iter: 10 x 1000 Avg Loss: 0.28279870748519897 JIT: Iter: 10 x 1000 Avg Loss: 0.099444642663002014
46 replies
MModular
Created by TilliFe on 7/18/2024 in #community-showcase
Endia
@TilliFe Just committed a PR with a simple implementation of multiple runs of the benchmarks to calculate average results. Feel free to use it or modify it as needed. If it doesn't fit, just ignore it. 😉
46 replies
MModular
Created by TilliFe on 7/18/2024 in #community-showcase
Endia
i can give it a try but would need to digg into your implementation - basically the weights would need to be initialised for each of the 1000 loops I assume - straight forward for mlp_func and mlp_imp it seems, for JIT, i will try 😉
46 replies
MModular
Created by TilliFe on 7/18/2024 in #community-showcase
Endia
here the results of 10 runs ... the Loss of MAX JIT is not always the lowest , seems to depend on the random weight initialization as you said already ... if you want to extend the benchmarks to calculate averages over multiple runs i am happy to run another test ...
46 replies
MModular
Created by TilliFe on 7/18/2024 in #community-showcase
Endia
I just ran the benchmarks on MacOS ❯ max --version max 24.4.0 (59977802) Modular version 24.4.0-59977802-release ❯ mojo --version mojo 24.4.0 (59977802) and noticed the Loss is significant smaller for MAX JIT compilation: Running MLP benchmark in eager mode. Iter: 1000 Loss: 0.22504070401191711 Total: 0.0069106340000000023 Fwd: 0.00096554800000000021 Bwd: 0.0015250959999999984 Optim: 0.0023210129999999963 Running MLP benchmark in a functional eager mode with grad: Iter: 1000 Loss: 0.25778612494468689 Total: 0.0048792460000000003 Value_and_Grad: 0.0027390779999999994 Optim: 0.0021332430000000025 Running MLP benchmark with MAX JIT compilation: JIT compiling a new subgraph... Iter: 1000 Loss: 0.061800424009561539 Total: 0.022694156999999975 Value_and_Grad: 0.020552729000000027 Optim: 0.0021339400000000013
46 replies
MModular
Created by TilliFe on 7/18/2024 in #community-showcase
Endia
super cool @TilliFe 🔥
46 replies
MModular
Created by White Frost on 7/16/2024 in #questions
Anyone here has tried using Mojo with local Whisper from OpenAI?
Mojo right now does not compile to code which runs on the GPU. It runs on the CPU. MAX is about to get GPU support, so it would be the way to go. But I haven't looked into the MAX graph engine myself so far, so can't help with that. (just about to start learning about it)
7 replies
MModular
Created by White Frost on 7/16/2024 in #questions
Anyone here has tried using Mojo with local Whisper from OpenAI?
Do you have a Nvidia GPU device or Apple Silicon? If not I am afraid you need to use the smaller models to have acceptable run times.
7 replies
MModular
Created by White Frost on 7/16/2024 in #questions
Anyone here has tried using Mojo with local Whisper from OpenAI?
You might want to try https://github.com/ggerganov/whisper.cpp , runs very well on my Mac (it uses Metal) I thought of looking into porting it to Mojo but without GPU support it won't lead to anything impressive I am afraid...
7 replies
MModular
Created by Jack Clayton on 6/27/2024 in #community-showcase
Beta test Llama3 serving and GUI on MAX Nightly
Installation went without issues on my M2 Mac Book Pro, thanks for sharing this app. Works well except the issue @Darin Simmons already mentioned. Cancel button would be great
38 replies
MModular
Created by Kevin Thomas on 6/20/2024 in #community-showcase
mojonet
Mojo does not run on GPU yet, how the situation exactly is when using Python integration, let's see what response you get on your bug report.
14 replies
MModular
Created by Kevin Thomas on 6/20/2024 in #community-showcase
mojonet
Sorry, but I even haven't had the chance to set this up for my own projects. I've been juggling too many projects at once, and don't have the expertise yet to establish meaningful benchmark tests yet.
14 replies
MModular
Created by Kevin Thomas on 6/20/2024 in #community-showcase
mojonet
Interesting. So it's basically a Pytorch wrapper?
...
var torch = Python.import_module("torch")
...
...
var torch = Python.import_module("torch")
...
Have you done any benchmark tests, comparing it to using Pytorch directly? By the way, there is https://discord.com/channels/1087530497313357884/1238547362012729355 pure Mojo 😉
14 replies
MModular
Created by Martin Dudek on 6/17/2024 in #questions
Seeking Clarification on Current and Future Tensor Library Support in Mojo
I am sure the community with its Mojo Gurus will be able to give you valuable feedback once it is published. What you are implementing is just so central for many applications 🙏 Thanks for keeping me updated here but please dont take too much time for that. I am hooked anyway 😉
26 replies
MModular
Created by Ethan on 6/19/2024 in #community-showcase
Random123: Splittable pseudorandom number generators
Very interesting. Do you have any concrete applications in mind for a random as pure function in Mojo? I just started to learn about JAX, so interestingly different to Pytorch and TF.
17 replies
MModular
Created by Martin Dudek on 6/17/2024 in #questions
Seeking Clarification on Current and Future Tensor Library Support in Mojo
thanks a lot for the update, really looking forward to learn how you guys implemented these functions to be so close to torch.cpu . :mojo:
26 replies
MModular
Created by Martin Dudek on 6/17/2024 in #questions
Seeking Clarification on Current and Future Tensor Library Support in Mojo
Wow, thank you so much for all your efforts , this looks all very promising. I hope others also see these results. Right now numojo.arrayis not in the repo yet (correct me if i am wrong), any plans when you guys might make it available? I was honestly getting a bit unsure if Mojo is right language for me right now for the projects i want to implement (like KANs), but seeing this makes me looking at Mojo in a more positive light again Thanks a lot for that :mojo:
26 replies
MModular
Created by Martin Dudek on 6/17/2024 in #questions
Seeking Clarification on Current and Future Tensor Library Support in Mojo
This sounds fantastic, and surely will be a big contribution for the Mojo community, thank you so much. :mojo: 🙏 I don't know if this comparison makes sense at all, so i want to ask you. I compared my simple matmul implementation with the following pytorch based one, and the pytorch one is nearly 100 times faster. which came a bit of a shock. Have you done any comparison with pytorch or equivalent performant libs? Is this at all a realistic comparison. I have no clue about compiler/performance/optimization possibilities ... thanks a lot
import torch
import time

def measure_matmul_time(rows_a, cols_a, cols_b):
# Generate random matrices
A = torch.randn(rows_a, cols_a)
B = torch.randn(cols_a, cols_b)

# Warm-up
for _ in range(10):
_ = torch.matmul(A, B)

# Measure the time of a single matrix multiplication
start_time = time.time()
C = torch.matmul(A, B)
end_time = time.time()

elapsed_time = 1000. * (end_time - start_time )
print(f"Time {elapsed_time:.6f} ms")

_ = C[23,6]

measure_matmul_time(1024, 2048, 512)
import torch
import time

def measure_matmul_time(rows_a, cols_a, cols_b):
# Generate random matrices
A = torch.randn(rows_a, cols_a)
B = torch.randn(cols_a, cols_b)

# Warm-up
for _ in range(10):
_ = torch.matmul(A, B)

# Measure the time of a single matrix multiplication
start_time = time.time()
C = torch.matmul(A, B)
end_time = time.time()

elapsed_time = 1000. * (end_time - start_time )
print(f"Time {elapsed_time:.6f} ms")

_ = C[23,6]

measure_matmul_time(1024, 2048, 512)
26 replies