duck_tape
duck_tape
MModular
Created by duck_tape on 1/7/2025 in #questions
Mojo for-loop performance
https://github.com/sstadick/rust-vs-mojo-loop While profiling other code, trying to get my perf to match Rust, I noticed that my vanilla for-loop seemed to be one large source of difference. I'm not great with assembly, but looking at what was generated it seemed like Rust was able to skip bound checks when indexing into the arrays since the length of the array was/is given in the range. Has anyone else ran into this? While this is a toy example, I've run into in more complex scenarios and with real data as well. The two programs in question, repo has benching script:
import sys


fn main() raises:
var times = sys.argv()[1].__int__()

var array = List[UInt64]()
for i in range(0, times):
array.append(i)

var sum: UInt64 = 0
for _ in range(0, times):
for i in range(0, times):
sum += array[i]
print(sum)
import sys


fn main() raises:
var times = sys.argv()[1].__int__()

var array = List[UInt64]()
for i in range(0, times):
array.append(i)

var sum: UInt64 = 0
for _ in range(0, times):
for i in range(0, times):
sum += array[i]
print(sum)
use std::env::args;

fn main() {
let times = args()
.skip(1)
.next()
.unwrap()
.parse::<usize>()
.expect("Expected number as first arg");

// I don't think filling the array with the macro has any hidden optimizations, but just in case:
let mut array: Vec<u64> = vec![];
for i in 0..times {
array.push(i as u64)
}

let mut sum = 0;
for _ in 0..times {
for i in 0..times {
sum += array[i];
}
}
println!("{}", sum)
}
use std::env::args;

fn main() {
let times = args()
.skip(1)
.next()
.unwrap()
.parse::<usize>()
.expect("Expected number as first arg");

// I don't think filling the array with the macro has any hidden optimizations, but just in case:
let mut array: Vec<u64> = vec![];
for i in 0..times {
array.push(i as u64)
}

let mut sum = 0;
for _ in 0..times {
for i in 0..times {
sum += array[i];
}
}
println!("{}", sum)
}
5 replies
MModular
Created by duck_tape on 1/3/2025 in #community-showcase
A Benchmark with Files and Bytes
Crossposting my forum post since the formatting is a bit nicer there: https://forum.modular.com/t/showcase-a-benchmark-with-files-and-bytes-standard-benchmark-warnings-apply/420 tl;dr; pretty vanilla Mojo was beating out pretty vanilla Rust (all normal caveats about benchmarks being worthless apply).
24 replies
MModular
Created by duck_tape on 12/13/2024 in #questions
Passing a Slice to a function
What is happening when I pass a slice of a list to a function? With these examples (very contrived, but reflective of what I'm seeing in larger real code), passing a slice of a list to a function of signature borrowed items: List[UInt8] is waaaaayyyy slower than any other way. Is it allocating a new copy on a slice? Should I be looking into Span's instead? Or is this an area still being looked at https://github.com/modularml/mojo/issues/3653 ? The signature of __getitem__ for List makes it look like it returns a ref to itself though, which seemingly wouldn't need to allocate?
fn count_items(borrowed items: List[UInt8]) -> Int:
return len(items)


fn count_items_list(borrowed items: List[UInt8], offset: Int) -> Int:
return len(items) - offset


fn count_items_tensor(borrowed items: Tensor[DType.uint8], offset: Int) -> Int:
"""Mock function that would work on the tensor based on the offset instead of a slice
"""
return items.num_elements() - offset

fn main():
var item_list = List[UInt8]()
for i in range(10000):
item_list.append(i)

fn test_passing_slice() raises:
var sum = 0
for i in range(10000):
sum += count_items(item_list[i:])

fn test_passing_list() raises:
var sum = 0
for i in range(10000):
sum += count_items_list(item_list, i)

var item_tensor = Tensor(item_list)

fn test_passing_tensor() raises:
var sum = 0
for i in range(10000):
sum += count_items_tensor(item_tensor, i)
# Custom benchmark code that needs to be changed to the stdlib benchmark code
fn count_items(borrowed items: List[UInt8]) -> Int:
return len(items)


fn count_items_list(borrowed items: List[UInt8], offset: Int) -> Int:
return len(items) - offset


fn count_items_tensor(borrowed items: Tensor[DType.uint8], offset: Int) -> Int:
"""Mock function that would work on the tensor based on the offset instead of a slice
"""
return items.num_elements() - offset

fn main():
var item_list = List[UInt8]()
for i in range(10000):
item_list.append(i)

fn test_passing_slice() raises:
var sum = 0
for i in range(10000):
sum += count_items(item_list[i:])

fn test_passing_list() raises:
var sum = 0
for i in range(10000):
sum += count_items_list(item_list, i)

var item_tensor = Tensor(item_list)

fn test_passing_tensor() raises:
var sum = 0
for i in range(10000):
sum += count_items_tensor(item_tensor, i)
# Custom benchmark code that needs to be changed to the stdlib benchmark code
8 replies
MModular
Created by duck_tape on 12/11/2024 in #questions
I can no longer post in the other channels
And this feels like a dumb question, but I can't figure out why? I wanted to post in the advent of code channel. I've gone through all the "Start Here" material. Am I missing something or are channels locked down?
8 replies
MModular
Created by duck_tape on 12/10/2024 in #questions
Why is alias U8 = Dtype.uint8 not a type?
alias U8 = DType.uint8

@inline_always
fn is_digit(value: U8) -> Bool:
return value >= 48 && value <= 57
alias U8 = DType.uint8

@inline_always
fn is_digit(value: U8) -> Bool:
return value >= 48 && value <= 57
Gives expected a type, not a value on the U8 in the function signature. What am I missing?
6 replies
MModular
Created by duck_tape on 12/10/2024 in #community-showcase
ExtraMojo
https://github.com/sstadick/ExtraMojo ExtraMojo has been updated to support latest mojo and Magic, and now has more tests and examples! ExtraMojo is just things I wish were in the stdlib, which mostly means a buffered file reader that can read files by line (or any delim), and a tiney regex implementation. Feedback welcome and appreciated.
68 replies
MModular
Created by duck_tape on 2/14/2024 in #questions
Creating a string from a DynamicVector Drops the last character in the vector
This example sums up the question:
fn test_stringify() raises:
var example = DynamicVector[Int8]()
example.append(ord("e"))
example.append(ord("x"))

var container = DynamicVector[Int8]()
for i in range(len(example)):
container.append(example[i])
let stringifed = String(container)
assert_equal("ex", stringifed)
# Unhandled exception caught during execution: AssertionError: ex is not equal to e
fn test_stringify() raises:
var example = DynamicVector[Int8]()
example.append(ord("e"))
example.append(ord("x"))

var container = DynamicVector[Int8]()
for i in range(len(example)):
container.append(example[i])
let stringifed = String(container)
assert_equal("ex", stringifed)
# Unhandled exception caught during execution: AssertionError: ex is not equal to e
Is it assuming it's a null terminated string or something?
2 replies