why does this String init work without len?

buf = InlineArray[UInt8, 640](0)
...
s = String(buf.unsafe_ptr())
buf = InlineArray[UInt8, 640](0)
...
s = String(buf.unsafe_ptr())
init sig:
fn __init__(inout self, ptr: UnsafePointer[UInt8], len: Int):
fn __init__(inout self, ptr: UnsafePointer[UInt8], len: Int):
37 Replies
Ryulord
Ryulord4mo ago
it expects your buffer to be null terminated so if you don't provide a length then it can just scan for the first null to figure out the length itself
Michael K
Michael K4mo ago
I think you end up using this init from String:
fn __init__(inout self, str: StringRef):
fn __init__(inout self, str: StringRef):
because StringRef has this init which doesn't require the length:
fn __init__(inout self, ptr: UnsafePointer[UInt8]):
fn __init__(inout self, ptr: UnsafePointer[UInt8]):
I am not 100% sure I am correct on the UnsafePointer to StringRef auto conversion but I am sure you are not using a String.__init__ signature requiring a length. Is you code magically working with the correct String length? The StringRef init I pointed to walks the pointer to find the length.
aurelian
aurelianOP4mo ago
interesting... what control flow causes it to use another type's constructor?
sora
sora4mo ago
I think you are correct and that's horrible, we should fix it
aurelian
aurelianOP4mo ago
when I put in the len (buf len + 1 for the 0 terminator), i get this crash:
src/tcmalloc.cc:302] Attempt to free invalid pointer 0x16b69edc8
fish: Job 1, './y3' terminated by signal SIGABRT (Abort)
src/tcmalloc.cc:302] Attempt to free invalid pointer 0x16b69edc8
fish: Job 1, './y3' terminated by signal SIGABRT (Abort)
sora
sora4mo ago
Could you include the whole program? Nothing to do with the constructor per se. Mojo finds the overload where each argument undergoes at most one implicit conversion.
aurelian
aurelianOP4mo ago
sock = send_cmd[og_slice]("query", args)
buf = InlineArray[UInt8, 2048](0)
read = sock.read_into(buf)
s = String(buf.unsafe_ptr(), read + 1)
sock = send_cmd[og_slice]("query", args)
buf = InlineArray[UInt8, 2048](0)
read = sock.read_into(buf)
s = String(buf.unsafe_ptr(), read + 1)
those are the relevant lines
sora
sora4mo ago
I think i know the problem, buf got freed too earily @Michael Kowalski was right, plus you are trying to free stack memory
Michael K
Michael K4mo ago
When you initialize a String from an UnsafePointer the String takes ownership to avoid a copy. So both buf and s own the same pointer here and both try to free it.
sora
sora4mo ago
If you are also not happy with implicit conversions, be very loud here.
aurelian
aurelianOP4mo ago
I haven't formed an opinion yet, if it's a bug I don't want to rush to judgment otherwise, I how are you supposed to know it would init the StringRef first?
sora
sora4mo ago
Implicit conversion itself is not a bug. Accidentally making String constructable from UnsafePointer[Byte] is.
aurelian
aurelianOP4mo ago
is there a solution?
sora
sora4mo ago
Yea, UnsafePointer[Byte].alloc, or String._buffer_type(capacity=...)
aurelian
aurelianOP4mo ago
so copy
Michael K
Michael K4mo ago
If you don't need the data to be in buf anymore then I think you should make buf a List[Byte] instead of InlineArray and then use the String initialization from List (which avoids a copy).
aurelian
aurelianOP4mo ago
once inline array supports slicing this gets easier though it didn't occur to me that creating a list would take ownership of the inline array, that is so much easier
sora
sora4mo ago
I'm not sure since inline array is allocated on the stack
aurelian
aurelianOP4mo ago
I mean you wouldn't return that s = String(List[UInt8](buf)) zero copy string right?
sora
sora4mo ago
This is more the idiom
l = String._buffer_type(...) # == List[Byte, hint_trivial_type=True]
String(impl=l)
l = String._buffer_type(...) # == List[Byte, hint_trivial_type=True]
String(impl=l)
aurelian
aurelianOP4mo ago
actually, creating the list allocates but that's beside the point StringSlice(buf[:read]) would be ideal in this scope and raise if not utf8
sora
sora4mo ago
You can in fact avoid alloc, using InlineArray would be equivalent.
from memory import UnsafePointer, stack_allocation
from utils import StringSlice

fn main():
p = stack_allocation[9, DType.uint8]()

# fill
s = String("123456789")
for i in range(9):
p[i] = s.as_bytes()[i]
#

s = StringSlice[MutableAnyOrigin](unsafe_from_utf8_ptr=p, len=9)
from memory import UnsafePointer, stack_allocation
from utils import StringSlice

fn main():
p = stack_allocation[9, DType.uint8]()

# fill
s = String("123456789")
for i in range(9):
p[i] = s.as_bytes()[i]
#

s = StringSlice[MutableAnyOrigin](unsafe_from_utf8_ptr=p, len=9)
aurelian
aurelianOP4mo ago
yep I use the unsafe StringSlice elsewhere, thanks. Would be nice to have a safe version could the span created when slicing an InlineArray have a local lifetime? sorry to bring up rust again, but arrays are sliced all the time
sora
sora4mo ago
What do you mean by local lt?
aurelian
aurelianOP4mo ago
how does rust prevent this:
let buf: [u8; 1024] = [0; 1024];
let slice = &buf[..10];
return slice; // compile error
let buf: [u8; 1024] = [0; 1024];
let slice = &buf[..10];
return slice; // compile error
sora
sora4mo ago
Ah, Mojo does "prevent" you from returning the Span unless you manually rebind the lifetime parameter
aurelian
aurelianOP4mo ago
so there's no unique danger in supporting __getitem__ then?
sora
sora4mo ago
Not at all, it's just people haven't implemented it yet. You could add that to the stdlib if you are interested.
aurelian
aurelianOP4mo ago
it's getting there let me check if I'm allowed to megacorp rules
sora
sora4mo ago
FYI, Mojo stdlib is apache 2 with LLVM exceptions, so if you are allowed to contribute to most libs for Rust, you should be fine.
aurelian
aurelianOP4mo ago
oh I see, the origin would be the array, nice ok this is cool StringRef.__init__(inout self, ptr: UnsafePointer[C_char], len: Int):
sora
sora4mo ago
Except StringRef is an unsafe type mainly meant for C++ interop which you probably should avoid using, people are trying to replace them with string slice in the stdlib
aurelian
aurelianOP4mo ago
yeah, this is the easy way, thx
data = List[UInt8](capacity=2048)
data.size = sock.read(data.unsafe_ptr(), data.capacity - 1)
s = String(data^)
data = List[UInt8](capacity=2048)
data.size = sock.read(data.unsafe_ptr(), data.capacity - 1)
s = String(data^)
do you want a gh issue?
sora
sora4mo ago
I actually already made a PR for it: https://github.com/modularml/mojo/pull/3698
GitHub
[stdlib] Make StringRef ctor from UnsafePointer[Byte] keyword o...
This issues was reported by @aurelian on Discord. This prevents accidental UnsafePointer[Byte] to String implicit conversion through String ctor from StringRef.
aurelian
aurelianOP4mo ago
sweet
sora
sora4mo ago
Could you share your GitHub handle so I can give you credit in the description?
aurelian
aurelianOP4mo ago
diocletiann. thanks!

Did you find this page helpful?