why does this String init work without len?

buf = InlineArray[UInt8, 640](0)
...
s = String(buf.unsafe_ptr())
buf = InlineArray[UInt8, 640](0)
...
s = String(buf.unsafe_ptr())
init sig:
fn __init__(inout self, ptr: UnsafePointer[UInt8], len: Int):
fn __init__(inout self, ptr: UnsafePointer[UInt8], len: Int):
37 Replies
Ryulord
Ryulord3mo ago
it expects your buffer to be null terminated so if you don't provide a length then it can just scan for the first null to figure out the length itself
Michael K
Michael K3mo ago
I think you end up using this init from String:
fn __init__(inout self, str: StringRef):
fn __init__(inout self, str: StringRef):
because StringRef has this init which doesn't require the length:
fn __init__(inout self, ptr: UnsafePointer[UInt8]):
fn __init__(inout self, ptr: UnsafePointer[UInt8]):
I am not 100% sure I am correct on the UnsafePointer to StringRef auto conversion but I am sure you are not using a String.__init__ signature requiring a length. Is you code magically working with the correct String length? The StringRef init I pointed to walks the pointer to find the length.
aurelian
aurelianOP3mo ago
interesting... what control flow causes it to use another type's constructor?
sora
sora3mo ago
I think you are correct and that's horrible, we should fix it
aurelian
aurelianOP3mo ago
when I put in the len (buf len + 1 for the 0 terminator), i get this crash:
src/tcmalloc.cc:302] Attempt to free invalid pointer 0x16b69edc8
fish: Job 1, './y3' terminated by signal SIGABRT (Abort)
src/tcmalloc.cc:302] Attempt to free invalid pointer 0x16b69edc8
fish: Job 1, './y3' terminated by signal SIGABRT (Abort)
sora
sora3mo ago
Could you include the whole program? Nothing to do with the constructor per se. Mojo finds the overload where each argument undergoes at most one implicit conversion.
aurelian
aurelianOP3mo ago
sock = send_cmd[og_slice]("query", args)
buf = InlineArray[UInt8, 2048](0)
read = sock.read_into(buf)
s = String(buf.unsafe_ptr(), read + 1)
sock = send_cmd[og_slice]("query", args)
buf = InlineArray[UInt8, 2048](0)
read = sock.read_into(buf)
s = String(buf.unsafe_ptr(), read + 1)
those are the relevant lines
sora
sora3mo ago
I think i know the problem, buf got freed too earily @Michael Kowalski was right, plus you are trying to free stack memory
Michael K
Michael K3mo ago
When you initialize a String from an UnsafePointer the String takes ownership to avoid a copy. So both buf and s own the same pointer here and both try to free it.
sora
sora3mo ago
If you are also not happy with implicit conversions, be very loud here.
aurelian
aurelianOP3mo ago
I haven't formed an opinion yet, if it's a bug I don't want to rush to judgment otherwise, I how are you supposed to know it would init the StringRef first?
sora
sora3mo ago
Implicit conversion itself is not a bug. Accidentally making String constructable from UnsafePointer[Byte] is.
aurelian
aurelianOP3mo ago
is there a solution?
sora
sora3mo ago
Yea, UnsafePointer[Byte].alloc, or String._buffer_type(capacity=...)
aurelian
aurelianOP3mo ago
so copy
Michael K
Michael K3mo ago
If you don't need the data to be in buf anymore then I think you should make buf a List[Byte] instead of InlineArray and then use the String initialization from List (which avoids a copy).
aurelian
aurelianOP3mo ago
once inline array supports slicing this gets easier though it didn't occur to me that creating a list would take ownership of the inline array, that is so much easier
sora
sora3mo ago
I'm not sure since inline array is allocated on the stack
aurelian
aurelianOP3mo ago
I mean you wouldn't return that s = String(List[UInt8](buf)) zero copy string right?
sora
sora3mo ago
This is more the idiom
l = String._buffer_type(...) # == List[Byte, hint_trivial_type=True]
String(impl=l)
l = String._buffer_type(...) # == List[Byte, hint_trivial_type=True]
String(impl=l)
aurelian
aurelianOP3mo ago
actually, creating the list allocates but that's beside the point StringSlice(buf[:read]) would be ideal in this scope and raise if not utf8
sora
sora3mo ago
You can in fact avoid alloc, using InlineArray would be equivalent.
from memory import UnsafePointer, stack_allocation
from utils import StringSlice

fn main():
p = stack_allocation[9, DType.uint8]()

# fill
s = String("123456789")
for i in range(9):
p[i] = s.as_bytes()[i]
#

s = StringSlice[MutableAnyOrigin](unsafe_from_utf8_ptr=p, len=9)
from memory import UnsafePointer, stack_allocation
from utils import StringSlice

fn main():
p = stack_allocation[9, DType.uint8]()

# fill
s = String("123456789")
for i in range(9):
p[i] = s.as_bytes()[i]
#

s = StringSlice[MutableAnyOrigin](unsafe_from_utf8_ptr=p, len=9)
aurelian
aurelianOP3mo ago
yep I use the unsafe StringSlice elsewhere, thanks. Would be nice to have a safe version could the span created when slicing an InlineArray have a local lifetime? sorry to bring up rust again, but arrays are sliced all the time
sora
sora3mo ago
What do you mean by local lt?
aurelian
aurelianOP3mo ago
how does rust prevent this:
let buf: [u8; 1024] = [0; 1024];
let slice = &buf[..10];
return slice; // compile error
let buf: [u8; 1024] = [0; 1024];
let slice = &buf[..10];
return slice; // compile error
sora
sora3mo ago
Ah, Mojo does "prevent" you from returning the Span unless you manually rebind the lifetime parameter
aurelian
aurelianOP3mo ago
so there's no unique danger in supporting __getitem__ then?
sora
sora3mo ago
Not at all, it's just people haven't implemented it yet. You could add that to the stdlib if you are interested.
aurelian
aurelianOP3mo ago
it's getting there let me check if I'm allowed to megacorp rules
sora
sora3mo ago
FYI, Mojo stdlib is apache 2 with LLVM exceptions, so if you are allowed to contribute to most libs for Rust, you should be fine.
aurelian
aurelianOP3mo ago
oh I see, the origin would be the array, nice ok this is cool StringRef.__init__(inout self, ptr: UnsafePointer[C_char], len: Int):
sora
sora3mo ago
Except StringRef is an unsafe type mainly meant for C++ interop which you probably should avoid using, people are trying to replace them with string slice in the stdlib
aurelian
aurelianOP3mo ago
yeah, this is the easy way, thx
data = List[UInt8](capacity=2048)
data.size = sock.read(data.unsafe_ptr(), data.capacity - 1)
s = String(data^)
data = List[UInt8](capacity=2048)
data.size = sock.read(data.unsafe_ptr(), data.capacity - 1)
s = String(data^)
do you want a gh issue?
sora
sora3mo ago
I actually already made a PR for it: https://github.com/modularml/mojo/pull/3698
GitHub
[stdlib] Make StringRef ctor from UnsafePointer[Byte] keyword o...
This issues was reported by @aurelian on Discord. This prevents accidental UnsafePointer[Byte] to String implicit conversion through String ctor from StringRef.
aurelian
aurelianOP3mo ago
sweet
sora
sora3mo ago
Could you share your GitHub handle so I can give you credit in the description?
aurelian
aurelianOP3mo ago
diocletiann. thanks!
Want results from more Discord servers?
Add your server