`DynamicVector`, the name
Why is it not called
DynamicArray
? I find dynamic vector a bit tautological, given the vector terminology (likely) comes from C++, which literally means "dynamic sized array"[ref].35 Replies
Yeah I agree. Also,
Vector
is a questionable term to use, because it doesn't match either of the existing meanings of "vector" in engineering and computing:
- a mathematical vector
- a SIMD vector (which is inspired by mathematical vectors)
The idea of being "resizable" and being a mathematical "vector" are orthogonal. C++ decided to use the term "vector" to mean "resizable array" which was a super weird idea. Even the guy who invented the name agrees:
The name vector in STL was taken from the earlier programming languages Scheme and Common Lisp. Unfortunately, this was inconsistent with the much older meaning of the term in mathematics and violates Rule 3; this data structure should have been called array. Sadly, if you make a mistake and violate these principles, the result might stay around for a long time. (source)
@sora maybe it will be reaspnable to create feature request to keep track this idea?
Yes. I’m just not sure if that counts as bike shedding or not. Thus, I think gathering some community responses/consensus here can really help.
For me
Array
also looks better. Even if it is bikeshedding, proper and consistent identifiers are important and mess here could be painful in future due to cmpatibility
Even more, I remember discussion somewhere, that Tensor
is also not valid name for such structureThat was also me 😂
So, you can put it at once xD
100% agree with DynamicVector and Tensor needing a rename
Feel free to file a GH issue. We plan on renaming it at some point as I don't think we love the existing name either. Better to do these sort of things earlier than later before even more code starts using the stdlib. There's other issues with the API that we need to resolve. For example, it's a bit silly to have both append and push_back do the same thing. It's a weird mix of Python and C++ naming since its origin.
Naming and getting these things consistent is important, but hasn't been a high priority in the short term yet. It's all easy to change though.
Congrats @Joe Loser, you just advanced to level 1!
@Joe Loser so when you are up to adjust names? I reported other issue related to consistency and I really hope that it will be solved sooner than later due to compatibility 😉
I'll bring it up with the team next week and circle back.
Thank you 🙂
FWIW the name
List
might be ideal, if I am correct in assuming that DynamicVector
is meant to serve the same role as Python's list
type. And you could use a term like CappedList
to refer to a capacity-limited list, instead of InlinedFixedVector
. (IMO the term "inlined" is misleading for that struct, because the data isn't actually guaranteed to be stored inline.)The best will be to rename all members from
Vector
family and pick name which conforms to all of them. I mean here: DynamicVector
, InlinedFixedVector
and whole module: vector.mojo
Pro: It's what Python uses,
list
is backed by an array;
Cons: Just like vector
, it's not what the type is. List is usually reserved to linked list etc. In system programming language like Mojo, it's a bad, because people will make different assumptions about the complexity of it's methods.
"Capped something" sounds horrible, and feels like to much of a novelty, at least for me. Also, what do you mean the data is not gauaranteed to be stored inline?In my opinion, Mojo should primarily conform to the expectations we can assume python programmers have. Due to its positioning in the stack, previous "c++-work" will be performed in mojo, so onboarding those programmers have value, but I still think python expectations should be the default.
Good argument. Though python list and mojo dv are really very different for the former is a reference type and the latter value type. Making them look superficially similar will cause more problem than they solve, I think.
Ok, I might be wrong. According to Wikipedia, list is the name for abstract type and array the data structure, like dict vs hash map. If we use
Dict
for one, we'd better use List
for the other.
I still wish the implementation is called DynamicArray
and List is a alias/wrapper to that. I wish the implementation of Dict
is called DenseHashMap
or something, and alias.
All of there are not pressing matters IMO, so maybe we can just wait and see.Why go the route of calling it something else and aliasing it to some other thing when it could be called one thing?
If it's a list it should be called a list. Go has a similar datatype it calls a slice, and Rust has something similar it calls a vector.
It's just a name
Because calling it a list hides the implementation and thus hide the bases from which developers can reasonably deduce operational complexity.
I agree. Name should conform to its behavior and implementation
If you go to the doc for Rust Vec, first line reads "A contiguous growable array type". That’s what it is.
Why can't Mojo's list say the same thing in the first line?
Go's slice also mentions the same thing in the doc. It's a growable array backed by a fixed array.
You said it should be called by its real name. There are other issues: what do we call the Python compatible list with reference semantics?
So it seems to me that the ambiguity is something that can be fixed with documentation.
Names should be unambigous by default. If that requires explaination and diving into docs, it means that sth is wrong
Go, unlike rust and like python, have this type implemented in C, it’s something provided by the language, high level language style, of course it’s called whatever it’s called.
Mojo is not Python, though I do not see what the problem is. Most of Mojo already use value semantics where Python uses reference semantics. What issue would stem from having a list that is backed by value semantics?
I see.
Compatible source have slightly different semantics is quite bad. And one day we will have that type when class is in place. What do we call it?
list
?
Oh the flip side, I quite like the ergonomics provided by a short name like List
.
This is actually also why I suggested the alias solution.Well, there's already the semantics of Int and int. Uppercase letter for Mojo implementation, though I do not know how well I like the solution. Seems to create a two-world problem inside the language where you must always be vigilant to catch differences. I was thinking of not having a specialised solution for Python, Mojo's list might have some differences and if you move over you'll be expected to adjust some of your implementation to fit.
Exactly
Int is alright (I guess), bool is more troublesome.
~False
produce different results in Mojo and Python.List being reference type is likely a CPython implementation detail, neither the language nor library reference details what should be the nature of the element. https://docs.python.org/3/library/stdtypes.html#lists
I wonder if something like GraalPy implements lists as reference type or value type since it compiles to JVM
Python documentation
Built-in Types
The following sections describe the standard types that are built into the interpreter. The principal built-in types are numerics, sequences, mappings, classes, instances and exceptions. Some colle...
Think it's handled as a Java array
List has reference semantics is because Python classes suppose have references semantics (people have even suggested calling variables "name" in Python), and that’s because the whole language is built around that. It’s kinda expected by a Python programmer that
=
doesn’t mean copy.
Java array also have reference semantics I think.Yes, yes. But I'm asking, is that a CPython implementation detail (because it's interpreted) or a requirement in the Python language itself? Python does not have a specification but it has a language reference which intentionally stays vague in some of these decisions to allow flexibility.
No matter, since Mojo aims to provide the full reference semantics for the dynamic part I think I agree with your position.
How could it be implementation detail (how we make things work) when it’s semantics (how things suppose to work)? Of course there will be performance concerns and implementation concerns when the design choice is made, still, the semantics is something separate from the implementation. Do you not expect to modify
a
through b
when b = a
and b
is some class? This is what Guido wants. This is vastly off topic, let’s move to somewhere else.