ā How to pattern-match a `Rune`?
is this the best I can do?
That's very noisy visually and it's unclear to me if the compiler is smart enough to perform the
(Rune)
cast on char
literals at compile-time.
Is there not a way to specify a constant/literal Rune value?29 Replies
Rune
is firstly very old I think and secondly has no kind of compiler/runtime support, so no it has no constant representationOne one think that since it's the way to represent valid utf-16 scalars and that it is old (according to you, I have no idea), that it would have decent language support, no? š¤
do c# devs just not care about supporting basic utf-16?
char
is UTF-16utf-16 scalars can be made of one or two chars
which is what rune represents
is that not correct?
at least for me, my tasks that involve parsing just don't have to handle non-ascii characters
that's fair, but in a world of international communication things tend to be more complex than that in general
is there something specific you're trying to achieve?
I think that's stated in the original question, let me know if I can clarify š
i mean the overall goal
I need to parse strings that may contain any valid utf-16 scalar
do you need to specifically handle characters that may be 2 code units?
or is the fact that they may exist in the input irrelevant to the actual parsing logic
here is something you can do:
yeah was about to suggest that
then you can pattern match as follows:
i assume you do not need to match on actual surrogate pairs, though you can add another overload of Deconstruct to do that
interesting!
didn't know
Deconstruct
allowed you to use that pattern matching syntaxBear in mind I come from a place where you don't have to ask this question since
char
is a whole scalar, not half of it. So the answer is "I don't know because I don't know which characters are part of the extended set or not".
Now, I'm aware the pragmatic answer is "if you don't know, then you probably don't need it". Nonetheless, I've been learning a lot by studying these more complex situations. It allows me to see different corners of the languages and the techniques that are used to work with them.
reflectronic's answer looks like it covers exactly what I need in this regardthere are not many commonly-used symbols that are not in the Basic Multilingual Plane (the set of characters which can be encoded with one UTF-16 code unit) so most people in C# will cook up the most cursed string processing algorithms in existence, largely because the APIs are not very good, and nobody will notice
yeah that makes sense
it is mostly historical scripts outside of the BMP. there are some mathematical symbols which are probably used more often than those. there are also many emoji
š is probably the most common non-BMP character
That is good to know! Definitely niche in the context of parsing, but good to know nonetheless.
In the context of general string manipulation this is paramount though since you need to be careful not to break up surrogate pairs.
@reflectronic if you don't mind me asking, how long have you been learning C#, and do you work professionally with it?
i'm not a professional, just a college sophomore :)
as for how long, it's hard to say, it's been at least three or four years, before that i was sort of on-and-off with programming
You've been answering a bunch of my more obscure questions, so I figure you're pretty familiar with the language. I was curious how long it might take to develop said familiarity.
one could say that, uh, my life skill tree is not very balanced
wdym? lots of CS?
many people have been using C# for far longer than me, but are probably more familiar with very different, more practical, things about the language
ah I see. Well, I can relate to you in a sense. I'm not a dev by trade (actually graduated in physics), so programming is more like a fun puzzle/hobby for me. Recently I've been thinking about becoming a dev though.
which is part of why I'm learning C#. Not a ton of Rust jobs in my area.
i like to understand how things work and read about the reasons why things are the way they are, it is very fun for me, and there is certainly a great depth of knowledge that comes from that. i am not sure how useful it is, or if other people like to do the same thing
Was this issue resolved? If so, run
/close
- otherwise I will mark this as stale and this post will be archived until there is new activity.