❔ Enum with data
I'm working on a lexer, and I'm encountering this problem where I need to associate different data with different token types. My first attempt was this:
then you can use it like so:
I kind of dislike this implementation though. For one it seems kind of messy with the null-forgiving stuff, but it also makes the token type very bulky since it requires a separate field for every data variant. Since there will be a ton of tokens, it's important to minimize their size.
It's also error-prone in the sense that someone can try to read data without first checking the token kind.
What I'm trying to do is conceptually this:
It doesn't have to look like this, but this is the kind of behavior I'm looking for (Rust enums, or algebraic sum types for those who know).
What would be the clean way to make this in C#? Bear in mind There is actually a lot of token kinds, and not all token kinds have data associated with them (in fact the majority do not).
49 Replies
Unfortunately C# does not allow for data in Enums, I personally love Rust's Enums.
So do I, and I am aware that it's not directly possible in C#, but surely there's some way to achieve similar semantics, no? How do C# devs solve these kinds of problems?
You can just use a struct called something like 'Token'
with properties such as Value, Index and Type.
Type property being an enum.
I'm not sure what Index refers to here, and what would be the type of
Value
?Type property being an enum.
yeah I get
Type
, what about the othersValue being the value of the token, example
(
.
Index refers to the index of where the token is found.Oh, I don't need those as my lexer works with spans. So let's set those aside for a sec.
So, is there truly not a way to achieve this using polymorphism or whatever?
Because that doesn't seem like it solves the problem at all, it doesn't include the type-specific data
unless I'm misunderstanding
This is a pattern I use a lot
That's better. But .. Yeah.
Although this won't work if you're using spans and ref structs
(to be clear, by span I meant a struct that contains a start byte offset and an end byte offset, not like the C#
Span
type)the record pattern is definitely the best here
you get all of the nice pattern matching syntax, you get equality, and it is not entirely unpleasant to add more cases
As someone who has never done traditional OOP, Thinker's suggestion is very foreign to me. It'll take me some time to figure out how it actually works, but it does look like a promising solution
You seem hesitant about that solution. Any particular reason why?
No, nothing.
I like it better than my idea 😛
abstract / record is definitely the correct way of doing it, and possibly a lot easier.
Alright then 🙂 I'll read some docs and give it a go
I think I grok
abstract
and sealed
, but I'm not too sure what's the reason for record
herea
`record
type comes with built-in support for equality. that is sort of the biggest feature
so, a1 == a2
if all their fields are equaloh ok
but, in this case, i think what is more important is the concise syntax
I expect all my data to be
Copy
(I believe you call them "value types" in C#, right?). Would it make sense to go with record struct
?because
Regular
uses a "primary constructor" (the (TokenKind Kind)
part), the record automatically gains a constructor, a public property for Kind
, and a deconstructor
which makes pattern matching just a breeze ah I see how that makes records a good fit for enum variants
you can't do inheritance with
struct
s which makes this pattern much more unpleasantahh ok I see!
Out of curiosity, what does "unpleasant" imply here?
it seems like this whole solution kind of relies on inheritance, so I'm curious how it would even look like without it
yeah i don't think unpleasant was the right word
you would have to do something awful
lol
Well, in any case, thanks to everyone who helped. The abstract record solution looks like it's going to be a useful pattern 🙂
@reflectronic sorry to ping, please let me know if that's not OK:
As I mentionned I'm not super familiar with inheritance, especially with all the record biz. Anyways, so
1) What would be the correct way to add field shared by all tokens? It should be accessible without having to match the token variant.
2) I also see that all variants share a
TokenKind Kind
field. Is it possible/desirable to "pull" that into the parent?
Again, I can't stress enough how unfamiliar I am with all of this, so what I'm asking might not even make sense. Please let me know if that's the case.to add something that's shared, you can add it to
Token
. something like: this adds a Span
property to Token
, which is inherited by Regular
and Integer
. we add Span
as a parameter to each derived record & pass it to Token
's primary constructor, so it can actually be setFor the sake of discussion, here's what I have now:
i would not have a
TokenKind
. the token kind is the type
if the instance of Token
you have is a LitInteger
then that is the token kindOh okay that makes sense. So I'd just ditch the
enum
entirely?right, it will likely just get in the way. i would get rid of it
Alright, cool. Thanks 😄
Was this issue resolved? If so, run
/close
- otherwise I will mark this as stale and this post will be archived until there is new activity.@reflectronic sorry to ping on this old discussion. I've been playing around with the sealed record pattern and I hit a limitation. Since you're pretty knowledgeable about stuff like this, I thought I might ask 🙂
My problem is that it doesn't seem possible to change the variant through a method. For example:
is it possible to work around this?
you can't reassign
this
/change the type of this
. it would not make sense because each variant exists as its own type. so, you can have Baz b = ...
. if b.ProblematicMethod();
changed b
to a Bar
it would be an issueHmm thinking about this a bit more, I'm not even sure that would work with inheritance at all. Doing something like that seems like it could change the variable's type (from say Baz to Bar) which is clearly not allowed.
right, it's a little different from other languages where each variant isn't its own type
Hmm, unfortunate. It seems like this makes it impossible to implement some fundamental enum stuff (like state machines and whatnot)
thanks for explaining
if you had something like a
static void Transition(ref Foo foo)
you could get closeoh ok ok I think I could work with that!
wait actually idk if that works 🤔 Let's say I have a
Bar
that I want to pass to Transition
, how would I do that?the variable shouldn't have type
Bar
. it should have type Foo
, but you store a Bar
into itgotcha, so I have to make Foo concrete?
(i.e. remove
abstract
)?no, it's still allowed to be abstract
i can't seem to make that work
what isn't working about it
Bar
needs to inherit from Foo
i think you meant to do that in your example but you didn'tohhh yeah my bad, thanks for pointing it out
Ok cool, that should work well enough for what I need
Thanks a lot, once again, @reflectronic 😄
Was this issue resolved? If so, run
/close
- otherwise I will mark this as stale and this post will be archived until there is new activity.