C
C#16mo ago
__dil__

❔ Enum with data

I'm working on a lexer, and I'm encountering this problem where I need to associate different data with different token types. My first attempt was this:
internal enum TokenKind
{
A,
B,
C
}

internal struct Token
{
private TokenKind kind;

// Optional properties depending on which token kind
internal long? AData { get; }
internal string? BData { get; }
internal bool? CData { get; }
}
internal enum TokenKind
{
A,
B,
C
}

internal struct Token
{
private TokenKind kind;

// Optional properties depending on which token kind
internal long? AData { get; }
internal string? BData { get; }
internal bool? CData { get; }
}
then you can use it like so:
switch (token.kind)
{
case TokenKind.A:
// Do something with AData
Console.WriteLine(token.AData!);
break;
case TokenKind.B:
// Do something with BData
Console.WriteLine(token.BData!);
break;
case TokenKind.C:
// Do something with CData
Console.WriteLine(token.CData!);
break;
}
switch (token.kind)
{
case TokenKind.A:
// Do something with AData
Console.WriteLine(token.AData!);
break;
case TokenKind.B:
// Do something with BData
Console.WriteLine(token.BData!);
break;
case TokenKind.C:
// Do something with CData
Console.WriteLine(token.CData!);
break;
}
I kind of dislike this implementation though. For one it seems kind of messy with the null-forgiving stuff, but it also makes the token type very bulky since it requires a separate field for every data variant. Since there will be a ton of tokens, it's important to minimize their size. It's also error-prone in the sense that someone can try to read data without first checking the token kind. What I'm trying to do is conceptually this:
enum TokenKind {
A(i64),
B(String),
C(bool),
}
enum TokenKind {
A(i64),
B(String),
C(bool),
}
It doesn't have to look like this, but this is the kind of behavior I'm looking for (Rust enums, or algebraic sum types for those who know). What would be the clean way to make this in C#? Bear in mind There is actually a lot of token kinds, and not all token kinds have data associated with them (in fact the majority do not).
49 Replies
Buddy
Buddy16mo ago
Unfortunately C# does not allow for data in Enums, I personally love Rust's Enums.
__dil__
__dil__OP16mo ago
So do I, and I am aware that it's not directly possible in C#, but surely there's some way to achieve similar semantics, no? How do C# devs solve these kinds of problems?
Buddy
Buddy16mo ago
You can just use a struct called something like 'Token' with properties such as Value, Index and Type. Type property being an enum.
__dil__
__dil__OP16mo ago
I'm not sure what Index refers to here, and what would be the type of Value?
Buddy
Buddy16mo ago
Type property being an enum.
__dil__
__dil__OP16mo ago
yeah I get Type, what about the others
Buddy
Buddy16mo ago
Value being the value of the token, example (. Index refers to the index of where the token is found.
__dil__
__dil__OP16mo ago
Oh, I don't need those as my lexer works with spans. So let's set those aside for a sec. So, is there truly not a way to achieve this using polymorphism or whatever? Because that doesn't seem like it solves the problem at all, it doesn't include the type-specific data unless I'm misunderstanding
Thinker
Thinker16mo ago
This is a pattern I use a lot
public abstract record Token
{
public sealed record Regular(TokenKind Kind) : Token;

public sealed record Integer(TokenKind, int Value) : Token;
}
public abstract record Token
{
public sealed record Regular(TokenKind Kind) : Token;

public sealed record Integer(TokenKind, int Value) : Token;
}
Buddy
Buddy16mo ago
That's better. But .. Yeah.
Thinker
Thinker16mo ago
Although this won't work if you're using spans and ref structs
__dil__
__dil__OP16mo ago
(to be clear, by span I meant a struct that contains a start byte offset and an end byte offset, not like the C# Span type)
reflectronic
reflectronic16mo ago
the record pattern is definitely the best here you get all of the nice pattern matching syntax, you get equality, and it is not entirely unpleasant to add more cases
__dil__
__dil__OP16mo ago
As someone who has never done traditional OOP, Thinker's suggestion is very foreign to me. It'll take me some time to figure out how it actually works, but it does look like a promising solution You seem hesitant about that solution. Any particular reason why?
Buddy
Buddy16mo ago
No, nothing. I like it better than my idea 😛 abstract / record is definitely the correct way of doing it, and possibly a lot easier.
__dil__
__dil__OP16mo ago
Alright then 🙂 I'll read some docs and give it a go I think I grok abstract and sealed, but I'm not too sure what's the reason for record here
reflectronic
reflectronic16mo ago
a`record type comes with built-in support for equality. that is sort of the biggest feature so, a1 == a2 if all their fields are equal
__dil__
__dil__OP16mo ago
oh ok
reflectronic
reflectronic16mo ago
but, in this case, i think what is more important is the concise syntax
__dil__
__dil__OP16mo ago
I expect all my data to be Copy (I believe you call them "value types" in C#, right?). Would it make sense to go with record struct?
reflectronic
reflectronic16mo ago
because Regular uses a "primary constructor" (the (TokenKind Kind) part), the record automatically gains a constructor, a public property for Kind, and a deconstructor which makes pattern matching just a breeze
var m = token switch
{
Regular(TokenKind.A) => 1,
Integer(TokenKind.B, > 100) => 2,
// ...
}
var m = token switch
{
Regular(TokenKind.A) => 1,
Integer(TokenKind.B, > 100) => 2,
// ...
}
__dil__
__dil__OP16mo ago
ah I see how that makes records a good fit for enum variants
reflectronic
reflectronic16mo ago
you can't do inheritance with structs which makes this pattern much more unpleasant
__dil__
__dil__OP16mo ago
ahh ok I see! Out of curiosity, what does "unpleasant" imply here? it seems like this whole solution kind of relies on inheritance, so I'm curious how it would even look like without it
reflectronic
reflectronic16mo ago
yeah i don't think unpleasant was the right word you would have to do something awful
__dil__
__dil__OP16mo ago
lol Well, in any case, thanks to everyone who helped. The abstract record solution looks like it's going to be a useful pattern 🙂 @reflectronic sorry to ping, please let me know if that's not OK: As I mentionned I'm not super familiar with inheritance, especially with all the record biz. Anyways, so 1) What would be the correct way to add field shared by all tokens? It should be accessible without having to match the token variant. 2) I also see that all variants share a TokenKind Kind field. Is it possible/desirable to "pull" that into the parent? Again, I can't stress enough how unfamiliar I am with all of this, so what I'm asking might not even make sense. Please let me know if that's the case.
reflectronic
reflectronic16mo ago
to add something that's shared, you can add it to Token. something like:
public abstract record Token(TextSpan Span)
{
public sealed record Regular(TokenKind Kind, TextSpan Span) : Token(Span);
public sealed record Integer(TokenKind, int Value, TextSpan Span) : Token(Span);
}
public abstract record Token(TextSpan Span)
{
public sealed record Regular(TokenKind Kind, TextSpan Span) : Token(Span);
public sealed record Integer(TokenKind, int Value, TextSpan Span) : Token(Span);
}
this adds a Span property to Token, which is inherited by Regular and Integer. we add Span as a parameter to each derived record & pass it to Token's primary constructor, so it can actually be set
__dil__
__dil__OP16mo ago
For the sake of discussion, here's what I have now:
public abstract record Token
{
// I'd like to add a field for the location of the token...

// Variants with no data
public sealed record Regular(TokenKind Kind) : Token;

// Variants for literals
public sealed record LitInteger(TokenKind Kind, long Value) : Token;

public sealed record LitFloat(TokenKind Kind, double Value) : Token;

public sealed record LitBool(TokenKind Kind, bool Value) : Token;

// Identifiers
public sealed record Identifier(TokenKind Kind, InternerId InternerId) : Token;
}
public abstract record Token
{
// I'd like to add a field for the location of the token...

// Variants with no data
public sealed record Regular(TokenKind Kind) : Token;

// Variants for literals
public sealed record LitInteger(TokenKind Kind, long Value) : Token;

public sealed record LitFloat(TokenKind Kind, double Value) : Token;

public sealed record LitBool(TokenKind Kind, bool Value) : Token;

// Identifiers
public sealed record Identifier(TokenKind Kind, InternerId InternerId) : Token;
}
reflectronic
reflectronic16mo ago
i would not have a TokenKind. the token kind is the type if the instance of Token you have is a LitInteger then that is the token kind
__dil__
__dil__OP16mo ago
Oh okay that makes sense. So I'd just ditch the enum entirely?
reflectronic
reflectronic16mo ago
right, it will likely just get in the way. i would get rid of it
__dil__
__dil__OP16mo ago
Alright, cool. Thanks 😄
Accord
Accord16mo ago
Was this issue resolved? If so, run /close - otherwise I will mark this as stale and this post will be archived until there is new activity.
__dil__
__dil__OP15mo ago
@reflectronic sorry to ping on this old discussion. I've been playing around with the sealed record pattern and I hit a limitation. Since you're pretty knowledgeable about stuff like this, I thought I might ask 🙂 My problem is that it doesn't seem possible to change the variant through a method. For example:
public abstract record Foo {
public sealed record Bar
{
}

public sealed record Baz
{
}

public void ProblematicMethod()
{
this = new Bar(); // Error: Reference to 'this' is immutable in record declarations. The assignment target must be an assignable variable, property, or indexer.
}
}
public abstract record Foo {
public sealed record Bar
{
}

public sealed record Baz
{
}

public void ProblematicMethod()
{
this = new Bar(); // Error: Reference to 'this' is immutable in record declarations. The assignment target must be an assignable variable, property, or indexer.
}
}
is it possible to work around this?
reflectronic
reflectronic15mo ago
you can't reassign this/change the type of this. it would not make sense because each variant exists as its own type. so, you can have Baz b = .... if b.ProblematicMethod(); changed b to a Bar it would be an issue
__dil__
__dil__OP15mo ago
Hmm thinking about this a bit more, I'm not even sure that would work with inheritance at all. Doing something like that seems like it could change the variable's type (from say Baz to Bar) which is clearly not allowed.
reflectronic
reflectronic15mo ago
right, it's a little different from other languages where each variant isn't its own type
__dil__
__dil__OP15mo ago
Hmm, unfortunate. It seems like this makes it impossible to implement some fundamental enum stuff (like state machines and whatnot) thanks for explaining
reflectronic
reflectronic15mo ago
if you had something like a static void Transition(ref Foo foo) you could get close
__dil__
__dil__OP15mo ago
oh ok ok I think I could work with that! wait actually idk if that works 🤔 Let's say I have a Bar that I want to pass to Transition, how would I do that?
reflectronic
reflectronic15mo ago
the variable shouldn't have type Bar. it should have type Foo, but you store a Bar into it
__dil__
__dil__OP15mo ago
gotcha, so I have to make Foo concrete? (i.e. remove abstract)?
reflectronic
reflectronic15mo ago
no, it's still allowed to be abstract
Foo state;

// initial state
state = new Bar();

// state may become something else
Foo.Transtion(ref state);

// it might be something else now
switch (state)
{
case Bar: Console.WriteLine("bar"); break;
case Baz: Console.WriteLine("baz"); break;
}
Foo state;

// initial state
state = new Bar();

// state may become something else
Foo.Transtion(ref state);

// it might be something else now
switch (state)
{
case Bar: Console.WriteLine("bar"); break;
case Baz: Console.WriteLine("baz"); break;
}
__dil__
__dil__OP15mo ago
i can't seem to make that work
reflectronic
reflectronic15mo ago
what isn't working about it
__dil__
__dil__OP15mo ago
No description
reflectronic
reflectronic15mo ago
Bar needs to inherit from Foo i think you meant to do that in your example but you didn't
__dil__
__dil__OP15mo ago
ohhh yeah my bad, thanks for pointing it out Ok cool, that should work well enough for what I need Thanks a lot, once again, @reflectronic 😄
Accord
Accord15mo ago
Was this issue resolved? If so, run /close - otherwise I will mark this as stale and this post will be archived until there is new activity.

Did you find this page helpful?