Which is faster - a type LUT or a switch?
Hello again, C# Discord! I've got a question about performance, which is absolutely critical since I'm making an emulator. When decoding Z80 instructions that involve registers, there's a clever little hack to pull out the "operating register", or the one that the opcode is referencing. I get that a simple switch could work, but it creates a lot of repeating code. For example:
Notice how the only thing changing is the register it's referencing. So, I want to use the hack to extract the operating register out of the current opcode and then set the corresponding register accordingly, so I can combine all 3 cases (
0x01
, 0x11
, and 0x21
) into one that uses the operating register value.
I need to know how I should implement this, and I've landed on two options - a switch statement (switch (operatingRegister) ... etc
), or, a LUT using a Dictionary<byte, Action>
. Which one do you think would perform better? There are 26 possible values that the operating register can be, if that helps you all decide.
Thanks a lot!47 Replies
for something like this id say benchmark and see
how would you recommend I do a proper benchmark? just stopwatch a loop that gets and sets all registers?
GitHub
GitHub - dotnet/BenchmarkDotNet: Powerful .NET library for benchmar...
Powerful .NET library for benchmarking. Contribute to dotnet/BenchmarkDotNet development by creating an account on GitHub.
awesome - thanks!
SharpLab
C#/VB/F# compiler playground.
A switch statement will normally outperform a dictionary. The compiler has all sorts of tricks it can use, including generating a lookup
To be honest, having those 3 separate cases is probably going to be quicker than trying to be clever. In both cars the compiler had to find right case for the opcode, but in your second approach it's then got to do a bunch of bitwise arithmetic and maybe some branching to find the right register on top of that
there’s a lotttt of cases where this register lookup is used so I feel like it’ll make it readable and make my life easier developing it
How many times are you repeating
switch (opcode)
? I can't imagine you need to copy/paste that a lot - and if you do, can't you call a helper function which has the switch?I’ve seen someone else do this same hack, but their registers are just represented by an array of bytes so they can just take the output of this operating register operation and index into the array accordingly. I really don’t to make my registers be an array of bytes though cause it’s not readable and its hard to remember what goes where
switch (opcode) is executed in a loop until the PC hits the end of memory or if the processor is halted
I can do a function call but it doesn’t eliminate the issue I have, which is repeated code
I want this emulator to perform well but be very readable and easy to understand
I still don't understand why it's repeated. If it's called in a loop, that doesn't mean the code is repeated...
I mean the code inside of the 3 cases
it’s the same code but it’s referencing a different type. And again, this “hack” is useful in a lot of scenarios so it’ll make everything a lot cleaner and easier to write
I still don't understand why you can't put the
switch (opcode)
(first code sample in your question) into a helper function, if needed
(bear in mind I don't know the Z80 instruction set)
(and have no idea what the "byte array hack" is)someone else made a z80 emulator, whose register set is defined as an array of bytes. this allows them to index into it given the
operatingRegister
value directly and get/set that part accordingly. my register set is defined as separate typesim against this idea of using a byte array as the register set because it's unreadable and makes it a pain for anyone who doesn't understand the Z80
I don't see a byte array in that code, or how that relates to opcode decoding
the screenshot is my register set
see, this is how it's currently done
but this code is all repetitive
there's no technical reason why this wouldn't work, but it's just not ideal and i'd like to use the operating register value
How many cases are there? Why are you copy/pasting that switch statement around?
these are the cases for the ld dd, nn opcodes
every single instruction will be parsed under this root switch
so i want to keep it as small as possible
and moving each little set of instructions out to helper functions doesnt solve the problem, it just moves the repeated code out to another spot
Ah so there are many more cases than the 3 you've referred above?
yes
this is just an example
this will have to be done all over the place if i don't implement this clever little lookup
Right! You didn't say that. You said you didn't want to keep repeating the three cases you already had
lol, sorry. but this repetitiveness happens all over the place because im lacking an implementation for the operating register lookup
Or not? Why "all over the place"? Why does anything except the opcode decoder care about this?
nothing else does
this massive switch doesnt effect anything from a code standpoint
this is all semantic
it makes the code a pain to read
So you're not repeating them "all over the place"? Just in that one switch statement?
again, i want this to be as simple, clean, and readable as possible
correct
repeating code is still bad regardless
Right OK
im personally not a fan, like at all
Well, ish. Often verbose code is more understandable more quickly, as you don't need to learn how all of the abstraction work
sort of, yeah, but thisll end up being a 4000 line file if everything has to be repeated lol
Also your debug code doesn't need to be on the fast path, so you can potentially reduce the body if each case down to 1 line
this is a library so i cant use preprocessor directives. the debug stuff has to be done at runtime to enable/disable logging depending on the mode of the processor
If you want to move the complexity out of the switch, you can make a register an object (rather than a property), and have a method which returns the register for a particular opcode
it has another mode where logging is disabled for performance
but thats besides the point
You misunderstand. I mean you can use slower constructs for logging, such as dictionary lookups etc
wdym? the logging happens right when an instruction is decoded so it kind of has to be inlined with the rest of the actual instruction behavior
But anyway. Make a register an object. Write a method which gets the register for a given "byte array offset"
thats kinda what i was thinking of doing
but wouldnt it require reflection to pull that off
to return a type
reflection is slow
For example, if a register is an object, it can return its name, and you can use that the construct the log message
No?
yeah, but im not worried about the logging code rn
im more worried about the actual setting of the right register
whether it be BC, DE, or HL
Stick your register objects in an array, do a O(1) lookup based on the same index as the byte array hack uses
ahhhh wait
youre on to something
yeah okay i get what youre saying
thats actually a great idea, thanks!
Your register objects can just be references back to your existing register bank storage thingy, if you still want to keep all of the register values in one place