C
C#3mo ago
Pdawg

Optimizing the pipeline on my Z80 emulator

Hello! I've come to ask you all a question about optimization of lambda expressions. Currently, my pipeline looks like this:
private readonly Action[] instructionTable = new Action[256];

// in ctor, for example:
instructionTable[0x48] = () => LD_R_R(C, B); // LD C, B
instructionTable[0x49] = () => LD_R_R(C, C); // LD C, C
instructionTable[0x4A] = () => LD_R_R(C, D); // LD C, D

private void LD_R_R(byte dest, byte source)
{
Registers.RegisterSet[dest] = Registers.RegisterSet[source];
LogInstructionExec($"0x{_currentInstruction:X2}: LD {Registers.RegisterName(dest)}, {Registers.RegisterName(source)}:0x{Registers.RegisterSet[source]:X2}");
}

// calling:
_currentInstruction = Fetch();
switch (_currentInstruction)
{
case 0xDD:
DDInstructionTable[_currentInstruction](); break;
case 0xFD:
FDInstructionTable[_currentInstruction](); break;
case 0xED:
EDInstructionTable[_currentInstruction](); break;
case 0xCB:
CBInstructionTable[_currentInstruction](); break;

default:
instructionTable[_currentInstruction](); break;
}
private readonly Action[] instructionTable = new Action[256];

// in ctor, for example:
instructionTable[0x48] = () => LD_R_R(C, B); // LD C, B
instructionTable[0x49] = () => LD_R_R(C, C); // LD C, C
instructionTable[0x4A] = () => LD_R_R(C, D); // LD C, D

private void LD_R_R(byte dest, byte source)
{
Registers.RegisterSet[dest] = Registers.RegisterSet[source];
LogInstructionExec($"0x{_currentInstruction:X2}: LD {Registers.RegisterName(dest)}, {Registers.RegisterName(source)}:0x{Registers.RegisterSet[source]:X2}");
}

// calling:
_currentInstruction = Fetch();
switch (_currentInstruction)
{
case 0xDD:
DDInstructionTable[_currentInstruction](); break;
case 0xFD:
FDInstructionTable[_currentInstruction](); break;
case 0xED:
EDInstructionTable[_currentInstruction](); break;
case 0xCB:
CBInstructionTable[_currentInstruction](); break;

default:
instructionTable[_currentInstruction](); break;
}
This is nice because it's very reusable, but, lambda indirection to convert the LD_R_R(C, B) into an anonymous call takes time. In this case, it seems like lambdas are pretty slow. How should I go about optimizing this while still keeping it reusable?
6 Replies
reflectronic
reflectronic3mo ago
you would need to write a separate method with a separate switch statement for all of the cases in instructionTable that calls LD_R_R(C, B); etc. directly
Pdawg
PdawgOP3mo ago
what, so like
private void LD_C_B() => LD_R_R(C, B);

instructionTable[0x48] = LD_C_B;
private void LD_C_B() => LD_R_R(C, B);

instructionTable[0x48] = LD_C_B;
reflectronic
reflectronic3mo ago
no, like:
switch (_currentInstruction)
{
case 0x48: LD_R_R(C, B);
// etc.
}
switch (_currentInstruction)
{
case 0x48: LD_R_R(C, B);
// etc.
}
i do not think it is that much more onerous to write
case 0x48: LD_R_R(C, B);
case 0x48: LD_R_R(C, B);
than
_instructionTable[0x48] = () => LD_R_R(C, B);
_instructionTable[0x48] = () => LD_R_R(C, B);
Pdawg
PdawgOP3mo ago
hmmm yeah i guess only thing is id like to separate the switch blocks out for each "table" should i just
_currentInstruction = Fetch();

switch (_currentInstruction)
{
case 0xDD:
//DDInstructionTable[_currentInstruction](); break;
DDSwitchTable(); break;
//etc
}
_currentInstruction = Fetch();

switch (_currentInstruction)
{
case 0xDD:
//DDInstructionTable[_currentInstruction](); break;
DDSwitchTable(); break;
//etc
}
like move each switch out to another method
reflectronic
reflectronic3mo ago
yes that would be fine i think
Pdawg
PdawgOP3mo ago
mk just really trying to optimize it while keeping it readable ive seen some horribly unreadable emulators but its fast and some emulators that are all pretty but eat 20% of your cpu trying to emulate a 4MHz z80
Want results from more Discord servers?
Add your server