Anyway, I managed to do some work on the E-UAE JIT in the stolen moments.
In the update for this month you will find these little eggs:
- Implementation of ADD.x Dy,mem, ADD.x mem,Dy, ADD.x #imm,mem, ADDA.x mem,Ay, ADDQ.x #imm,mem,
AND.x mem,Dy, AND.x reg,mem, ANDI.x #imm,mem,
BCLR.B #imm,mem, BCHG.B #imm,mem, BCHG.L #imm,reg, BSET.B #imm,mem,
NEG.x Dy,
OR.x Dy,mem, OR.x mem,Dy, ORI.x #imm,mem,
UNLK.x Ay instructions.
- Fixed unintended modification of the source register for some register to memory operations.
- Memory read helper tweaked to use R3 register as the result register, no need to copy the data back-and-forth. (More optimal compiled code.)
- Memory reader and writer helper function cleaned up to be more independent from caller data.
The tricky part was accessing the memory while the allocated temporary registers remain accessible somehow. With a minor workaround for saving and occasionally reloading the temporary registers after the memory access this is solved now.
I am not too happy about how the whole register mapping works, unfortunately there are some limitations of the C language which makes it complicated to come up with a more robust solution. So, right now the whole thing is just a bit hacky and wacky. Maybe in the future it would need an overhaul.
I get the question most of the times: how many instructions are left to implement. There is an easy way to find out the progress: check the table68k_comp descriptor file.
Each (to be) supported instructions for the JIT compiling is already listed there, next to the name of the instruction there is a number: 0 or 1. The 1 means it is already done, 0 remains to be implemented.
The instructions which will not be supported by the JIT compiling (so the interpretive will handle these) are not listed in this file.
So, all we need to do is counting the instructions which are already supported and what remains to be done. The current state without the FPU instructions is: 181 is done out of 388 (~46% is done).
As you can see there is more work to do, but it is really hard to tell how long does it take. What I can see is that the time I have to spend with each instruction is shorter and shorter, due to the infrastructure which had to be built first but now it is mostly done. Also some instructions are very similar, I can simply reuse parts of an already finished instruction.
We are not there yet, but the donkey is not that stubborn anymore. Giddy-up buddy!
Very good to see progress :)
ReplyDeleteI tested it on Siedler (Settlers) on MorphOS and here I get the message: "JIT macroblock array run out of space, please enlarge the static array".
Hm, interesting. Usually this is the sign of an overrunning block, like jumping into the "empty" memory to a random address. Anyway, you can try to increase the size of the macroblock buffer: open the compemu_compiler_ppc.c source and look for this line:
Delete#define MAXMACROBLOCKS (MAXRUN*4)
The assumption is that the average number of macroblocks per instruction is around 4, which might or might not be valid assumption. Bump it up to 5 or 6 and test it again.
Even 10 is not much enough ;) I set it to 15, and the error was gone. But at 15 and above I get a new error: unexplained cache miss.
DeleteThen the execution simply ran amok. It jumped to the middle of the memory and whe in ran out of the mapped memory then the unexplained cache flush happened. Some fixes are coming in the pipeline which might have a positive effect on the current behavior.
Delete