- Implementation of Bcc.x addr, BCHG.B Dx,mem, BCHG.L Dx,Dy, BCLR.B Dx,mem, BCLR.L Dx,Dy, BRA.x abs, BSET.B Dx,mem, BSET.L Dx,Dy, BSR.x abs, BTST.B #imm,mem, BTST.B Dx,#imm, BTST.B Dx,mem, BTST.L Dx,Dy, CMP.x #imm,mem, CMP.x mem,Dy, CMP.x reg,Dy, CMPA.L reg,Ax, CMPA.W reg,Ax, CMPA.x mem,Ay, DBF.W Dx,addr, EOR.x #imm,mem, JMP.L abs, JMP.L mem, JSR.L abs, JSR.L mem, NEG.x mem, NOT.x mem, RTS, TAS.B Dx, TAS.B mem instructions.
- Cache invalidation fix for OSX 10.3.9 and below. (Thanks to Mike Blackburn again.)
- Fixed mask handling in BCHG.B Dx,mem instruction.
- Fixed missing register mapping in ASL.x #imm,Dy implementation.
- Fixed input dependency overwriting in certain memory-related allocation functions.
- Fixed dependency for destination memory pointer register in special memory reading.
- Fixed post address handler for condition code addressing modes, previously it might crash or call some random handler from the other addressing modes.
- Fixed instructions where temporary registers are allocated but not free'ed.
- Optimized masking for register to register bit instruction.
- Optimized the temporary register usage in helper_test_bit_register_register function.
- Optimized flag extraction in several shifting operation.
- Branch scheduling is more flexible: adding multiple interleaved branches is possible.
- Comment on missing implementation for an exception on loading odd address into PC.
A few highlights
First of all, let me brag around a little bit about the number of freshly implemented instructions. Right now 237 instructions are implemented out of 388, a solid 61% is done. (Previously the ratio was ~46%.)More MacOSX versions are supported now, Mike fixed up the cache flushing a little bit and added the pre-10.4 versions too. Please read the included README file regarding the compiling instructions.
While I was working on the instructions I discovered a few bugs and glitches, which are now fixed in this release thus improving the overall stability.
I have also managed to optimize the compiled code for some instructions. Together with the implementation of some yet missing instructions the results for the Mandelbrot test (mandel_though_hw.kick.gz among the test kick files) improved a bit compared to the previous results:
Interpretive: 108 seconds (no change there...);
JIT compiled without optimization: 44 seconds (previously it was 52 seconds);
JIT compiled with optimization: 27 seconds (previously it was 32 seconds).
That was the time for the self-polishing and now back to work...