- Thanks to
Anonymous #1Thore andAnonymous #2itix (from the comments section for the previous post) MorphOS support for the JIT compiling is now implemented. (I had no possibility to test it, but fingers crossed...) - A bug is fixed in the memory read/write handling. It caused illegal memory access when the 3.x Kickstart was running, the stackframe was trashed due to a wrong offset calculation for the register saving.
Unfortunately, this is not the fix what is needed for let the AmigaOS boot yet, but at least one more baby step toward that direction.
P.S.: Anonymous MorphOS devs, don't you want to reveal yourselves? :)
Anonymous #1: My Name is Thore.
ReplyDeleteUnfortunately, I still get random crashes when trying mandel_hw.kick or iamalive. Sometimes it works, sometimes not. And sometimes mandel crashes after drawing some lines.
Btw: You forgot to include proto/exec.h and exec/system.h for MorphOS in memory.c.
Thanks, includes are fixed now.
DeleteDo you have a crashlog or something to go on with? If it happens randomly then most likely it is some trashed register or the stackframe is still wrong somehow.
You got rid of the anonymous function? ;) Okay.
DeleteI do not have a crash log but I know when it happens:
1. The bi->handler is set to "compile_p_at_start" (compemu_support.c/compile_block)
2. CL List is updated (This one is importand, without the update it works, but not in JIT mode)
3. m68k_run_2a (newcpu.c) tries to execute the code (I compared the handler addresses, so this is exactly the compile_p_at_start)
4. It won't return from this code, but other handlers worked 4 times before this happens.
So it could indeed be a stackframe or register issue.
What you described here is the calling of the compiled code chunk.
DeleteCould you please turn on the JIT logs (comp_log=true and comp_log_compiled), this way you can check what is the exact code chunk that was translated before the crash.
The log is this, right after it the window stays gray and crashes. It's the mandelbrot.
DeleteJIT: Init compiling
JIT: Compiled code start: 2b984f48
JIT: Comp: 00f8005e 12d8 MOVE.B (A0)+,(A1)+
JIT: Comp: 00f80060 51c8 fffc DBF.W D0,#$fffc == 00f8005e (FALSE)
JIT: Unsupported opcode: 0x51c8
JIT: M68k: 00f8005e 12d8 MOVE.B (A0)+,(A1)+
JIT: Mblk: load_memory_long
JIT: Dism: 2b984fac: lwz r15,64(r14)
JIT: Mblk: load_memory_long
JIT: Dism: 2b984fb0: lwz r3,68(r14)
JIT: Mblk: rotate_and_copy_bits
JIT: Dism: 2b984fb4: rlwimi r15,r3,16,26,26
JIT: Mblk: load_memory_long
JIT: Dism: 2b984fb8: lwz r3,32(r14)
JIT: Mblk: copy_register_long
JIT: Dism: 2b984fbc: mr r4,r3
JIT: Mblk: add_register_imm
JIT: Dism: 2b984fc0: addi r3,r3,1
JIT: Mblk: load_memory_long
JIT: Dism: 2b984fc4: lwz r5,36(r14)
JIT: Mblk: copy_register_long
JIT: Dism: 2b984fc8: mr r6,r5
JIT: Mblk: add_register_imm
JIT: Dism: 2b984fcc: addi r5,r5,1
JIT: Mblk: save_reg_stack
JIT: Dism: 2b984fd0: stw r6,16(r1)
JIT: Mblk: save_memory_long
JIT: Dism: 2b984fd4: stw r3,32(r14)
JIT: Mblk: save_memory_long
JIT: Dism: 2b984fd8: stw r5,36(r14)
JIT: Mblk: load_memory_spec
JIT: Dism: 2b984fdc: mr r3,r4
JIT: Dism: 2b984fe0: rlwinm r0,r3,18,14,29
JIT: Dism: 2b984fe4: lis r5,10669
JIT: Dism: 2b984fe8: ori r5,r5,40152
JIT: Dism: 2b984fec: lwzx r5,r5,r0
JIT: Dism: 2b984ff0: lwz r5,8(r5)
JIT: Dism: 2b984ff4: mtlr r5
JIT: Dism: 2b984ff8: blrl
JIT: Dism: 2b984ffc: mr r4,r3
JIT: Mblk: load_reg_stack
JIT: Dism: 2b985000: lwz r3,16(r1)
JIT: Mblk: check_byte_register
JIT: Dism: 2b985004: extsb. r0,r4
JIT: Mblk: copy_nz_flags_to_register
JIT: Dism: 2b985008: mfcr r5
JIT: Mblk: rotate_and_copy_bits
JIT: Dism: 2b98500c: rlwimi r15,r5,0,0,2
JIT: Mblk: rotate_and_mask_bits
JIT: Dism: 2b985010: rlwinm r15,r15,0,11,8
JIT: Mblk: save_memory_spec
JIT: Dism: 2b985014: rlwinm r0,r3,18,14,29
JIT: Dism: 2b985018: lis r5,10669
JIT: Dism: 2b98501c: ori r5,r5,40152
JIT: Dism: 2b985020: lwzx r5,r5,r0
JIT: Dism: 2b985024: lwz r5,20(r5)
JIT: Dism: 2b985028: mtlr r5
JIT: Dism: 2b98502c: blrl
JIT: M68k: 00f80060 51c8 fffc DBF.W D0,#$fffc == 00f8005e (FALSE)
JIT: Mblk: save_memory_long
JIT: Dism: 2b985030: stw r15,64(r14)
JIT: Mblk: save_memory_word
JIT: Dism: 2b985034: sth r15,68(r14)
JIT: Mblk: load_register_long
JIT: Dism: 2b985038: lis r3,10543
JIT: Dism: 2b98503c: ori r3,r3,53816
JIT: Mblk: save_memory_long
JIT: Dism: 2b985040: stw r3,76(r14)
JIT: Mblk: opcode_unsupported
JIT: Dism: 2b985044: li r3,20936
JIT: Dism: 2b985048: lis r4,10669
JIT: Dism: 2b98504c: ori r4,r4,39904
JIT: Dism: 2b985050: bl 0x2cf64ee4
JIT: Done compiling
That is the very first compiled code chunk. As it seems when it tries to execute it then some goes wrong.
DeleteAs far as I can tell there might be two possibilities:
1. the cache flushing for the translated code area is not working,
2. the allocated memory for the code cache requires some special flag or MMU mapping.
Unfortunately, I have no access to any MOS device, but you might try to cook up a simple test app that copies a few lines of code into an allocated memory area flushes the cache then tries to execute it and see what happens.
Very frustrating. My test program did not crash, but stopped at the cash flush. When I disabled cash flush, the code was done multiple times correctly without stopping.
DeleteSo I tried some stuff in uae.
I disabled cash flushing. Then the mandel will work, even multiple times.
After running the iamalive, which can work once, it begins to be instable. iamalive will not run a second time.
With cache flush enabled, I cannot run any demo, it will crash immediately. Is there really no way around this cash flush?
Here is the log for iamalive after the second attempt to run it (first one worked, second one crashed), note this is without cache flush!
JIT: Compiled code start: 37ec4be8
JIT: Comp: 0000001a 5281 ADD.L #$00000001,D1
JIT: Comp: 0000001c 3141 0180 MOVE.W D1,(A0,$0180) == $00dff180
JIT: Comp: 00000020 60f8 BT.B #$fffffff8 == 0000001a (TRUE)
JIT: Unsupported opcode: 0x60f8
JIT: M68k: 0000001a 5281 ADD.L #$00000001,D1
JIT: Mblk: load_memory_long
JIT: Dism: 37ec4c4c: lwz r15,64(r14)
JIT: Mblk: load_memory_long
JIT: Dism: 37ec4c50: lwz r3,68(r14)
JIT: Mblk: rotate_and_copy_bits
JIT: Dism: 37ec4c54: rlwimi r15,r3,16,26,26
JIT: Mblk: load_memory_long
JIT: Dism: 37ec4c58: lwz r3,4(r14)
JIT: Mblk: load_register_long
JIT: Dism: 37ec4c5c: li r4,1
JIT: Mblk: add_with_flags
JIT: Dism: 37ec4c60: addco. r3,r3,r4
JIT: Mblk: copy_nzcv_flags_to_register
JIT: Dism: 37ec4c64: mcrxr cr2
JIT: Dism: 37ec4c68: mfcr r15
JIT: Mblk: rotate_and_copy_bits
JIT: Dism: 37ec4c6c: rlwimi r15,r15,16,26,26
JIT: M68k: 0000001c 3141 0180 MOVE.W D1,(A0,$0180) == $00dff180
JIT: Mblk: load_memory_long
JIT: Dism: 37ec4c70: lwz r4,32(r14)
JIT: Mblk: add_register_imm
JIT: Dism: 37ec4c74: addi r5,r4,384
JIT: Mblk: check_word_register
JIT: Dism: 37ec4c78: extsh. r0,r3
JIT: Mblk: copy_nz_flags_to_register
JIT: Dism: 37ec4c7c: mfcr r6
JIT: Mblk: rotate_and_copy_bits
JIT: Dism: 37ec4c80: rlwimi r15,r6,0,0,2
JIT: Mblk: rotate_and_mask_bits
JIT: Dism: 37ec4c84: rlwinm r15,r15,0,11,8
JIT: Mblk: save_memory_long
JIT: Dism: 37ec4c88: stw r3,4(r14)
JIT: Mblk: save_memory_spec
JIT: Dism: 37ec4c8c: mr r4,r3
JIT: Dism: 37ec4c90: mr r3,r5
JIT: Dism: 37ec4c94: rlwinm r0,r3,18,14,29
JIT: Dism: 37ec4c98: lis r5,12549
JIT: Dism: 37ec4c9c: ori r5,r5,45400
JIT: Dism: 37ec4ca0: lwzx r5,r5,r0
JIT: Dism: 37ec4ca4: lwz r5,16(r5)
JIT: Dism: 37ec4ca8: mtlr r5
JIT: Dism: 37ec4cac: blrl
JIT: M68k: 00000020 60f8 BT.B #$fffffff8 == 0000001a (TRUE)
JIT: Mblk: save_memory_long
JIT: Dism: 37ec4cb0: stw r15,64(r14)
JIT: Mblk: save_memory_word
JIT: Dism: 37ec4cb4: sth r15,68(r14)
JIT: Mblk: load_register_long
JIT: Dism: 37ec4cb8: lis r3,13673
JIT: Dism: 37ec4cbc: ori r3,r3,32640
JIT: Mblk: save_memory_long
JIT: Dism: 37ec4cc0: stw r3,76(r14)
JIT: Mblk: opcode_unsupported
JIT: Dism: 37ec4cc4: li r3,24824
JIT: Dism: 37ec4cc8: lis r4,12549
JIT: Dism: 37ec4ccc: ori r4,r4,45152
JIT: Dism: 37ec4cd0: bl 0x398467f4
JIT: Done compiling
There is no way around the cache flush. The data cache must be written back to the memory and the instruction cache must be invalidated to let it read the new instructions from memory.
DeleteThis should work unless we don't know some very important detail.
I checked the code several times and am confused, why this should crash. I found a site on which they describe the usage of the registers, maybe here something is messed up?
Deletehttp://library.morphzone.org/An_Introduction_to_MorphOS_PPC_Assembly
From that description MorphOS is also SysV ABI compilant (which is not too suprising). I don't think that anything is wrong with the register layout. Maybe the stackframe is different somehow, but I doubt that.
DeleteFinally I found out the MorphOS "bug". I thought about the random crashes, and your hint "stackframe" brought me to the stack itself. So I decreased the cachesize (e.g.512) in the config and increased the stacksize of the CLI (e.g. stack 1000000) and then, the mysterious crashes were gone. Even after the crash of trying to boot the kickstart rom, the demos still work.
DeleteSo it is indeed just the amout of free stack space. Now we can go further in looking for the "one mysterious bug" which prevents the OS from booting. Nice vacations :)
Sounds weird. In Snoopium and SnoopDos where small PPC assembly code is used to patch system calls it works just fine. You could try querying L2 cache line size or MMU page size if it makes any difference. Or just use some fixed value (4k or 8k). OS4 code is using MEMF_HW_ALIGNED which is aligned to MMU page size. It shouldnt make difference but at least you could try.
DeleteThis stack problem is different issue. It looks like UAE JIT is using 68k stack. There are two stacks (stack pointers) in MorphOS: one is legacy 68k stack pointer found in REG_A7 for 68k code (tc_SPReg, tc_SPLower, tc_SPUpper) and PPC stack pointer found in r1 and maintained in struct ETask. It is interesting that UAE JIT is using 68k stack pointer but it is not necessarily bad. Only if EUAE JIT assumes mixed 68k/PPC stack it can be problem.
int __stack = 1000000; only sets PPC stack because 68k stack is rarely used and defaults to 2048 bytes. To fix this you should use StackSwap() call in Exec to set properly sized 68k stack or modify EUAE JIT to use PPC stack. I dont know what method is better. OS4 is using mixed stack (ppc and 68k together) so maybe that stack layout is assumed in JIT code and that is why it crashes. But I couldnt find any references to tc_SPReg or tc_SPLower or anything like that... definitely this stack is not working in MorphOS as it is intended to work.
I don't see why would the JIT use the 68k stack on MOS. The code is the same on both MOS and OS4, it depends on the pointer in r1. On OS4 if the app was made in PPC then there are no 68k emulated access to the stack and you can follow the SysV ABI stack frames all along the whole stack.
DeleteE-UAE is a purely PPC app (in our case), so it won't try to access the 68k stack or any 68k emulation dependency.
Since the JIT code is single threaded and never gets called through the GCC compiled code directly I can move the saved registers to the global context array structure instead of storing it on the stack. Probably it is even possible to remove the SysV ABI stack frame creation all together somehow, in this case there won't be any stack usage at all.
Lowering the code cache size to 512 bytes might cause side effects, like frequent flushing of the compiled code or simply skip compiling all-together. Do not do that, it makes no sense to use less than 1 MB code cache.
DeleteMaybe I should introduce checking for this setting.
Sorry, I just realized that the code cache size is specified in KBytes. So, specifying 512 as cache is 0.5 MB code cache, which should be fine.
DeleteWait what? I'm not any of the anonymous', nor do I think I helped with anything. (Incase at the time of this post the blog article has been edited with my name removed, my name is/was listed as one of the people who helped with MorphOS support. I am not one of the anonymous people, nor do I have any skills in coding in C, nor do I have MorphOS at this time).
ReplyDeleteSorry about that, somehow I connected your name with Anonymous #2.
DeleteThe real Anonymous #2, please stand up!
Anon #2 is me, an ex-Kiwi ;)
ReplyDeleteGotcha! ;)
Delete