*Phew*, there were times when I thought I am not going to write this down ever in the blog. I was this >> << close to give it up on some days. Since the world hasn't ended in 2012, I realized I have to go on, there is no escape.
I am proud to announce: the project stepped into Alpha stage on SourceForge with this update. Details of the changes are:
For the Kickstart boot these fixes were needed:
- Added compiling stop (jump) flag to instructions which might trigger interrupt for supervisor mode: OR.W #imm,SR, AND.W #imm,SR, EOR.W #imm,SR
- Temporary registers are flushed at the end of the compiling cycle, but before the code generation.
- Stop the block processing when the special flags were set in the block (an interrupt might be triggered).
- Reload the emulated program counter register when the block finishes with a supported instruction.
- Fixed wrong function epilog implementation: the return address was read from the wrong position in the stack.
- Old executed instruction pointer and emulated PC register is synchronized on PC reload.
Other fixes for the bugs I have discovered while I extensively debugged the emulation:
- Fixed missing releasing of the compiling buffer on quitting the emulator (memory leak).
- Prevent compiling of tiny blocks (less than 4 instructions in a row): the overhead of the block calling is too much.
- Fixed compiling buffer overflow checking and misleading help text for the compiling buffer unit size.
- Removed supported status for not-yet-implemented EOR.x reg,mem instruction, which was added accidentally before.
So, there is no practical use of the sources yet, but this was the big question: is the compiled code handling able to deal with something as complex as the Kickstart? And for a long time the answer was: no.
Some über-geeky details about the fixes (you like when I'm talking dirty, right?):
As you can see from the changes there were numerous problems around the code, all of these changes were needed for the final result. There were the usual ridiculous issues like a missing negative sign in the function Epilog (line 2308) when it tried to read back the return address from the stack - from the wrong offset.
It essentially means that the execution returned from every compiled block to the parent function instead of the block call loop. The funny familiar feeling that every coder experiences sooner or later: how on EARTH this thing ever worked? ;)
The trickiest part was finding the bug about the interrupt handling: I waded through a few hundred gigabytes of debug dump following the execution. Unfortunately, I was not able to compare the different execution sessions as I mentioned earlier in the comments for the previous post.
At last, I have found out that the OS is switching between User and Supervisor mode in the Exec.lib/Supervisor() function using the special OR immediate instruction by flipping the S flag in the Status Register. This step triggers an exception which is captured by the OS and the position of the triggering instruction identified in the ROM exactly.
The bug was: I never considered this OR instruction to be similar to a TRAP or an ILLEGAL instruction, which instructions change the Program Counter by raising an exception - which is essentially a jump. Thus the compiling hadn't stopped to give back the execution to the interpretive emulation.
As a result the compiled block contained the next instruction after the OR consecutively and later the exception was triggered separately: train-wrecking the boot completely. The only way to get out of there for the OS was to reboot, this is how the reboot-loop happened.
Promises?
Now, it gets a lot easier to fix up the different instruction implementations and implement the rest of the missing instructions. As soon as the former is done the OS will be usable and the latter can be done gradually.
I would like to thank to Toni Wilen for his hints regarding the possible ways of tracking down bugs inside UAE. His suggestions gave me ideas which eventually led to the required fixes.
Yet lot more work should be done, but at least I can see the light at the end of the tunnel. (What a cliché, man. Put yourself together!)
And finally, here is a picture of me, made by my wife to capture the moment when the freakin’ thing started up for the very first time:
OMG, who is this ork-face? |
Congratulations, that was really good job :]
ReplyDeleteNo wonder that you are so proud of yourself, because you should be :D
Very good job! I hope we dont need to wait a couple of more months more before we can use it. :-)
ReplyDelete@Joeled
Delete.. and you were so pessimistic :)
I can't blame him: he wants everything, but right now. ;)
DeleteSorry kas1e, I accidentally clicked on delete link for your comment and there was no confirmation. :(
ReplyDelete@Almos
ReplyDeleteNot so matter, you know what i think then :) cool stuff
Great work, keep the momentum!
ReplyDeleteThanks for the nice words guys!
ReplyDeleteGood work! I hate it when those bugs end up being so obvious later on! 8-)
ReplyDeleteBut that's how we learn I guess....
WooHoo !!!!!!! GO Rachi GO !!!!!!!
ReplyDeleteGreat news Álmos, I'm really impressed. Well done and congratulations. I think nothing more on this project could be as challenging as the hurdle you've just beaten, so here's to smooth sailing from here on in!!
ReplyDeleteHi Almos, have you news?
ReplyDeleteEager to get the a full jit working on OS4. Again i would like to know if an implementation of the full chipset, could be implemented in a similar way as petunia.
Would be this hard do implement?
I am making my way through the not yet implemented instructions slowly.
DeleteI doubt that could be achieved using Petunia. For the full chipset emulation every memory access has to go through specific functions, which try to interpret the data flow as input-output data for the specific chips. This would require a complete overhaul of the memory access emulation. In Petunia the memory is accessed directly, there is no layer between the emulation and the physical memory.
And also it is usually highly important to maintain the instruction timing, many old applications are relying on that.
One possible (limited) solution would be some MMU-mapped memory area on the chipset address space, but this would be way too slow as far as I can tell.
This question pops up time-to-time, maybe I could add it into the FAQ.
Great! I will try to compile it on MorphOS later! Thank you.
ReplyDeleteOkay, compiles but the kickstart still reboots for kick 1.3 but with yellow screen and leads into a Guru on 3.x. So far... I am excited when all missing instructions are implemented.
DeleteThanks for the try. Please read the FAQ: Kick 1.3 is not supported by the JIT because it is depending on the processor cache (which is not available under 1.3). It should work with the emulation, as far as I can tell, simply because it never turns on the cache. So if it is not working then something is wrong with the compiling maybe?
DeleteRegarding the crash at boot: in the meanwhile I had found a few interesting issues, wait for the next update. Even if not all of the instructions are implemented the emulation still might work, because it is reusing the interpretive emulation for the missing instructions.
This is really interessting, because when I turn off the cache, it works on 1.3. If I turn it on I get the yellow screen and reboot. Did you try it with 1.3?
DeleteI also rebuilt the comp stuff (gencomp) so I think I have all the instructions in my code.
Hm, interesting. I have just got another report about stuff failing with the emulator which does not use cache. I will check this out.
Delete