..with a catch.
This week I've been posting about various speedups I've been making by implementing various opcodes in the new dynarec engine. Although I have implemented most of the commonly used opcodes now, after my previous post I decided to add some temporary logging to the emulator to see how often the remaining unhandled opcodes were being used. Two that immediately jumped out at me were JAL (Jump And Link, which is used to perform a function call) and JR (Jump Register, which is used to return from a function call.)
These two instructions are very heavily used, and I was surprised to realise that I'd not implemented them! They're pretty easy to code - in fact, due to the way I construct the instruction traces that are fed into the dynamic recompiler I could effectively ignore the JAL instruction and JR just required a couple of lines of code.
So far so good. I was expecting a modest speedup - maybe another 2-3% on top of all the previous changes I've made this week. After compiling and running the new code, I was amazed to see an improvement of over 10%! Surprised with the figures I was seeing, I did a full rebuild and checked the results again with several different roms. They all showed the same kind of speedup.
I've been programming (and more importantly debugging) long enough now when I should trust my instincts - call it my programming 'Spider-Sense' tingling if you will, but I knew something didn't quite add up :). In situations like this in the past, rather than taking an unexpected speedup for granted I've spent time investigating the root cause to find out exactly what's going on. At the very least I'll simply satisfy my own curiosity, but often I'll find some useful information along the way too (e.g. other related improvements and optimisations etc.)
So I started looking through the code and rerunning a few roms to try and get a handle on what was causing such a significant improvement. After a short while I realised that all the roms were now generating a lot more potential traces for the dynarec engine to consider for recompiling. This confused me even more, because this behaviour should slow the emulator down rather than speed it up. Another puzzling thing was that the only observable behaviour of my changes should be the speed of emulation - but it looked like my change was somehow changing the flow of execution in the rom.
After bit more head scratching and debugging, I finally realised what had happened. In making my changes, I had inadvertently fixed a bug in the dynarec engine that was causing the recompiled code to jump back out to the interpreter whenever a JAL instruction was encountered! This bug had been in the dynarec engine since the first day or so, but because its only side effect was to slow down the emulator rather than something more obvious (i.e. a crash!) it had remained undetected for a couple of months.
So I had figured out what the reason for the 10% speedup was, and I could finally get to bed safe in the knowledge that I had fixed a nasty, subtle bug along the way. Brilliant!
It was only then that I noticed a couple of new problems that I hadn't seen before: The emulator began hanging in places that had previously been fine - such as Peach's letter at the start of Mario 64. On the occasions the emulator managed to get past that point, I found out that Mario wouldn't move or jump (but strangely the c-buttons and pause menu worked fine)
I've spent the last couple of evenings trying to figure out why fixing one bug is causing another. I've finally managed to find a way of reliably reproducing a hang within a few seconds of starting the emulator up. This is important because my best chance of identifying and fixing the problem is to be able to run the PC build of the emulator with my 'fragment simulator' enabled. The simulator is very slow however (about 100x slower than running the emulator normally), which is why it's important to find a way of reliably reproducing the bug very early on in the emulation.
So that's what I've been up to over the past couple of days, and why I haven't been able to reply to people's emails or comments on this blog. Now that I can reproduce the bug in the fragment simulator I'm confident that I can get to the bottom of it. I'll keep you posted with any developments and try to go through emails/comments just as soon as I've cracked it.