Saturday, July 29, 2006

Fixed!

I'm very pleased to be able to say that I've finally managed to fix the nasty bug I blogged about on Thursday.

I'll go into more details in a later post, but in essence the problem was due to very rare situations where the trace recorder would exit a trace when there was still a branch delay instruction pending. This caused the fragment generator to inadvertently skip the branch instruction, causing the odd behaviour I was seeing.

For reference, here are some updated figures for Super Mario 64 and Mario Kart (initial results are from a previous post). Generally the current changes seem to indicate an overall speedup of 20%-25%, which is great for a few days work. What's even better is that I've still not implemented all the optimisations that I have planned for R7, so hopefully these numbers will look even better soon.











SceneR4 Framerate (Hz)R5 Framerate (Hz)Current Framerate (Hz)
Mario Head368
Mario Main Menu142530
Mario Peach Letter6-71113
Mario Flyby (under bridge)61012
Mario In Game5-6911
Mario Kart Nintendo logo102324
Mario Kart Flag61113
Mario Kart Menu71113


(I'll update with results from Zelda shortly - I have to go to a BBQ now!)

-StrmnNrmn

Thursday, July 27, 2006

Captain Morgan's compatibility results

It looks like Captain Morgan has put a lot of effort into doing compatibility testing with Daedalus R6. Thanks Cap'n! (thanks for your email too - I promise I'll get back to you just as soon as sort this issue out!)

[Update]

Don't miss Wally*Won_Kenobie's R6 compatibility list, which is also excellent. Wally has been collecting missing_mux.txt files for me too, for which I am indebted :)

-StrmnNrmn

A great optimisation/bugfix..

..with a catch.

This week I've been posting about various speedups I've been making by implementing various opcodes in the new dynarec engine. Although I have implemented most of the commonly used opcodes now, after my previous post I decided to add some temporary logging to the emulator to see how often the remaining unhandled opcodes were being used. Two that immediately jumped out at me were JAL (Jump And Link, which is used to perform a function call) and JR (Jump Register, which is used to return from a function call.)

These two instructions are very heavily used, and I was surprised to realise that I'd not implemented them! They're pretty easy to code - in fact, due to the way I construct the instruction traces that are fed into the dynamic recompiler I could effectively ignore the JAL instruction and JR just required a couple of lines of code.

So far so good. I was expecting a modest speedup - maybe another 2-3% on top of all the previous changes I've made this week. After compiling and running the new code, I was amazed to see an improvement of over 10%! Surprised with the figures I was seeing, I did a full rebuild and checked the results again with several different roms. They all showed the same kind of speedup.

I've been programming (and more importantly debugging) long enough now when I should trust my instincts - call it my programming 'Spider-Sense' tingling if you will, but I knew something didn't quite add up :). In situations like this in the past, rather than taking an unexpected speedup for granted I've spent time investigating the root cause to find out exactly what's going on. At the very least I'll simply satisfy my own curiosity, but often I'll find some useful information along the way too (e.g. other related improvements and optimisations etc.)

So I started looking through the code and rerunning a few roms to try and get a handle on what was causing such a significant improvement. After a short while I realised that all the roms were now generating a lot more potential traces for the dynarec engine to consider for recompiling. This confused me even more, because this behaviour should slow the emulator down rather than speed it up. Another puzzling thing was that the only observable behaviour of my changes should be the speed of emulation - but it looked like my change was somehow changing the flow of execution in the rom.

After bit more head scratching and debugging, I finally realised what had happened. In making my changes, I had inadvertently fixed a bug in the dynarec engine that was causing the recompiled code to jump back out to the interpreter whenever a JAL instruction was encountered! This bug had been in the dynarec engine since the first day or so, but because its only side effect was to slow down the emulator rather than something more obvious (i.e. a crash!) it had remained undetected for a couple of months.

So I had figured out what the reason for the 10% speedup was, and I could finally get to bed safe in the knowledge that I had fixed a nasty, subtle bug along the way. Brilliant!

It was only then that I noticed a couple of new problems that I hadn't seen before: The emulator began hanging in places that had previously been fine - such as Peach's letter at the start of Mario 64. On the occasions the emulator managed to get past that point, I found out that Mario wouldn't move or jump (but strangely the c-buttons and pause menu worked fine)

:(

I've spent the last couple of evenings trying to figure out why fixing one bug is causing another. I've finally managed to find a way of reliably reproducing a hang within a few seconds of starting the emulator up. This is important because my best chance of identifying and fixing the problem is to be able to run the PC build of the emulator with my 'fragment simulator' enabled. The simulator is very slow however (about 100x slower than running the emulator normally), which is why it's important to find a way of reliably reproducing the bug very early on in the emulation.

So that's what I've been up to over the past couple of days, and why I haven't been able to reply to people's emails or comments on this blog. Now that I can reproduce the bug in the fragment simulator I'm confident that I can get to the bottom of it. I'll keep you posted with any developments and try to go through emails/comments just as soon as I've cracked it.

-StrmnNrmn

Wednesday, July 26, 2006

Further dynarec optimisation

I've spent the last couple of evenings working on adding support for additional instructions to the dynamic recompiler. With every instruction I add, the generated code becomes a bit more efficient as I can avoid various bookeeping work (such as flushing all the cached registers out to memory.)

I've added code to handle the following ops:


  • MULT, MULTU (multiply, multiply unsigned)

  • DIV, DIVU (divide, divide unsigned)

  • MFLO, MFHI (move from lo/hi)

  • MTLO, MTHI (move to lo/hi)

  • LB, LBU (load byte, load byte unsigned)

  • LH, LHU (load halfword, load halfword unsigned)



So far I'm seeing around a 5-6% speedup with these changes (on top of the 10-12% speedup I talked about on Sunday). I am generating slightly more code as a result of this work, but given the large savings I made over the weekend this isn't much of an issue.

My next job is to look at optimising the remaining load/store instructions - I just have LWU/SB/SH to do (ignoring the 64 bit instructions for now). Once that's done I'm going to have a look at optimising sequences of load/store operations by caching the base address between uses. I think that should give a significant speed up for memory intensive chunks of code.

-StrmnNrmn

Sunday, July 23, 2006

A productive weekend

I've had a fairly productive weekend. I've been working on a few improvements on the dynamic recompiler, and I've managed to both decrease the amount of memory it's using (by approximately 10-12%), and increase the emulator's speed (by around 7-10%).

I don't want to get into too many details just yet, as I'm planning on putting together a more detailed post with all the gory details (and a few graphs :) later in the week.

In a related change, I've also managed to identify and fix an issue (read 'bug') with the dynarec that may have been causing stability problems on v1.0 firmware PSPs. This may explain some of the differences in stability users of the different builds have been seeing.

-StrmnNrmn

Wednesday, July 19, 2006

Windows CE device emulator source

For anyone reading this blog with an interest in emulator development, Microsoft have just released their Windows CE device emulator as shared source.

As the post mentions, it includes the source for an ARM -> x86 JIT compiler which makes interesting reading, (take a peek at armcpu.cpp)

-StrmnNrmn

R6 Released

I've got to keep this short as blogger.com is going down in 5 minutes :(

Change log:

[+] Added over 50 new combiner modes
[+] Added support for c-buttons
[+] Load roms from ms0:\N64 in addition to local roms directory
[!] Fixed backface culling issues
[!] Correctly implemented flipping to avoid flickering with certain roms
[!] Plugged memory leak in texture handling code, fixing various crashes
[!] Fixed issue which caused screenshot function to hang the emulator

You can grab it here.

For R6 I've mostly been focusing on fixing a number of graphical issues (namely adding combiner modes to popular roms). I've also managed to add a couple of nice usability improvements (in particular mapping the c buttons to the dpad, using the circle button to toggle back to the n64 dpad)*. I've also been able to track down a couple of bugs that affected stability.

I've still not decided what to concentrate my attention on for R7. The main areas are:


  • Speed

  • Compatibility

  • Graphics

  • Usability



These are all quite broad areas, but it would be good to get a feeling for what people are most interested in seeing improved. Any comments would be most appreciated.

-StrmnNrmn

* I should point out that I had dozens of people suggest this to me via email and through comments on this blog, so I can't take any credit for this idea :)

R6 up within the hour..

I had to pull a long shift at work (14 hours!) so I've only just got in. I'm in the process of rebuilding a release build, updating docs, zipping things etc and should have a new build uploaded by 1am (BST). It looks like blogger.com is down at 5pm (PST) so if I don't post an update soon, check the SourceForge site for the update.

-StrmnNrmn

Monday, July 17, 2006

R6 Tomorrow (hopefully!)

I don't mean to tease, but I am hoping to release the next build of Daedalus PSP tomorrow. I've spent the last couple of days polishing a number of graphical issues and I need to set myself a target date otherwise I'll just keep tinkering for ages :) I'll have a longer post tomorrow with details of the changes that have made it into R6.

In the meantime, I've answered a few of the most recent questions on the previous comments page.

-StrmnNrmn

Tuesday, July 11, 2006

Graphical fixes

Here are a few of the significant graphical fixes I've made so far for R6.

Firstly, I managed to fix the horrible flickering that happened when running various roms (Paper Mario was a good example). It turned out that I was making an assumption that roms executed exactly one display list per frame. I assumed that each display list would clear the screen, render everything, and then wait for the screen to flip. As it turns out, some roms execute multiple display lists per frame. In the case of Paper Mario it executes 2 display lists per frame (one which clears the screen, then another which renders everything). By making sure that I only flip after the second display list executes, I avoid the flickering (the actual solution is a little more involved but this is the general idea).

The next significant glitch I've fixed was to do with backface culling of triangles. Basically, when I ported the graphics engine over from the PC version, I forgot to implement the two or three lines of code which handles this. It was a very small fix, but it corrects a number of significant graphical issues (notably all the walls getting in the way in Quest 64).

Finally, I've managed to track down and fix a significant memory leak in the texture handling code. I believe this was causing many of the random crashes that were occuring when leaving the emulator running for several minutes or more (basically through running out of memory). Before applying the fix I found that the Super Mario 64 would crash within 4-5 minutes. After applying the fix I've been able to run Mario with no problems for over 30 minutes.

I'll keep you posted as to when you can expect a new release. I'm quite excited about the memory leak fix, so I'd like to get a new release out as soon as I can implement some of the other things I promised for R6.

-StrmnNrmn

PS I know I've been crap at replying to emails :( I'm hoping to get this release out and then I'll spend a few hours sorting out my mailbox and replying to various comments here.

Sunday, July 09, 2006

Congratulations Italy!

Congratulations to Italy on their win tonight!

Just a quick note to apologise for the lack of updates recently. I've been busy with work for the past couple of weeks, but hopefully it should be business as usual now. I'll try to post a more interesting update tomorrow or on Tuesday.

-StrmnNrmn