I'll go into more details in a later post, but in essence the problem was due to very rare situations where the trace recorder would exit a trace when there was still a branch delay instruction pending. This caused the fragment generator to inadvertently skip the branch instruction, causing the odd behaviour I was seeing.
For reference, here are some updated figures for Super Mario 64 and Mario Kart (initial results are from a previous post). Generally the current changes seem to indicate an overall speedup of 20%-25%, which is great for a few days work. What's even better is that I've still not implemented all the optimisations that I have planned for R7, so hopefully these numbers will look even better soon.
Scene | R4 Framerate (Hz) | R5 Framerate (Hz) | Current Framerate (Hz) |
Mario Head | 3 | 6 | 8 |
Mario Main Menu | 14 | 25 | 30 |
Mario Peach Letter | 6-7 | 11 | 13 |
Mario Flyby (under bridge) | 6 | 10 | 12 |
Mario In Game | 5-6 | 9 | 11 |
Mario Kart Nintendo logo | 10 | 23 | 24 |
Mario Kart Flag | 6 | 11 | 13 |
Mario Kart Menu | 7 | 11 | 13 |
(I'll update with results from Zelda shortly - I have to go to a BBQ now!)
-StrmnNrmn
83 comments:
Post a Comment