Sunday, March 18, 2007

Daedalus R10 Optimisation Progress

I didn't mean to leave it quite so long since last weekend's update, but I've been working hard on a number of optmisations for R10. Oddly enough these are mostly new issues that I've found - most of them don't exist in the list of tasks I came up with a couple of weeks ago. I think that shows how much scope there is for optimising Daedalus!

Firstly, I finally managed to get Daedalus compiling with GCC's '-O3' setting. This flag turns on all of the optimisations that GCC provides. When I've tried to enable this flag in the past I've had numerous strange crashes and odd behaviour, so all releases of Daedalus to date have been compiled with -O1.

I updated my local installation of the PSPSDK last weekend and decided to try the -O3 setting again. I was pleased to find that Daedalus ran without crashing, but there was still some odd behaviour which I eventually tracked down to my use of the famous InvSqrt function. You can read a bit more about my findings on the pspdev forums.

Enabling -O3 tends to slightly increase the code size (the EBOOT.PBP has increased from around 850KB to 900KB), but the speedup is quite noticable - my estimate is that Daedalus runs around 5% faster with -O3 over -O1.

As a result of the thread I started on the pspdev forums, hlide and Raphael both came up with some great suggestions for how I could optimise my use of the VFPU.

When I originally wrote the VFPU code for TnL and clipping there were still many undocumented/unsupported functions. A few months down the line and hlide and co have discovered a couple of instructions which are perfect for my needs - namely vuc2i and vc2i. These two functions take a 32-bit value comprising of 4 (un)signed 8-bit chars and unpack them into a vector of 4 32-bit fixed point numbers. It turns out that these instructions are perfect for converting the N64's packed colour and normal values into a format I can use in the VFPU code.

The various VFPU tweaks I've made have given Daedalus another 5% or so speedup.

The final set of changes I've been working on this week have been to do with how I handle certain blend modes. Some of the N64 blend modes are too complex for the PSP to deal with precisely, so I have a large table of 'override' blend modes which allow me to make as good an approximation of the N64 mode as possible. It turned out that looking up these blend modes was very expensive, so I've rewritten how this is handled to make it more efficient. The end result is another small speedup.

Overall these three changes give a combined 10-15% speedup on the various games I've tested, although there are roms that lie outside this range (some show an even greater speedup while others are more or less unaffected by the changes).

There's still quite a lot more in the way of optimisations that I want to get in for Daedalus R10 (mostly stuff I mentioned earlier) so hopefully these numbers will improve even further over the next couple of weeks.

-StrmnNrmn

29 comments:

Jinder said...

sorry this is completely random but i really need help with R9

everything works untill i get to the splash screen where it says "daedalus psp" then it freezes on the screen after that and it turns my psp off

how can i fix this plz help i havent been able to play and n64 for a while now

Anonymous said...

Great news!

Looking forward to R10!

Did you manage to find out why Mario's head is slower than the rest of the game?

Unknown said...

Woot! First Comment!

THose look like cool optimisations. I can't wait.

Austin said...

This is Great news...wow 10-15%!...Can't wait!!Well Thx for the weekly posts...I check ur blog like 10 times a day on saturday and sunday to stay up w/ the updates!...Keep up the good work and before we all know it Daedalus will be as good as Project64 but on a psp!...simply incredible.

Unknown said...

Glad to see your still working.. =D

Ryu Icaguri said...

Sounds like some serious improvements are being implemented. Can't wait for another update on this.

skatterfelt said...

I love your optimistic approach. Upon finding countless areas to improve, you don't think of R9 as hopelessly flawed--but rather as hopefully filled with potential.

Or something.

Anyway, the improvements so far sound awesome. Keep us updated, it makes for a very interesting read. However, after a couple more posts like this, I'm going to have to remind you of your promise to release R10 sooner rather than later.

Morgan said...

Looking great StrmnNrmn I can't wait for R10 your optimisations look very promising. I know you would have liked to update us sooner but as long as you work alot and get Daedalus better I have no problem with waiting for an update. Can't wait man R10 should be great, especially with the OFFICIAL frameskip option from you. R10 should be a really great update from R9, I just hope that compatability will increase in R11.

BrendanL said...

Wow! You must be working really hard on R10! I can't wait!

Unknown said...

awesome keep it up. im waiting your next release

prototype said...

First comment yes! anyways, it sounds like you are working out those little things that make the whole thing better. Good going. Just find some more inefficient functions and we'll be good to go!

Anonymous said...

Awesome, keep up the good work!

Unknown said...

Good work man. Have you seen the UO Version of daudelus it has a build in frameship . maybe you should look into it?

Unknown said...

good work man. I love this Emulator. ITS ONE OF THE BEST. Have you checked the Unofficial build of Daedalus with the implemented frameskip feature? Maybe you should adapt some of its optimizations into the current build?

Unknown said...

It’s good to see a dev that updates us on his progress, good luck man great emulator.

Unknown said...

Keep the good work Strmn, really, you are a great programmer :D

Cheers form Mexico

Nico said...

There's nothing like arriving to work on a sleepy monday and finding your update to cheer up! ;)

Keep it up!

Trexx said...

Congrats,
Cant wait for this release

BlauerCrystal said...

Great!
Really good work, thanks.
But my Memory Stick is broken unfortunately…:)
Unfortunate which I can not help you, I learn only straight C++. Until then…

Syne49 said...

Rock on man, it's awesome that you're putting so much work into this. I commend you greatly for your efforts, and will accept anything you manage to accomplish.

reow. :3 said...

Great, can't wait for the release.

Good luck :)

Simon White said...

Thats a piece of good news! :)

Keep it up! :)

Jody said...

Great work. I hope I get ungrounded from my PSP before R10 comes out or else I'll go crazy and buy a whole new one just for this emulator. You have earned an award for Best Programmer of Deadalus.

Unknown said...

Can you PLEASE add an option to adjust the control stick's dead zone? I constantly have problems with Mario walking very slowly in a random direction when i'm not touchig the stick at all.

Unknown said...

How about the use of gcc 4.1?
psptoolchain support gcc-4.1(but need to edit toolchain.sh).

Simon said...

hey strmn nrmn, some guy posted a version of r9 on the dcemu forums, it's got frame skip. mabye your's should have frame skip?

Anonymous said...

wally: It's because the N64 works like a 3D software program. Large models take FOREVER to render by the engine. Like this, Mario's head is a big 3D model and thus takes longer to render

Karl said...

I agree with the dead zone issue above... As far as I'm concerned its the single biggest problem with the emulator (in Mario 64 anyway) now that speed isn't so much of an issue.

Unknown said...

It's too bad that you didn't end up bringing out R10 by the end of March... But we can understand that it's difficult to program these things; here's hoping that it comes out soon.

But don't rush it, quality is key.

Peace Stormin' Normin' =)