Monday, June 26, 2006

Deciding what to optimise

Whenever I start to answer questions on the comment pages I always end up going into too much detail for a quick response and end up deciding to put up a new post instead. I hope this isn't too annoying :)

In response to Plans for R6 xiringu and ukcuf16 had a couple of interesting suggestions for performance improvements.

First up, from xiringu:

instead of working with a 300x200 screen, work with only half height 150x200 and then display an empty line every other line to get the final 300x200.


That's an interesting idea - it's a trick that's been used by demo coders for years to get a few extra fps. I'm not sure this is going to provide all that much of a speedup to Daedalus though :( The reason for this is that currently rendering only contributes a small amount to the overall cost of each frame, so even if rendering time was totally eliminated, the framerate wouldn't change much. As an example, let's take something like Zelda which currently runs at around 4 fps. At 4fps it means each frame takes 1000/4 = 250 milliseconds to render each frame, which is broken down something like this:

CPU emulation: 200 ms
Display list parsing: 40 ms
Rendering: 10 ms
Total: 200 + 40 + 10 = 250 ms (i.e. 1000/250 = 4fps)

Assuming that we could totally eliminate the rendering time, this would now look like:

CPU emulation: 200 ms
Display list parsing: 40 ms
Rendering: 0 ms (no cost)
Total: 200 + 40 = 240 ms (i.e. 1000/240 = 4.17fps)

So the very best we could hope for in this case would be a .17fps improvement in the framerate :(

ukcuf16 wrote:

Just wanted to ask if there is ever going to be frame skip in later versions :)


What ukcuf16 is suggesting is that the emulator renders one frame, then skips the next. Alternating frames like this should halve the cost of rendering, at the cost of making the framerate a little less smooth.

Again, this is an interesting idea, but I don't really see this having much impact on the framerate as things stand at the moment. Working out the potential speedup is a little more complicated, as we have to take the average time over two frames. The numbers look something like this:

Frame1 CPU emulation: 200 ms
Frame1 Display list parsing: 40 ms
Frame1 Rendering: 10 ms
Frame2 CPU emulation: 200 ms
Frame2 Display list parsing: 0 ms (skipped)
Frame2 Rendering: 0 ms (skipped)
Total: 200 + 40 + 10 + 200 = 450 ms
Average: 450 / 2 = 225 ms (i.e. 1000/225 = 4.44fps)

So even implementing a frame skip mechanism would only give a tiny 0.5fps speedup.

To take this example to its ultimate conclusion, let's assume that I could somehow eliminate the entire cost of display list parsing and rendering:

CPU emulation: 200 ms
Display list parsing: 0 ms (no cost)
Rendering: 0 ms (no cost)
Total: 200 ms (i.e. 1000/200 = 5fps)

Even if I could somehow (magically) reduce the cost of rendering to 0 milliseconds, we'd still only see a 1fps speedup. However, if I can halve the cost of CPU emulation (which is much more likely given the speedups already seen with the new dynarec engine) this is what the calculations look like:

CPU emulation: 100 ms (now twice as fast)
Display list parsing: 40 ms
Rendering: 10 ms
Total: 100 + 40 + 10 = 150 ms (i.e. 1000/150 = 6.66fps)

At the moment I feel that there are more gains to come from optimising the CPU emulation, which is why I've been concentrating on this area recently. As the cost of CPU emulation falls relative to rendering then the ideas suggested by xiringu and ukcug16 will start to become more attractive.

-StrmnNrmn

30 comments:

Morgan said...

Strmnnrmn we don't get annoyed by these updates actually we look forward to them. It's nice to read something new about the progress, keep it up! I don't know anything about coding so, but yeah what is the next big thing you could do to help the speed and smoothness increase. Lastly in the comments on the last update PSDonkey said he made some changes to the R5 source AGAIN* and uploaded them and emailed it to you, are his changes anything to help out the progress? And if possible could you provide a link to compiled source he made?

Morgan said...

I want it smooth so I wouldn't like it for right now. I just want Stmnnrmn to speed it up in the best ways he knows how, so that it runs great.

Morgan said...

Psmonkey good to see your working on your emulator! Do have a release maybe in the future? If so when maybe? How far along is the emulation comming? Like how fast is it running games?

Laxer3A said...

Hi,

1/ Definitely that render odd or even line is actually more difficult considering the fact that you probably translate your graphic call from the N64 in native graphic PSPGU calls.

2/Your benchmarking definitely shows that the dynarec is going to be the key of performance improvement in your emu. and is a MUST as I said before.

Now it start to be very clear that you need it to be REALLY smart from now to get good speed improvement.
To get all the potential out of the dynarec you will need to analyze completly the generate chunk. and optimize it globally also. (useless moves, better use of the delay slot, avoid useless pipeline flush may be ?).

When I see that I am not sure that you should generate directly ASM instruction into the chunk while decoding the N64 code but really rely on a "intermediate" encoding allow to build tree of operations, detect register usage and so on...
And then generate your PSP code at the end, once a chunk has been parsed and optimized.

Anyway, just throwing out ideas, as usual, coz I dont have so much time to look at the source in details.

Regards.

frmariam said...

Well regardless of any other improvement I really hope you can fix the cause of freezing in 2.0 tiff... Nowadays seems that tiff is always a problem...

R4 froze like less than 10 minutes after starting Mario 64 and R5 freezes before I even enter the castle...

I'd love to pass Mario64 all over again in the PSP... And Mario Kart 64...

I can tell this is a really though job and that you put a lot of hard work into it.

If you ever try to tackle the gameplay time crashes in 2.0 tiff and need anyone to report on it mail it to me at sousa.jff@gmail.com (sorry I bet you are tired of ppl begging to beta but it's just that so many new apps work in GTA and not in tiff... like all GBA emus apart from PSPGBA 1.0 by psp298)

Morgan said...

Hey why don't you downgrade to 1.5 and use devhook? I was a 2.0 user because I loved GTA but then when devhook came out I said I'll test it out. I love 1.5 because I know I get the fastest and best homebrew and I can play any UMD up to 2.5! I'm staying where I'm at forever now and I can't wait till Devhook is up to 2.7 and it looks like that could be very soon!

Ps. Strmnnrmn please gives us a link to the changes donkey made and sent to you on the R5. The changes he said in the last update in the comments! You have a link to a compiled source code?!

Exophase said...

One thing I've been wondering. What's the native N64 frame speed supposed to be for Zelda 64? Is it 16.7ms (60FPS) or 33.3ms (30FPS)? Or something else altogether...

Morgan said...

Strmnnrmn please answer my question!

PSdonkey said...

Yes, Morgan. Norman is very busy right now. For those of you who want the updated changes to the source, here are the links to the 1.0 and 1.5 binaries. I already sent the source to norman so anyone who wants to take a look at the source feel free to either email norman or myself for the updated source code and one of us will send you the source.
Basically this new R5 edition has the new eboot icons and sound from before and also the new menu backgrounds from before as well. The newer code that was implemented is the addition of menu music while you are browsing through the options and rom list. As soon as you choose a rom to load, the music stops and clears the memory for the use of loading the rom into the memory. You can also create your own custom mp3 music for the menu and putting it inside your Daedalus folder naming your mp3 "menu.mp3" It doesn't really matter how long your music is because once it finishes, the music will repeat itself until you choose a rom to play then the music will be freed from the memory usage so that the emualtor will still have all the necessary memory that it needs. Also, once again, this new R5 update has the rom folder the same as the folder that Monkey64 uses. You need to place all your rom files in the root of the memory stick called "n64" just like the way Monkey64 does. So there is no more "Rom" folder in Daedalus. I am working on a few more additions to Daedalus for when R6 comes out. Once again to make things clear, All 100% credit goes to StrmnNrmn for this awesome work of his. Anyone else who has some new ideas for StrmnNrmn's future releases, just drop him an email with the updated source and i am sure he will incorporate it into the new release. Here are the binary files cut in half so that you can see the whole link. Just copy and paste the top part and the bottem part of the link into your browser.

Link for R5 Changes for 1.00

http://rapidshare.de/files/24398732/
Daedalus_100_R5_New.rar.html

Link for R5 Changes for 1.50

http://rapidshare.de/files/24398995/
Daedalus_150_R5_New.rar.html

Here are the full links if you can view them

http://rapidshare.de/files/24398732/Daedalus_100_R5_New.rar.html

http://rapidshare.de/files/24398995/Daedalus_150_R5_New.rar.html

kaiser said...

Thanks psdonkey!
I love the new changes!

Morgan said...

PSDonkey thanks for providing the links I'm going to test them out now! But yeah thanks for dedicating your time to this emulator. But really if you have a working N64 emulator why not release it? Or couldn't you at least give Norman some serious help on getting his very fast? Either way thanks and sorry for getting a little impatient, StrmnNrmn keep up the great work!

PSdonkey said...

I don't come in here as other alias as someone said nor have I EVER went to dcemu under any other name that was not mine. You can goto psp-hacks.com and see that the person named kersplatty is someone from over there and not some made up name.It's sad that if someone says thank you to someone for doing something, they are automatically branded as someone coming in under different names. This is the reason why I stay away from dcemu.com and just post updates and C++ tutorials and lessons over at psp-hacks.com

Now to clear something up.

I never said I had a full speed working emulator. Yes the fps were pretty high, but all games would crash within a few minutes of playing and I still haven't overcome that problem. I had mentioned that months ago. The few people that do have it, tell me that games get worse and worse everytime they play it. I know its some kind of memory problem but I have put the whole project on hold now. Looking at StrmnNrmn's source, I can see that he is getting along alot farther with every build. His source is alot more organized and soildly built also. His dynamic recompiler is alot different than what I had expected as well. In his next release, I'm going to take a deeper look into the core of his source and in his DynaRec and see if I can improve the over all speed of the emulator by adding my own code or changing his code a bit. Other then that, something that people need to know is that not all emualtors are coded the same way. They can actually be very different in code and in approach. You can't just transfer one part of a code of one emualtor into another and expect anything to happen.

Unknown said...

Guys, would you please drop this Donkey debate already. Dont let this ruin StrmnNrmn's nice dev blog.
Take this shit elsewhere, to whatever PSP forums you frequent.

Why do you think StrmnNrmn closed the comments on the previous news update?

StrmnNrmn:
"I'm closing comments on this post now, as I'd rather the PSdonkey debate didn't ramble on forever. Cheers!"

Drop it please.
Thank you.

Morgan said...

I'm sorry guys I think I brought the Donkey issue back up with asking for his changes, I just got excited. But yeah let's not talk about it anymore guys it's leading no where.

PSdonkey said...

Yes PSmonkey, you are right. I know my brother has caused alot of drama over at dcemu.com and I have apoligized several times already for his actions. However, I have never ever went to dcemu.com under any different names as "kaiser" has suggested. I was never anybody named gaydar or coors or bigboy or pspuser or any other name that "kaiser" is making up. In fact I have seen over there that anybody who has said that I have released something new or done an update to anything, "kaiser" automatically tells everyone that that person is me and then he bans them on the spot. I have also seen "kaiser" ban anyone who even says thank you to me over at that website either. "kaiser" ovbiously abuses his power over there and needs to grow up.
On another note, how can there be 2 "kaiser" here posting with the same name? There can't be. If this person really is the kaiser over at dcemu.com, then he is playing some sick joke/game to everyone. There can only be one kaiser and one password and if this kaiser is the one from dcemu like he says he is then he is just spamming this blog and playing a joke on everyone. Once again, he needs to grow up.
On a finally note, This is StrmnNrmn's blog. Please keep all comments on here regarding his work. If anyone is interested in what I am doing that is NOT related to StrmnNrmn's emualtor, you can either send me a PM at psp-hacks.com or a personaly email if you wish. Quite frankly, it is simply rude to talk about other things on someone's blog that is not related to the blog's sunject.

cms108 said...

long time no see.

must be about 5 or 6 years now... which is about the length of time it's been since i did any programming... so bear with me if this makes no sense at all... or has all already been implemented.

is there a reason why dynamic recompilation has to be done dynamically?

i understand that you don't know what execution path the code is going to take, so you need to do most stuff dynamically. but there's bound to be blocks of code that can be optimised outside of the emulation itself.

you could have a kind static recompilation. a pre-preocessor that would either create a custom rom for use with the emulator... or maybe generate an intermediate file.. like object code to be used as a kind of patch for the rom. could also perform any texture conversions and any other conversions on other things. that might need... ummm... converting...

anyway... other thing is... doesn't the psp have about 18 processors or something? specifically... doesn't it have 2 x R4000s? could these be used to gether to either run 2 threads simultaneously or perhaps to speed up stuff like 64 bit operations that aren't supported.

also, doesn't the psp have 2 graphics cores, which from what i can tell, seem analagous to the n64's rdp and rsp. i have no idea how they work... but it'd be good if you could offload some of that rsp/rdp stuff to them... but then i spose it'd be good if we had flying cars too.

it's no wonder i got kicked out of 2 universities....

Morgan said...

Looks as if they have Goldeneye, Mario 64, and Zelda running full speed! Check out pspupdates, Strmnnrmn what is your opinion on this? If this thing is true are you going to continue building?

Morgan said...

StrmnNrmn I would also like to see a ME (media engine) version of the emulator. Most psps are 1.5 now so it would make sense to use the Media Engine. I'm sure it would help performance and hey it may open soem doors for you.

Ps. Please answer my question above also when you can, and keep up the great work I know your busy working on it now!

Morgan said...

StrmnNrmn you said by early July you could have something for us, such as R6? Well it's now July 5th so maybe could you post here about progress or whatever you've been working on since you returned from vacation. Also I mentioned using the Media Engine now that most PSP's are 1.5 is this a possibility? Thanks for your hard work thus far and please never give up!

Morgan said...

That is very true I don't ever really remember using the dpad really EVER! But yeah StrmnNrmn are we close to a release because you did say at around this time you "would have something for us". I'm not trying to hurry you along I'm patient and I want the best emulator you can build I just want to know if we are close and if you maybe had a date?

Morgan said...

If you are cheering on StrmnNrmn to build this emulator so you can play pokemon stadium. Are you a complete clown or what, take that stupid game and throw it away. Give me some goldeneye, starfox, diddy kong racing, mario 64! Pokemon you should be ashamed of yourself. Kepp up the great work StrmnNrmn!

Morgan said...

Plus how are you going to play pokemon stadium if you had to have your gameboy cartridge hooked up, come on man think!

Morgan said...

Sound won't probably be for a while because sound would GREATLY slow down the emulator. Sound will probably be implimented in a much later release. Keep up the great work StrmnNrmn!

Morgan said...

Sound won't probably be for a while because sound would GREATLY slow down the emulator. Sound will probably be implimented in a much later release. Keep up the great workk StrmnNrmn!

Morgan said...

Sound won't probably be for a while because sound would GREATLY slow down the emulator. Sound will probably be implimented in a much later release. Keep up the great work StrmnNrmn!

Morgan said...

Sound would greatly slow down the emulator so I don't expect sound for a while.

Morgan said...

Are you talking about the build with the music in the backround (r5 remix) that was just a pic from the emu and the sound from mario in atrac3. Just a eboot file customized, I do that with all my eboots.

Disturbd1 said...

Wow, no updates in like what, 2 weeks already... Where is this guy, lol... Any1 heard from him? Thought there was going to be a release of R6 rly soon.

Morgan said...

StrmnNrmn probably is hard at work, I hope he's working and that's why we haven't heard from him. But yeah StrmnNrmn you could at least give us an update or at least a comment saying something new.

StrmnNrmn said...

Argh - tons and tons of comments here! Apologies for the late replies :( I'm going to tackle a few of these before I head off to bed :)

frmariam: I suspect the memory leak I just fixed may have been responsible for the freezing you talk about. The main difference between R4 and R5 is the dynarec (which uses a lot more memory). I think the reason you're seeing the freezing happening earlier is because R5 is running out of memory more quickly. Hopefully the memory leak fix will sort this out for you.

flyingbuzz: There are actually three areas I'm looking at doing this for: matrix/vector manipulation, texture format conversion, and triangle setup. I think these three things account for most of the time spent processing display lists, so I'll be looking at doing this as I improve the dynarec performance.

sroon: I think a frameskip option is always going to involve a tradeoff between lagginess and overall speed. At least if it's an option people can choose whichever they prefer.

cms108: long time no see indeed mate :)) Seems like eons ago that we were disassembling roms for the first time in Wellington street!
You could probably preprocess some stuff in advance. I think with the dynarec the idea is just to get the cost of recompiling so low that it's insignificant compared the the time spent executing it. It might make more sense to pre-convert textures, but it would take a lot longer to read them from the memory stick, and I'm not sure there's enough free memory to hang on to them there :(
It's definitely worth trying to get both PSP cores fully utilised. One idea I was playing with was to emulate the CPU on the first core, and RSP (including display/audiolist HLE) on the second core. The n64 code should handle all the synchronisation issues itself, which is nice.
Anyway, I'll have to drop you a line soon. Did you know Steve had a sprog?!

morgan: Any links about a fullspeed emu for those roms seems to have gone now :( In any case, competition is always good :)
Yup, I agree taking advantage of the ME needs to be investigated.

flyinghippo/ukcuf16/morgan: A few people have suggested this usage of the O button. I've got this working in my R6 build, but I'd like to make it configurable (i.e. at least whether the default state is for Dpad or Cbuttons)

exoskeletor: Unfortunately I think sound is going to be some time away :( Like morgan mentions (several times :) it will slow things down somewhat, and while the framerate is like it is now, it'll sound all choppy and horrible.

disturbd1: Me too :)