(This is quite technical post, probably only of interest to other C++/PSP developers.)
Since I fixed the issue stopping the
Expansion Pak support from working, I've been testing daedalus with the feature enabled to see if the added pressure on available RAM causes any new issues. Most roms that I've been tested have been running perfectly, but I've been experiencing occasional crashes through running out of memory. I believe this is due to a small leak somewhere, so in an effort to track it down I've been improving the tools I use to track memory usage.
In the improved tracker I override the global new and delete operators, which lets me perform some logging on every allocation/deallocation made during the course of executing the emulator (this isn't quite true, as I don't currently log calls to malloc etc, but it's good enough for my purposes). At the start of each overriden implementation, I keep track of the calling function's return address with the following snippet of code:
u32 ra;
asm volatile
(
"sw $ra, %0\n"
: "+m"(ra) : : "memory"
);
I log this return address along with the allocation size and the returned pointer for every call to new and new[]. For calls to delete and delete[], I just log the address of the memory being freed and the caller's address. This data is all logged to disk across the USB connection using
PSPLink.
The logfiles look a little like this:
Allocating 36 bytes - 09c40620 - RA is 08953a94
Allocating[] 8192 bytes - 09c40a00 - RA is 0895e8bc
Allocating 12 bytes - 09c3fce0 - RA is 08964898
Allocating 12 bytes - 09c40970 - RA is 089648b0
Allocating 12 bytes - 09c409d0 - RA is 089648c8
Allocating 12 bytes - 09c408e0 - RA is 089648e0
Allocating 20 bytes - 091ceb20 - RA is 089645e8
Freeing 09c3fce0 - RA is 08962374
Freeing 09c40970 - RA is 089621fc
In order to analyse the results I've written a PC tool which scans through the logfile one line at a time 'replaying' the allocations and deallocations in the same order that they occured on the PSP. The analyser keeps track of the current state of allocations at any point in time, matching up any calls to delete with the corresponding call to new. This means that at any given time the tool has a complete list of all outstanding allocations.
I can discover any memory leaks by shutting down the emulator and continuing to record the logfile while it frees up all allocated resources. By replaying the logfile the analyser can identify any leaks, as these are (mostly) the only remaining allocations at the end of the logfile. I can then run the corresponding return addresses through psp-addr2line to discover where the leaked memory is being allocated from.
The other cool feature of the tool is that it builds up a graphical representation of the state of memory allocations at any point in time. This is really useful for figuring out where all the available RAM is going.
Here's a picture showing where most of the PSP's memory is being used while emulating Mario 64 (I've added the labels on by hand, the tool doesn't do that :)
Each pixel corresponds to 16 bytes of RAM. The smallest blocks are 64x64x16 bytes = 64KB. 1MB chunks are formed from 4x4 64KB blocks. The black space corresponds to both unallocated memory, and also memory allocated outside of the tracker (e.g. calls to malloc, memory used by PSPLink, the CRT, static data areas etc.) You can see that the "Emulated RAM" accounts for just over 8MB - this is because I enabled the Expansion Pak while testing. Dynarec currently uses about 6MB - Ideally I'd like to reduce this down to around 4MB soon.
You'll notice that almost all the memory Daedalus uses is for these big fixed-size allocations. What little dynamic allocation it does at runtime is limited to:
- Keeping track of hot-trace hit counts in the Dynamo implementation
- Textures and texture caching
As it turns out, the out-of-memory issues I've been having have been due to the texture cache going crazy and chewing up around 3MB of memory (typically it just uses 200-300KB or so). I've not figured out the root cause yet, but the tool has helped point me in the right direction.
All in all I think this is a pretty nifty utility as it stands, but I've been thinking about a few features that would make it even better:
- Log the time and current frame alongside each allocation/deallocation. The analyser can then use this information to see how much 'churn' there is over time. Minimising this should help improve performance and reduce fragmentation.
- Every memory allocation has a small housekeeping overhead (for alignment, keeping track of the allocation size etc). The tool could generate a histogram of allocation sizes to demonstrate how much memory is being wasted through tiny allocations, and give some indication of where pooling or freelists might help.
These are probably features for some point down the line however :)
That pretty much sums up the tool. If this code (either for the tracker or the logfile analyser) would be useful to anyone, let me know and I'll add it to Subversion alongside the rest of the Daedalus code with R10.
StrmnNrmn