WedNESday wrote:__forceinline is the best option IMO.
Just to reiterate...
__forceinline is not a standard C++ keyword. I'd say it's never a good idea to use it simply because it may give you trouble with compilers that don't support it -- which will make portability or even public source release somewhat problematic.
Plus inline (which
is a standard keyword) does the same job. The only difference is that __forceinline doesn't detect conditions where inlining
isn't favorable. You say it's favorable for CPU functions and I don't disagree -- but the truth is you shouldn't substitute your judgement for the compiler's. Inline doesn't always mean faster... and in the offchance you happen to make a function inline where inlining reduces performance, the compiler will correct that (that's its job) -- whereas with __forceinline you end up screwing yourself.
Any situation where it
is favorable to have stuff inlined, inline works just as well as __forceinline.
So yeah -- __forceinline should never be the way to go, IMO.
But again -- this is another reason to #define calling conventions, since whoever compiles your source can change the define to inline rather than __forceinline if they choose -- rather than having to go and change every function.
EDIT (MottZilla replied while I was typing)
What does __forceinline do? It sounds to me like it replaces the function call with inline code? So it sounds to me like the compiler doesn't actually call your function but instead any part of code that uses it actually has it placed right in there? Let me know if my idea is close. :p
Yeah sounds like you have it right. Function inlining makes it so that when you call a function, it doesn't actually jump to that function -- rather, the function gets sort of copy/pasted into the area that calls it.
This is good because there's a little overhead for function calling (variables pushed on stack and whatnot) which is avoided if the function is inlined.
But it can also be bad because it can greatly bloat code size, which may cause the program to run slower.
Anyway, curious what do you guys know about BattleToads?
It's very picky about timing. If your NMI isn't timed just right, or if your sprite 0 hit is little off, the game can very easily deadlock on level 2. It's also picky about when in the scanline you reset the horizontal scroll and increment the Y scroll, etc. Doing these at the wrong times can cause it to deadlock.
I do seem to have the name table switching by writing to 2006 correct, atleast enough so that Super Mario's status bar doesn't flicker. But I'm not clear on how you change the scroll offset.
How it works is the PPU address set by $2006 is the same address that the PPU uses to fetch tiles to render. During rendering, every time the PPU fetches a tile it increments the address so that the next tile to be displayed is pointed to. I'm not really sure it helps to think of it in terms of scroll offset.
For example... if the game sets the PPU address to $1234 by writing to $2006, this means that the next tile fetched comes from $2234 ($0234 + $2000) and with a fine Y scroll of 1 ($1000 >> 12). In effect this translates to:
Y scroll: $89
X scroll: $A0 (to $A7... depending on the fine X scroll set by $2005)