higan CPU emulation mode bug? (attn: byuu or any 65816 guru)
Moderator: Moderators
Forum rules
- For making cartridges of your Super NES games, see Reproduction.
Re: higan CPU emulation mode bug? (attn: byuu or any 65816 g
https://github.com/awjackson/bsnes-clas ... a6100569b8
For byuu: the functions that need changing (based on the datasheet) are op_pei, op_read_ildp_[bw], op_read_ildpy_[bw], op_sta_ildp_[bw], and op_sta_ildpy_[bw].
Actually, I just realized that as a microoptimization you could use readdpn() for the _w versions of all the addressing modes. Those are never going to be called in emulation mode, after all.
Along with fixing the bug, I got rid of all the separate _e versions of opcode handlers. Nearly all of them only differed in that they fix up the high byte of the SP after doing readstackn() or writestackn(), or they force the M and X flags to remain set when popping or otherwise modifying the flags register. The only instruction that's different enough between emulation and native mode to make the unified handler a tiny bit ugly is RTI.
For byuu: the functions that need changing (based on the datasheet) are op_pei, op_read_ildp_[bw], op_read_ildpy_[bw], op_sta_ildp_[bw], and op_sta_ildpy_[bw].
Actually, I just realized that as a microoptimization you could use readdpn() for the _w versions of all the addressing modes. Those are never going to be called in emulation mode, after all.
Along with fixing the bug, I got rid of all the separate _e versions of opcode handlers. Nearly all of them only differed in that they fix up the high byte of the SP after doing readstackn() or writestackn(), or they force the M and X flags to remain set when popping or otherwise modifying the flags register. The only instruction that's different enough between emulation and native mode to make the unified handler a tiny bit ugly is RTI.
Re: higan CPU emulation mode bug? (attn: byuu or any 65816 g
Doubt any official software does, but I wouldn't be surprised if the firmware of some copier or whatever does. I mean, copiers on the Mega Drive usually run in Master System mode, so I wouldn't be surprised if copiers on SNES run in emulation mode, especially seeing as you can access all the SNES registers from the first bank anyway (the biggest limit being the ROM size).byuu wrote:I don't know of a single official game in the entire SNES library that runs in emulation mode, so there's bound to be more issues lurking in there than just this.
Re: higan CPU emulation mode bug? (attn: byuu or any 65816 g
Is there something to gain from running in Master System mode?
Conversely, there's really no apparent benefit from intentionally running SNES code in emulation mode unless you really need to make your init two bytes smaller.
Conversely, there's really no apparent benefit from intentionally running SNES code in emulation mode unless you really need to make your init two bytes smaller.
Re: higan CPU emulation mode bug? (attn: byuu or any 65816 g
I'm taking a guess that they had more information about the Master System than the Mega Drive available at the time. (and possibly the gain from using a single 8-bit ROM, which would have been cheaper) I can't think of any other reason.Revenant wrote:Is there something to gain from running in Master System mode?
Re: higan CPU emulation mode bug? (attn: byuu or any 65816 g
(suppresses urge to compare Street Fighter II emulated in higan or MAME to the decidedly inferior PC-native port of Street Fighter II to make an "emulation mode" joke)Revenant wrote:there's really no apparent benefit from intentionally running SNES code in emulation mode unless you really need to make your init two bytes smaller.
But thanks for the Super NES port of the test. I wonder whether more Super NES consoles or Apple IIGS computers are still in operation.
Re: higan CPU emulation mode bug? (attn: byuu or any 65816 g
This is it. I hereby officially start a "tepples wonders whether" counter.tepples wrote:I wonder whether
Some of my projects:
Furry RPG!
Unofficial SNES PowerPak firmware
(See my GitHub profile for more)
Furry RPG!
Unofficial SNES PowerPak firmware
(See my GitHub profile for more)
Re: higan CPU emulation mode bug? (attn: byuu or any 65816 g
Search feature says 3 pages of posts by him with "wonder whether". Let's make it 4!Ramsis wrote:This is it. I hereby officially start a "tepples wonders whether" counter.tepples wrote:I wonder whether
Re: higan CPU emulation mode bug? (attn: byuu or any 65816 g
Alright, your fixes are in place for v098r11. Thanks again everyone, especially AWJ.
I also took part of your advice and merged the pe?_(e,n) functions, and simplified the ridiculous paranoia-masking in memory.hpp. There's a lot more cleanups needed on this core, but these alone get us a nice speed boost from 127fps to 131fps. I then gave up a bit of speed to 130fps in order to eliminate all op_*_(e,n) variants and just merge them into single opcodes. The only opcode that became really ugly as a result was op_rti, so I hard-coded a split for the function tail in order to preserve the seamless flow of instructions and the L prefixes, but it keeps us from duplicating the first 80% of the instruction at least.
I added "#define E if(regs.e)" and "#define N if(regs.n)", which I realize is kind of evil, but it's no worse than "#define L lastCycle();" was.
If we could come up with some alternative to remove the need for L, then we could add M/X for if(!regs.p.m) and if(!regs.p.x) and merge every single _b/_w split instruction together, which would probably be a big boost for cache locality (less code.)
If you weren't aware, Screwtape is now hosting WIP releases publicly via Gitlab, so there's full version history information now. There's a small delay in my WIPs making it there, so you won't see it if you look right now.
https://gitlab.com/higan/higan/tree/master/
I also took part of your advice and merged the pe?_(e,n) functions, and simplified the ridiculous paranoia-masking in memory.hpp. There's a lot more cleanups needed on this core, but these alone get us a nice speed boost from 127fps to 131fps. I then gave up a bit of speed to 130fps in order to eliminate all op_*_(e,n) variants and just merge them into single opcodes. The only opcode that became really ugly as a result was op_rti, so I hard-coded a split for the function tail in order to preserve the seamless flow of instructions and the L prefixes, but it keeps us from duplicating the first 80% of the instruction at least.
I added "#define E if(regs.e)" and "#define N if(regs.n)", which I realize is kind of evil, but it's no worse than "#define L lastCycle();" was.
If we could come up with some alternative to remove the need for L, then we could add M/X for if(!regs.p.m) and if(!regs.p.x) and merge every single _b/_w split instruction together, which would probably be a big boost for cache locality (less code.)
If you weren't aware, Screwtape is now hosting WIP releases publicly via Gitlab, so there's full version history information now. There's a small delay in my WIPs making it there, so you won't see it if you look right now.
https://gitlab.com/higan/higan/tree/master/
It shows nothing at all in higan because you aren't initializing the system correctly.For anyone who cares, here's a quick and dirty SNES version of koitsu's test. As expected(?), it shows BBBB on hardware, but BBAA in bsnes-plus (not sure about current higan).
Re: higan CPU emulation mode bug? (attn: byuu or any 65816 g
Whoops, fixed it. (See what I mean by "quick and dirty"?
)
Re: higan CPU emulation mode bug? (attn: byuu or any 65816 g
I was aware of that Gitlab repository, but thanks all the same.byuu wrote:Alright, your fixes are in place for v098r11. Thanks again everyone, especially AWJ.
I also took part of your advice and merged the pe?_(e,n) functions, and simplified the ridiculous paranoia-masking in memory.hpp. There's a lot more cleanups needed on this core, but these alone get us a nice speed boost from 127fps to 131fps. I then gave up a bit of speed to 130fps in order to eliminate all op_*_(e,n) variants and just merge them into single opcodes. The only opcode that became really ugly as a result was op_rti, so I hard-coded a split for the function tail in order to preserve the seamless flow of instructions and the L prefixes, but it keeps us from duplicating the first 80% of the instruction at least.
I added "#define E if(regs.e)" and "#define N if(regs.n)", which I realize is kind of evil, but it's no worse than "#define L lastCycle();" was.
If we could come up with some alternative to remove the need for L, then we could add M/X for if(!regs.p.m) and if(!regs.p.x) and merge every single _b/_w split instruction together, which would probably be a big boost for cache locality (less code.)
If you weren't aware, Screwtape is now hosting WIP releases publicly via Gitlab, so there's full version history information now. There's a small delay in my WIPs making it there, so you won't see it if you look right now.
https://gitlab.com/higan/higan/tree/master/
I don't think there's anything you can do about L; it's fundamental to how the 65816 checks interrupts. At least the ugliness would be confined to the tails of the addressing mode methods.
Here's what I had in mind for unifying the _b and _w and reducing code duplication. Define a few helper functions:
Code: Select all
alwaysinline uint16_t regmask(bool eight)
{
return eight ? 0xff : 0xffff;
}
alwaysinline uint16_t signbit(bool eight)
{
return eight ? 0x80 : 0x8000;
}
alwaysinline void setreg(uint16_t ®, uint16_t value, bool eight)
{
reg = reg & ~regmask(eight) | value & regmask(eight);
}The idea behind these helper functions is to minimize branches by encouraging the compiler to use conditional moves instead. For even stronger protection against gratuitous branches, put the mask and signbit lookup values into static const arrays inside the respective functions (I have no idea whether this would actually be smaller/faster or not)
Another idea I had was making p (the flags) incorporate e, and make the class itself enforce e/m/x consistency. But then you'll have to use getters and setters to access the individual flags and I know you hate that style (I trust you have the good judgement not to consider attaching an operator=() callback pointer to each flag, like you did with the SuperFX registers!) One other advantage to making p an opaque class is that you can experiment with things like byte-packing the less volatile flags, or storing n/z lazily (store the last result that affected them and convert it to a bool on demand) and see if they have any effect on performance.
Off topic, by any chance have you read this article? http://blog.codef00.com/2014/12/06/port ... using-c11/ It reminded me a lot of your new integer classes, though the specifics are a bit different (the templates described in the article don't support accessing arbitrary numeric bit ranges, only fields defined and named at compile time).
Re: higan CPU emulation mode bug? (attn: byuu or any 65816 g
Google Search results for "envmxdizc" show that this idea may be viable.AWJ wrote:Another idea I had was making p (the flags) incorporate e
Re: higan CPU emulation mode bug? (attn: byuu or any 65816 g
> I don't think there's anything you can do about L; it's fundamental to how the 65816 checks interrupts
The basic idea is obviously that it's a two-stage pipeline. So if we could manipulate the read/write/io functions to have a one-slot delay, and have the "execute new instruction" trigger the delay slot, it would work. But obviously it's not nearly that simple, since by the time we get to the next opcode execute, the last opcode work cycle has executed already.
I don't even want to imagine how we would emulate more than two stages of pipeline accurately. L is a crude hack.
> alwaysinline uint16_t regmask(bool eight) { return eight ? 0xff : 0xffff; }
That's not going to address the difference between A (leaves upper 8-bits alone) and X/Y (upper 8-bits always zero.)
You'd probably want regmaskM / regmaskX or something.
I kind of use what you're saying here though in my other cores, like V30MZ and it's 8/16/32-bit modes (32-bit for the long multiply/division.) But it's much easier to merge the the byte/word/dword instructions there because there's no lastCycle nonsense to worry about (or at least, we don't bother trying to.)
> you can make the registers plain uint16_t and not compiler-dependent unions
I wanted to build them around Natural<16>, and then a.l == a.byte(0), a.h == a.byte(1), but that's too much extra typing.
> your auto-bitmasking integer classes (which will perform extremely poorly when the range of bits to access isn't a compile-time constant--trust me, don't even try it)
They only experience a very minor slowdown due to poor compiler code generation. The logic to handle dynamic bit-masking would be identical if you write it yourself in your own code (in your case above, you're optimizing because there are only two possibilities, of course.)
> I trust you have the good judgement not to consider attaching an operator=() callback pointer to each flag, like you did with the SuperFX registers!
Say what you will, that avoids the need to have manual checks for r15 (pc) modifications in every single instruction handler.
> One other advantage to making p an opaque class is that you can experiment with things like byte-packing the less volatile flags, or storing n/z lazily
That will definitely speed things up (not by a lot, probably 1-2% of the total emulation time.)
This should be obvious by now, but I am intentionally writing a lot of code in higan sub-optimally. The goal is to make the code easier to read, so I don't employ 100% safe tricks like delayed computations of register flags. Even when they're expensive and rarely used, like V30MZ's ridiculous parity flag.
My goal (I don't always achieve it) is always "the least possible amount of code with the least amount of caches/copies."
I'm all for you applying them to your fork, but it's unfortunate the way they've become so different that there's no hope of you merging upstream changes anymore. You're gonna have to add any of my changes/fixes by hand to your code. That was already the case though since you're still working off a five-year old fork.
> Off topic, by any chance have you read this article?
No, but I'm very familiar with bitfields.
Whereas the order_[ml]sb macros can result in union/structs on bytes that work on 100% of platforms I've tested, bit-packing is vastly more volatile in doing what you want. C doesn't make any guarantees for bit-ordering, when things will actually compact versus pad, etc.
The code gen for them is also worse than my flag classes are. (Yes, I had a dumb issue where I was recomputing all of P everywhere. That's been fixed for a while now.) I'm referring to both when I used to pack the eight bits of P into one uint8_t, and the current method where I split them into eight booleans.
> the templates described in the article don't support accessing arbitrary numeric bit ranges, only fields defined and named at compile time
What I really wanted to see was C++ support unified function call syntax, including on native types.
So I could say:
Because I am never going to be happy with "bits(x, 2, 3);" for many obvious reasons.
If I could get that, then uint8/16/32/64 could continue to be their native types in higan, and I'd only need Natural<T> for non-power-of-two integers, which are a whole lot less common in the codebase.
Unfortunately, some dipshit on the Jacksonville panel torpedoed the idea. It would have been the most revolutionary feature added to C++ since at least 1998 had they done it. We could have started implementing truly encapsulated data structures without resorting to "useless" (std) or "kitchen sink" (nall) style classes.
The basic idea is obviously that it's a two-stage pipeline. So if we could manipulate the read/write/io functions to have a one-slot delay, and have the "execute new instruction" trigger the delay slot, it would work. But obviously it's not nearly that simple, since by the time we get to the next opcode execute, the last opcode work cycle has executed already.
I don't even want to imagine how we would emulate more than two stages of pipeline accurately. L is a crude hack.
> alwaysinline uint16_t regmask(bool eight) { return eight ? 0xff : 0xffff; }
That's not going to address the difference between A (leaves upper 8-bits alone) and X/Y (upper 8-bits always zero.)
You'd probably want regmaskM / regmaskX or something.
I kind of use what you're saying here though in my other cores, like V30MZ and it's 8/16/32-bit modes (32-bit for the long multiply/division.) But it's much easier to merge the the byte/word/dword instructions there because there's no lastCycle nonsense to worry about (or at least, we don't bother trying to.)
> you can make the registers plain uint16_t and not compiler-dependent unions
I wanted to build them around Natural<16>, and then a.l == a.byte(0), a.h == a.byte(1), but that's too much extra typing.
> your auto-bitmasking integer classes (which will perform extremely poorly when the range of bits to access isn't a compile-time constant--trust me, don't even try it)
They only experience a very minor slowdown due to poor compiler code generation. The logic to handle dynamic bit-masking would be identical if you write it yourself in your own code (in your case above, you're optimizing because there are only two possibilities, of course.)
> I trust you have the good judgement not to consider attaching an operator=() callback pointer to each flag, like you did with the SuperFX registers!
Say what you will, that avoids the need to have manual checks for r15 (pc) modifications in every single instruction handler.
> One other advantage to making p an opaque class is that you can experiment with things like byte-packing the less volatile flags, or storing n/z lazily
That will definitely speed things up (not by a lot, probably 1-2% of the total emulation time.)
This should be obvious by now, but I am intentionally writing a lot of code in higan sub-optimally. The goal is to make the code easier to read, so I don't employ 100% safe tricks like delayed computations of register flags. Even when they're expensive and rarely used, like V30MZ's ridiculous parity flag.
My goal (I don't always achieve it) is always "the least possible amount of code with the least amount of caches/copies."
I'm all for you applying them to your fork, but it's unfortunate the way they've become so different that there's no hope of you merging upstream changes anymore. You're gonna have to add any of my changes/fixes by hand to your code. That was already the case though since you're still working off a five-year old fork.
> Off topic, by any chance have you read this article?
No, but I'm very familiar with bitfields.
Whereas the order_[ml]sb macros can result in union/structs on bytes that work on 100% of platforms I've tested, bit-packing is vastly more volatile in doing what you want. C doesn't make any guarantees for bit-ordering, when things will actually compact versus pad, etc.
The code gen for them is also worse than my flag classes are. (Yes, I had a dumb issue where I was recomputing all of P everywhere. That's been fixed for a while now.) I'm referring to both when I used to pack the eight bits of P into one uint8_t, and the current method where I split them into eight booleans.
> the templates described in the article don't support accessing arbitrary numeric bit ranges, only fields defined and named at compile time
What I really wanted to see was C++ support unified function call syntax, including on native types.
So I could say:
Code: Select all
uint bits(uint16_t&, uint lo, uint hi);
uint16_t x;
x.bits(2, 3);If I could get that, then uint8/16/32/64 could continue to be their native types in higan, and I'd only need Natural<T> for non-power-of-two integers, which are a whole lot less common in the codebase.
Unfortunately, some dipshit on the Jacksonville panel torpedoed the idea. It would have been the most revolutionary feature added to C++ since at least 1998 had they done it. We could have started implementing truly encapsulated data structures without resorting to "useless" (std) or "kitchen sink" (nall) style classes.
Re: higan CPU emulation mode bug? (attn: byuu or any 65816 g
As long as you make sure you clear the upper bits of X/Y every time the x flag changes from 0 to 1, that doesn't matter.> alwaysinline uint16_t regmask(bool eight) { return eight ? 0xff : 0xffff; }
That's not going to address the difference between A (leaves upper 8-bits alone) and X/Y (upper 8-bits always zero.)
The reason the slowdown is only "minor" is because the compiler can optimize for only one possibility when it knows what the bit range is at compile time. Remember what I said once about how "inline does more than chopping off call and ret"?> your auto-bitmasking integer classes (which will perform extremely poorly when the range of bits to access isn't a compile-time constant--trust me, don't even try it)
They only experience a very minor slowdown due to poor compiler code generation. The logic to handle dynamic bit-masking would be identical if you write it yourself in your own code (in your case above, you're optimizing because there are only two possibilities, of course.)
I thought of a way to improve on that, but haven't tested it for performance yet (mainly because I know it will break Revenant's debugger enhancements). My idea is to have each opcode return the index of the register it modified (SuperFX instructions never modify multiple arbitrary registers--if they modify two registers one of them is always one with no side effects) Then put the r14/r15 check after the opcode dispatch:> I trust you have the good judgement not to consider attaching an operator=() callback pointer to each flag, like you did with the SuperFX registers!
Say what you will, that avoids the need to have manual checks for r15 (pc) modifications in every single instruction handler.
Code: Select all
auto changedreg = op_exec();
if(changedreg == 14) {// handle data caching }
else if(changedreg == 15) {// handle pipeline flushing }(the previous parenthesized snark was targeted at Revenant as much as at byuu)
Your Natural class supports writing to arbitrary ranges of bits, not just reading from them. I don't see how you could achieve that with unified function call syntax even if it was in the language.What I really wanted to see was C++ support unified function call syntax, including on native types.
So I could say:Because I am never going to be happy with "bits(x, 2, 3);" for many obvious reasons.Code: Select all
uint bits(uint16_t&, uint lo, uint hi); uint16_t x; x.bits(2, 3);
If I could get that, then uint8/16/32/64 could continue to be their native types in higan, and I'd only need Natural<T> for non-power-of-two integers, which are a whole lot less common in the codebase.
Unfortunately, some dipshit on the Jacksonville panel torpedoed the idea. It would have been the most revolutionary feature added to C++ since at least 1998 had they done it. We could have started implementing truly encapsulated data structures without resorting to "useless" (std) or "kitchen sink" (nall) style classes.
ETA: Completely offtopic, but while we're communicating on more-or-less civil terms I'm going to take this opportunity to call out something you said a while back about how hypothetical Java-style inner classes could be a magic fast box for C++:
Even in Java, instances of inner classes contain hidden pointers to their parent outer-class instance. That's why Java programming guides advise you to make inner classes "static" unless they really need access to non-static members of their parent outer class ("static" in the context of Java inner classes means "omit the implicit pointer-to-parent-instance")In this case, PPU::vblank knows exactly where the cpu. object is. It doesn't have to do any pointer lookups. When it comes to accessing properties, we can compute it as a single displacement against the object's address, as if the property existed inside the PPU class itself.
Why can't the compiler just apply a displacement to the address of the inner object to get the address of its parent? Think about what happens if you have multiple instances of an inner class inside the outer class, as in your own hypothetical example:
Code: Select all
class SuperFamicom {
subclass CPU cpu;
subclass PPU {
subclass BG { ... } bg1, bg2, bg3, bg4;
subclass OBJ { ... } obj;
} ppu;
subclass SMP smp;
subclass DSP dsp;
};
Last edited by AWJ on Wed May 25, 2016 2:48 pm, edited 1 time in total.
Re: higan CPU emulation mode bug? (attn: byuu or any 65816 g
I've got no problem reworking my register editor (or any other feature I'm responsible for) to support / allow for a cleaner approach to modifying state. I remember we briefly discussed it a while ago but I haven't really looked into at all since then.
Re: higan CPU emulation mode bug? (attn: byuu or any 65816 g
> (the previous parenthesized snark was targeted at Revenant as much as at byuu)
You can always write your own emulator from scratch if mine is really so bad.
We really do need someone to write an Snes9X replacement, given said project is dead in all but name. And PCs aren't getting any faster, so bsnes is unlikely to run on cheap portable hardware any time soon.
> Your Natural class supports writing to arbitrary ranges of bits, not just reading from them. I don't see how you could achieve that with unified function call syntax even if it was in the language.
The same way it does it now:
The key advantage here would be that uint(8,16,32,64) would be native types for when .bits is not being used on them. This would be huge. Things like vector<Natural<8>> is not compatible with a function that takes a vector<uint8_t>.
> hypothetical Java-style inner classes could be a magic fast box for C++:
Speaking for the PPU, I never said it would be faster. I said the code would be nicer to look at. We'd lose a bunch of excess "ppu." prefixes all over the PPU core. Even if it ends up slower than capturing PPU& inside PPU::BG1, etc (and I can't see how); I'd still do it.
As for the general case of a "class SuperFamicom", it would be the same performance as I had now (if things were static), or it would incur the same 10% performance penalty from when I used to have global CPU*, PPU*, etc handles.
The magic speedup I speak of happens from having "CPU cpu;" in the global namespace. Which makes instantiating multiple SNES cores with bsnes impossible, but allows the PPU's reference to cpu.foo to be turned into a static address without the need for displacement.
> while we're communicating on more-or-less civil terms
I never made things personal. I asked you several times not to. I very much didn't want to block you on my own forums, and I gave you more chances than anyone.
I am extremely grateful for your help in addressing legitimate bugs, and always will be no matter what transpires in the future between us.
I wish you wouldn't keep attacking every last detail of my programming style. I hate that, but I can mostly take it. I know I'm weird. What do you expect from someone who self-taught himself everything? At the end of the day, crazy as my software is, it works. Right now, no one can point to a game bug that isn't possible on real hardware in the entire SNES library. Something no one else can claim. And yes, that was thanks to the help of probably two dozen other people, including yourself. I'm not claiming all the credit, but I am the one that took everything and put it together into a real product. I'm saying that the horrible codebase that's so heaped with scorn ... works. So surely it can't be all bad. (And I mean seriously, have you seen other SNES emulator codebases? One of them has a CRC32 table inside the memmap.c file for the SNES CPU address bus handling, and has code like the block below. Another one is pure commentless x86 assembler. Why are you being so critical of my work? >_<)
We have our differences on how we'd design things. You can make any changes you like in your fork. You don't have to agree with my choices, but please respect that we're each entitled to our own opinions on how software should be designed.
Here's the thing I'll admit to: I am not as smart as you! Or as smart as most of bsnes' contributors. The codebase is simple not just because I believe in simplicity, but because anything more and I can't comprehend it. I keep it this simple, this slow, this in my own personal style, because that's what I need to do to keep it all straight in my head. And when you laugh at many of my ideas, you're attacking code that's 12+ years old. I'm still not a perfect programmer, but I like to think I'm better than some of the code in there from 2004. Singling out a line like a XOR swap in a 370,000-line codebase is kind of unfair.
When it comes to stupid design choices: you can achieve the same results by politely pointing out, "have you tried X instead of Y here?", and I may or may not agree. But if I don't, being rude about it is only going to make me less likely to make the changes you want (I'm somewhat petty in that regard), and will just damage our relationship. I know it can be frustrating when someone won't agree with you. We both experience that about each other on our disagreements.
It's when you go around insulting my integrity, competence, and anyone who doesn't take your side that it gets to be a problem. As important as bsnes is to me, I'm not going to put up with ad hominem attacks against my character. If you do that here, then I'll stop reading your messages and stop responding to you for good. I really don't want to do that. I don't want to miss out on bugfixes like from this thread. But if that's what I have to do, then I will.
If you can avoid insulting me as a person, then I'd like us to consider the past water under the bridge. If there's something I can do to help facilitate that, please let me know. I'd like to make an equal effort here.
If you have to attack me, please at least do it in places where it won't get back to me. Admittedly not easy, I seem to unintentionally have more little birds than Lord Varys >_> I've even had to threaten to ban people on my board to try and get them to stop relaying stuff like that to me.
You can always write your own emulator from scratch if mine is really so bad.
We really do need someone to write an Snes9X replacement, given said project is dead in all but name. And PCs aren't getting any faster, so bsnes is unlikely to run on cheap portable hardware any time soon.
> Your Natural class supports writing to arbitrary ranges of bits, not just reading from them. I don't see how you could achieve that with unified function call syntax even if it was in the language.
The same way it does it now:
Code: Select all
template<typename T> bitrange_t {
T& source;
const type Lo;
const type Hi;
operator T() const;
auto& operator=(T);
};
bitrange_t<uint16_t> bits(uint16_t& source, uint lo, uint hi);> hypothetical Java-style inner classes could be a magic fast box for C++:
Speaking for the PPU, I never said it would be faster. I said the code would be nicer to look at. We'd lose a bunch of excess "ppu." prefixes all over the PPU core. Even if it ends up slower than capturing PPU& inside PPU::BG1, etc (and I can't see how); I'd still do it.
As for the general case of a "class SuperFamicom", it would be the same performance as I had now (if things were static), or it would incur the same 10% performance penalty from when I used to have global CPU*, PPU*, etc handles.
The magic speedup I speak of happens from having "CPU cpu;" in the global namespace. Which makes instantiating multiple SNES cores with bsnes impossible, but allows the PPU's reference to cpu.foo to be turned into a static address without the need for displacement.
> while we're communicating on more-or-less civil terms
I never made things personal. I asked you several times not to. I very much didn't want to block you on my own forums, and I gave you more chances than anyone.
I am extremely grateful for your help in addressing legitimate bugs, and always will be no matter what transpires in the future between us.
I wish you wouldn't keep attacking every last detail of my programming style. I hate that, but I can mostly take it. I know I'm weird. What do you expect from someone who self-taught himself everything? At the end of the day, crazy as my software is, it works. Right now, no one can point to a game bug that isn't possible on real hardware in the entire SNES library. Something no one else can claim. And yes, that was thanks to the help of probably two dozen other people, including yourself. I'm not claiming all the credit, but I am the one that took everything and put it together into a real product. I'm saying that the horrible codebase that's so heaped with scorn ... works. So surely it can't be all bad. (And I mean seriously, have you seen other SNES emulator codebases? One of them has a CRC32 table inside the memmap.c file for the SNES CPU address bus handling, and has code like the block below. Another one is pure commentless x86 assembler. Why are you being so critical of my work? >_<)
Code: Select all
//x is a uint16_t type
if(x == 0xffff)
{
x = 0;
}
else
{
x = x + 1;
}Here's the thing I'll admit to: I am not as smart as you! Or as smart as most of bsnes' contributors. The codebase is simple not just because I believe in simplicity, but because anything more and I can't comprehend it. I keep it this simple, this slow, this in my own personal style, because that's what I need to do to keep it all straight in my head. And when you laugh at many of my ideas, you're attacking code that's 12+ years old. I'm still not a perfect programmer, but I like to think I'm better than some of the code in there from 2004. Singling out a line like a XOR swap in a 370,000-line codebase is kind of unfair.
When it comes to stupid design choices: you can achieve the same results by politely pointing out, "have you tried X instead of Y here?", and I may or may not agree. But if I don't, being rude about it is only going to make me less likely to make the changes you want (I'm somewhat petty in that regard), and will just damage our relationship. I know it can be frustrating when someone won't agree with you. We both experience that about each other on our disagreements.
It's when you go around insulting my integrity, competence, and anyone who doesn't take your side that it gets to be a problem. As important as bsnes is to me, I'm not going to put up with ad hominem attacks against my character. If you do that here, then I'll stop reading your messages and stop responding to you for good. I really don't want to do that. I don't want to miss out on bugfixes like from this thread. But if that's what I have to do, then I will.
If you can avoid insulting me as a person, then I'd like us to consider the past water under the bridge. If there's something I can do to help facilitate that, please let me know. I'd like to make an equal effort here.
If you have to attack me, please at least do it in places where it won't get back to me. Admittedly not easy, I seem to unintentionally have more little birds than Lord Varys >_> I've even had to threaten to ban people on my board to try and get them to stop relaying stuff like that to me.
Last edited by Near on Tue Aug 14, 2018 3:07 am, edited 1 time in total.