SNES Doom Source Released! Now What?

Discussion of hardware and software development for Super NES and Super Famicom. See the SNESdev wiki for more information.

Moderator: Moderators

Forum rules
  • For making cartridges of your Super NES games, see Reproduction.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: SNES Doom Source Released! Now What?

Post by tepples »

Would any of these popcount algorithms be more efficient than a LUT?
creaothceann
Posts: 611
Joined: Mon Jan 23, 2006 7:47 am
Location: Germany
Contact:

Re: SNES Doom Source Released! Now What?

Post by creaothceann »

Those seems to be loops, LUTs, and some complicated-looking operations with larger registers.

A LUT (that fits into a CPU cache) is usually very fast, at the expense of space. Would the game be space-restrained?
My current setup:
Super Famicom ("2/1/3" SNS-CPU-GPM-02) → SCART → OSSC → StarTech USB3HDCAP → AmaRecTV 3.10
93143
Posts: 1717
Joined: Fri Jul 04, 2014 9:31 pm

Re: SNES Doom Source Released! Now What?

Post by 93143 »

creaothceann wrote: Sat Apr 16, 2022 1:40 amA LUT (that fits into a CPU cache)
That's actually an issue. Without sketching out the surrounding code, I can't say how fast a LUT would be, because the Super FX has no data cache, so loading a byte takes five cycles no matter what. If I don't have four cycles of useful work to do in between setting R14 and reading the value, I'm wasting time waiting. RAM is worse because it doesn't buffer in parallel, and reading a byte is more inefficient because the 8-bit RISC instruction set prioritizes reading words; if my assumptions are correct, reading a byte from RAM consumes 8 cycles in 21 MHz mode regardless of what the surrounding code looks like.

I'm also pretty much out of registers, so anything I do that requires registers is going to take extra time. Loading an immediate byte is two cycles on Super FX, and an immediate word is three. The more unique values I need, the more extra time it will take (which means a popcount based on magic constants is going to be slow too).
Would the game be space-restrained?
We're seriously considering putting a three-layer compression algorithm in between the most time-critical part of the game and the data it needs. Yes, I'd say the game is space-constrained.

Then again, a 256-byte LUT is not all that large, even if there's one in every bank (to avoid needing to switch banks, which costs cycles and registers). If it's part of the decompressor, I'd say it is what it is.

And there's still the possibility that the unconnected pin 21 of the GSU2 could turn out to be what nocash thought it was, which would permit 4 MB of ROM and ease things up considerably. It wasn't an official configuration, though, which makes me think twice about whether this would constitute cheating in the case of a port of a '90s-vintage game...
M2m
Posts: 24
Joined: Mon Feb 15, 2021 12:53 pm

Re: SNES Doom Source Released! Now What?

Post by M2m »

93143 wrote: Sat Apr 16, 2022 7:19 am And there's still the possibility that the unconnected pin 21 of the GSU2 could turn out to be what nocash thought it was, which would permit 4 MB of ROM and ease things up considerably. It wasn't an official configuration, though, which makes me think twice about whether this would constitute cheating in the case of a port of a '90s-vintage game...
My take on that one. If it would be a feasible option in the 90s, then I wouldn't consider it cheating. So in that sense, if connecting above mentioned pin would add the possibility of 4MB, I wouldn't consider it cheating.

Adding raspberry pi 4 onto a chart to do all the heavy lifting, I'd consider cheating (while technically interesting).
93143
Posts: 1717
Joined: Fri Jul 04, 2014 9:31 pm

Re: SNES Doom Source Released! Now What?

Post by 93143 »

I'm leaning towards using the spare pin if it's what nocash thinks it is. The memory problem for this game in particular is quite severe, at least if the objective is to not have to make cuts.

But using RAM on the GSU's ROM bus to allow the S-CPU to change out the data would have been far too expensive back then, even discounting the substantial engineering required to make it work. That one's probably a trick for a different game...

Optional MSU1 music would be cool, as long as the game defaults to DSP music. Even apart from the better quality, it would allow the sound effects engine to use the full resources of the S-DSP without turning the music off. And if you try really hard, you can almost pretend it's a Voicer-kun (looping and two-player mode kinda ruin the illusion)...

MSU1 data, though, is hard to justify in this case. A few megabytes could potentially be defended in an emergency, but I'd rather not have to. The S-DD1 is period hardware and might work in a pinch...
M2m
Posts: 24
Joined: Mon Feb 15, 2021 12:53 pm

Re: SNES Doom Source Released! Now What?

Post by M2m »

93143 wrote: Thu Apr 21, 2022 4:48 pm I'm leaning towards using the spare pin if it's what nocash thinks it is. The memory problem for this game in particular is quite severe, at least if the objective is to not have to make cuts.
Well you can try to create a new wad format, to reduce memory footprint.

For example for the THINGS Lump in a wad is 10 bytes long (in DOS Doom standard), but you could adopt an "optimized" format (only 6 bytes):

Code: Select all

Things

Offset 	Size (bits) 	Description 	Format 	Example
0 		16 	x position 	int16_t 	-200
16 		16 	y position 	int16_t 	-128
32 		8 	Thing Type 	uint8_t 	1
40 		4 	Angle facing 	4bit BAM 	0
34 		4 	Flags 		4bit 	7

Flags

bit 	hex 	description 
0 	0x01 	Thing is on skill levels 1 & 2 	
1 	0x02 	Thing is on skill level 3 	
2 	0x04 	Thing is on skill levels 4 & 5 	
3 	0x08 	Thing is deaf 	
4 	0x0010 	Thing is not in single player 	<- drop to fit 1 byte

Thing type:
Doom, Doom II, Final Doom contain 123 different THINGS (https://doom.fandom.com/wiki/Thing_types or https://doomwiki.org/wiki/Thing_types)
Like this you drop only the NOT in single player bit, so maybe you don't support multiplayer, or you just show it also in single player.

There is room to optimize other Lumps of the map format, too
93143
Posts: 1717
Joined: Fri Jul 04, 2014 9:31 pm

Re: SNES Doom Source Released! Now What?

Post by 93143 »

I could probably take some cues from Randy's port - he managed to jam nearly the whole game into just 2 MB.

If the extra pin enables 4 MB and the texture compression works really well, it might be possible to include the level data in the Super FX ROM as well as the CPU ROM, eliminating the need to transfer geometry data to the Super FX RAM. Space saving on level data would then become more important. If it's only on the CPU side, I'm reasonably confident that everything will fit, because you've got up to 6 MB (or 18 MB with an S-DD1) that doesn't have to contain texture data. The original game was about 10-11 MB depending on version, and supposedly over half of it was texture data. You'd just have to be careful not to blow the ROM limit with unrolled loops and lookup tables (and even that's probably impossible if an S-DD1 can safely share a cart with a GSU)...

The above discussion is predicated on the S-CPU being powerful enough to preprocess level data in a heavily rethought Doom-equivalent engine, and send the result up to GPRAM. if it isn't, then either the largest single level has to fit in GPRAM in its entirety alongside the framebuffers and working data, or all of them have to go in GPROM whether the extra pin is real or not. Or, technically, a small amount of U-ROM could be implemented - 64 KB might be enough, and it wouldn't break the bank like 2 MB would - but it's still a hit to authenticity and feels a lot more like cheating than just connecting an existing unused pin does.
M2m
Posts: 24
Joined: Mon Feb 15, 2021 12:53 pm

Re: SNES Doom Source Released! Now What?

Post by M2m »

The Doom 1 full wad is a little over 11MB. But for each level you only need to load a subset of the 11MB obviously.

That’s also what PC doom does, only load the required subset data into RAM.

Besides texture compression there are certain other optimization potentials on the doom level format:

- Texture references for walls, floors, ceilings are in 8byte (8 character) format. You could replace this with a UINT16 reference, reducing the need by 6bytes for each wall in level. Of course then you have a 64k max textures limit, but I don’t think Doom uses that many. Also this is what they went for in Doom64 for the N64.

- Flags and types entries in Doom are mostly 16bit, but these are unnecessarily large, and 8bit entries could be used. Example the Things type. It’s a 16bit entry originally. But Doom, Doom II, Final Doom use only 123 different THINGS (https://doom.fandom.com/wiki/Thing_types). So you can easily just use an 8bit entry. You have however convert some numbers compared to the original doom format. A 'conversion' of the type number would be fairly simple, however:
Type Number < 100 -> keep
2000 ≤ Type Number ≤ 3000 -> remove 1st two digits and replace with 1 (Type Number = Type Number - 2000 + 100). Example: 2019 (Megaarmor) -> 119
3000 < Type Number -> remove 1st two digits and replace with 2 (Type Number = Type Number - 2800). Example: 3006 (Lost soul) -> 206

- the SEGS entry in a map lump als has huge saving potentials in my opinion. The original format is 12bytes per entry. But I guess you could use an alternative format using 6bytes only (you’d add some performance hits however).

Code: Select all

SEGS entry

Offset	Bytes 	Description   	 	 	 Idea
0	2	Starting vertex number	 	(drop?)
2	2	Ending vertex number	 	(drop?)
4	2	Angle
6	2	Linedef number	 	 	"encode" direction.
8	2	Direction  	 			 1 byte (or drop)
10	2	Offset

Idea: Drop Starting and Ending Vertex Info and get it via the Linedef Number. The direction indicates if Start and End need to be reversed (compared to Linedef).
But you can also Drop Direction and "encode" it in Linedef number. An opposite direction would be a negative linedef number. So -623 would mean line 623 opposite direction.
So you only keep angle, linedef and offset entry which is 6bytes.


Last but not least the blockmap entry in the maps lump can be compressed with the use ZokumBSP for example: https://github.com/zokum-no/zokumbsp

Just some ideas.
Last edited by M2m on Fri Apr 22, 2022 7:20 pm, edited 3 times in total.
ehaliewicz
Posts: 18
Joined: Thu Oct 10, 2013 3:30 pm

Re: SNES Doom Source Released! Now What?

Post by ehaliewicz »

93143 wrote: Thu Mar 31, 2022 3:24 am There's a renderer for Mega Drive called PortalView, that uses "sorted linedefs". I'm imagining something where the linedefs associated with a subsector are ordered clockwise or something, so you can stop searching when you run off the edge of the portal. With subsectors of arbitrary convex geometry, you might want to be able to merge multiple portals opening into the same subsector so you don't have to process that subsector multiple times, but if you could come up with a visible linedef quickly and just search left and right until you get an occluded vertex, it might not be so bad. And a split portal is really two portals anyway, so maybe it's good to handle a split sector twice. It's hard to see how you could easily pick a linedef that can be seen through the portal, though - perhaps the portal data could suggest a likely candidate, resulting in a shorter search...

I'm the author of the megadrive portal engine you mention. I haven't read through the entire discussion, but I figured I could give a quick explanation of the sorted linedef idea. As far as I know, KK/Altair came up with this idea for Dread, a doom "clone" on Amiga.

Essentially, it's a merging of the visible surface determination algorithm used in build, and quake's pre-calculated PVS.

Build uses portals to search through the game world for rendering walls, but it uses concave sectors. This allows them to use far less sectors than a naive portal engine, which reduces clipping and transformation costs, but it also means that objects/walls can potentially occlude other walls in the same sector. It uses a complex test to determine which is in front, and this problem is solvable in 2D but not in 3D, because cyclic overlap cannot happen in 2D. Interestingly, this problem is why Ken Silverman didn't finish a 3D engine to compete with quake after build.

Anyway, the sorted linedef idea is to calculate all potentially visible walls for each sector, ala quake style PVS, and then sort those linedefs such that from any position in the sector, any wall B drawn after wall A cannot occlude wall A. There should always be a valid sorting order. As far as I can tell, this is probably the fastest way to handle visibility in "2.5D" worlds.
M2m
Posts: 24
Joined: Mon Feb 15, 2021 12:53 pm

Re: SNES Doom Source Released! Now What?

Post by M2m »

ehaliewicz wrote: Fri Apr 22, 2022 6:21 pm I'm the author of the megadrive portal engine you mention.
Just came here to say, that your MD Doom "clone" is the most amazing thing, I have seen in 2021/22. Keep it up !
93143
Posts: 1717
Joined: Fri Jul 04, 2014 9:31 pm

Re: SNES Doom Source Released! Now What?

Post by 93143 »

ehaliewicz wrote: Fri Apr 22, 2022 6:21 pmI'm the author of the megadrive portal engine you mention.
Oh hey. Nice work.
Anyway, the sorted linedef idea is to calculate all potentially visible walls for each sector, ala quake style PVS, and then sort those linedefs such that from any position in the sector, any wall B drawn after wall A cannot occlude wall A. There should always be a valid sorting order. As far as I can tell, this is probably the fastest way to handle visibility in "2.5D" worlds.
Wow, I was way off.

Are you using convex sectors or subsectors? I don't see how a single sorted list could handle a concave sector. The whole problem with concave sectors is that the Z order of potentially mutually-occluding linedefs is different when viewed from different parts of the sector. Isn't it?

Also, does this scheme require more storage than something like the Doom engine? I imagine it needs more than a Build-like engine at least.

It strikes me that this scheme pretty much gets rid of a significant chunk of the geometry preprocessing I was expecting the S-CPU to have to do in order to avoid needing to store level data in Super FX ROM. Frustum culling would still be helpful, I suppose, although it requires a lot of multiplication...

Funny - it had occurred to me that one could store per-subsector (or per-portal) lists of potentially visible subsectors, but I didn't think of taking it to the linedef level and I never considered the possibility that that's what you meant. I'm not familiar with the Quake engine; it seemed a bit too advanced to be relevant to Doom, but clearly that's not quite true...
ehaliewicz
Posts: 18
Joined: Thu Oct 10, 2013 3:30 pm

Re: SNES Doom Source Released! Now What?

Post by ehaliewicz »

93143 wrote: Fri Apr 22, 2022 8:27 pm Wow, I was way off.

Are you using convex sectors or subsectors? I don't see how a single sorted list could handle a concave sector. The whole problem with concave sectors is that the Z order of potentially mutually-occluding linedefs is different when viewed from different parts of the sector. Isn't it?
Yes, this requires you to break down concave sectors to convex. Build gets away with concave sectors because it only ever calculates this order from a single view-point, at runtime.
93143 wrote: Fri Apr 22, 2022 8:27 pm Also, does this scheme require more storage than something like the Doom engine? I imagine it needs more than a Build-like engine at least.
Yet, it takes quite a bit of space, but it can be in ROM since it never needs to be updated.

93143 wrote: Fri Apr 22, 2022 8:27 pm It strikes me that this scheme pretty much gets rid of a significant chunk of the geometry preprocessing I was expecting the S-CPU to have to do in order to avoid needing to store level data in Super FX ROM. Frustum culling would still be helpful, I suppose, although it requires a lot of multiplication...

Funny - it had occurred to me that one could store per-subsector (or per-portal) lists of potentially visible subsectors, but I didn't think of taking it to the linedef level and I never considered the possibility that that's what you meant. I'm not familiar with the Quake engine; it seemed a bit too advanced to be relevant to Doom, but clearly that's not quite true...
Yes, quake stores lists of potentially visible sectors, per-sector, so it's not quite as fine grained. It's also not pre-sorted in the same kind of order (because it's not possible in 3D).
93143
Posts: 1717
Joined: Fri Jul 04, 2014 9:31 pm

Re: SNES Doom Source Released! Now What?

Post by 93143 »

ehaliewicz wrote: Fri Apr 22, 2022 8:39 pmYes, this requires you to break down concave sectors to convex. Build gets away with concave sectors because it only ever calculates this order from a single view-point, at runtime.
Okay, that makes sense. It also seems to match what I saw in that YouTube video, where the portals were actually shown on screen to demonstrate how the renderer worked...
it takes quite a bit of space, but it can be in ROM since it never needs to be updated.
That might still turn out to be an issue. Doom has a lot of geometry, and according to another poster the level data in the IWAD takes up nearly 4 MB already. Then again, the fact that the existing SNES port fits most of the levels in 2 MB along with the entire rest of the game, without nerfing the geometry like the Jaguar-derived ports, suggests that the PC version might have been a little expansive with its level data format.

Well, I guess there's no way to know without doing it. At this stage (basically I want to do this, but don't have the time), it's probably wise to keep my options open...
ehaliewicz
Posts: 18
Joined: Thu Oct 10, 2013 3:30 pm

Re: SNES Doom Source Released! Now What?

Post by ehaliewicz »

Well just storing potentially visible subsectors will probably help already, and it is very compressible (most sectors will not be visible from any given sector). The only problem is that it wouldn't provide a near to far sort order, you'll have to traverse a bsp tree or portal graph to determine the correct sorting order while checking if the sectors you're visiting are in the pvs.

One potential problem is that doom maps are less tightly constrained than quake maps, which often contain 90 degree angle tunnels and such to cut off visibility for the pvs generator.
93143
Posts: 1717
Joined: Fri Jul 04, 2014 9:31 pm

Re: SNES Doom Source Released! Now What?

Post by 93143 »

Is there a good way to do coarse frustum culling without having to transform every vertex?

I guess if you want very coarse culling you could just check if either vertex of a linedef is in either of the quadrants the view frustum is in. This would only require a single coordinate comparison per vertex, and no multiplication.
Post Reply