Continuum 93, an assembly machine emulator that... does not have a hardware

Discussion of development of software for any "obsolete" computer or video game system. See the WSdev wiki and ObscureDev wiki for more information on certain platforms.
User avatar
Gilbert
Posts: 561
Joined: Sun Dec 12, 2010 10:27 pm
Location: Hong Kong
Contact:

Re: Continuum 93, an assembly machine emulator that... does not have a hardware

Post by Gilbert »

EnthusiastGuy wrote: Wed Oct 18, 2023 2:34 pm 1. deA cannot work, since it skips f. efA would work, so would fA or defA (which is quite a weirdly cool name for a register that also takes me to think: define A)
2. YZA is correct, but YAB cannot be, only YZAB or ZAB. Or ZA (which is yet another weird name for a register, but since I also have KLM, I won't press the topic further)

Most likely those are typos, but since I noticed two, I said it's safer to make this note. :D
Yeah, they're typos. Thanks for pointing them out.
(And that the main task of my fulltime job is proofreading, doesn't help. :oops: )

Anyway, this is exactly what I meant by easily making mistakes with 32 symbols (though I also made typos when using 26 symbols :roll: ).

The suggestion of using names such as P01, T01, etc. would help a bit, but I still think the alphabet system is better and more readable, especially since parameters/return values of functions/bios calls/interrupts are supposed to be registers continuously linked together, so for example if a certain function returns a 3-byte value and a 2-byte value in order, it would be easier to read if they are ZAB and CD, but not T32 and P03 (it would be very easy to mistake the second value as P02 or P04 here).

For users who don't care/know how stuff really work internally (like me) you can just say that the 6 wasted bytes are used by some other internal functions of the chip and call it a day.
User avatar
Gilbert
Posts: 561
Joined: Sun Dec 12, 2010 10:27 pm
Location: Hong Kong
Contact:

Re: Continuum 93, an assembly machine emulator that... does not have a hardware

Post by Gilbert »

EnthusiastGuy wrote: Wed Oct 18, 2023 2:19 pm Hi there
Below you can see a program I wrote the other days with this upgraded architecture.

https://youtu.be/DOjuLj5TM6M
Nice! And the simplicity of the demo made it easy to adapt to whatever platform that supports trigonometric functions, so I immediately tried porting it to another game engine, but performance is not guaranteed. I think rewriting the algorithm to be ASM (like) would be interesting and seems that the performance is quite good on Continuum 93 too.

I know these fantasy systems are all about limitations, but would you mind me making some suggestions here?
I'd like to see the following features to be considered, if they're not already planned:
1. Able to toggle visibility of each of the 8 graphics layers (maybe by using a byte as a bit mask). This would be helpful in situations like using one layer for the UI which can be easily disabled and enabled on the fly (instead of having to clear it and redraw it).
2. Able to set an x offset and a y offset to each graphics layer, so yeah, scrolling, and parallax scrolling, since there are 8 layers.
I think it's worth implementing these two features; otherwise it seems to be a missed opportunity as there are already 8 layers of graphics.
User avatar
EnthusiastGuy
Posts: 12
Joined: Tue Sep 12, 2023 10:42 am
Contact:

Re: Continuum 93, an assembly machine emulator that... does not have a hardware

Post by EnthusiastGuy »

Gilbert wrote: Thu Oct 19, 2023 6:47 am Nice! And the simplicity of the demo made it easy to adapt to whatever platform that supports trigonometric functions, so I immediately tried porting it...
Yeah, I know. I had the same reaction when I first saw it. Coincidence made that I just finished implementing the floating registers to Continuum so I felt an itch to try this on. Offcourse, I had to implement SIN and COS, but I was going to do that anyway, took about 30 minutes and then I started working on this gem.
Gilbert wrote: Thu Oct 19, 2023 6:47 am ...to another game engine,
Well, well, small world. :) I know AGS since early 2000's. I actually started a project almost 2 decades ago in your nice editor. I was terrible at graphics, still am, but I went pretty far with it before my mechanical HDD crashed and I lost everything. Nice to meet the author (if I assume correctly)!
Gilbert wrote: Thu Oct 19, 2023 6:47 am ... but performance is not guaranteed. I think rewriting the algorithm to be ASM (like) would be interesting and seems that the performance is quite good on Continuum 93 too.
Well, Continuum's CPU speed IS linked to the single core processing power of the host system. I toyed a lot with limiting it but some technical penalties held me back. Also, I realized it's actually better to leave it loose and guide the developer to a different programming pattern that would not depend on the CPU speed. That's why I'm also making the clock available through an interrupt. Still a bit of work there, but I expect the "machine" to run at around 20 Mhz on Raspberry Pi Zero 2W up to 200 Mhz on a powerful laptop.

So, the performance you see there is mainly due to it running on my laptop. I'll get around to testing it on lower spec machines eventually.

By the way, your implementation is flipped since you're drawing it the other way. While it's basically the same thing, if you change the minus at line 17 to a plus, your implementation will be aligned with everyone else's. ;)

Also, I see a lot of casting from int-float float-int. Those might be the culprits to your performance penalty. Not sure how fast they are. I am yet to revisit your editor propperly to figure that out, if I even can.
Gilbert wrote: Thu Oct 19, 2023 6:47 am I know these fantasy systems are all about limitations, but would you mind me making some suggestions here?
Not at all, I actually welcome it along with critique as well.
Gilbert wrote: Thu Oct 19, 2023 6:47 am I'd like to see the following features to be considered, if they're not already planned:
1. Able to toggle visibility of each of the 8 graphics layers (maybe by using a byte as a bit mask). This would be helpful in situations like using one layer for the UI which can be easily disabled and enabled on the fly (instead of having to clear it and redraw it).
Indeed, that also crossed my mind at some point, but at the time I considered I'd fill the respective pallete with transparent colors obtaining the same effect. Since then, I gave up on transparency, thus the palettes have 24 bit colors and that idea got lost along the way. I agree it's useful and I'll implement it as a pair of interrupts (one to set, one to read) taking a byte that conveniently has just enough bits for each layer. Thanks for this!
Gilbert wrote: Thu Oct 19, 2023 6:47 am 2. Able to set an x offset and a y offset to each graphics layer, so yeah, scrolling, and parallax scrolling, since there are 8 layers.
I think it's worth implementing these two features; otherwise it seems to be a missed opportunity as there are already 8 layers of graphics.
I did visit this topic at some point. Having an offset on x/y would mean the actual layer is larger and you'd just move the visible area on it, if I understand your thought correctly. Then, question is, how large should it be? 10% more, each side? 25? The memory spend of this would be increasing quickly. That's why I considered that any paralax effect should be built sequentially by also allowing negative x/y's to sprites. The sprites will eventually be able to be drawn outside the screen if necessary and only the visible part of them would show up, everything else will be clipped.
I am currently doing that with text rendering. From that, the background should be built from bits and depending on the performance requirement, the layer will be scrolled (that is in my backlog) to allow drawing only at edges where the composite image moves away from.

I hope this is aligned with your proposal, but if you have specific cases you'd like to mention, or if I missed something, just let me know.
User avatar
EnthusiastGuy
Posts: 12
Joined: Tue Sep 12, 2023 10:42 am
Contact:

Re: Continuum 93, an assembly machine emulator that... does not have a hardware

Post by EnthusiastGuy »

aa-dav wrote: Wed Oct 18, 2023 9:18 pm ...
Indeed, point well taken. Especially since I will need to be using negative numbers for some interrupts. I did neglect this aspect, but I will settle it well before I reach version 1.
User avatar
aa-dav
Posts: 201
Joined: Tue Apr 14, 2020 9:45 pm
Location: Russia

Re: Continuum 93, an assembly machine emulator that... does not have a hardware

Post by aa-dav »

(continued)
You can check my implementation of signed/unsigned arithmetics flags in the sources of SimpX for reference: https://github.com/aa-dav/SimpX/blob/ma ... leton4.cpp
Function void CPU::mathOverflow(bool sub) calculates overflow flag after operation ('x' and 'y' are arguments and 'a' is answer, 'sub' is true if operation was substraction)
'case OP_CADD' block uses flags for conditions. 'above/below' words are for unsigned integers and 'greater/leass' are for signed integers.
Remember that my CPU is 16-bit, so things like 'y & 0x8000' are testing upper (sign) bit of y.


Bubble Universe is amazing! I want to implement it in my PC too, but with sincos tables and fixed point arithmetic. It may work with fine speed.
EnthusiastGuy wrote: Thu Oct 19, 2023 3:08 pm No equivalent of the ADC/SBC exists since I am still in doubt whether it would be needed. Probably for situations where certain existing assembly algorythms would make use of that. That is still a bit on the debate side, but I'd welcome and appreciate your input on this, if you want.
Add/sub with carry were important on 8/16-bit machines to support addition/substraction with higher bitness via chained add/adc.
They still exist on most modern CPUs, so 32-bit machine could easily do 64-bit addition/substraction or 64-bit machine could do 128 bit.
I don't think 'retroPC' really need 64 bit arichmetic, so it's up to you. Maybe limited support of just two opcodes ADC XXXX, YYYY and SBC XXXX, YYYY is enough - just registers and 32-bit only (another addressing modes and bitnesses are not that important).
Also, note, that add with carry could be emulated via manual bit inspection in program (some C programs do that for universal long arithmetic support).
Hey, that was really cool to see.
Thanks!
Not to mention, total respect for actually making a working Wolfenstein prototype. You only need some enemies and a gun and you're there!
It's not my work. :)

Some guy with nickname Total Vacuum made wolf3d-like demo for SimpX.

If you take a better look at the assembler source code you probably will be amazed by it's cryptic nature.
This is because it is not written manually, but generated by Forth compiler which Total Vacuum has and adapted for my platform. :D
This is how real program looks like:

Code: Select all

% UF\SIMPX\core.uf
% UF\SIMPX\stdio.uf

: rshift [ $ ` r0 <= [ sp ]` ` r1 = r0 & $8000` ` r0 = r0 & $FFFE` ` r0 <= r0 >> 1` ` [ sp ] = r0 | r1` $ ] ;

:: [ ] LocX ; :: [ ] LocY ;
:: [ ] Angle ;

:: [ ] color ;
:: [ ] x ; :: [ ] y ;

:: [ ] h ;
:: [ ] j1 ; :: [ ] j2 ;

:: [ ] i0 ; :: [ ] j0 ;
:: [ ] u0 ; :: [ ] v0 ;
:: [ ] da ; :: [ ] db ;
:: [ ] a ;  :: [ ] b ;
:: [ ] u ;  :: [ ] v ;
:: [ ] u1 ; :: [ ] v1 ;
:: [ ] a1 ; :: [ ] b1 ;
:: [ ] i ;  :: [ ] j ;
:: [ ] di ; :: [ ] dj ;
:: [ ] wall ;

:: [ 13 ] Map ; :: [ 16 ] CosTable ;

:: WIDTH 64 ; :: HEIGHT 48 ;
:: BITS 5 ; :: STEP 32 \ 1<<BITS \ ; :: MASK 31 \ STEP-1 \ ;

:: over 1 ? ;
: abs # 15 rshift # + 1 + * ;

: coz abs 8 * 64 - abs 2 * 64 - ;

: cos Angle @          CosTable + @ ;
: sin Angle @ 4 - 15 & CosTable + @ ;

:: span
   color ! x !
   HEIGHT # # # h ! [ # y @ * b1 @ < { h ! 0 } ]
   h @ - 1 rshift j1 !
   j1 @ h @ + j2 !
   j1 @ [ x @ over 0 putpixel ]
   h @ [ x @ over j1 @ + color @ putpixel ]
   j2 @ - [ x @ over j2 @ + 0x2222 putpixel ]
;

: scene
   LocX @ # BITS rshift i0 ! MASK & u0 !
   LocY @ # BITS rshift j0 ! MASK & v0 !

   sin # 3 rshift da ! cos 1 rshift + ` [ sp ] <= [ sp ] >> 13` \ 8 * \ b !
   cos # 3 rshift db ! sin 1 rshift - ` [ sp ] <= [ sp ] >> 13` \ 8 * \ a !

   WIDTH [
      a @ 0 < { 0xFFFF u0 @ 0 a @ - ~ 1 STEP u0 @ - a @ } a1 ! u ! di !
      b @ 0 < { 0xFFFF v0 @ 0 b @ - ~ 1 STEP v0 @ - b @ } b1 ! v ! dj !
      u @ # STEP - x ! b1 @ # ` [ sp ] <= [ sp ] >> 11` \ 32 * \ \ <<BITS \ u1 ! * u !
      v @ # STEP - y ! a1 @ # ` [ sp ] <= [ sp ] >> 11` \ 32 * \ \ <<BITS \ v1 ! * v !
      i0 @ i !
      j0 @ j !

      1 (
         u @ v @ < # {
            x @ STEP + x !
            v @ u @ - v !
            u1 @ u !
            i @ di @ + i !
         ~
            y @ STEP + y !
            u @ v @ - u !
            v1 @ v !
            j @ dj @ + j !
         }
         wall !
      i @ Map + @ j @ rshift 1 & )

      # wall @ { a1 @ b1 ! x @ y ! } i @ j @ + 3 & { 0x1111 ~ 0x4444 } span

      a @ da @ + a !
      b @ db @ - b !
   ]
;

:: run
   LocX @ cos 4 rshift + # x ! BITS rshift i !
   LocY @ sin 4 rshift + # y ! BITS rshift j !
   i @ Map + @ j @ rshift 1 & { x @ LocX ! y @ LocY ! scene }
;

: turn Angle @ + 15 & Angle ! scene ;

:: main
   0b000000000000000
   0b011111111111110
   0b010101010101010
   0b011111111111110
   0b011110010011110
   0b011110111011110
   0b011111101111110
   0b011110111011110
   0b011110010011110
   0b011111111111110
   0b010101010101010
   0b011111111111110
   0b000000000000000
   13 [ $ over Map + ! ]
   16 [ # # coz $ CosTable + ! ]
   4 Angle !
   40 # LocX ! LocY !
   scene
   1 (
      key # #
      'w' = { run }
      'a' = { 0xFFFF turn }
      'd' = { 1 turn }
   1 )
;

main
`@end`
IMHO he is proud of easiness of porting his minimalistic Forth compiler to different platforms and do it for fun. Maybe he'll be interested in your Virtual PC too...
Last edited by aa-dav on Fri Oct 20, 2023 5:45 am, edited 1 time in total.
User avatar
Gilbert
Posts: 561
Joined: Sun Dec 12, 2010 10:27 pm
Location: Hong Kong
Contact:

Re: Continuum 93, an assembly machine emulator that... does not have a hardware

Post by Gilbert »

EnthusiastGuy wrote: Thu Oct 19, 2023 2:56 pm Well, well, small world. :) I know AGS since early 2000's. I actually started a project almost 2 decades ago in your nice editor. I was terrible at graphics, still am, but I went pretty far with it before my mechanical HDD crashed and I lost everything. Nice to meet the author (if I assume correctly)!
I'm not the author of AGS but I've been lurking there like forever and have never released a game yet.
Indeed, that also crossed my mind at some point, but at the time I considered I'd fill the respective pallete with transparent colors obtaining the same effect. Since then, I gave up on transparency, thus the palettes have 24 bit colors and that idea got lost along the way. I agree it's useful and I'll implement it as a pair of interrupts (one to set, one to read) taking a byte that conveniently has just enough bits for each layer. Thanks for this!
This would be great.
I did visit this topic at some point. Having an offset on x/y would mean the actual layer is larger and you'd just move the visible area on it, if I understand your thought correctly. Then, question is, how large should it be? 10% more, each side? 25? The memory spend of this would be increasing quickly. That's why I considered that any paralax effect should be built sequentially by also allowing negative x/y's to sprites. The sprites will eventually be able to be drawn outside the screen if necessary and only the visible part of them would show up, everything else will be clipped.
I am currently doing that with text rendering. From that, the background should be built from bits and depending on the performance requirement, the layer will be scrolled (that is in my backlog) to allow drawing only at edges where the composite image moves away from.
Actually in my mind, the layers don't need to be larger than the whole screen, otherwise they eat up memory fast.
Just keep them the same as the screen resolution as it is now.
My idea is that one of the following methods may be adopted:
1. The layers just wrap around, so if you have, say, scrolled one pixel in the horizontal direction, the coder needs to redraw a vertical line worth of graphics. This is a bit difficult to use, as like real consoles it's usually more effective updating a whole column or row of tiles (of say 8 pixel or 16 pixel wide) than a thin line. But this would be handy if say you have a looping distant scenery layer for parallax scrolling, then you can just use one single graphic layer for that scenery and nothing need to be updated. Just scroll it.
2. The layers do not wrap around, so when you, say, move a layer halfway to the right, it only occupies the right half of the screen (this also means that negative offsets have to be supported). To facilitate a horizontally scrolling screen, just use two layers arranged side by side horizontally (and similarly for vertical ones), and if the screen needs to be scrolled in both x and y directions, you need to use 4 layers. This waste layers quickly but I'll accept it as a sacrifice to do stuff with intentionally limited hardware. This is inspired by how the Famicom/NES has only enough internal memory for two screens of tile maps, and you can arrange the two maps horizontally or vertically for scrolling.

The perfect solution is to be able to toggle wrapping in either x and y direction for each layer, so the coders are free to use both #1 and #2 above for different needs. #1 for looping and #2 for normal scrolling map. Being able to disable wrapping in either direction has a benefit to allow, e.g. a layer that loops in the horizontal direction but not in the vertical direction.

I think the layers could still be a bit larger than the screen though. mainly because the vertical screen resolution of 270 bothers me a bit (I know, it came from 1080/4). For computer or "digital" stuff, we like to have things a multiple of 8 or 16. Also, in games it's common to populate a screen with 16x16 tiles. Having a 270 line layer means that the bottom row of tiles have to be cut off, and it wouldn't work well if you want the layer to loop in the vertical direction while scrolling, so it would be nice if a layer is 480x272 or even 480x320 instead (whereas the screen resolution is kept at 480x270).
User avatar
Gilbert
Posts: 561
Joined: Sun Dec 12, 2010 10:27 pm
Location: Hong Kong
Contact:

Re: Continuum 93, an assembly machine emulator that... does not have a hardware

Post by Gilbert »

EnthusiastGuy wrote: Thu Oct 19, 2023 2:56 pm By the way, your implementation is flipped since you're drawing it the other way. While it's basically the same thing, if you change the minus at line 17 to a plus, your implementation will be aligned with everyone else's. ;)
Thank you! Fixed now.
Also, I see a lot of casting from int-float float-int. Those might be the culprits to your performance penalty. Not sure how fast they are. I am yet to revisit your editor propperly to figure that out, if I even can.
Casting has been a pain in AGS, since floating point support was added to the engine as some form of workaround.
I see that the main culprit is that the looping counter i of the outer loop has to be cast to float 200 x 4 = 800 times in each loop, so I now cast it to another float variable in the outer loop and then use it in the inner loop.
It doesn't make much difference though. There are possibly some optimisations done during compile time and that there are possibly some bottleneck in the script system doesn't help. (The framerate of running it in my office system of generation 7 Core i7 with internal Intel GPU isn't much different from running in in my home system of generation 10 Core i7 with Nvidia GPU).
User avatar
EnthusiastGuy
Posts: 12
Joined: Tue Sep 12, 2023 10:42 am
Contact:

Re: Continuum 93, an assembly machine emulator that... does not have a hardware

Post by EnthusiastGuy »

aa-dav wrote: Thu Oct 19, 2023 6:01 pm You can check my implementation of signed/unsigned arithmetics flags in the sources of SimpX for reference: https://github.com/aa-dav/SimpX/blob/ma ... leton4.cpp
Yeah, I ignored this part of the architecture for quite some time fixing some other stuff. I've placed it in closer in my backlog. Have a pretty good idea how to go about it, but will definitely take a look at your implementation as well. Thanks!
aa-dav wrote: Thu Oct 19, 2023 6:01 pm Some guy with nickname Total Vacuum made wolf3d-like demo for SimpX.

If you take a better look at the assembler source code you probably will be amazed by it's cryptic nature.
This is because it is not written manually, but generated by Forth compiler which Total Vacuum has and adapted for my platform. :D
I noticed that when I played the game. Didn't look into it closely though. I'm also planning on having some compilers on Continuum, but I will need to stabilize the instruction set. It's still a bit wobbly. After that, I would also like to implement a BASIC native variant for it. But I'd definitely not say no to a Forth compiler if the author likes that as well. :D
User avatar
EnthusiastGuy
Posts: 12
Joined: Tue Sep 12, 2023 10:42 am
Contact:

Re: Continuum 93, an assembly machine emulator that... does not have a hardware

Post by EnthusiastGuy »

Gilbert wrote: Thu Oct 19, 2023 6:28 pm ...
1. The layers just wrap around, so if you have, say, scrolled one pixel in the horizontal direction, the coder needs to redraw a vertical line worth of graphics. This is a bit difficult to use, as like real consoles it's usually more effective updating a whole column or row of tiles (of say 8 pixel or 16 pixel wide) than a thin line. But this would be handy if say you have a looping distant scenery layer for parallax scrolling, then you can just use one single graphic layer for that scenery and nothing need to be updated. Just scroll it.
2. The layers do not wrap around, so when you, say, move a layer halfway to the right, it only occupies the right half of the screen (this also means that negative offsets have to be supported). To facilitate a horizontally scrolling screen, just use two layers arranged side by side horizontally (and similarly for vertical ones), and if the screen needs to be scrolled in both x and y directions, you need to use 4 layers. This waste layers quickly but I'll accept it as a sacrifice to do stuff with intentionally limited hardware. This is inspired by how the Famicom/NES has only enough internal memory for two screens of tile maps, and you can arrange the two maps horizontally or vertically for scrolling.
...
Well, the idea is that indeed I will support negative coordinates. And I expect that to fix quite a lot of problems. You mentioned drawing a vertical line of pixels when the screen scrolls, for instance. "Hardware scrolling" will indeed be available and since the tile map could reside in a state, there's no problem drawing it at x = 479. The "hardware" will clip anything not visible, so no performance penalty will exist either, while the developer can simply implement an abstract way of drawing paralaxed bacgrounds or layers in general.

I'll probably be more explicit and clear when, reaching that stage, I will make some tutorials and samples available. But rest assured. I did design 8 layers on this machine to use them especially with paralax situations. ;)
Gilbert wrote: Thu Oct 19, 2023 6:28 pm I think the layers could still be a bit larger than the screen though. mainly because the vertical screen resolution of 270 bothers me a bit (I know, it came from 1080/4). For computer or "digital" stuff, we like to have things a multiple of 8 or 16. Also, in games it's common to populate a screen with 16x16 tiles. Having a 270 line layer means that the bottom row of tiles have to be cut off, and it wouldn't work well if you want the layer to loop in the vertical direction while scrolling, so it would be nice if a layer is 480x272 or even 480x320 instead (whereas the screen resolution is kept at 480x270).
Well, I avoid making hardware that custom, but the good news is that you can still achieve what you want. Just reserve a portion of RAM, draw your scene there as much as you want and then just use the INT 0x10 (Sprite draw) to draw any portion of that to one of the layers. It will be very fast also since it will take only 2-3 instructions. ;)

I just finished implementing correct buffering of the video layers, so now you can take full control of when the actual video draw layers are updated from the video RAM. So this will allow you even more flexibility when tinkering with the video. But, it will take a bit until I release along with updated documentation.
Gilbert wrote: Thu Oct 19, 2023 6:57 pm Casting has been a pain in AGS, since floating point support was added to the engine as some form of workaround.
I see that the main culprit is that the looping counter i of the outer loop has to be cast to float 200 x 4 = 800 times in each loop, so I now cast it to another float variable in the outer loop and then use it in the inner loop.
It doesn't make much difference though. There are possibly some optimisations done during compile time and that there are possibly some bottleneck in the script system doesn't help. (The framerate of running it in my office system of generation 7 Core i7 with internal Intel GPU isn't much different from running in in my home system of generation 10 Core i7 with Nvidia GPU).
I actually had another look recently. I stripped down all SIN/COS and everything. Just had a simple plot x, y which practically drew a square pixel by pixel. This reduced performance from 30 to 24 fps. So, I am pretty certain now that AGS needs some optimizations under the hood. I think I also tried without the debugger on and I felt the performance penalty regardless.
User avatar
Gilbert
Posts: 561
Joined: Sun Dec 12, 2010 10:27 pm
Location: Hong Kong
Contact:

Re: Continuum 93, an assembly machine emulator that... does not have a hardware

Post by Gilbert »

EnthusiastGuy wrote: Mon Oct 23, 2023 10:57 am (Graphics layer related stuff)
Great. I'll just wait and see.
I actually had another look recently. I stripped down all SIN/COS and everything. Just had a simple plot x, y which practically drew a square pixel by pixel. This reduced performance from 30 to 24 fps. So, I am pretty certain now that AGS needs some optimizations under the hood. I think I also tried without the debugger on and I felt the performance penalty regardless.
Indeed, pixel drawing is slow in AGS. I've actually tried commenting the DrawPixel line and keeping the sine/cosine lines and there is a 5 FPS boost, so doing 40,000 sets of trigonometric/floating point operations and drawing 40,000 pixels in each frame both contribute to the performance drop.

Well, anyway 12 FPS is actually not that shabby as I also toyed around with running the demo in Apple ][. If set to authentic speed in the AppleWin emulator, it took more than an hour to render a single frame (if the emulator was set to run as fast as possible it completed in a breeze). Yeah, I know. Running it with a 1 MHz 6502 and that MicroSoft AppleSoft BASIC being so slow doesn't help.
Anyway, the first frame is attached in this post. Because of how the graphic modes are arranged in Apple ][ I didn't use colours.
One problem is that when you run the demo in lower resolutions, there are a lot of pixels drawn onto the same coordinates, so a lot of time was wasted. In my Apple ][ case, the stuff was actually drawn onto a 160x160 square, so there were only 256,000 active pixels in total (actually since the pixels are restricted in a circle, only pi*802 ~ 20,000 pixels are active), and in each frame, there are pixels that are never set, so there were A LOT of repetition. Reducing n to a smaller number can help a lot with the performance by drawing fewer pixels but since the variables were iterated in each step, changing n would probably gives a vastly different result.

I may try to write an assembly version with fixed point arithmetic and look up tables to see how much it helps with the performance as an exercise.
Attachments
apple_bubbles.png
apple_bubbles.png (10.22 KiB) Viewed 1833 times
User avatar
EnthusiastGuy
Posts: 12
Joined: Tue Sep 12, 2023 10:42 am
Contact:

Re: Continuum 93, an assembly machine emulator that... does not have a hardware

Post by EnthusiastGuy »

Gilbert wrote: Mon Oct 23, 2023 7:28 pm it took more than an hour to render a single frame
Wow... 1fph. That's gotta be worth something in someone's record book. :)

I'd say, drop floats altogether. If you can implement asm, that would be fine, otherwise maybe try to switch to integers. Just build up some look-up tables with 360 sin/cos values as ints. Modify your code to work with ints and when providing input to the function that returns from your look-up table, you'd just find the closest matching int index value to your look-up and use that. Then, you can do some divisions to return the value to representable pixel space.

Also, maybe reduce the n. It's rather difficult to explain how it works (not sure I fully understand it either), but try to imagine a rectangle of thin plastic 4cm high and 20cm wide. It has some random patterns printed on it. Then, you fold it by 4 cm repeatedly until you have 5 superinposed layers. The patterns are now on top of each other showing a more complex pattern in a 4cmx4cm square. If you reduce the n, you will basically reduce the length of that sheet. You will get less patterns but you should get consistent patterns still all over the circle area.
User avatar
aa-dav
Posts: 201
Joined: Tue Apr 14, 2020 9:45 pm
Location: Russia

Re: Continuum 93, an assembly machine emulator that... does not have a hardware

Post by aa-dav »

EnthusiastGuy wrote: Tue Oct 24, 2023 5:39 am Just build up some look-up tables with 360 sin/cos values as ints.
In binary it is cheaper to work with fixed point arithmetic with 256 begrees in circle. Begree = binary degree.
For example let's describe 8.8 arithmetic - 1 byte for whole part and 1 byte for fractional part.
Every bit in fractional part is twice less amount that previous. So we can treat fractional byte as quantity of 1/256 amounts.
And it is convenient to choose that circle has this amount of begrees, so:
1. table of angles is power-of-two size
2. to implement periodic nature of sin/cos we just get fractional part of number and treat fractional part as index in the sin/cos array
So, 90 degrees is 64 begrees and 256 begrees is 0 bergees automatically because of overflow.
P.S.
Also, note, that -1 begrees in integer arithmetic is not distinguishable from 255 begrees in terms of 'two's complement' which is also fully corresponds to repetative nature of angles.
User avatar
Gilbert
Posts: 561
Joined: Sun Dec 12, 2010 10:27 pm
Location: Hong Kong
Contact:

Re: Continuum 93, an assembly machine emulator that... does not have a hardware

Post by Gilbert »

If AppleSoft BASIC is still used, using Integers won't help, as the stupid thing is that every arithmetic operation is carried in floating point.
Even if you use something like A% to define integral variables, they're only temporally stored as integers, and are still cast as floating point numbers during calculation, so it'll be even slower (just blame Bill Gates or whoever coded it in Micro Soft), and even if I use a LUT for the trigo values, there are still a lot of additions and multiplications of integers (i.e floating point numbers) . So, ASM is the definite path (or use Woz's original 'integer' BASIC, but I'm never familiar with it).

I've actually built up a simulation of using fixed point maths and LUT tables in AGS to see how stuff are done.
It is more or less working, though it's even slower than using floating point maths and the trigo functions, as there are a bit more operations involved (e.g. I use 16 bits for a fixed point number 0<=x<1, when you multiply two these numbers together, you need to divide the result by 65536 (or shift to the right 16 bits) afterwards, but this would definitely make a huge speed difference in 6502 assembly). However, there are some noticeable differences in the generated frames (the "bubbles" are a bit smaller) and I am still checking whether it is because of some mistakes in my coding or because of the limited precision of fixed point maths. I've tried a Sine/Cosine LUT of 1024 (i.e. 256 entries for each 90 degrees), 4096 or even 16384 entries, but it doesn't make much difference.
User avatar
EnthusiastGuy
Posts: 12
Joined: Tue Sep 12, 2023 10:42 am
Contact:

Re: Continuum 93, an assembly machine emulator that... does not have a hardware

Post by EnthusiastGuy »

Aaaannd I'm back with Version 0.6.8 of Continuum 93!

Just released it.
There's a devlog published there, but the general update revolves around:
Emulator:
- Video card improvements;
- Multi-platform (right now, aside from Windows this can be brought to Raspberry Pi 3, 4, 5, 400 and Zero 2W) and Steam Deck;
- Signed values support for regular registers + new instructions in that sense (thanks for your feedback there, guys!);
- Floating point registers and a lot of instructions that make use of them;
- Improved and added interrupts;
- Added more instructions for logic, flags, trigonometry, general math;
- Gamepad support (though a bit shady on Linux);
- Accelerator interrupts to load png tilemaps to memory and draw from them;
Debugger:
- A 3D visualization of the video memory that also looks cool;
- Showing flags, history;

- ... and a lot of fixes, small improvements etc...

Again, thanks for your feedback, it was well used!

Here's also a recent trailer

https://youtu.be/UCmCEudSpeA
Last edited by EnthusiastGuy on Sat Jan 27, 2024 12:09 pm, edited 1 time in total.
User avatar
Gilbert
Posts: 561
Joined: Sun Dec 12, 2010 10:27 pm
Location: Hong Kong
Contact:

Re: Continuum 93, an assembly machine emulator that... does not have a hardware

Post by Gilbert »

Great! Will check it up.
Post Reply