[lazy] Can't get this floating point code

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems.

Moderator: Moderators

michikaze
Posts: 36
Joined: Sat Jun 28, 2014 2:35 am

[lazy] Can't get this floating point code

Post by michikaze »

This code: http://6502.org/source/floats/wozfp1.txt
I wandered around aimlessly as usual, and stumbled at this. Happy at the possibility to finally improve my math-fu, I started reading. Then confusion, then lazyness, then this thread.

You can either discuss this somehow, or answer questions below made exclusively for you.

I want to look at this code in fceux's debugger. Maybe anyone have this or simular code in some demo? Running code in my head isn't a pleasant experience. Or maybe it is...///

How is it possible to have logarithm on a processor that doesn't support even multiplication? I lack proper math background to understand it.

Aaaand, the most stupid of theeeeem......! What the hell IS logarithm, in that game it can be useful? Yes, 6502 core hasn't been used only in NES, I know, but blargh.
tepples
Posts: 22414
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: [lazy] Can't get this floating point code

Post by tepples »

michikaze wrote:How is it possible to have logarithm on a processor that doesn't support even multiplication? I lack proper math background to understand it.
Probably CORDIC, a way of computing transcendental functions (sine, cosine, exponent, logarithm, etc.) using only adds and shifts. It is derived using a bunch of trigonometric identities.

Sine and cosine and arctangent are useful for things like determining which way a player's unit is pointed.
User avatar
tokumaru
Posts: 12254
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: [lazy] Can't get this floating point code

Post by tokumaru »

michikaze wrote:I want to look at this code in fceux's debugger.
That would require you to build a valid NES ROM around the code. May I suggest this 6502 Simulator instead? It has some nice debugging features and you can easily get code running with jus an ORG statement indicating where the code should be located in memory.
michikaze
Posts: 36
Joined: Sat Jun 28, 2014 2:35 am

Re: [lazy] Can't get this floating point code

Post by michikaze »

tepples wrote:CORDIC
Nice, thank you!
From wikipedia page I got only how to calculate sine and cosine. But it also methoned that it can be used for exponential functions and logarithms. Huh...
tokumaru wrote:6502 Simulator
Nice, thank you! Going to try soon.
tepples
Posts: 22414
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: [lazy] Can't get this floating point code

Post by tepples »

"CORDIC" in the Digital Circuits textbook on Wikibooks explains how it's generalized to exp and log.
doynax
Posts: 162
Joined: Mon Nov 22, 2004 3:24 pm
Location: Sweden
Contact:

Re: [lazy] Can't get this floating point code

Post by doynax »

Looks more like a couple of iterations of some other numerical method to me, though I must confess to being to lazy (and/or stupid) to decipher it. In any event Feynman's algorithm is probably faster and certainly easier to understand than CORDIC, though it boils down to a suspiciously similar shift-and-add sequence.

Let's say you want to compute the base-2 logarithm of 1≤x<2. This is easily generalized to any binary floating-point number by computing the logarithm of the normalized mantissa and adding the exponent (plus one) to the result. The trick, then, is to factorize x into a series of 1+2^-i factors. That is we start with y=1 and from i=0 on down successively multiply by 1+2^-i, keeping the new factor if the result does not exceed x. Note that these multiplications may be implemented through shift-and-add, e.g. y'=y>>i.

The logarithm y is then simply the sum of the logarithms of all individual 1+2^-i factors which may be pre-calculated and stored in a table.

Here's a quick-and-dirty C implementation in 1.15-bit fixed-point:

Code: Select all

unsigned int feynlog(unsigned int x) {
	static const unsigned int tab[] = {
		/* i=1  */ 19168, /* i=2  */ 10549,
		/* i=3  */ 5568,  /* i=4  */ 2866,
		/* i=5  */ 1455,  /* i=6  */ 733,
		/* i=7  */ 378,   /* i=8  */ 184,
		/* i=9  */ 92,    /* i=10 */ 46,
		/* i=11 */ 23,    /* i=12 */ 12,
		/* i=13 */ 6,     /* i=14 */ 3,
		/* i=15 */ 1,     /* i=16 */ 1
	};
	unsigned int y = 32768;
	unsigned int z = 0;
	for(unsigned int i = 1; i < 16; ++i) {
		unsigned int t = y + (y >> i);
		if(t <= x) {
			y = t;
			z += tab[i - 1];
		}
	}
	return z;
}
As for uses the most frequent on the 6502 is probably to speed up multiplication and division, much as a human would use a slide-rule, though typically you'd store pre-calculated logarithm and exponent tables rather than computing them at start-up.
michikaze
Posts: 36
Joined: Sat Jun 28, 2014 2:35 am

Re: [lazy] Can't get this floating point code

Post by michikaze »

doynax wrote:Feynman's algorithm
......Please let me to stare at this for a few more days, I'll get this for sure......
Don't get that those numbers in the array are.

Hey, google showed something funny:
The Feynman Algorithm:

Write down the problem.
Think real hard.
Write down the solution.

The Feynman algorithm was facetiously suggested by Murray Gell-Mann, a colleague of Feynman, in a New York Times interview.
Now retyping the code to get it running in the emulator, please watch warmly.
doynax
Posts: 162
Joined: Mon Nov 22, 2004 3:24 pm
Location: Sweden
Contact:

Re: [lazy] Can't get this floating point code

Post by doynax »

michikaze wrote:......Please let me to stare at this for a few more days, I'll get this for sure......
Don't get that those numbers in the array are.
Perhaps an example might serve to illustrate the scheme.

The factorization here is analogous to the positional number system. Imagine that you wanted to encode any number 1≤x<2 by hand as a binary fraction. You might start with 1, then successively add further smaller terms. If the result is still larger after adding a term then include it (a one bit), otherwise omit it and move on (a zero bit).

Let's pick x=1.4142:

Code: Select all

1+2^-1                          = %1.1      = 1.5 > x, so omit the 2^-1 term
1     +2^-2                     = %1.01     = 1.25 < x, so include the 2^-2 term
1     +2^-2+2^-3                = %1.011    = 1.375 < x, so include the 2^-3 term
1     +2^-2+2^-3+2^-4           = %1.0111   = 1.4375 > x, so omit the 2^-4 term
1     +2^-2+2^-3     +2^-5      = %1.01101  = 1.40625 < x, so include the 2^-5 term
1     +2^-2+2^-3     +2^-5+2^-6 = %1.011011 = 1.421875 > x, so omit the 2^-6 term
This is a variant of binary search. At each step the approximation grows more precise, rapidly converging on the true value one bit at a time.

Instead of using a sum of 2^-i terms Feynman's algorithm uses a product of 1+2^-i factors to encode the number:

Code: Select all

1·(1+2^-1)                                               = 1.5 > x, so omit the 1+2^-1 factor
1         ·(1+2^-2)                                      = 1.25 < x, so include the 1+2^-2 factor
1         ·(1+2^-2)·(1+2^-3)                             = 1.40625 < x, so include the 1+2^-3 factor
1         ·(1+2^-2)·(1+2^-3)·(1+2^-4)                    = 1.494140625 > x, so omit the 1+2^-4 factor
1         ·(1+2^-2)·(1+2^-3)         ·(1+2^-5)           = 1.4501953125 > x, so omit the 1+2^-5 factor
1         ·(1+2^-2)·(1+2^-3)                  ·(1+2^-6)  = 1.42822265625 > x, so omit the 1+2^-6 factor
In this case it is less apparent that any x in the range is reachable. The base 2 used here is in fact sub-optimal but is close enough to the ideal one (~2.2656) while being easily evaluated through shifts and adds. In general smaller bases are finer grained, meaning that the results converge more slowly, whereas larger bases have unreachable gaps in the output.

The point of all this is to make use of the logarithmic identity which lowers multiplication into addition, e.g. log(p*q) = log p + log q. Therefore if we can factorize x into a series of 1+2^-1 factors then log x is simply the sum of the logarithms of the included factors.
In our example y = 1·(1+2^-2)·(1+2^-3) ≈ x, therefore log x ≈ log y = log(1·(1+2^-2)·(1+2^-3)) = log 1 + log(1+2^-2) + log(1+2^-3).

This is in fact what the table in the code above is all about. A table of pre-calculated 1+2^-i logarithms. Binary logarithms in fixed-point to be precise, e.g. log2(1+2^-1) · 2^15 ≈ 19168.
michikaze wrote:Hey, google showed something funny:
The Feynman Algorithm:
Write down the problem.
Think real hard.
Write down the solution.
Good advice. He was active in a different field so I suppose I should have called it Feynman's logarithm algorithm to help your Google efforts but logarithm algorithm sounds somehow redundant to me, even though I know the etymologies are quite distinct.

I apologize for these half-on-topic ramblings but I'm a sucker for elegant algorithms and this one certainly qualifies. Despite its original application.
michikaze
Posts: 36
Joined: Sat Jun 28, 2014 2:35 am

Re: [talking with myself] Can't get this floating point code

Post by michikaze »

First - you guys are awesome.

Second - I'm playing with function "float" now. And trying to comment. Look if you have nothing better to do.

Emulator's editor and this forum's formatter have different tab lengths, so it's not pretty...

The first part is commented out, because with it, execution starts from adress $0.

Code: Select all

;	.org $0
;	.db 0,0,0	;not used?
;sign:	.db $ea
;x2:	.db $ea		;exponent 2
;m2:	.db 0,0,0	;mantissa 2
;x1:	.db $ea
;m1:	.db 0,0,0
;e:	.db 0,0,0,0	;scratch
;z:	.db 0,0,0,0
;t:	.db 0,0,0,0
;sexp:	.DB 0,0,0,0
;int:	.db 0


sign = $3
x2 = $4
m2 = $5
x1 = $8
m1 = $9
e = $c
z = $10
t = $14
sexp = $18
int = $1c					

	.org $600

io_putc	= $e001
io_posx	= $e005
io_posy	= $e006

	;LDA #9
	;STA io_posx
	;LDA #2
	;STA io_posy

	; print 'aaa'
	;LDA #'a'
	;STA io_putc
	;STA io_putc
	;STA io_putc

	; SET 16 BIT INTEGER
	LDA #$0
	STA m1		;HIGH
	LDA #$03
	STA m1+1	;low
	
	; CONVERT 16 BIT INTEGER TO 32 BIT FLOAT
	JSR float
	

	BRK
	
	;;;;;;;;;;;;;;;;;;;;FLOAT;;;;;;;;;;;;;;;;;;
	;CONVERT 16 BIT INTEGER IN M1(HIGH) AND M1+1(LOW) TO F.P.
        ;RESULT IN EXP/MANT1.  EXP/MANT2 UNEFFECTED
        
        ;So... how it works.
        ;It sets exponent to $e (maximal possible for 16bit input)
        ;and [dec exponent,
        ;     shiftleft 3byte mantissa bit by bit]
        ;untill done.
        
        ;examples of output:
        ; 00 01 -> 80 40 00 00
        ; 00 02 -> 81 40 00 00
        ; 00 03 -> 81 60 00 00
        ; 00 41 -> 86 41 00 00
        ; 41 00 -> 8e 41 00 00
        ; 00 81 -> 87 40 80 00
        ; 40 00 -> 8e 40 00 00
        
        ; NEGATIVE VALUES!
        ; 80 00 -> 8e 80 00 00 ;finally mant. starts from 8
        ; ff ff -> 7f 80 00 00 ;negative?
        ; 8f ff -> 8e 8f ff 00 ;didn't quite get this
        ;				;"complement" mantissa
        ; 00 00 -> 00 00 00 00 ;and it loops a lot 
        				;(until exp decs to 0)
        
float:	LDA #$8e	;init $08 and $0b ($09 and $0a should've been
				;set by you already)
	STA x1		;set exponent to 14dec ($0e in hex)
	LDA #0		;clear most right byte
	STA m1+2	;/init
	BEQ norm	;normalize result
	; Jump is always taken. Zero flag is set by LDA #0.
	; They used BEQ insead of JMP because it supports relative
	; addressing mode, and so takes one byte less.
norm1:	DEC x1		;decrease exponent by one
			; SHIFT!
	ASL m1+2	;     carry <- [$0b] <- 0
				;WAIT, isn't [$0b] always zero?
				;*proud* *proud*
	ROL m1+1	;     carry <- [$0a] <- carry
	ROL m1		; who cares <- [$09] <- carry
norm:	LDA m1		; huh...
	ASL		; ...continue until...
	EOR m1		; ...left two bits of mantissa are...
	BMI rts1	; ...either 10 or 01.
			; 10 is for negative floats
	LDA x1		
	BNE norm1	; or until we got all zeroes
rts1:	RTS		;return! Finally!
michikaze
Posts: 36
Joined: Sat Jun 28, 2014 2:35 am

Re: [lazy] Can't get this floating point code

Post by michikaze »

And new version, this time with add. Wait a few years, and I will get to the interesting stuff.

God, retyping stuff is so hard.
Attachments
wozfp1_notready.65s.txt
(4.5 KiB) Downloaded 168 times
michikaze
Posts: 36
Joined: Sat Jun 28, 2014 2:35 am

Re: [lazy] Can't get this floating point code

Post by michikaze »

Now with fsub, fmul, fdiv and fix.

I think these kind of things are better to read this way, from short to complex. So, I will post them slowly growing up, step by step.
Attachments
wozfp1_notready.65s.txt
(8.8 KiB) Downloaded 165 times
michikaze
Posts: 36
Joined: Sat Jun 28, 2014 2:35 am

Re: [lazy] Can't get this floating point code

Post by michikaze »

Anyone knows how to make these work in 6502 simulator? It doesn't support "DCM" instruction, it seems.

Code: Select all

1DCD  7E 6F     LN10   DCM 0.4342945
      2D ED
1DD1  80 5A     R22    DCM 1.4142136   SQRT(2)
      02 7A
1DD5  7F 58     LE2    DCM 0.69314718  LOG BASE E OF 2
      B9 0C
1DD9  80 52     A1     DCM 1.2920074
      80 40
1DDD  81 AB     MB     DCM -2.6398577
      86 49
1DE1  80 6A     C      DCM 1.6567626
      08 66
1DE5  7F 40     MHLF   DCM 0.5
      00 00
User avatar
tokumaru
Posts: 12254
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: [lazy] Can't get this floating point code

Post by tokumaru »

To use undocumented instructions you usually have to use .db and .dw statements. Look up the opcode for the instruction (considering the addressing mode) in a document like this, and use .db followed by that value. Then comes the address or immediate value, which can be 8 or 16 bits long, so use .db or .dw accordingly.
User avatar
thefox
Posts: 3134
Joined: Mon Jan 03, 2005 10:36 am
Location: 🇫🇮
Contact:

Re: [lazy] Can't get this floating point code

Post by thefox »

tokumaru wrote:To use undocumented instructions you usually have to use .db and .dw statements. Look up the opcode for the instruction (considering the addressing mode) in a document like this, and use .db followed by that value. Then comes the address or immediate value, which can be 8 or 16 bits long, so use .db or .dw accordingly.
Those aren't undocumented instructions, they are floating point values. :) I'm not aware of any assembler that supports them. Maybe a macro hackery based on a string could be possible on some assemblers.
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi
michikaze
Posts: 36
Joined: Sat Jun 28, 2014 2:35 am

Re: [lazy] Can't get this floating point code

Post by michikaze »

I just realised, those numbers are in format that is probably used only by this code. Still curious if whatever those guys at Apple used does actually support this keyword.

Ok, put it in as .db chain.
Post Reply