NDS Wifi specs
Re: NDS Wifi specs
I had a couple test programs back then, sender and receiver. I did get the empty replies when I didn't set TXBUF_REPLY1 on the receiver side. maybe it's something like having to make sure you have W_BSSID and W_AID registers set properly, but I think you have those set up already.
Re: NDS Wifi specs
Yes, W_BSSID and W_AID_LOW are set as usually, I just had removed the W_TXBUF_REPLY1 write. Don't know why that didn't cause an empty reply, maybe that feature can be disable somewhere, or something had filtered it out on the other console.
___
Emulating the DS Download Play tool is getting difficult.
The ARM9 code is generating a Dummy reply, and then one Username reply.
But the ARM7 code is only sending the Dummy reply, and never happens to send the Username reply.
Instead, the Dummy gets retransmitted about 43h times, and ARM7 does then issue a "shutdown" message to ARM9, switching ARM9 into an endless loop.
My emulated reply timings are a bit very crappy, but I don't know if that is causing the problem. I've quite deeply disassembled the Download Play code, but it's hard to understand how the ARM7+ARM9 functions interact with each other.
Presumably, something makes the ARM7 think that it isn't yet ready to send the next REPLY. But I've no idea what/why/where : /
The state is REPLY1=0, REPLY2=8000h, and TXBUF[04h] incremented on each (re-)transmit. So ARM7 should know that the Dummy reply was sent, and that the next reply could be written to REPLY1.
___
Btw. there are two nds-wifi hardware versions, with W_ID chip ID values 1440h=Old NDS, and C340h=Newer DS-Lite/DSi/3DS.
The DSi's local multiplayer code (and maybe also DS-lite-era code) is doing something like this:
That looks as if the newer hardware is slower, and requires extra delays. On the other hand, older NDS launch titles didn't support that stuff, and they are apparently working on newer hardware either way.
___
Emulating the DS Download Play tool is getting difficult.
The ARM9 code is generating a Dummy reply, and then one Username reply.
But the ARM7 code is only sending the Dummy reply, and never happens to send the Username reply.
Instead, the Dummy gets retransmitted about 43h times, and ARM7 does then issue a "shutdown" message to ARM9, switching ARM9 into an endless loop.
My emulated reply timings are a bit very crappy, but I don't know if that is causing the problem. I've quite deeply disassembled the Download Play code, but it's hard to understand how the ARM7+ARM9 functions interact with each other.
Presumably, something makes the ARM7 think that it isn't yet ready to send the next REPLY. But I've no idea what/why/where : /
The state is REPLY1=0, REPLY2=8000h, and TXBUF[04h] incremented on each (re-)transmit. So ARM7 should know that the Dummy reply was sent, and that the next reply could be written to REPLY1.
___
Btw. there are two nds-wifi hardware versions, with W_ID chip ID values 1440h=Old NDS, and C340h=Newer DS-Lite/DSi/3DS.
The DSi's local multiplayer code (and maybe also DS-lite-era code) is doing something like this:
Code: Select all
if W_ID = 1440h then normal behaviour
else insert some extra 1000us (?) delays in various places
Re: NDS Wifi specs
Two months later - new year, new try.
I've given up on disassembling the DSi's DS Download Play tool, and instead switched to disassembling the much older Metroid First Hunt cartridge (which looks only half as complicated: less special-case-do-this-if-that stuff, and also I/O port access less obscured from compiler optimization).
Unfortunately, the underlaying mess is about same as in the newer wifi driver versions... there are lots of function numbers that are passed on as callback requests from one thread to another thread (which is then passing another callback number to another thread, and so on) (all without giving much clue about which callbacks are intended, and which are error handlers).
But I think that I've tracked down the source of the problem. The RX_done IRQ handler is checking REPLY2, and when disliking it, triggers Deauth via the call at 37FA350h (which does then go through at least four callback funcions (15h, 186h, 25h, 05h) before actually generating the Deauth packet.
Removing the opcode at 37FA350h does fix the problem. And to fix it without that patch... I guess the critical part is when to forward REPLY1 to REPLY2, or when to set the reply's TXHDR[00h] to 01h.
After finding the issue in Metroid, I've now also tracked down the same problem in the DSi's DS Download Play tool, it's checking REPLY1 instead of REPLY2, but otherwise it seems to be the same issue. Except, DS Download Play is additionally hanging elsewhere if incoming packets are arriving too fast: The cmd_ack callback (185h) is then trying to send another callback (2Ch) to another thread, but without ever finding time to execute that thread.
I've given up on disassembling the DSi's DS Download Play tool, and instead switched to disassembling the much older Metroid First Hunt cartridge (which looks only half as complicated: less special-case-do-this-if-that stuff, and also I/O port access less obscured from compiler optimization).
Unfortunately, the underlaying mess is about same as in the newer wifi driver versions... there are lots of function numbers that are passed on as callback requests from one thread to another thread (which is then passing another callback number to another thread, and so on) (all without giving much clue about which callbacks are intended, and which are error handlers).
But I think that I've tracked down the source of the problem. The RX_done IRQ handler is checking REPLY2, and when disliking it, triggers Deauth via the call at 37FA350h (which does then go through at least four callback funcions (15h, 186h, 25h, 05h) before actually generating the Deauth packet.
Removing the opcode at 37FA350h does fix the problem. And to fix it without that patch... I guess the critical part is when to forward REPLY1 to REPLY2, or when to set the reply's TXHDR[00h] to 01h.
After finding the issue in Metroid, I've now also tracked down the same problem in the DSi's DS Download Play tool, it's checking REPLY1 instead of REPLY2, but otherwise it seems to be the same issue. Except, DS Download Play is additionally hanging elsewhere if incoming packets are arriving too fast: The cmd_ack callback (185h) is then trying to send another callback (2Ch) to another thread, but without ever finding time to execute that thread.
Re: NDS Wifi specs
I've logged the register changes that are happening on slave side on real hardware. Here's a summary:
The important part is that the REPLY registers (and their TXHDR's) must be updated simultaneously with the RX Done IRQ0, for CMD DATA. Nintendo's RX Done IRQ handler is checking that stuff to determine if the replies are delivered. And, for whatever reason, it does insist on reply delivery to occur ONLY alongsides with the CMD DATA's RX Done IRQ.
With that, my multiplayer wifi emulation is now more or less working (yet not always perfectly stable, in metroid it does still disconnect after some seconds).
For reference, below is the whole log. That's manually typed up from the log shown on the console's LCD screen (hopefully without typos). And, I've merged it with data from several test runs (eg. with different TXSTATCNT settings) (and with different test runs having different WRCSR values).
The RXTX_ADDR values are only summarized (in reality, the transfer would extend thoughout the next some microseconds, with about one halfword each 8us, at 2Mbit/s).
The US_COUNT values are showing the duration without register changes, then followed by the new register values. The logging granularity is about 7-8us (ie. changes that seem to occur after 8us may have in fact occurred simultaneously with the preceeding changes).
Code: Select all
Flowchart (at Slave side)
At incoming CMD DATA packet: ;\
RF_STATUS=6 ;RX processing incoming stuff ;
After RX preamble: ; CMD
IRQ6 (RX Start, for CMD DATA) ; DATA
After RX data: ;
IRQ0 (RX Done, for CMD DATA) ;
WRCSR=WRCSR+(size of CMD DATA) ;
RF_STATUS=5 ;preparing REPLY ;
if REPLY2.bit15=1 ;
TXHDR[1]=TXHDR[0] ;<-- or sometimes random? ;\adjust TXHDR[0,1] ;
TXHDR[0]=01h ;<-- mark done/discarded ;/for <old> REPLY2 ;
REPLY2=REPLY1, REPLY1=0000h ;-forward new reply ;
if REPLY2.bit15=1 ;
TXHDR[4] incremented (unless already max FFh) ;\adjust TXHDR[4,5] ;
TXHDR[5]=00h ;/for <new> REPLY2 ;
TX_SEQNO incremented ;<-- done here if REPLY2 exists ;/
After some moment (at the AID_LOW slot?): ;\
RF_STATUS=8 ;TX sending REPLY ;
After TX preamble: ; REPLY
IRQ7 (TX Start, for REPLY) ;
After TX data: ;
RF_STATUS=1 ;RX awaiting next packet ;
optional: IRQ1 (TX Done) (only if enabled in TXSTATCNT, and REPLY2.bit15=1)
optional: TXSTAT=0401h (only if enabled in TXSTATCNT) ;
if REPLY2.bit15=0 ;
SEQNO increased ;<-- done here when REPLY2 is empty ;/
After some moment: ;\
RF_STATUS=6 ;RX processing incoming stuff ;
After RX preamble: ; CMD
IRQ6 (RX Start, for CMD ACK) ; ACK
After RX data: ;
IRQ0 (RX Done, for CMD ACK) ;
WRCSR=WRCSR+(size of CMD ACK) ;
RF_STATUS=1 ;RX awaiting next packet ;/
Thereafter, Nintendo's software seems to require a delay (at least
100h microseconds) before receiving the next CMD DATA packet.
With that, my multiplayer wifi emulation is now more or less working (yet not always perfectly stable, in metroid it does still disconnect after some seconds).
For reference, below is the whole log. That's manually typed up from the log shown on the console's LCD screen (hopefully without typos). And, I've merged it with data from several test runs (eg. with different TXSTATCNT settings) (and with different test runs having different WRCSR values).
The RXTX_ADDR values are only summarized (in reality, the transfer would extend thoughout the next some microseconds, with about one halfword each 8us, at 2Mbit/s).
The US_COUNT values are showing the duration without register changes, then followed by the new register values. The logging granularity is about 7-8us (ie. changes that seem to occur after 8us may have in fact occurred simultaneously with the preceeding changes).
Code: Select all
nds_wifi_state_logger (when downloading from nanostray)
US=0020h ;\
IF=80h (TX Start, for REPLY) (that is, empty dummy reply?) ;
RF_PINS=46h (formerly 44h) ;
US=006Ch ; REPLY
SEQNO+1 (for REPLY) (increase from 3 to 4) ;
US=0007h ;
RF_STAT=1 ;RX awaiting (formerly 8) ;
RF_PINS=04h ;/
US=0033h
RF_PINS=84h
US=0020h
RF_STAT=6 ;RX processing
RF_PINS=85h
US=0053h ;<--- partial preamble? (minus time needed for sensing preamble?)
(RXTX_ADDR+0..14h) (eg. 08CAh..08DEh)
IF=40h (RX Start, for CMD ACK) ;\
RF_PINS=87h ;
US=007Fh ;
WRCSR+14h (now 920h) (eg. or 8DCh) ; CMD ACK
RF_STAT=1 ;RX awaiting ;
(sometimes shortly: RF_PINS=07h) :
RF_PINS=84h ;
US=0007h ;
IF=01h (RX Done, for CMD ACK) ;/
(sometimes US=80h, IF=4000h)
US=05F6h ;--- huge gap
RF_STAT=6 ;RX processing
RF_PINS=85h
(RXTX_ADDR+0..18h) (eg. 08DEh..08F6h)
US=0053h ;<--- partial preamble?
IF=40h (RX Start, for CMD DATA) ;\
RF_PINS=87h ;
US=00A6h ;<--- data time? (1Ah*4+more?) ;
IF=01h (RX Done, for CMD DATA) ;
WRCSR+1Ah (now 93Ah) ; CMD DATA
REPLY1=0000h (formerly 84D0h) ;ram 48049A0h (960h+40h) ;
REPLY2=84D0h (formerly 0) ;
(RXTX_ADDR=04D6h) (aka REPLY+6) ;
RF_STAT=5 ;preparing REPLY ;
RF_PINS=04h ;
SEQNO+1 ;
TXHDR.4.B=0001h (formerly 0000h) ;per new REPLY2 ;/
(TXHDR.0.B=E701h (formerly 0000h) ;per old? REPLY2 (if any);/
US=0006h
RF_STAT=8 ;TX sending REPLY
RF_PINS=44h ;TXing PREAMBLE
(RXTX_ADDR=04D6h..04D7h)
US=0060h ;<-- preamble time
IF=80h (TX Start, for REPLY) ;\
RF_PINS=46h ;TXing DATA ;
US=0099h ;<-- data time ;
(RXTX_ADDR=04D8h..04EBh) (aka max 4D0h+1Bh) ;+(0Ch+26h)/2+2? ;
RF_STAT=1 ;RX awaiting ; REPLY
(sometimes shortly: RF_PINS=06h) ;
RF_PINS=04h ;
(optional: IF=02h (when enabled in TXSTATCNT) ;
(optional: TXSTAT=0401h (when enabled in TXSTATCNT) ;
(else stays at TXSTAT=2001h (or whatever old value) ;/
US=000Dh
RF_PINS=84h
US=0020h
RF_STAT=6 ;RX processing
RF_PINS=85h
US=0053h
IF=40h (RX Start, for CMD ACK) ;\
RF_PINS=87h ;
US=0080h ;
RF_STAT=1 ;RX awaiting ; CMD ACK
RF_PINS=84h ;
US=0006h ;
IF=01h (RX Done, for CMD ACK) ;
WRCSR+14h (now 94Eh) ;/
(if REPLY's forward REPLY1 to REPLY2) ;\uh, done here??? probably blah
(if REPLY's raise TXHDR.4) ;/
US=053Dh ;--- huge gap
RF_PINS=85h ;RXing... preamble?
US=0006h
RF_STAT=6 ;RX processing
US=004Dh
RF_PINS=87h ;RXing... data?
US=0006h ;\
IF=40h (RX Start, for CMD DATA) ;
US=00A6h ;
IF=01h (RX Done, for CMD DATA) ; CMD DATA
WRCSR+1Ah (now 968h) ;
REPLY2=0000h (formerly 84D0h) ;
RF_STAT=5 ;preparing REPLY ;
RF_PINS=04h ;
TXHDR.0.B=A201h (or xx01h) (formerly 0000h) ;old REPLY2 ;
(TXHDR.0.A=0001h) ;new REPLY2, if any ;/
US=0007h
RF_STAT=8 ;TX sending REPLY
RF_PINS=44h
US=0060h ;<-- preamble time
IF=80h (TX Start, for REPLY) ;\
RF_PINS=46h ;
US=006Ch ;<-- reply time (short/empty?) ;
SEQNO+1 (for REPLY) ; REPLY
US=0006h ;
RF_STAT=1 ;RX awaiting ; hm, IRQ-less?
RF_PINS=04h ;/ (if REPLY's, IF=02h)
US=0034h
RF_PINS=84h
US=0020h
RF_STAT=6 ;RX processing
RF_PINS=85h
US=0053h
IF=40h (RX Start, for CMD ACK) ;\
RF_PINS=87h ;
US=007Fh ;
WRCSR+14h (now 97Ch) ; CMD ACK
RF_STAT=1 ;RX awaiting ;
RF_PINS=84h ;
US=0007h ;
IF=01h (RX Done, for CMD ACK) ;/
US=0556h ;--- huge gap
RF_STAT=6 ;RX processing
RF_PINS=85h
US=004Dh
RF_PINS=87h
US=0006h
IF=40h (RX Start, for CMD DATA)
US=00A6h
IF=01h (RX Done, for CMD DATA)
WRCSR+1Ah (now 996h)
N/A REPLY1=0000h (formerly 84D0h)
N/A REPLY2=84D0h (formerly 0)
RF_STAT=5 ;preparing REPLY
RF_PINS=04h
N/A SEQNO+1
N/A TXHDR.4.B=0001h (formerly 0000h)
US=0007h
RF_STAT=8 ;TX sending REPLY
RF_PINS=44h
US=0060h ;<-- preamble time
IF=80h (TX Start, for REPLY) xx
RF_PINS=46h xx
US=0073h ;<-- shorter?
RF_STAT=1
RF_PINS=04h
SEQNO+1 ;<------- HERE (instead above)
US=0033h
RF_PINS=84h
US=0020h
RF_STAT=6 ;RX processing
RF_PINS=85h
US=0053h
IF=40h (RX Start, for CMD ACK)
RF_PINS=87h
US=0080h
RF_STAT=1
RF_PINS=84h
US=0006h
IF=01h (RX Done, for CMD ACK)
N/A WRCSR+14h
US=03FEh ;--- huge gap
RF_STAT=6 ;RX processing
RF_PINS=85h
US=0053h
RF_PINS=87h
US=0006h
IF=40h (RX Start, for CMD DATA)
US=00A6h
IF=01h (RX Done, for CMD DATA)
WRCSR+1Ah (now 9C4h)
N/A REPLY2=0000h (formerly 84D0h)
RF_STAT=5 ;preparing REPLY
RF_PINS=04h
N/A TXHDR.0.B=A201h (formerly 0000h)
US=0007h
RF_STAT=8 ;TX sending REPLY
RF_PINS=44h
US=0060h ;<-- preamble time
IF=80h (TX Start, for REPLY)
RF_PINS=46h
Re: NDS Wifi specs
Now I've logged the Multiplay Master side, too.
Here's a summary of the log...
And the full log...
RF_STATUS=7 is actually a transition state (occurs shortly before RF_STATUS=8).
RXTX_ADDR=0FC0h stays constant throughout the whole ACK transfer (it doesn't seem to increase from 0FBxh to 0FC0h). I guess they've just used that special "invalid" address value to indicate that none of the TXBUF's is in use.
It's been surprisingly difficult to receive REPLY's. That could be bad luck (CMD packets or REPLY's getting lost). On the other hand, I had no problems with lost packets when receiving intact streams of CMD+ACK pairs. One reason for not getting REPLY's seems to have been that I had set TXSTATCNT=FFFFh for logging TXSTAT changes. Setting the LSBs of TXSTATCNT seems to have some completely disabled replies (TXSTATCNT=F000h works better). And, RXFILTER=0FFFh seems to be also increasing chances to receive replies (not sure if that's really needed, it's working also with other RXFILTER settings, but setting more RXFILTER bits appeared to work a bit better).
I don't fully understand how CMD_COUNT works. It's usually set to the time needed to transfer one set of CMD+REPLY+ACK packets. The problem is that the CMD transfer may start about 500us after writing to CMD_COUNT and TXBUF_CMD, so CMD_COUNT is already half ellapsed at begin of transfer. I think that was part of the problem when I had used TXSTATCNT=FFFFh (I sometimes got REPLY RX start, but then immediately followed by ACK TX Done, without actually transferring the REPLY and ACK).
Here's a summary of the log...
Code: Select all
Flowchart (at Master side)
After starting transfer via TXREQ and TXBUF_CMD write:
TXBUSY=2 (formerly 0) (after TXBUF_CMD write, or sometimes a bit later)
After about 50-500 microseconds: ;\
RF_STAT=3 (TXing) (formerly 2) ;
RXTX_ADDR=0006h..0008h (TXbuf+0Ch..) (formerly in RXBUF) ; CMD
SEQNO+1 ;
After TX preamble: ;
IF=80h (TX Start, for CMD) ;
RXTX_ADDR=0009h..0xxxh (TXbuf..) ;
After TX data: ;
optional: IF=02h (TX Done, for CMD) (if enabled in TXSTATCNT);
optional: TXSTAT=0800h (CMD done) (if enabled in TXSTATCNT);
RF_STAT=5 (CMD done, prepare for REPLY) ;/
US=0017h ;\
RXTX_ADDR=rxbuf.. ;
After RX preamble: ;
IF=40h (RX Start, for REPLY) ; REPLY
RXTX_ADDR=rxbuf.. ; (if any)
After RX data: ;
IF=01h (RX Done, for REPLY) ;
WRCSR+18h (for REPLY) ;/
After a dozen microseconds: ;\
RF_STAT=7 ;Switching from REPLY to ACK ;
RF_STAT=8 ;TXing ACK (shortly after above STAT=7) ;
RXTX_ADDR=0FC0h (special dummy addr during TX ACK) ;
After TX preamble: ; ACK
IF=80h (TX Start, for ACK) ;
After TX data: ;
optional: IF=02h (TX Done, for ACK) (if enabled in TXSTATCNT);
optional: TXSTAT=0B01h (ACK done) (if enabled in TXSTATCNT);
TXBUSY=0000h (formerly 0002h) ;
TXBUF_CMD.bit15=0 ;
TXHDR_0=0001h (okay) (formerly 0000h) ;
TXHDR_2=0000h (no error flags) (formerly 0002h) ;
SEQNO+1 ;
RF_STAT=1 ;RX awaiting ;
IF=1000h (CMD timeslot done) (shortly AFTER above IF=02h) ;/
Code: Select all
nds_wifi_state_log (when uploading to ds download play on dsi)
(after starting transfer via TXREQ and TXBUF_CMD write)
TXBUSY=2 (formerly 0) (right after TXBUF_CMD write, or sometimes a bit later)
US=0148h (other day: 002Ch or 01E5h) ;\
RF_STAT=3 (TXing) (formerly 2) ;
RF_PINS=04h (formerly 84h) ;
RF_PINS=44h (shortly after above 04h) ; CMD
RXTX_ADDR=0006h..0008h (TXbuf+0Ch..) (formerly in RXBUF) ;
SEQNO+1 ;
US=0060h <-- preamble time for CMD ;
RF_PINS=46h ;
US=0006h ;
IF=80h (TX Start, for CMD) ;
RXTX_ADDR=0009h..0xxxh (TXbuf..) ;
US=0885h <-- data time for CMD ;
optional: IF=02h (TX Done, for CMD) (if enabled in TXSTATCNT);
optional: TXSTAT=0800h (CMD done) (if enabled in TXSTATCNT);
RF_STAT=5 (CMD done, prepare for REPLY) ;
RF_PINS=06h ;
RF_PINS=04h (shortly after above 06h) ;
RF_PINS=84h (shortly after above 04h) ;/
US=0017h ;\
RF_PINS=85h ;
RXTX_ADDR=rxbuf.. ;
US=0050h <-- partial preamble time for REPLY ; REPLY
IF=40h (RX Start, for REPLY) ; (if any)
RF_PINS=87h ;
RXTX_ADDR=rxbuf.. ;
US=0090h <-- data time for REPLY ;
RF_PINS=84h ;
US=0006h ;
IF=01h (RX Done, for REPLY) ;
WRCSR+18h (for REPLY) ;/
US=000Dh ;\
RF_STAT=7 ;Switching from REPLY to ACK ;
RF_STAT=8 ;TXing ACK (shortly after above STAT=7) ;
RF_PINS=04h ;
US=0006h ; ACK
RF_PINS=44h (shortly after above PINS=04h) ;
RXTX_ADDR=0FC0h (special dummy addr during TX ACK) ;
US=0060h <-- preamble time for ACK ;
RF_PINS=46h ;
US=0006h ;
IF=80h (TX Start, for ACK) ;
US=007Dh <-- data time for ACK ;
optional: IF=02h (TX Done, for ACK) (if enabled in TXSTATCNT);
optional: TXSTAT=0B01h (ACK done) (if enabled in TXSTATCNT);
TXBUSY=0000h (formerly 0002h) ;
TXBUF_CMD.bit15=0 ;
TXHDR_0=0001h (okay) (formerly 0000h) ;
TXHDR_2=0000h (no error flags) (formerly 0002h) ;
SEQNO+1 ;
RF_STAT=1 ;RX awaiting ;
RF_PINS=84h ;
IF=1000h (CMD timeslot done) (shortly AFTER above IF=02h) ;/
RXTX_ADDR=0FC0h stays constant throughout the whole ACK transfer (it doesn't seem to increase from 0FBxh to 0FC0h). I guess they've just used that special "invalid" address value to indicate that none of the TXBUF's is in use.
It's been surprisingly difficult to receive REPLY's. That could be bad luck (CMD packets or REPLY's getting lost). On the other hand, I had no problems with lost packets when receiving intact streams of CMD+ACK pairs. One reason for not getting REPLY's seems to have been that I had set TXSTATCNT=FFFFh for logging TXSTAT changes. Setting the LSBs of TXSTATCNT seems to have some completely disabled replies (TXSTATCNT=F000h works better). And, RXFILTER=0FFFh seems to be also increasing chances to receive replies (not sure if that's really needed, it's working also with other RXFILTER settings, but setting more RXFILTER bits appeared to work a bit better).
I don't fully understand how CMD_COUNT works. It's usually set to the time needed to transfer one set of CMD+REPLY+ACK packets. The problem is that the CMD transfer may start about 500us after writing to CMD_COUNT and TXBUF_CMD, so CMD_COUNT is already half ellapsed at begin of transfer. I think that was part of the problem when I had used TXSTATCNT=FFFFh (I sometimes got REPLY RX start, but then immediately followed by ACK TX Done, without actually transferring the REPLY and ACK).
Re: NDS Wifi specs
Now I have managed to receive Empty Replies, too. Receiving those is enabled by setting RXFILTER.bit8. What is weird is that I've received different types of empty packets (both with HWHDR[0]=801Fh, Rate=2Mbit/s, 18h-byte IEEE header, and 0-byte frame body):
The one with FC=0158h is what you have described for empty multiplayer REPLY.
And the other one with FC=0000h, that would be an empty Assoc Request (or maybe just garbage with FC=zero). It does occur shortly after the actual (non-empty) Assoc Request. I've no idea why it's sending that empty one.
Code: Select all
FC=0000, Duration=00A2, Addr1=Own, Addr2=Remote, Addr3=Own, Seq=0010
FC=0158, Duration=0126, Addr1=Own, Addr2=Remote, Addr3=0309BF000010, Seq=0020
And the other one with FC=0000h, that would be an empty Assoc Request (or maybe just garbage with FC=zero). It does occur shortly after the actual (non-empty) Assoc Request. I've no idea why it's sending that empty one.
Re: NDS Wifi specs
nice findings!
also re: RFSTATUS=7. I kinda suspected it was a transition state of that kind, just had not observed it in hardware.
also reminds me: I should look into how the hardware calculates packet durations. those are set in a specific way during MP exchanges, I guess the purpose is to tell other wifi devices "this exchange is going to last this long, please don't use this channel in the meantime". setting durations properly isn't too important as far as emulators are concerned, but might be more important if we ever try connecting an emulator to an actual DS (not that I have high hopes about the feasibility of such a thing).
also re: RFSTATUS=7. I kinda suspected it was a transition state of that kind, just had not observed it in hardware.
also reminds me: I should look into how the hardware calculates packet durations. those are set in a specific way during MP exchanges, I guess the purpose is to tell other wifi devices "this exchange is going to last this long, please don't use this channel in the meantime". setting durations properly isn't too important as far as emulators are concerned, but might be more important if we ever try connecting an emulator to an actual DS (not that I have high hopes about the feasibility of such a thing).
Re: NDS Wifi specs
I don't really know the IEEE specs for the Duration values. I guess they are reserving extra time for responses?
And also don't know if the NDS durations are used or adjusted by hardware or software (or if that differs per packet type).
What I have observed when logging incoming multiplyer REPLY's is that smaller packets are having larger duration values. So there, Nintendo seems to adjust them to get constant timings per "packet_size + duration".
Eg. if a packet is 8 bytes (64bit) smaller, then duration is 32 us larger (for 2Mbit/s).
They could as well append 8-byte padding, but that would (minimally) increase risk of transfer errors during those padding bytes.
Btw. one random question/idea: Are TXBUF_LOC1..3 really all same, or could they have specific purposes? Like one of them being specifically used for sending responses to managment requests.
Well, I am currently sending everything through LOC3, data frames (for internet), and managment requests, and managment replies. So, at least I know that LOC3 is working for all purposes.
In case of managment replies, I am sending them immediately after the request (which, I think, will ensure them being transferred during the "Duration" period of the request (though I still seem to need retries for getting the managment reply transferred, at least the retries would be definetly outside of the "Duration" period, so I may be doing something wrong there)).
And also don't know if the NDS durations are used or adjusted by hardware or software (or if that differs per packet type).
What I have observed when logging incoming multiplyer REPLY's is that smaller packets are having larger duration values. So there, Nintendo seems to adjust them to get constant timings per "packet_size + duration".
Eg. if a packet is 8 bytes (64bit) smaller, then duration is 32 us larger (for 2Mbit/s).
They could as well append 8-byte padding, but that would (minimally) increase risk of transfer errors during those padding bytes.
Btw. one random question/idea: Are TXBUF_LOC1..3 really all same, or could they have specific purposes? Like one of them being specifically used for sending responses to managment requests.
Well, I am currently sending everything through LOC3, data frames (for internet), and managment requests, and managment replies. So, at least I know that LOC3 is working for all purposes.
In case of managment replies, I am sending them immediately after the request (which, I think, will ensure them being transferred during the "Duration" period of the request (though I still seem to need retries for getting the managment reply transferred, at least the retries would be definetly outside of the "Duration" period, so I may be doing something wrong there)).
Re: NDS Wifi specs
I have somehow managed to upload the updated nds-wifi specs...
https://problemkaputt.de/gbatek-ds-wire ... ations.htm
Apart from the already mentioned changes to hardware register specs, the main news are more details about nintendo's transfer protocol...
http://problemkaputt.de/gbatek-ds-wifi- ... eacons.htm
http://problemkaputt.de/gbatek-ds-wifi- ... d-play.htm
I think that's about twice as much as what was previously known about the protocol. With some important details needed to get multiboot uploads working, and some more obscure detail like exchanging information about the favorite colors of the different users. There are also still some unknown entries.
https://problemkaputt.de/gbatek-ds-wire ... ations.htm
Apart from the already mentioned changes to hardware register specs, the main news are more details about nintendo's transfer protocol...
http://problemkaputt.de/gbatek-ds-wifi- ... eacons.htm
http://problemkaputt.de/gbatek-ds-wifi- ... d-play.htm
I think that's about twice as much as what was previously known about the protocol. With some important details needed to get multiboot uploads working, and some more obscure detail like exchanging information about the favorite colors of the different users. There are also still some unknown entries.
Re: NDS Wifi specs
Not sure if that interests you, but I got WIFIWAITCNT documented:
Code: Select all
WIFIWAITCNT (ARM7 - 0x04000206)
Bit 0-1: WS0 nonsequential timing (0-3 = 10, 8, 6, 18 cycles)
Bit 2: WS0 sequential timing (0-1 = 6, 4 cycles)
Bit 3-4: WS1 nonsequential timing (0-3 = 10, 8, 6, 18 cycles)
Bit 5: WS1 sequential timing (0-1 = 10, 4 cycles)
Re: NDS Wifi specs
Thanks! I seem to have never tested or documented that properly. The WS0 cycle values are looking same as for GBA slot access time (in EXMEMCNT register).
Are that cycle values per halfword, or per word?
In your table, Bit2=0 is 6 cycles for WS0/sequential, but Bit5=0 is 10 cycles for WS1/sequential. That looks like a "special" hardware feature... or is it a typo?
Are that cycle values per halfword, or per word?
In your table, Bit2=0 is 6 cycles for WS0/sequential, but Bit5=0 is 10 cycles for WS1/sequential. That looks like a "special" hardware feature... or is it a typo?
Re: NDS Wifi specs
The values are per halfword, the bus interface seems to be the same as for the GBA slot.
"Bit2=0 is 6 cycles for WS0/sequential, but Bit5=0 is 10 cycles for WS1/sequential"
Yeah, it's like that for some reason. That's not a typo.
"Bit2=0 is 6 cycles for WS0/sequential, but Bit5=0 is 10 cycles for WS1/sequential"
Yeah, it's like that for some reason. That's not a typo.
Re: NDS Wifi specs
Reusing the GBA cart interface, including WAITCNT/EXMEMCNT, would make sense if Nintendo were originally developing the DS Wi-Fi radio as an Option Pak in SLOT-2 before putting it on the mainboard.
Re: NDS Wifi specs
Okay, thanks for confirming. Now I am wondering what would be fastest way to read from Wifi RAM... it looks like 16bit DMA from W_RXBUF_RD_DATA register would be faster than directly accessing the RAM via 32bit DMA or LDMIA.
I should change the "arm7_Wifi_MACCopy" function in wifiboot to use DMA from W_RXBUF_RD_DATA instead of normal "memcopy", not sure if that could actually improve the transfer speed, I guess the function is already fast enough to handle all incoming data.
I should change the "arm7_Wifi_MACCopy" function in wifiboot to use DMA from W_RXBUF_RD_DATA instead of normal "memcopy", not sure if that could actually improve the transfer speed, I guess the function is already fast enough to handle all incoming data.
Re: NDS Wifi specs
as I'm working on DS wifi again, I'll make a new post here with what I found. I don't know if nocash will see it, but regardless, here we go.
as I worked on melonDS to get local multiplayer connections stable, I found out some details that turn out to be important:
* when receiving a CMD frame, if the currently configured reply frame is too big (that is, if the preamble+data time length exceeds the maximum per-client reply time defined in the CMD frame), the hardware will ignore it and send an empty reply instead. when this happens, everything is still done as normal, except txheader[4] isn't incremented on the configured frame. the frame being sent is the same as when no reply frame is configured.
* when powering down the transceiver (via W_MODE_RST or W_POWERFORCE):
when doing so while a frame is being sent or received, the hardware will first finish that operation before powering down. the first three register changes are applied immediately, but RFSTATUS/RFPINS don't reflect idle mode until the current transmit/receive is finished.
* when receiving a beacon frame with a BSSID that matches W_BSSID, W_US_COUNT is set to the beacon's timestamp, like so:
these values are observed at a 2MB/s rate. for a 1MB/s rate, you'd multiply received_len by 8 I guess,and the 76 offset might be different too.
for me, these have greatly helped achieve a stable local-multiplayer connection.
I also did some reverse-engineering on the RX filtering, and found interesting things:
* when receiving a frame, the hardware always checks the first address field against W_MACADDR, regardless of the frame type/subtype.
* similarly, it always considers the second address field to be the source MAC -- this is used when acknowledging non-broadcast frames.
* local-multiplayer frames (RX-flags types C/D/E/F) are detected based on their MAC address fields.
if the 3rd address field is 03:09:BF:00:00:10, the frame is considered a MP reply frame (gets type E, or F if the framectl subtype is 5).
if the 1st address field is 03:09:BF:00:00:00, the frame is considered a MP CMD frame and gets type C. if that address field is 03:09:BF:00:00:03, it is considered a MP ack and gets type D. note that the aforementioned MP-reply check has priority over this one.
it's weird, because in my tests I haven't observed that. maybe I missed something?
I reverse-engineered the RXFILTER registers:
bits 0, 9, 10 and 12 in RXFILTER are still mysterious, but most of it is figured out I think. I also observed that port 0D8 had an effect on receiving data frames, so there might be more unknown RX-filter registers.
as I worked on melonDS to get local multiplayer connections stable, I found out some details that turn out to be important:
* when receiving a CMD frame, if the currently configured reply frame is too big (that is, if the preamble+data time length exceeds the maximum per-client reply time defined in the CMD frame), the hardware will ignore it and send an empty reply instead. when this happens, everything is still done as normal, except txheader[4] isn't incremented on the configured frame. the frame being sent is the same as when no reply frame is configured.
* when powering down the transceiver (via W_MODE_RST or W_POWERFORCE):
Code: Select all
Setting W_POWERFORCE=8001h whilst W_POWERSTATE.Bit9=0 acts immediately:
(Doing this is okay. Switches to power down mode. Similar to IRQ13.)
[4808034h]=0002h ;W_INTERNAL
[480803Ch]=02xxh ;W_POWERSTATE
[48080B0h]=0000h ;W_TXREQ_READ
[480819Ch]=0046h ;W_RF_PINS
[4808214h]=0009h ;W_RF_STATUS (idle)
* when receiving a beacon frame with a BSSID that matches W_BSSID, W_US_COUNT is set to the beacon's timestamp, like so:
Code: Select all
W_US_COUNT = beacon_timestamp + (received_len * 4) - 76
for me, these have greatly helped achieve a stable local-multiplayer connection.
I also did some reverse-engineering on the RX filtering, and found interesting things:
* when receiving a frame, the hardware always checks the first address field against W_MACADDR, regardless of the frame type/subtype.
* similarly, it always considers the second address field to be the source MAC -- this is used when acknowledging non-broadcast frames.
* local-multiplayer frames (RX-flags types C/D/E/F) are detected based on their MAC address fields.
if the 3rd address field is 03:09:BF:00:00:10, the frame is considered a MP reply frame (gets type E, or F if the framectl subtype is 5).
if the 1st address field is 03:09:BF:00:00:00, the frame is considered a MP CMD frame and gets type C. if that address field is 03:09:BF:00:00:03, it is considered a MP ack and gets type D. note that the aforementioned MP-reply check has priority over this one.
Code: Select all
0Fh: Also ALL empty packets (raw IEEE header, with 0-byte body)
I reverse-engineered the RXFILTER registers:
Code: Select all
RXFILTER
* bit0: receive beacons with mismatched BSSID, receive data frame retransmits (FC.bit11 set) (only if frame BSSID-matches or bit11 is set), possibly others too
* bit1: receive non-MP-reply/ack data frames with FC=0x18 (subtype 1)
* bit2: receive non-MP-CMD data frames with FC=0x28 (subtype 2)
* bit3: receive data subtype 3
* bit4: receive non-MP-reply data subtype 5
* bit5: receive data subtype 6
* bit6: receive data subtype 7
* bit7: receive MP acks
* bit8: receive empty MP replies (non-MP FC=0158 ignores this bit)
* bit9: receive non-beacon management frames with mismatched BSSID
* bit10: same function as bit9
* bit11: receive control/data frames with mismatched BSSID
* bit12: (GBAtek) "Update W_RXBUF_WRCSR after IEEE header" -- no effect observed on IRQs
* bit 13-15 don't exist
frames with a destination MAC other than W_MACADDR or broadcast, are always ignored. NOTE: the hardware always checks the frame's first address field and considers that to be the frame's destination MAC, regardless of the frame type.
similarly, non-broadcast frames require the second address field to be the sender's MAC, otherwise the hardware fails to correctly acknowledge the received frame.
BSSID check knows to look for the correct address field based on frame type. requires exact BSSID match, no 'broadcast' BSSID or anything.
bit8=0 causes empty MP replies to not be exposed, but they still count as a valid reply.
management subtype 9 is ignored/can't be sent.
control subtypes B..F are either not sent or ignored.
data subtypes 0 and 4 are always received
data subtypes 1 and 2 are special-cased when in a MP frame (if the destination MAC is one of the MP broadcast MACs)
data subtypes 8..F can't be sent
retransmit detection seems to be based on seqno (and maybe also FC.bit11?).
there's different retransmit handling for MP CMD frames. the hardware seems to keep track of the last CMD-frame seqno specifically, and doesn't seem to be controllable by RXFILTER.bit0.
RXFILTER2
filters data frames based on their FromDS/ToDS bits
* bit0: ignore FC=00x8 (fromSTA/toSTA)
* bit1: ignore FC=01x8 (fromSTA/toDS)
* bit2: ignore FC=02x8 (fromDS/toSTA)
* bit3: ignore FC=03x8 (fromDS/toDS)
* bit4-15 don't exist
bit1=1 causes MP replies not to be exposed, but they still count as a valid reply.
bit2=1 causes a MP client to attempt sending something? but it doesn't seem to send anything valid -- counts as an error on the host side.