Reverse-engineering DLDI specs for NDS

Discussion of development of software for any "obsolete" computer or video game system. See the WSdev wiki and ObscureDev wiki for more information on certain platforms.
coto
Posts: 102
Joined: Wed Mar 06, 2019 6:00 pm
Location: Chile

Re: Reverse-engineering DLDI specs for NDS

Post by coto »

nocash wrote:
coto wrote:and otherwise, I just copy -> paste the DLDI in there, I get segfaults just by executing it from that address. Even when updating the DLDI headers.
Yes, do that, simply memcopy the block from 0200xxxxh to 0680xxxxh, without any address adjustments.
There should be no segfaults if you have correctly specified 0680xxxxh as dldi target address in the "empty dldi area" at time when creating the .nds file.
"empty dldi area" ????????????????????????? wtf???

-

Anyway, since I can demonstrate what I am saying (i'm not lying, and I can prove it):

Sources: https://bitbucket.org/Coto88/gbarunner2/src/master/

I enabled DLDI @ ARM9, recompiled GBARunner2, and just dma copied the DLDI from EWRAM -> VRAM (because i cannot use memcpy, as it uses 8 bit writes to VRAM in NintendoDS, which fails). And I get undefined aborts.

Steps to reproduce that:

dldigba.c -> replace this function:

(this one does the relocation perfectly)
#ifdef ARM9
PUT_IN_VRAM
#endif
bool dldiPatchLoader(bool clearBSS, u32 DldiRelocatedAddress, u32 dldiSourceInRam)
{
addr_t memOffset; // Offset of DLDI after the file is loaded into memory
addr_t patchOffset; // Position of patch destination in the file
addr_t relocationOffset; // Value added to all offsets within the patch to fix it properly
addr_t ddmemOffset; // Original offset used in the DLDI file
addr_t ddmemStart; // Start of range that offsets can be in the DLDI file
addr_t ddmemEnd; // End of range that offsets can be in the DLDI file
addr_t ddmemSize; // Size of range that offsets can be in the DLDI file

addr_t addrIter;

data_t *pDH;
data_t *pAH;

size_t dldiFileSize = 0;

// Target the DLDI we want to use as stub copy and then relocate it to a DldiRelocatedAddress address
DLDI_INTERFACE* dldiInterface = (DLDI_INTERFACE*)DldiRelocatedAddress;
pDH = (data_t*)dldiInterface;
pAH = (data_t *)dldiSourceInRam;

dldiFileSize = 1 << pAH[DO_driverSize];

// Copy the DLDI patch into the application
dmaCopyWords(0, (void*)pAH, (void*)pDH, dldiFileSize);

if (*((u32*)(pDH + DO_ioType)) == DEVICE_TYPE_DLDI) {
// No DLDI patch
return false;
}

if (pDH[DO_driverSize] > pAH[DO_allocatedSpace]) {
// Not enough space for patch
return false;
}


memOffset = DldiRelocatedAddress; //readAddr (pAH, DO_text_start);
if (memOffset == 0) {
memOffset = readAddr (pAH, DO_startup) - DO_code;
}
ddmemOffset = readAddr (pDH, DO_text_start);
relocationOffset = memOffset - ddmemOffset;

ddmemStart = readAddr (pDH, DO_text_start);
ddmemSize = (1 << pDH[DO_driverSize]);
ddmemEnd = ddmemStart + ddmemSize;

// Remember how much space is actually reserved
pDH[DO_allocatedSpace] = pAH[DO_allocatedSpace];


// Fix the section pointers in the DLDI @ VRAM header
writeAddr (pDH, DO_text_start, readAddr (pAH, DO_text_start) + relocationOffset);
writeAddr (pDH, DO_data_end, readAddr (pAH, DO_data_end) + relocationOffset);
writeAddr (pDH, DO_glue_start, readAddr (pAH, DO_glue_start) + relocationOffset);
writeAddr (pDH, DO_glue_end, readAddr (pAH, DO_glue_end) + relocationOffset);
writeAddr (pDH, DO_got_start, readAddr (pAH, DO_got_start) + relocationOffset);
writeAddr (pDH, DO_got_end, readAddr (pAH, DO_got_end) + relocationOffset);
writeAddr (pDH, DO_bss_start, readAddr (pAH, DO_bss_start) + relocationOffset);
writeAddr (pDH, DO_bss_end, readAddr (pAH, DO_bss_end) + relocationOffset);

// Fix the function pointers in the header
writeAddr (pDH, DO_startup, readAddr (pAH, DO_startup) + relocationOffset);
writeAddr (pDH, DO_isInserted, readAddr (pAH, DO_isInserted) + relocationOffset);
writeAddr (pDH, DO_readSectors, readAddr (pAH, DO_readSectors) + relocationOffset);
writeAddr (pDH, DO_writeSectors, readAddr (pAH, DO_writeSectors) + relocationOffset);
writeAddr (pDH, DO_clearStatus, readAddr (pAH, DO_clearStatus) + relocationOffset);
writeAddr (pDH, DO_shutdown, readAddr (pAH, DO_shutdown) + relocationOffset);

if (pDH[DO_fixSections] & FIX_ALL) {
// Search through and fix pointers within the data section of the file
for (addrIter = (readAddr(pDH, DO_text_start) - ddmemStart); addrIter < (readAddr(pDH, DO_data_end) - ddmemStart); addrIter++) {
if ((ddmemStart <= readAddr(pAH, addrIter)) && (readAddr(pAH, addrIter) < ddmemEnd)) {
writeAddr (pAH, addrIter, readAddr(pAH, addrIter) + relocationOffset);
}
}
}


if (pDH[DO_fixSections] & FIX_GLUE) {
// Search through and fix pointers within the glue section of the file
for (addrIter = (readAddr(pDH, DO_glue_start) - ddmemStart); addrIter < (readAddr(pDH, DO_glue_end) - ddmemStart); addrIter++) {
if ((ddmemStart <= readAddr(pAH, addrIter)) && (readAddr(pAH, addrIter) < ddmemEnd)) {
writeAddr (pAH, addrIter, readAddr(pAH, addrIter) + relocationOffset);
}
}
}

if (pDH[DO_fixSections] & FIX_GOT) {
// Search through and fix pointers within the Global Offset Table section of the file
for (addrIter = (readAddr(pDH, DO_got_start) - ddmemStart); addrIter < (readAddr(pDH, DO_got_end) - ddmemStart); addrIter++) {
if ((ddmemStart <= readAddr(pAH, addrIter)) && (readAddr(pAH, addrIter) < ddmemEnd)) {
writeAddr (pAH, addrIter, readAddr(pAH, addrIter) + relocationOffset);
}
}
}

/*
if (clearBSS && (pDH[DO_fixSections] & FIX_BSS)) {
// Initialise the BSS to 0, only if the disc is being re-inited
for(int i = 0; i < (readAddr(pDH, DO_bss_end) - readAddr(pDH, DO_bss_start)) / 4; i++)
{
((uint32_t*)&pAH[readAddr(pDH, DO_bss_start) - ddmemStart]) = 0;
}

}
*/
return true;
}


And instead replace it with:


(just DMA copy DLDI from EWRAM -> VRAM)
#ifdef ARM9
PUT_IN_VRAM
#endif
bool dldiPatchLoader(bool clearBSS, u32 DldiRelocatedAddress, u32 dldiSourceInRam)
{
addr_t memOffset; // Offset of DLDI after the file is loaded into memory
addr_t patchOffset; // Position of patch destination in the file
addr_t relocationOffset; // Value added to all offsets within the patch to fix it properly
addr_t ddmemOffset; // Original offset used in the DLDI file
addr_t ddmemStart; // Start of range that offsets can be in the DLDI file
addr_t ddmemEnd; // End of range that offsets can be in the DLDI file
addr_t ddmemSize; // Size of range that offsets can be in the DLDI file

addr_t addrIter;

data_t *pDH;
data_t *pAH;

size_t dldiFileSize = 0;

// Target the DLDI we want to use as stub copy and then relocate it to a DldiRelocatedAddress address
DLDI_INTERFACE* dldiInterface = (DLDI_INTERFACE*)DldiRelocatedAddress;
pDH = (data_t*)dldiInterface;
pAH = (data_t *)dldiSourceInRam;

dldiFileSize = 1 << pAH[DO_driverSize];

// Copy the DLDI patch into the application
dmaCopyWords(0, (void*)pAH, (void*)pDH, dldiFileSize);

/*
if (*((u32*)(pDH + DO_ioType)) == DEVICE_TYPE_DLDI) {
// No DLDI patch
return false;
}

if (pDH[DO_driverSize] > pAH[DO_allocatedSpace]) {
// Not enough space for patch
return false;
}


memOffset = DldiRelocatedAddress; //readAddr (pAH, DO_text_start);
if (memOffset == 0) {
memOffset = readAddr (pAH, DO_startup) - DO_code;
}
ddmemOffset = readAddr (pDH, DO_text_start);
relocationOffset = memOffset - ddmemOffset;

ddmemStart = readAddr (pDH, DO_text_start);
ddmemSize = (1 << pDH[DO_driverSize]);
ddmemEnd = ddmemStart + ddmemSize;

// Remember how much space is actually reserved
pDH[DO_allocatedSpace] = pAH[DO_allocatedSpace];


// Fix the section pointers in the DLDI @ VRAM header
writeAddr (pDH, DO_text_start, readAddr (pAH, DO_text_start) + relocationOffset);
writeAddr (pDH, DO_data_end, readAddr (pAH, DO_data_end) + relocationOffset);
writeAddr (pDH, DO_glue_start, readAddr (pAH, DO_glue_start) + relocationOffset);
writeAddr (pDH, DO_glue_end, readAddr (pAH, DO_glue_end) + relocationOffset);
writeAddr (pDH, DO_got_start, readAddr (pAH, DO_got_start) + relocationOffset);
writeAddr (pDH, DO_got_end, readAddr (pAH, DO_got_end) + relocationOffset);
writeAddr (pDH, DO_bss_start, readAddr (pAH, DO_bss_start) + relocationOffset);
writeAddr (pDH, DO_bss_end, readAddr (pAH, DO_bss_end) + relocationOffset);

// Fix the function pointers in the header
writeAddr (pDH, DO_startup, readAddr (pAH, DO_startup) + relocationOffset);
writeAddr (pDH, DO_isInserted, readAddr (pAH, DO_isInserted) + relocationOffset);
writeAddr (pDH, DO_readSectors, readAddr (pAH, DO_readSectors) + relocationOffset);
writeAddr (pDH, DO_writeSectors, readAddr (pAH, DO_writeSectors) + relocationOffset);
writeAddr (pDH, DO_clearStatus, readAddr (pAH, DO_clearStatus) + relocationOffset);
writeAddr (pDH, DO_shutdown, readAddr (pAH, DO_shutdown) + relocationOffset);

if (pDH[DO_fixSections] & FIX_ALL) {
// Search through and fix pointers within the data section of the file
for (addrIter = (readAddr(pDH, DO_text_start) - ddmemStart); addrIter < (readAddr(pDH, DO_data_end) - ddmemStart); addrIter++) {
if ((ddmemStart <= readAddr(pAH, addrIter)) && (readAddr(pAH, addrIter) < ddmemEnd)) {
writeAddr (pAH, addrIter, readAddr(pAH, addrIter) + relocationOffset);
}
}
}


if (pDH[DO_fixSections] & FIX_GLUE) {
// Search through and fix pointers within the glue section of the file
for (addrIter = (readAddr(pDH, DO_glue_start) - ddmemStart); addrIter < (readAddr(pDH, DO_glue_end) - ddmemStart); addrIter++) {
if ((ddmemStart <= readAddr(pAH, addrIter)) && (readAddr(pAH, addrIter) < ddmemEnd)) {
writeAddr (pAH, addrIter, readAddr(pAH, addrIter) + relocationOffset);
}
}
}

if (pDH[DO_fixSections] & FIX_GOT) {
// Search through and fix pointers within the Global Offset Table section of the file
for (addrIter = (readAddr(pDH, DO_got_start) - ddmemStart); addrIter < (readAddr(pDH, DO_got_end) - ddmemStart); addrIter++) {
if ((ddmemStart <= readAddr(pAH, addrIter)) && (readAddr(pAH, addrIter) < ddmemEnd)) {
writeAddr (pAH, addrIter, readAddr(pAH, addrIter) + relocationOffset);
}
}
}
*/

/*
if (clearBSS && (pDH[DO_fixSections] & FIX_BSS)) {
// Initialise the BSS to 0, only if the disc is being re-inited
for(int i = 0; i < (readAddr(pDH, DO_bss_end) - readAddr(pDH, DO_bss_start)) / 4; i++)
{
((uint32_t*)&pAH[readAddr(pDH, DO_bss_start) - ddmemStart]) = 0;
}

}
*/
return true;
}


shared.h -> disable DLDI @ ARM7 (makes DLDI @ ARM9):

#ifndef __SHARED_H__
#define __SHARED_H__

//DLDI ARM7 Support (requires to recompile the project)
//#define ARM7_DLDI

#define SOUND_BUFFER_SIZE (8192)
#define SAVE_DATA_SIZE (0x20000) //128K SRAM/EEprom/Flash 8bit/16bit/32bit write compatible RAM memory.
#define ROM_DATA_LENGTH (0x3A0000 - (32*1024) ) //0x400000 - 0x40000 (hypervisor) - 0x20000 (128K) = 0x3A0000
#define ROM_ADDRESS_MAX (0x08000000 + ROM_DATA_LENGTH)
#define MAIN_MEMORY_ADDRESS_SAVE_DATA (0x02400000 - SAVE_DATA_SIZE) //-> (0x023E0000 ~ 0x023FFFFF) -> mirror: 0x01FE0000
#define MAIN_MEMORY_ADDRESS_SDCACHE (MAIN_MEMORY_ADDRESS_SAVE_DATA - (32*1024)) //-> (0x23D8000): -- used for shared DLDI sector memory between ARM7/ARM9

#define SOUND_EMU_QUEUE_LEN 64
#define address_dtcm (0x02C00000) //@0x04F00000 @0x01800000
#define MAIN_MEMORY_ADDRESS_ROM_DATA (0x02040000)

//VRAM Layout
#define sd_cluster_cache_addr (0x06840000)
#define sd_access_driver (0x06860000)

#define FIFO_CNT_EMPTY (1 << 8)
#define REG_FIFO_CNT (*((vu32*)0x04000184))
#define REG_SEND_FIFO (*((vu32*)0x04000188))
#define REG_RECV_FIFO (*((vu32*)0x04100000))

#ifdef __ASSEMBLER__

sd_cluster_cache = sd_cluster_cache_addr
sd_data_base = sd_access_driver
sd_is_cluster_cached_table = (sd_data_base + 32768 + (64*1024) ) @(sd_data_base + (224 * 1024))
sd_cluster_cache_info = (sd_is_cluster_cached_table + (16 * 1024))
sd_sd_info = (sd_cluster_cache_info + (256 * 8 + 4)) @0x0685C404
pu_data_permissions = 0x33600603 @0x33600003 @0x33660003

#endif

#ifndef __ASSEMBLER__

#ifdef ARM9
#define PUT_IN_VRAM __attribute__((section(".vram")))
#define ITCM_CODE __attribute__((section(".itcm"), long_call))
#endif

#define PACKED __attribute__ ((packed))

#endif //__ASSEMBLER__

#endif



now re-build / recompile GBARunner2.

You'll get a build that causes undefined exceptions (registers scope within the original DLDI address, but while the Program Counter is in VRAM).

Thus, I cannot simply copy (or dmacopy) the DLDI from EWRAM to VRAM. Which is why I am actually replying here and calling it "a trick".
nocash
Posts: 1405
Joined: Fri Feb 24, 2012 12:09 pm
Contact:

Re: Reverse-engineering DLDI specs for NDS

Post by nocash »

The "empty dldi area" is the memory area in your .nds file that is reserved for the flashcart driver. If the driver hasn't been installed yet, then it is initially "empty", or "almost empty" because there are few non-empty bytes in there: The dldi ID bytes/string with EDh,A5h,8Dh,BFh,20h,"Chishm",00h at offset 00h, the size byte at offset 0Fh, and the load address at offset 40h.

The important value here is the load address at offset 40h, I assume you (or your devkit) have configured that value to 02xxxxxxh, and that is wrong because your code will actually load it to VRAM, so the correct value would be 068xx000h (or wherever you have it in VRAM).

If you figure out how to change that value then you won't manually need to adjust the FIX_GLUE and FIX_GOT stuff, and that will make things a good bit less unreliable, and that will probably fix mysterious issues like problems with SD vs SDHC... unless that issue was caused by missing 8bit write support in VRAM.
homepage - patreon - you can think of a bit as a bottle that is either half full or half empty
NightScript
Posts: 3
Joined: Mon Apr 27, 2020 9:46 pm

Re: Reverse-engineering DLDI specs for NDS

Post by NightScript »

nocash wrote: Fri Nov 09, 2018 9:18 pm Would there be any interest in DLDI support on DSi? I have no idea how many games & tools are actually needing DLDI support (and if they were really worth adding DLDI support).
The "top ten titles" might already have native support for DSi SD/MMC slot, if that's so, then DLDI would be needed only for some rather obscure old NDS titles, ie. the kind of stuff that wasn't updated (and maybe not even used or downloaded) in the past some years.
I'm 2 years late, but a DLDI driver for the Nintendo DSi has been made: https://github.com/ahezard/nds-bootstra ... er/hb/dldi
Trouble is, it's not that well written. For example, it uses FIX_ALL, something that's highly discouraged for DLDI drivers.
It'd be cool if you could take a look at it and see what could be done to improve/fix it.

Here's a link to download it: https://download-directory.github.io/?u ... er/hb/dldi
lifehackerhansol
Posts: 1
Joined: Mon Dec 06, 2021 1:19 am

Re: Reverse-engineering DLDI specs for NDS

Post by lifehackerhansol »

I've ran dldiscan on the DLDI archive repository hosted at https://github.com/DS-Homebrew/DLDI.

Seems GitHub isn't for everyone so I just attached the whole thing here along with the results stdout'd to result.txt and result_dumped.txt.

result.txt consists of raw DLDI files received from wherever we could, bundled with flashcart kernels, released on Chishm's site or others. result_dumped.txt consists of DLDI files that were dumped from memory, which there are a few tools that exist to do so. Said DLDI does not have proper ddmem addresses as they are dumped from memory and thus are already patched to the required memory address.

Some are provided with source code. Not all of these source codes compile (or rather, they compile but they don't actually work on real hardware) using latest devkitARM environment, which is something I'd like to fix one day but don't really know how.
Attachments
DLDI_scan.zip
(727.82 KiB) Downloaded 65 times
Post Reply