Automatically determining track lengths?

Discuss NSF files, FamiTracker, MML tools, or anything else related to NES music.

Moderator: Moderators

tepples
Posts: 22345
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Automatically determining track lengths?

Post by tepples »

blargg wrote:Even if a track has no timing information, a player should fade it out after some number of minutes. The proper solution is for the song file itself to encode the length of tracks in the file itself. If this can be done automatically, great, but do that on the archive copy of the file rather than requiring a player to go through backflips. For the NSF format, this has been solved, and it's called NSFE.
Is there a way to mass-convert NSF files to NSFE with accurate track lengths without having to pay somebody at least minimum wage to sit down and listen to each track in each NSF and write down when to fade them out?
User avatar
blargg
Posts: 3717
Joined: Mon Sep 27, 2004 8:33 am
Location: Central Texas, USA
Contact:

Post by blargg »

A nice distributed solution would be to capture this data as a part of people's natural NSF playing behavior. The main piece of data desired is the length to play the song before fading out. The rest can be determined automatically or doesn't matter.

I've written command-line tools to convert between NSF+text file and NSFE (both ways), and have done some experimentation with automatic loop finding by looking at registers and frequency (FFT) data. Even with an automated tool someone would need to check the files for cases where automated detection fails.

I guess I don't listen to game music that much, since I get by fine with my player which fades out after 2.5 minutes (or different if the track specifies a length).
NewRisingSun
Posts: 1312
Joined: Thu May 19, 2005 11:30 am

Post by NewRisingSun »

by looking at registers and frequency (FFT) data
Frequency data?

Under what scenario might the following similiar approach fail: For each frame, make a snapshot of both CPU RAM and hardware registers --- then compare each new frame's snapshot with all previous ones. If a match has been found (all variables are the same), the track is looping or stuck in silence. If no match has been found after, say, five minutes worth of snapshots --- or any time after which it can be said that the music MUST be over by now ---, there must be some "rogue" variables that never get reset upon looping, because they count "how much time has passed since playing started" or something like that.
Determine which variables they are, because they don't match any previous snapshot, and go through all frame snapshots again, comparing each one with all previous ones, but this time ignoring said variables.

Either way, the first snapshot that matches a previous one is the end of the song. If one snapshot takes about 4 KB (more for NSFs from games that had extra RAM), said five minutes would need 4 KB * 60 snapshots per second * 300 seconds = 72000 KB, which shouldn't be a problem with today's harddisks. Since it would run all by itself, speed should not be that big an issue either, and it doesn't need to run in real-time...
User avatar
blargg
Posts: 3717
Joined: Mon Sep 27, 2004 8:33 am
Location: Central Texas, USA
Contact:

Post by blargg »

My idea with frequencies was to perform a Fourier transform on blocks of samples. Below is a frequency plot of the beginning of Super Mario Bros. main theme. You can follow the upper light gray bars for the first few notes of the melody.

Image

I figured this might make it easier to look at the important data, since the waveforms might differ.

Your idea about looking at system state sounds excellent. I just threw together a simple version saves 32-bit checksums of the low 2K RAM and it works on several tracks from Destiny of an Emperor. I have it play 5 minutes of the track (only takes a little over a second of host CPU time) and then start at the beginning of the checksums for each 1/60 second frame and find a run of 20 or more that repeat later in the track. I'll have to experiment with the "rogue variables" issue.
If one snapshot takes about 4 KB (more for NSFs from games that had extra RAM), said five minutes would need 4 KB * 60 snapshots per second * 300 seconds = 72000 KB, which shouldn't be a problem with today's harddisks.
That also shouldn't be a problem just keeping it all in RAM, with today's memory capacities. :) It might even be simpler to just keep checksums of the snapshots and then regenerate them when actually needed (by running multiple NSF emulators at once or something).
NewRisingSun
Posts: 1312
Joined: Thu May 19, 2005 11:30 am

Post by NewRisingSun »

I think there's at least one scenario where my approach fails: if the track itself doesn't loop, but the voices loop out of sync at different times, like in Gradius.nsf song 2.
User avatar
blargg
Posts: 3717
Joined: Mon Sep 27, 2004 8:33 am
Location: Central Texas, USA
Contact:

Post by blargg »

I don't think the "rogue variable" problem can be eliminated without making further assumptions. Say a music engine increments a 16-bit value once each frame. Every 4.27 seconds the low 8 bits will loop. How do you eliminate this from consideration?

I was thinking more about what state to monitor and it seems that it would be simpler to examine only sound register writes each frame, looking for total repetition of the same writes. This will never be more strict than looking at total system state, since identical system state will always result in the same register writes, while a given set of register writes could occur from several possible system states. Looking at register writes eliminates the "rogue variable" problem.

The one remaining problem I foresee is songs with tempos that aren't an integral division of 3600 bpm. With these, the song's loop might not coincide with the same tempo divisor fraction as when the song first passed that point, causing slight variation in the timing that would throw off a loop detector. Perhaps it could ignore +/- 1/60 second differences in timing.
User avatar
blargg
Posts: 3717
Joined: Mon Sep 27, 2004 8:33 am
Location: Central Texas, USA
Contact:

Post by blargg »

The register writes approach is working pretty well (actually, just a checksum of the register writes right now), and I was able to put in some allowance for +/- 1/60 second jitter. It's given proper times for tracks from several different soundtracks from different games, having trouble on a couple of tracks from each.

At this point I could put together an open-source command-line tool that took an NSF and generated a text file timesheet, which you could then fill in with the track names, adjust times, add playlist order, and adjust any other fields, then feed this and the NSF through an NSFE builder tool to generate the final NSFE. It'd be nice if we had a good automated NSF/NSFE submission site as I described in another thread.

The format is something like this (... parts snipped to save space).

Code: Select all

Game: Snake's Revenge
Author: T. Ogura
Copyright: 1990 Konami/Ultra Games
Ripper: Matt Conte

Playlist:
05      0:24    Introduction
00      0:38    Mission Introduction
...
09      1:23    Ending/Staff Roll
18      0:05    Mission Failed

Tracks:
20      -       Pause
21      -       Tap
22      -       Tap 2
...
Knurek

Post by Knurek »

blargg wrote:The register writes approach is working pretty well (actually, just a checksum of the register writes right now), and I was able to put in some allowance for +/- 1/60 second jitter. It's given proper times for tracks from several different soundtracks from different games, having trouble on a couple of tracks from each.
Groovy. The big question - does it have loop number support? I'd like to have my track play for two loops (a bit of a standard in VGM community). Doubling the detected time would only work for the tunes that don't have an intro part.

And does your method work with songs that have this structure: loop, loop, small outro, loop, loop, small outro. There was a song in Ys3 that did that (Theme of Chester).

Also, keep in mind that some of us actually don't use NSFE (NotSoFatso works like ass with XMPlay, and actually I prefer the sound of in_yansf (Farty preset)), so even if you integrate a NSFE creator in the detection tool, please keep the text output option, as converting from that to M3U playlists or whatever would be n times easier than from NSFE files.
User avatar
Dwedit
Posts: 4470
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Post by Dwedit »

Is it possible to detect song loops in games like Wizards and Warriors 3 where the song speeds up each time it loops?
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
User avatar
blargg
Posts: 3717
Joined: Mon Sep 27, 2004 8:33 am
Location: Central Texas, USA
Contact:

Post by blargg »

As far as number of loops, that information is available. NSFE doesn't have a block for it, but one could be added, allowing the player to decide how many loops based on the loop start and duration. The normal play time before fade out would be based on the number of loops, but up to final adjustement based on how long it's pleasant to listen for. The loop algorithm currently verifies that the found loop continues through the entire time it sampled (I think I'm using 10 minutes currently), thus songs with sub loops won't throw it off.

I didn't realize that NSFE wasn't the preferred format. Can't someone just modify players to support it? I've got some LGPL code that reads an NSFE and then converts in memory to an NSF for use with a normal NSF sound engine, and allows access to the track names. Or perhaps the issue is that multiple tracks in a single file are problematic, and that this has already been solved in a general way with the use of m3u files?

I'm making two separate tools: loopnsf to find the loop points and output these as a text file, and nsf2nsfe to build an NSFE from an NSF and text file. I'll modularize loopnsf a bit more to make it easy to output an m3u instead. I've also written an nsfe2nsf tool, which could be made to output an m3u from an NSFE, which might be the most helpful of all. They'll all be open-source so you can tweak these things once I'm done. What format should the m3u file be (I'm not at all familiar with m3u)?

I tried it with Wizards and Warriors and it didn't find the loops. So the songs steadily speed up, eventually going really fast? The main goal of the tool is to find the loop times of most tracks, leaving the problematic ones to a person to do manually.

Running on the version of Ys III that I have, loopnsf outputs the following track times (reformatted to save space). Tracks marked with a dash eventually end, which I don't have the tool detect yet.

Code: Select all

   0    1    2    3    4    5    6    7    8    9
0      1:36 1:08 2:28 0:56 1:24 1:30  -    -   3:38
1  -    -    -   1:06  -   3:23 0:42 1:42 1:14 2:33	
2 1:56 2:32 1:36 0:50 0:28 2:06  -	
Knurek

Post by Knurek »

blargg wrote:As far as number of loops, that information is available.
Joy. Listening to NSFs is gonna get more pleasant soon.
blargg wrote:I didn't realize that NSFE wasn't the preferred format. Can't someone just modify players to support it?
Well, both NSFPlug (in_yansf) and NEZPlug (in_nez) are open source, so it's possible. Still, both haven't been updated for a while (well, NEZPlug got recently updated by RuRuRu, but that was only compatibility fixes).
blargg wrote:Or perhaps the issue is that multiple tracks in a single file are problematic, and that this has already been solved in a general way with the use of m3u files?
The thing is - with M3U playlist I can listen to timed songs whenever I want, but if I want to listen to the tunes endlessly, I just open NSF file. No plugin reconfiguration required.
blargg wrote:What format should the m3u file be (I'm not at all familiar with m3u)?
Extended M3U format used by both NEZPlug and NSFPlug:

# Solstice (CSG Imagesoft) [1989]
# Composer: Tim Follin

(comments start with # at the beginning)

Solstice (1989)(CSG Imagesoft).nsf::NSF,06,Title - Tim Follin - Solstice - ©1989 CSG Imagesoft,177,,1
Solstice (1989)(CSG Imagesoft).nsf::NSF,02,Main Theme - Tim Follin - Solstice - ©1989 CSG Imagesoft,112,113,7

(entry structure is: name_of_the_nsf_file::NSF,[1 based songno(dec)|$songno(hex)],[title(text, no commas obviously)],[time(h:m:s)|time(s)],[loop(h:m:s)|time(s)][-],[fade(h:m:s)|fade(s)],[loopcount])

Use the decimal songnumbers, as the hex ones aren't compatible between NSFPlug and NEZPlug (ie: NSFPlug starts from $0x01, NEZPlug from $0x00).
Time format is a bit tricky.
The first track doesn't loop, so I just set the time (i'm doing seconds, but you can swith 177 with 2:57, both work), skip the loop declaration, and add a 1 second fade to be on the safe side.

The second track loops, so i set the time to the duration of the first loop (112 seconds), the loop to the duration of the second loop (113 seconds), and fadeout to 7 seconds (my personal standard).

This could be done this way too: 112,-,8 (where - means that the loop equals the time).

I could also add a loopcount after the fade (ie, if I like to hear 4 loops, I'd just set: 112,-,8,3).
blargg wrote: Tracks marked with a dash eventually end, which I don't have the tool detect yet.
Smiles at the yet part.
blargg wrote:

Code: Select all

   0    1    2    3    4    5    6    7    8    9
0      1:36 1:08 2:28 0:56 1:24 1:30  -    -   3:38
1  -    -    -   1:06  -   3:23 0:42 1:42 1:14 2:33	
2 1:56 2:32 1:36 0:50 0:28 2:06  -	
Tracks #7 and 8 do loop. The rest is fine, including the detected times (which are spot on from what I've checked).
I was talking about the Megadrive version of Ys3, seems like the NSF file doesn't have "The Theme of Chester" song. Not sure if it's a incomplete rip, or if the tune just wasn't in the NES convertion.
kingshriek
Posts: 98
Joined: Tue Sep 13, 2005 7:53 am
Contact:

Post by kingshriek »

"The Theme of Chester" is only in the X68k, MD, SFC versions (source: http://homepage2.nifty.com/tkdate/ysmusic/cd/bgm3.html).
User avatar
blargg
Posts: 3717
Joined: Mon Sep 27, 2004 8:33 am
Location: Central Texas, USA
Contact:

Post by blargg »

I was thinking more about what conversions to support and it makes most sense to have a loop finder that outputs a text file for building an NSFE, the NSFE builder, and an NSFE to NSF+m3u convertor. This way anyone timing an NSF will always generate an NSFE, thus encouraging them to contribute that to the NSFE archive. Because there are other NSFE tagging tools, the nsfe2nsf+m3u tool will be more useful.

The loop finding algorithm could use more work. I figured I'd get the overall tool organization worked out before discussing refinements. Perhaps at that time everyone can be trying a test build and tweaking the algorithm in parallel. I also stopped being lazy and added the "notice if the track went silent" bit to properly time tracks that end normally.
User avatar
blargg
Posts: 3717
Joined: Mon Sep 27, 2004 8:33 am
Location: Central Texas, USA
Contact:

Post by blargg »

I've finally got back to working on the NSF timing and NSFE generation tools. Sorry for the sudden absence from when it was being discussed over a month ago. At this point three command-line tools seem best:

timensf - Automatically times NSF tracks, both those that end and those that loop endlessly. It writes a text file with all the information. You'd then edit this text file as necessary (add track names, fix times, move tracks to playlist).

nsf2nsfe - Takes original NSF and text file to produce NSFE.

nsfe2m3u - Generates M3U and NSF file given an NSFE. This allows any NSFE file to be used, not just those created with the above tools.

They are almost done. I'd like to get things fine-tuned before I release the source code. In particular, the M3U functionality needs to be evaluated. I used Duck Tales as an example and made an archive of the files at each step in the process, going from original NSF to timed and tagged NSFE to M3U:

nsf_tools_example.zip

As for Win32 builds, we'll need someone who can compile the C++ source code.
NewRisingSun
Posts: 1312
Joined: Thu May 19, 2005 11:30 am

Post by NewRisingSun »

Nice, but consider adding a finer resolution than just seconds.
Post Reply