Random questions (mostly APU)

Are you new to 6502, NES, or even programming in general? Post any of your questions here. Remember - the only dumb question is the question that remains unasked.

Moderator: Moderators

User avatar
za909
Posts: 229
Joined: Fri Jan 24, 2014 9:05 am
Location: Hungary

Random questions (mostly APU)

Post by za909 »

Hey! It's been a while since I last posted anything, mainly because I never had time to even attempt working on anything, but then I realised that these things take a while anyway to make.

I don't exactly know what I'm after, maybe all I'm gonna do is just a simple sound engine, maybe more, but I have a couple questions before I go deeper (the only things I've made so far were simple DPCM playback and a ROM that plays a pitch sweeping noise sound, so at least I understand how game logic and NMI have to cooperate)
So what I want to make for now is a simple sound driver, if possible I'm trying to go for just a 32k NROM program, and if the size of the individual programs gets too large (assuming that I'm going for a full game), switch to 128k UNROM (preferably without CHR-RAM).

I've spent a lot of time drawing and sketching my engine on paper, and I know how everything is supposed to work out (bitflags, manipulation of shadow registers, etc. )
but still, there are a couple of features I'm not sure how to implement (or if it's possible to implement them at all)
So no code here yet folks, just my ideas of how I want to do these things.
What I want to know ultimately, is whether this is a good way of thinking when it comes to programming or not

1. I'm planning to implement vibrato using the sweep units (so no triangle vibrato) and I'm not sure if this is possible. I want it this way though, so that I can get away with less code, and all I have to do is alternating between upward and downward pitchbends with a fixed timer (half period down, full up then half down to complete the triangle shape) , which activates when half the note duration is reached, if vibrato "mode" is enable with an effect.

2. Effects to apply to notes are all contained in a single byte, which is changed by an effect
msb| TDV- IIII |lsb
T - Tone drum mode (pulse only): if set, discard note and play a pre-defined drum sound with the sweep unit and envelope
D - Detune mode: if set, add one to the low period-shadow register after fetching it if a new note byte is read
V - Vibrato mode (pulse only): if set, apply vibrato at a pre-defined rate with the sweep units at half note duration.
I - Select instrument (pulse only): Selects intrument for the pulse channels. All instruments are 8 bytes, successively written to $4000/$4004 and at entry $03, waits for note off (so the last 4 entries are the "release" phase)

3. Should all possible note pitches be accessible directly with a single byte, and duration be set independently (which would allow more flexibility with song speed and such) or the 4 high bits meaning a duration in frames in lookup table, and the low 4 meaning notes (and octaves are set by an effect)? Which one is more efficient?

4. What size should I expect? What always gets me and takes my motivation is thinking about possibly ending up with a program way to large or way too CPU-extensive. Or should I just not worry about that at all?
tepples
Posts: 22345
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Random questions (mostly APU)

Post by tepples »

A simple sound driver and its data will certainly fit in 32 kilobytes, more likely 16 if you don't use any DPCM. Look at the size of a well-optimized NSF for instance. And even then, dealing with CHR RAM isn't that hard; you just need to copy an 8192 byte chunk from one of the banks of PRG ROM to PPU $0000.

Are you planning to use the NES as an instrument, where the music driver has most of the RAM and CPU time available for itself? Or are you planning a music engine to be used by a game?

As for pitch, my music engine's phrase data format uses a range of 0-24 semitones above the phrase's base pitch. Semitone 25 means "tie", or hold the previous note and don't start a new note. Semitone 26 means "rest", or cut the current note. Semitones 27-31 have special meanings related to effects such as arpeggio, shifting the phrase's base pitch, and the like. This leaves three bits for duration, selected among 1, 2, 3, 4, 6, 8, 12, or 16 rows, where in-between durations are made with ties. A phrase's initial base pitch is specified in units of a semitone in the "conductor track" that tells when to play each phrase, not in the phrase itself, so that the phrase can be transposed up or down at various parts of a song.

Code: Select all

76543210  Phrase bytecodes $00-$C7
||||||||
|||||+++- Duration index
+++++---- Pitch (0-24: semitone offset from base pitch,
          25: continue note, 26: stop note)
Vibrato using the sweep units may drift from the intended center pitch when the APU's updates doesn't line up with the music engine's updates. In Tetris for NES, for example, I can change the line clear sound by rotating the falling piece at just the right moment.
User avatar
za909
Posts: 229
Joined: Fri Jan 24, 2014 9:05 am
Location: Hungary

Re: Random questions (mostly APU)

Post by za909 »

I would use it in a game if anything (I want to get better at programming by doing something with the part of the NES I'm the most familiar with) and sound effects would only be played with the Pulse 2 and Noise channels. The SFX data format is simply 16 note bytes (raw period for noise straight away), and then 16 raw data bytes for $4004 and $400C. Sweep is disabled for sfx. Or maybe I could go for having vibrato only on Pulse 1 and on Pulse 2 I only use it for sfx, I'll see how it goes.

And so, whenever I play sfx, I interrupt music playback by letting all the updates for the channel run, except that I never write the shadow registers to the real ones while an sfx is active, right? Not that I'd have to be overly concerned with this in the case of noise sounds. I want to use the envelope generator for the drums, so I can't recover the current volume and continue the envelope anyway.
tepples
Posts: 22345
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Random questions (mostly APU)

Post by tepples »

My music engine uses software envelopes for everything, with the attack phase and start of the decay specified frame-by-frame and the rest as a linear decrease in volume specified as a starting level of x and a decrease rate in units per 16 frames.

For each channel, it interprets the phrase bytecode if needed, runs the envelope code, and writes the pitch and duty/volume values to locations in low zero page. Then it reads from the current sound effect on that channel, and if the sound effect is louder, it uses the duty/volume and pitch from the sound effect instead. A sound effect interrupts an existing effect only if it is longer than the existing effect's data. Square wave sound effects get played on channel 1 or 2, whichever has less remaining data.

Drums are actually sound effects in my engine. For example, a kick drum has two components: noise at high pitch ($3) for one frame followed by a few frames of noise at the lowest pitch ($F), plus a few frames of triangle at descending pitches. This allows the triangle part of the drum to interrupt the bass line in a reasonable way. Hi-hats alternate between the long-period (hiss) and short-period (tonal) modes of the noise channel, which to me sounds slightly more realistic especially for open hats.

You can listen to an NSF of my engine, and the source code for the latest version is part of RHDE.
User avatar
rainwarrior
Posts: 8062
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Random questions (mostly APU)

Post by rainwarrior »

za909 wrote:4. What size should I expect? What always gets me and takes my motivation is thinking about possibly ending up with a program way to large or way too CPU-extensive. Or should I just not worry about that at all?
I wrote a music engine that is about 1.6k of code, and it supports a subset of Famitracker features, so I can use Famitracker to make tunes and SFX for it. It runs in about 1700 cycles, typically, peaking at around 2400. I haven't done any significant optimization of it, so it could probably be a little smaller or a little faster if I needed either of those things. I have fit my whole game soundtrack and SFX and music driver into a single 32k bank (I am using BNROM), which seemed like a pretty reasonable size target.
User avatar
za909
Posts: 229
Joined: Fri Jan 24, 2014 9:05 am
Location: Hungary

Re: Random questions (mostly APU)

Post by za909 »

Alright, I've finally started working on this, and at least I think I'll get this. It's not like you people figure out everything the first time in a matter of seconds, right? (Because I feel like an alien here, my brain is wired for art stuff and not exactly "science" stuff like this, but that doesn't make me give up without trying you know)
You know, I just want to know if I'm on the right track or should find another hobby.

So this is what I have for now, initiating playback of a new song. I load the song header address to temporary zero page ram and then the pointers for the 4 channels. I hope I'll be able to do this with the planned virtual registers and counters in 32 zero page bytes and 32 regular RAM bytes. (Maybe that's too much as it is, but I have nothing to compare to)
It does what it's supposed to in a test file, and then crashes with stack overflow, because it just kind of ends for now.

Code: Select all

.org $8200

SoundBegin:
; First of all, check if the sound engine is
; enabled at all
	
	lda prog_flag1
	and #%10000000
	bpl EndSound
	jmp PlayBack

EndSound:
	rts

PlayBack:
; First, check if the song to be played is a new
; one, or the same as in the last frame	

	lda cur_song 	; Other stuff requests a song here
	cmp prev_song
	sta prev_song
	bne InitSongl_00
	jmp ProcessFrame

InitSongl_00:
; Fetch the start address for all 4 channels

	ldy #$00

InitSongl_01:
	ldx cur_song
	lda songTBL,x
	sta temp_1
	inx
	lda songTBL,x
	sta temp_0

InitSongl_02:	
; CH addresses will have to be copied to temp. memory for
; use with indirect read!

	lda (temp_0),y
	sta p1_addrhi,y
	cpy #$07
	beq InitSongl_03
	iny 
	jmp InitSongl_02

InitSongl_03:
; Clear sound memory

	lda #$00
	ldx #$00

InitSongl_04:

	sta p1_shvol,x
	cpx #$15
	beq InitSongl_05
	inx
	jmp InitSongl_04
	
InitSongl_05:
; Clear on page 3
	
	ldx #$00

InitSongl_06:

	sta p1_timerload,x
	cpx #$08
	beq ProcessFrame
	inx
	jmp InitSongl_06
	
songTBL:
; The data doesn't make any sense for now

	.dw $9000,$9000,$9000,$9000,$9000,$9000,$9000,$9000
	.dw $9000,$9000,$9000,$9000,$9000,$9000,$9000,$9000

ProcessFrame:
	nop
User avatar
za909
Posts: 229
Joined: Fri Jan 24, 2014 9:05 am
Location: Hungary

Re: Random questions (mostly APU)

Post by za909 »

I have a few pretty easy questions here, since it could save a couple bytes during certain conditional jumps, and additions if it works. So for example if I lda #$00, the zero flag is set. But what happens if I load a non-zero value? Is the zero flag cleared or unaffected?

Does the state of the carry flag actually affect the result in the accumulator druing additions and subtractions? As in, does it change any of its 8 bits? So can I get rid of clc in this piece of code?

Code: Select all

EffEnd:
; Increment ch address and return
; If wrap around occurs, increment hi address	
	
	clc
	adc p1_addrlo,x
	sta p1_addrlo,x
	bcs EffEndl_00
	rts

 EffEndl_00:
	
	inc p1_addrhi,x
	rts
tepples
Posts: 22345
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Random questions (mostly APU)

Post by tepples »

Yes. The adc instruction computes (A + value from memory + value from carry). Bits 7-0 of the result go to A, and bit 8 goes to carry. So if the carry is set when the adc instruction is executed, the CPU adds 1 to the result. The clc prevents the CPU from adding 1 by ensuring that the contribution of the value from carry to the sum is 0. If you can find some other way of proving that carry is 0 before it hits that line of code, you can drop the clc.

The carry does not affect the inc and dec instructions (including inx, iny, dex, and dey), nor do they affect the carry.
User avatar
za909
Posts: 229
Joined: Fri Jan 24, 2014 9:05 am
Location: Hungary

Re: Random questions (mostly APU)

Post by za909 »

Thanks, so really in this case it acts as if the accumulator was a sort of "9-bit" register. My first question did not get answered though, I guess you just forgot about it or didn't notice it.

No big deal because while I'm at it I feel the need to get the rest out of the way (it might be worth just contacting some of you privately instead of polluting the forum with my shit in the long run)

1. BRK. Why do I see no use for this (yet)? It's all dependent on software, so the only way I can make use of it, is by endlessly looping and waiting for NMI to do something. But even then, I can just jsr or jmp. Is this just some feature that's not very useful for the NES, but instead in other 6502 based machines? OR am I missing the point entirely?

2. ASL & LSR vs. ROL & ROR. How are they any different from eachother? The seem to be doing the same things, have the same addressing modes available and even affect the status flags in the exact same way.

3. This one is not very important because I'm not going to use unofficial opcodes, but are there any that affect the unused status flag?
JRoatch
Formerly 43110
Posts: 394
Joined: Wed Feb 05, 2014 7:01 am
Location: us-east
Contact:

Re: Random questions (mostly APU)

Post by JRoatch »

1. Being shared with the IRQ vector, NMI/IRQ hijacking, and the stack juggling necessary to read the byte after the opcode like in other 6502 based machines, does makes it less useful.

2. ROL and ROR inputs the carry bit, where ASL and LSR inputs a constant 0 bit.

3. No. and as far as I know the unofficial opcodes that use the ALU are not affected by the D flag.
User avatar
thefox
Posts: 3139
Joined: Mon Jan 03, 2005 10:36 am
Location: Tampere, Finland
Contact:

Re: Random questions (mostly APU)

Post by thefox »

43110 wrote:3. No. and as far as I know the unofficial opcodes that use the ALU are not affected by the D flag.
I don't know what he meant by the "unused" flag, but bits 4 and 5 don't even physically exist in the CPU, so they can't be affected by anything. The decimal flag (bit 3) is also completely disconnected in 2A03, so it doesn't affect anything, be it official or unofficial instructions, even though the flag exists.
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi
User avatar
Movax12
Posts: 529
Joined: Sun Jan 02, 2011 11:50 am

Re: Random questions (mostly APU)

Post by Movax12 »

You can still set the D flag with CLD / SED but there is not much point. You could use it for a boolean, but it's probably more trouble than it is worth to read it. Maybe:

Code: Select all

    php
    pla
    and #$08
    ; is D set?
    bne there
    ; is it clear?
    beq here
User avatar
za909
Posts: 229
Joined: Fri Jan 24, 2014 9:05 am
Location: Hungary

Re: Random questions (mostly APU)

Post by za909 »

Thank you for your answers, I'd really like to keep this going because programming is such a good remedy for stress for me. (Does anyone else feel that way too?)
So really there's not much point in trying to use the D flag. (If you're really that craving for every single bit of memory you can get you're probably not doing a very good job?)

And again, I've run into something fairly specific. I want to test if a value is higher than another fixed value, so I use subtraction and if the result is below zero, I do a certain action. Though I don't know which status flag to check.
The 6502 reference I'm using tells me this about sbc:
Carry Flag Clear if overflow in bit 7
...
Negative Flag Set if bit 7 set

So do I check N, or C to know what the result is? In my code (to avoid having to include a bunch of $00 bytes in the high period table) I use sec before the subtraction, so C should be cleared after the instruction if the result is negative, and if I subtract a value too large, I can't rely on N getting set.

Code: Select all

; If the note is high, automatically load 0 for hi period	
	
	lda temp_3
	sec
	sbc #$25
	bcc LoadPeriodl_00 ; Manually find the correct hi period
	lda #$00
	sta p1_shhi,x
	rts
User avatar
rainwarrior
Posts: 8062
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Random questions (mostly APU)

Post by rainwarrior »

I recommend reading these:
http://6502.org/tutorials/compare_instructions.html
http://6502.org/tutorials/compare_beyond.html

For unsigned comparisons you want to check C and/or Z. CMP is equivalent to SEC SBC, but it doesn't change the value in A.

For signed comparisons you need to do a little more work, see the second article.
User avatar
za909
Posts: 229
Joined: Fri Jan 24, 2014 9:05 am
Location: Hungary

Re: Random questions (mostly APU)

Post by za909 »

Alright, now I'm at the point where the pulse channel handling is 90% finished, so I need to start thinking about what to do with the triangle. I'm planning to have two modes for it, one with infinitely held notes (but upon reading a delay $00 byte I turn it off with its bit in $4015) and one using the linear counter (in which case I'll probably ignore note delays, since silencing the channel is automated)
An effect is used to change this, the parameter is saved to RAM, and during data processing for the triangle channel, it reads this value (if it's $00, use infinite length, if non-zero, use it as the linear counter load)

But I don't see how the counter load affects the note length at all. I thought it was just a 7-bit countdown at 240Hz, and that's it, and I tested it with the SNDTEST.nes ROM but it's just all weird and I'm not sure what's going on, and why I randomly get endless notes even though bit 7 of $4008 is set.
So linear counters, how to they work?
Post Reply