Practical audio streaming while limiting kbps and CPU usage
Posted: Fri Jan 20, 2012 3:13 am
Its a fact audio streaming is possible on the SNES, blargg proved it.
However I wonder if there is a way to limit the size of the data (kbps) and CPU usage.
I'll explain :
In Blarg's demo, he writes at the rate of 32kHz (the output rate of the SPC) data in the echo buffer. This works well but he used raw uncompressed data which is not acceptable for a system such as the SNES where the memory is limited.
Uncompressed mono data at 32kHz is 512kbps, so a one minute song will take about 3MB, the size of a big game like Final Fantasy VI, which is not acceptable.
Also this almost monopolizes CPU usage, similarly to using $4011 with the NES.
The first idea is to use SNES' native BRR format. It will compress data to a ratio of 9/32, to about 144kbps. That way a one-minute song will take about 850kb, wich is more acceptable.
This can be done in having a huge sample that takes a significant part of the memory in the SNES, and you update the first half when the second half is playing and vice versa (double buffering).
The problem is to sync the updates in the BRR sample between the CPU and the SPC. If you do it in open loop (with carefully timed code) chances are that it will depend on NTSC/PAL settings, and maybe even will not work so well with all SNES as there is two different crystals for the SPC and the CPU/PPU (I think).
So you'll need some way of keeping track of where the replaying is to tell the CPU what to update when. If this is possible, then it'll be possible to do streaming at a more reasonable bitrate without monopolizing the CPU.
The best option would be to use low bitrate OGG/Vorbis encoding, which can go as low as 45kbps with acceptable loss of quality, a one minute song would take only about 260kb then !
The problem is that of course the SPC can't decode this format natively so it'll be up to either the SPC and/or the CPU to handle the decoding and use the echo buffer for replaying. Then I wonder if the computing power of both is sufficient for vorbis decoding.
The best would be the CPU sending compressed data to the SPC, which would handle itself the decoding on the fly and paste it in its echo buffer. However, the SPC is clocked at only 1.024 MHz, while the CPU can reach about 3Mhz. So if this SPC can't decode vorbis, it'll up to the CPU to do it and then it'll monopolize it of course.
However I wonder if there is a way to limit the size of the data (kbps) and CPU usage.
I'll explain :
In Blarg's demo, he writes at the rate of 32kHz (the output rate of the SPC) data in the echo buffer. This works well but he used raw uncompressed data which is not acceptable for a system such as the SNES where the memory is limited.
Uncompressed mono data at 32kHz is 512kbps, so a one minute song will take about 3MB, the size of a big game like Final Fantasy VI, which is not acceptable.
Also this almost monopolizes CPU usage, similarly to using $4011 with the NES.
The first idea is to use SNES' native BRR format. It will compress data to a ratio of 9/32, to about 144kbps. That way a one-minute song will take about 850kb, wich is more acceptable.
This can be done in having a huge sample that takes a significant part of the memory in the SNES, and you update the first half when the second half is playing and vice versa (double buffering).
The problem is to sync the updates in the BRR sample between the CPU and the SPC. If you do it in open loop (with carefully timed code) chances are that it will depend on NTSC/PAL settings, and maybe even will not work so well with all SNES as there is two different crystals for the SPC and the CPU/PPU (I think).
So you'll need some way of keeping track of where the replaying is to tell the CPU what to update when. If this is possible, then it'll be possible to do streaming at a more reasonable bitrate without monopolizing the CPU.
The best option would be to use low bitrate OGG/Vorbis encoding, which can go as low as 45kbps with acceptable loss of quality, a one minute song would take only about 260kb then !
The problem is that of course the SPC can't decode this format natively so it'll be up to either the SPC and/or the CPU to handle the decoding and use the echo buffer for replaying. Then I wonder if the computing power of both is sufficient for vorbis decoding.
The best would be the CPU sending compressed data to the SPC, which would handle itself the decoding on the fly and paste it in its echo buffer. However, the SPC is clocked at only 1.024 MHz, while the CPU can reach about 3Mhz. So if this SPC can't decode vorbis, it'll up to the CPU to do it and then it'll monopolize it of course.