Becoming an FPGA Engineer

You can talk about almost anything that you want to on this board.

Moderator: Moderators

User avatar
rainwarrior
Posts: 8734
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Becoming an FPGA Engineer

Post by rainwarrior »

Your development kit looks rad. :)
User avatar
Ben Boldt
Posts: 1149
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: Becoming an FPGA Engineer

Post by Ben Boldt »

rainwarrior wrote: Sat Jun 05, 2021 10:06 pm Your development kit looks rad. :)
Thanks. That NES might be destined to stay that way forever I think.
User avatar
Ben Boldt
Posts: 1149
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: Becoming an FPGA Engineer

Post by Ben Boldt »

Ben Boldt wrote: Sat Jun 05, 2021 9:34 pm
Quietust wrote: Sat Jun 05, 2021 7:20 pm I see one minor issue: writing with D7=1 is supposed to also set both of the PRG bank mode bits (i.e. something along the lines of control_reg(3 downto 2) <= "11";).
Okay, I will give that a try. I see what you are talking about in the wiki now as well. That may explain why I have to cycle power to the CPLD.
Quietust wrote: Sat Jun 05, 2021 7:20 pmI'm pretty sure those should be using all 5 bits of the register, sending the upper bit to CHR A16, otherwise it'll only be able to access 64KB instead of 128KB.
Oh wow, good catch! I will give it a try when I go back in tomorrow. Will let you know, thanks!
Those were definitely bugs that needed to be fixed but they didn't solve any of the issues I was observing unfortunately. I just now found that if I remove my CHR-ROM chip, the graphics then work correctly... So something isn't working correctly with this statement:

Code: Select all

chr_rom_ce_n <= '1';  -- disable CHR-ROM.
This could be a hardware problem. I will verify the signal itself and make sure it is wired properly.

Edit:
Check this out:
disconnected.png

:oops:


Edit 2:
Very interesting, the problem where I had to power cycle the CPLD is solved when I initialize chr_bank_0_reg to "11111".

Code: Select all

signal chr_bank_0_reg:  std_logic_vector(4 downto 0) := "11111";  -- $A000-BFFF
Without this, an incorrect reset vector value will be fetched (FFFC,D = $00, 00). I get 89, FE when it works. Pressing reset does not help. I am not quite sure what the deal is there yet; the CHR bank should have nothing to do with that. The level shifter and ROM /CE's all appear to be correct. I will keep poking at it.
User avatar
TmEE
Posts: 960
Joined: Wed Feb 13, 2008 9:10 am
Location: Norway (50 and 60Hz compatible :P)
Contact:

Re: Becoming an FPGA Engineer

Post by TmEE »

You'll probably want to create a dedicated reset signal for your devboard. Many pirate carts use a diode+capacitor on M1 (I think it was that, or maybe it was A0) signal to create it.
User avatar
Ben Boldt
Posts: 1149
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: Becoming an FPGA Engineer

Post by Ben Boldt »

TmEE wrote: Mon Jun 07, 2021 12:19 pm You'll probably want to create a dedicated reset signal for your devboard. Many pirate carts use a diode+capacitor on M1 (I think it was that, or maybe it was A0) signal to create it.
Always starting from a known state is a great idea, and probably something that should always be done in practice. The initial value of MMC1's ROM banking mode apparently isn't ever 2 (per the wiki), so it must rely on initial states to some degree.

If I am trying to recreate a mapper chip that starts up in a random state, and therefore has ROM code designed around that, are there any additional CPLD-specific reasons that I should do a reset? After programming the CPLD, would I be safe to assume that it automatically runs its initialization, or should I not trust that and always force a reset? I am trying to re-learn all the pitfalls and strangeness of these things...

I can play Rad Racer almost flawless. It does usually crash after 5-10 minutes, usually at some specific event (if I literally crash, win the race, game over, etc). Probably accessing a different PRG-ROM bank. So something isn't quite perfect. My logic may be too slow or fast in different situations, or maybe my wires are too long and picking up noise (hopefully not, looks pretty clean on the scope). This is where the real learning happens for me, dealing with bugs and 'what ifs' and edge cases, etc.

Here is my latest MMC1 code, I am now preventing consecutive register write cycles per the wiki:

Code: Select all

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity MMC1_MAPPER is
port
    (
        -- CPU/PRG signals
        cpu_data_bus     : inout std_logic_vector(7 downto 0) := "ZZZZZZZZ";
        
        cpu_rw           : in    std_logic;
        m2               : in    std_logic;
        cpu_address_bus  : in    std_logic_vector(15 downto 11);
        cpu_a0           : in    std_logic;
        
        cpu_shifter_oe_n : out   std_logic := '1';
        prg_ram_we_n     : out   std_logic := '1';
        prg_ram_ce_n     : out   std_logic := '1';
        prg_ram_oe_n     : out   std_logic := '1';
        prg_rom_ce_n     : out   std_logic := '1';
        prg_rom_address  : out   std_logic_vector(18 downto 13) := "111111";
        irq_inverted     : out   std_logic := '0';
        
        
        -- PPU/CHR Signals
        ppu_rd_n         : in    std_logic;
        ppu_wr_n         : in    std_logic;
        ppu_address_bus  : in    std_logic_vector(13 downto 10);
        
        ppu_shifter_oe_n : out   std_logic;
        chr_ram_we_n     : out   std_logic;
        chr_ram_ce_n     : out   std_logic;
        chr_ram_oe_n     : out   std_logic;
        chr_rom_ce_n     : out   std_logic;
        chr_rom_address  : out   std_logic_vector(18 downto 10);
        ciram_a10        : out   std_logic;
        ciram_ce_n       : out   std_logic
    );
end MMC1_MAPPER;

architecture logic of MMC1_MAPPER is

    signal control_reg:     std_logic_vector(4 downto 0) := "11111";  -- $8000-9FFF
    signal chr_bank_0_reg:  std_logic_vector(4 downto 0) := "11111";  -- $A000-BFFF
    signal chr_bank_1_reg:  std_logic_vector(4 downto 0) := "11111";  -- $C000-DFFF
    signal prg_bank_reg:    std_logic_vector(4 downto 0) := "11111";  -- $E000-FFFF
    
    signal reg_shift:       std_logic_vector(3 downto 0) := "1111";
    signal reg_bit_counter: integer := 0;
    signal consecutive_write: std_logic := '0';
    
begin
    -- CPU/PRG signals
    --cpu_data_bus <= "ZZZZZZZZ";  -- Set data bus as input.
    cpu_shifter_oe_n <= cpu_address_bus(15);
    prg_ram_we_n <= cpu_rw;
    --prg_ram_ce_n <= '1';  -- Disable PRG-RAM
    --prg_ram_oe_n <= '1';
    prg_rom_ce_n <= not cpu_rw;  -- Prevent bus conflict writing to ROM.
    --irq_inverted <= '0';
    
    
    -- PPU/CHR Signals
    ppu_shifter_oe_n <= ppu_address_bus(13);
    chr_ram_we_n <= ppu_wr_n;
    chr_ram_ce_n <= ppu_address_bus(13);
    chr_ram_oe_n <= '0';
    chr_rom_ce_n <= '1';  -- disable CHR-ROM.

    ciram_ce_n <= not ppu_address_bus(13);
    
    -- Handle Register Writes:
    process( m2, cpu_rw, cpu_address_bus, reg_bit_counter, control_reg, prg_bank_reg )
    begin
        
        if (m2'event) and (m2='0') then
            if (cpu_rw='1') then  -- Any CPU read cycle, release the consecutive write latch.
                consecutive_write <= '0';
            else  -- Any CPU write cycle
                if consecutive_write = '0' then
                    consecutive_write <= '1';  -- Allow the register write and require a read allowing another.
                    if(cpu_address_bus(15)='0') then
                        if cpu_data_bus(7) = '1' then  -- Reset detected by CPU D7 = 1.
                            reg_bit_counter <= 0;
                            control_reg(3 downto 2) <= "11";
                        elsif reg_bit_counter < 4 then  -- Accumulating bits.
                            reg_shift <= cpu_data_bus(0) & reg_shift(3 downto 1);
                            reg_bit_counter <= reg_bit_counter + 1;
                        elsif cpu_address_bus(14 downto 13) = "00" then  -- $8000-9FFF
                            control_reg <= cpu_data_bus(0) & reg_shift;
                            reg_bit_counter <= 0;
                        elsif cpu_address_bus(14 downto 13) = "01" then  -- $A000-BFFF
                            chr_bank_0_reg <= cpu_data_bus(0) & reg_shift;
                            reg_bit_counter <= 0;
                        elsif cpu_address_bus(14 downto 13) = "10" then  -- $C000-DFFF
                            chr_bank_1_reg <= cpu_data_bus(0) & reg_shift;
                            reg_bit_counter <= 0;
                        elsif cpu_address_bus(14 downto 13) = "11" then  -- $E000-FFFF
                            prg_bank_reg <= cpu_data_bus(0) & reg_shift;
                            reg_bit_counter <= 0;
                        end if;
                    end if;
                end if;
            end if;
        end if;
    end process;
    
    -- Handle PRG Banking:
    process( cpu_address_bus, control_reg, prg_bank_reg )
    begin
        if control_reg(3) = '0' then  -- 32kbyte bank mode
            prg_rom_address <= (17=>prg_bank_reg(3), 16=>prg_bank_reg(2), 15=>prg_bank_reg(1), 14=>cpu_address_bus(14), 13=>cpu_address_bus(13), others=>'0');
        elsif control_reg(2) = '0' then  -- 16kbyte bank mode, fix first bank at $8000-BFFF.
            if( cpu_address_bus(14) = '0' ) then  -- CPU is accessing $8000-BFFF.
                prg_rom_address <= (17=>'0', 16=>'0', 15=>'0', 14=>'0', 13=>cpu_address_bus(13), others=>'0');
            else  -- CPU is accessing last bank at $C000-FFFF.
                prg_rom_address <= (17=>prg_bank_reg(3), 16=>prg_bank_reg(2), 15=>prg_bank_reg(1), 14=>prg_bank_reg(0), 13=>cpu_address_bus(13), others=>'0');
            end if;
        else  -- 16kbyte bank mode, fix last bank at $C000.
            if( cpu_address_bus(14) = '0' ) then  -- CPU is accessing $8000-BFFF.
                prg_rom_address <= (17=>prg_bank_reg(3), 16=>prg_bank_reg(2), 15=>prg_bank_reg(1), 14=>prg_bank_reg(0), 13=>cpu_address_bus(13), others=>'0');
            else  -- CPU is accessing last bank at $C000-FFFF.
                prg_rom_address <= (17=>'1', 16=>'1', 15=>'1', 14=>'1', 13=>cpu_address_bus(13), others=>'0');
            end if;
        end if;
    end process;
    
    -- Handle CHR Banking:
    process( ppu_address_bus, control_reg, chr_bank_0_reg, chr_bank_1_reg )
    begin
        if control_reg(4) = '0' then  -- 1 x 8kbyte mode
            chr_rom_address <= (16=>chr_bank_0_reg(4), 15=>chr_bank_0_reg(3), 14=>chr_bank_0_reg(2), 13=>chr_bank_0_reg(1), 12=>ppu_address_bus(12), 11=>ppu_address_bus(11), 10=>ppu_address_bus(10), others=>'0');
        else  -- 2 x 4kbyte mode
            if ppu_address_bus(12)='0' then  -- low bank
                chr_rom_address <= (16=>chr_bank_0_reg(4), 15=>chr_bank_0_reg(3), 14=>chr_bank_0_reg(2), 13=>chr_bank_0_reg(1), 12=>chr_bank_0_reg(0), 11=>ppu_address_bus(11), 10=>ppu_address_bus(10), others=>'0');
            else  -- high bank
                chr_rom_address <= (16=>chr_bank_1_reg(4), 15=>chr_bank_1_reg(3), 14=>chr_bank_1_reg(2), 13=>chr_bank_1_reg(1), 12=>chr_bank_1_reg(0), 11=>ppu_address_bus(11), 10=>ppu_address_bus(10), others=>'0');
            end if;
        end if;
    end process;
    
    -- Handle Mirroring:
    process( ppu_address_bus, control_reg )
    begin
        if control_reg(1 downto 0) = "00" then  -- one-screen, lower bank
            ciram_a10 <= '0';
        elsif control_reg(1 downto 0) = "01" then  -- one-screen, upper bank
            ciram_a10 <= '1';
        elsif control_reg(1 downto 0) = "10" then  -- Vertical Mirroring
            ciram_a10 <= ppu_address_bus(10);
        else  -- Horizontal Mirroring
            ciram_a10 <= ppu_address_bus(11);
        end if;
    end process;
   
end logic;
It just occurred to me: /ROMSEL won't probably be available until a little while after M2 falling edge, is that correct? So do I need to introduce a delay after M2 falling edge to read /ROMSEL? I am not doing that presently and that would affect my register writes. /ROMSEL is called cpu_address_bus(15) in my code.


Edit:
I added this:

signal m2_delayed: std_logic := '0';

m2_delayed <= m2 after 20 ns;

And then I used "m2_delayed" instead of "m2". It can still crash with that. The sprite graphics stay good and the background graphics all get messed up, so it seems to have written to one of the CHR banks incorrectly, maybe when it meant to write to the PRG bank.


Edit 2:
I knew I just read about this somewhere:
http://forums.nesdev.com/viewtopic.php?p=273548#p273548
aquasnake wrote: Mon May 31, 2021 10:25 pm When M2 rising edge comes, the data lines are not stable.

A15 has a delay of almost 33ns after the rising edge of M2, which depends on the gate delay of romsel generation circuit. An NOAC has very short delay and better compatibility.

It is not a good method to use M2 rising edge latch strictly. Even if the address is locked, A15 will be misjudged. That is to say, if the address lines are used to be locked, A15 must be ignored, and the locked A15 may not be the exact state of the internal bus. A15/romsel can only be used as the input condition of the latch, but not as the stored value of the latch.

Powerpak does not simply latch at the rising edge of M2, it has a digital filtering of M2, and even has been inverted internally.
Maybe I should trigger on rising edge M2 with delay of 40 nsec, will try that.

...

Tied it, that doesn't work at all, black screen. I have not yet gotten it to crash with 40 nsec delay and falling edge M2 though. I know that MMC5 latches writes after a delay after the rising edge of M2, I have that still stuck up on a sticky note from back then.

I bet the crashes are something to do with M2 delays. I am going to leave it run with 40ns delay, falling edge, playing its demo overnight and see if it is still running in the morning.
lidnariq
Posts: 11432
Joined: Sun Apr 13, 2008 11:12 am

Re: Becoming an FPGA Engineer

Post by lidnariq »

If you can time things to the falling edge of M2, that should be reliable: all delays should align to be valid at that point.

However, this won't work for chip enables: there you'll have to delay M2 until after /ROMSEL would have changed, and depending on the external part, you may have to wait until after the data bus is stable.

See my diagram here: viewtopic.php?p=244126#p244126
User avatar
Ben Boldt
Posts: 1149
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: Becoming an FPGA Engineer

Post by Ben Boldt »

lidnariq wrote: Mon Jun 07, 2021 10:17 pm If you can time things to the falling edge of M2, that should be reliable: all delays should align to be valid at that point.
Based on what you are saying and your diagram, I should be able to use M2 falling edge directly to trigger register writes. Or if I wanted to handle writes on the rising edge, if I waited 100nsec, it should also work. Adding a 40 nsec delay after the M2 falling edge seemed to prevent crashing though; the demo mode ran all night and was still playing this morning. This is confusing because adding that delay should have made things worse and maybe even prevented it from working at all... I am not understanding how that test result is possible.

write_cycle.png
lidnariq wrote: Mon Jun 07, 2021 10:17 pmHowever, this won't work for chip enables: there you'll have to delay M2 until after /ROMSEL would have changed, and depending on the external part, you may have to wait until after the data bus is stable.
Presently I have my level shifter's /OE routed directly to /ROMSEL but I will have to keep this in mind when adding PRG-RAM support which will then not be able to do that anymore. I wasn't aware -- thanks for pointing that out.
lidnariq
Posts: 11432
Joined: Sun Apr 13, 2008 11:12 am

Re: Becoming an FPGA Engineer

Post by lidnariq »

Ben Boldt wrote: Tue Jun 08, 2021 10:05 am Adding a 40 nsec delay after the M2 falling edge seemed to prevent crashing though; the demo mode ran all night and was still playing this morning. This is confusing because adding that delay should have made things worse and maybe even prevented it from working at all...
How did you implement the 40ns delay?
User avatar
Ben Boldt
Posts: 1149
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: Becoming an FPGA Engineer

Post by Ben Boldt »

lidnariq wrote: Tue Jun 08, 2021 12:31 pm How did you implement the 40ns delay?
Like this:

m2_delay_code.png

My M2 does not get inverted or anything; it feeds straight through a level shifter like all the other signals. Note that I am very prone to rookie mistakes at this point, I may easily have done something too silly to be obvious.

Here is the full source code:

Code: Select all

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity MMC1_MAPPER is
port
    (
        -- CPU/PRG signals
        cpu_data_bus     : inout std_logic_vector(7 downto 0) := "ZZZZZZZZ";
        
        cpu_rw           : in    std_logic;
        m2               : in    std_logic;
        cpu_address_bus  : in    std_logic_vector(15 downto 11);
        cpu_a0           : in    std_logic;
        
        cpu_shifter_oe_n : out   std_logic := '1';
        prg_ram_we_n     : out   std_logic := '1';
        prg_ram_ce_n     : out   std_logic := '1';
        prg_ram_oe_n     : out   std_logic := '1';
        prg_rom_ce_n     : out   std_logic := '1';
        prg_rom_address  : out   std_logic_vector(18 downto 13) := "111111";
        irq_inverted     : out   std_logic := '0';
        
        
        -- PPU/CHR Signals
        ppu_rd_n         : in    std_logic;
        ppu_wr_n         : in    std_logic;
        ppu_address_bus  : in    std_logic_vector(13 downto 10);
        
        ppu_shifter_oe_n : out   std_logic;
        chr_ram_we_n     : out   std_logic;
        chr_ram_ce_n     : out   std_logic;
        chr_ram_oe_n     : out   std_logic;
        chr_rom_ce_n     : out   std_logic;
        chr_rom_address  : out   std_logic_vector(18 downto 10);
        ciram_a10        : out   std_logic;
        ciram_ce_n       : out   std_logic
    );
end MMC1_MAPPER;

architecture logic of MMC1_MAPPER is

    signal control_reg:     std_logic_vector(4 downto 0) := "11111";  -- $8000-9FFF
    signal chr_bank_0_reg:  std_logic_vector(4 downto 0) := "11111";  -- $A000-BFFF
    signal chr_bank_1_reg:  std_logic_vector(4 downto 0) := "11111";  -- $C000-DFFF
    signal prg_bank_reg:    std_logic_vector(4 downto 0) := "11111";  -- $E000-FFFF
    
    signal reg_shift:       std_logic_vector(3 downto 0) := "1111";
    signal reg_bit_counter: integer := 0;
    signal consecutive_write: std_logic := '0';
    
    signal m2_delayed:      std_logic := '0';
    
begin
    -- CPU/PRG signals
    --cpu_data_bus <= "ZZZZZZZZ";  -- Set data bus as input.
    cpu_shifter_oe_n <= cpu_address_bus(15);
    prg_ram_we_n <= cpu_rw;
    --prg_ram_ce_n <= '1';  -- Disable PRG-RAM
    --prg_ram_oe_n <= '1';
    prg_rom_ce_n <= not cpu_rw;  -- Prevent bus conflict writing to ROM.
    --irq_inverted <= '0';
    
    
    -- PPU/CHR Signals
    ppu_shifter_oe_n <= ppu_address_bus(13);
    chr_ram_we_n <= ppu_wr_n;
    chr_ram_ce_n <= ppu_address_bus(13);
    chr_ram_oe_n <= '0';
    chr_rom_ce_n <= '1';  -- disable CHR-ROM.

    ciram_ce_n <= not ppu_address_bus(13);
    
    m2_delayed <= m2 after 40 ns;
    
    -- Handle Register Writes:
    process( m2_delayed, cpu_rw, cpu_address_bus, reg_bit_counter, control_reg, prg_bank_reg )
    begin
        
        if (m2_delayed'event) and (m2_delayed='0') then
            if (cpu_rw='1') then  -- Any CPU read cycle, release the consecutive write latch.
                consecutive_write <= '0';
                prg_ram_oe_n <= '0';  -- TEST
            else  -- Any CPU write cycle
                if consecutive_write = '0' then
                    consecutive_write <= '1';  -- Allow the register write and require a read allowing another.
                    --wait for 20 ns;
                    if(cpu_address_bus(15)='0') then
                        if cpu_data_bus(7) = '1' then  -- Reset detected by CPU D7 = 1.
                            reg_bit_counter <= 0;
                            control_reg(3 downto 2) <= "11";
                        elsif reg_bit_counter < 4 then  -- Accumulating bits.
                            reg_shift <= cpu_data_bus(0) & reg_shift(3 downto 1);
                            reg_bit_counter <= reg_bit_counter + 1;
                        elsif cpu_address_bus(14 downto 13) = "00" then  -- $8000-9FFF
                            control_reg <= cpu_data_bus(0) & reg_shift;
                            reg_bit_counter <= 0;
                        elsif cpu_address_bus(14 downto 13) = "01" then  -- $A000-BFFF
                            chr_bank_0_reg <= cpu_data_bus(0) & reg_shift;
                            reg_bit_counter <= 0;
                        elsif cpu_address_bus(14 downto 13) = "10" then  -- $C000-DFFF
                            chr_bank_1_reg <= cpu_data_bus(0) & reg_shift;
                            reg_bit_counter <= 0;
                        elsif cpu_address_bus(14 downto 13) = "11" then  -- $E000-FFFF
                            prg_bank_reg <= cpu_data_bus(0) & reg_shift;
                            reg_bit_counter <= 0;
                            prg_ram_oe_n <= '1';  -- TEST
                        end if;
                    end if;
                end if;
            end if;
        end if;
    end process;
    
    -- Handle PRG Banking:
    process( cpu_address_bus, control_reg, prg_bank_reg )
    begin
        if control_reg(3) = '0' then  -- 32kbyte bank mode
            prg_rom_address <= (17=>prg_bank_reg(3), 16=>prg_bank_reg(2), 15=>prg_bank_reg(1), 14=>cpu_address_bus(14), 13=>cpu_address_bus(13), others=>'0');
        elsif control_reg(2) = '0' then  -- 16kbyte bank mode, fix first bank at $8000-BFFF.
            if( cpu_address_bus(14) = '0' ) then  -- CPU is accessing $8000-BFFF.
                prg_rom_address <= (17=>'0', 16=>'0', 15=>'0', 14=>'0', 13=>cpu_address_bus(13), others=>'0');
            else  -- CPU is accessing last bank at $C000-FFFF.
                prg_rom_address <= (17=>prg_bank_reg(3), 16=>prg_bank_reg(2), 15=>prg_bank_reg(1), 14=>prg_bank_reg(0), 13=>cpu_address_bus(13), others=>'0');
            end if;
        else  -- 16kbyte bank mode, fix last bank at $C000.
            if( cpu_address_bus(14) = '0' ) then  -- CPU is accessing $8000-BFFF.
                prg_rom_address <= (17=>prg_bank_reg(3), 16=>prg_bank_reg(2), 15=>prg_bank_reg(1), 14=>prg_bank_reg(0), 13=>cpu_address_bus(13), others=>'0');
            else  -- CPU is accessing last bank at $C000-FFFF.
                prg_rom_address <= (17=>'1', 16=>'1', 15=>'1', 14=>'1', 13=>cpu_address_bus(13), others=>'0');
            end if;
        end if;
    end process;
    
    -- Handle CHR Banking:
    process( ppu_address_bus, control_reg, chr_bank_0_reg, chr_bank_1_reg )
    begin
        if control_reg(4) = '0' then  -- 1 x 8kbyte mode
            chr_rom_address <= (16=>chr_bank_0_reg(4), 15=>chr_bank_0_reg(3), 14=>chr_bank_0_reg(2), 13=>chr_bank_0_reg(1), 12=>ppu_address_bus(12), 11=>ppu_address_bus(11), 10=>ppu_address_bus(10), others=>'0');
        else  -- 2 x 4kbyte mode
            if ppu_address_bus(12)='0' then  -- low bank
                chr_rom_address <= (16=>chr_bank_0_reg(4), 15=>chr_bank_0_reg(3), 14=>chr_bank_0_reg(2), 13=>chr_bank_0_reg(1), 12=>chr_bank_0_reg(0), 11=>ppu_address_bus(11), 10=>ppu_address_bus(10), others=>'0');
            else  -- high bank
                chr_rom_address <= (16=>chr_bank_1_reg(4), 15=>chr_bank_1_reg(3), 14=>chr_bank_1_reg(2), 13=>chr_bank_1_reg(1), 12=>chr_bank_1_reg(0), 11=>ppu_address_bus(11), 10=>ppu_address_bus(10), others=>'0');
            end if;
        end if;
    end process;
    
    -- Handle Mirroring:
    process( ppu_address_bus, control_reg )
    begin
        if control_reg(1 downto 0) = "00" then  -- one-screen, lower bank
            ciram_a10 <= '0';
        elsif control_reg(1 downto 0) = "01" then  -- one-screen, upper bank
            ciram_a10 <= '1';
        elsif control_reg(1 downto 0) = "10" then  -- Vertical Mirroring
            ciram_a10 <= ppu_address_bus(10);
        else  -- Horizontal Mirroring
            ciram_a10 <= ppu_address_bus(11);
        end if;
    end process;
   
end logic;


Edit:
The delay is not working actually. After M2 falls, the delay should be there before the debug signal rises, and it is not.

tek00021.png
Trirosmos
Posts: 50
Joined: Mon Aug 01, 2016 4:01 am
Location: Brinstar, Zebes
Contact:

Re: Becoming an FPGA Engineer

Post by Trirosmos »

Ben Boldt wrote: Tue Jun 08, 2021 2:08 pm
lidnariq wrote: Tue Jun 08, 2021 12:31 pm How did you implement the 40ns delay?
Like this:

Code: Select all

library ieee;
  m2_delayed <= m2 after 40 ns;
I'm more used to Verilog than VHDL, but either way, I can't see how an after statement could be synthesizable. After all, the CPLD has no concept of time whatsoever. Your VHDL toolchain might have some things to say about that... or maybe the default behaviour is to just silently ignore wait or after statements, dunno.
Some CPLDs have internal timers, while some others might require an external crystal. Either way, the only way to delay M2 is to count cycles based on some reference clock, afaik.
lidnariq
Posts: 11432
Joined: Sun Apr 13, 2008 11:12 am

Re: Becoming an FPGA Engineer

Post by lidnariq »

My understanding is that not all synthesis software supports delays, but I would have expected an error for something unsynthesizeable...
Trirosmos
Posts: 50
Joined: Mon Aug 01, 2016 4:01 am
Location: Brinstar, Zebes
Contact:

Re: Becoming an FPGA Engineer

Post by Trirosmos »

lidnariq wrote: Tue Jun 08, 2021 5:35 pm My understanding is that not all synthesis software supports delays, but I would have expected an error for something unsynthesizeable...
Well, at least when it comes to Verilog in Quartus, things such as $display, $write or delay statements result in a warning, but are otherwise ignored. I suspect other synthesis tools would behave the same way, as it's quite handy to be able to use the same file, containing both a module and its simulation testbench, for both synthesis and simulation.
User avatar
Ben Boldt
Posts: 1149
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: Becoming an FPGA Engineer

Post by Ben Boldt »

I literally just googled how to do a delay in VHDL and put that statement in there, not having the slightest clue. Good to know that it doesn't make sense to do that. In what situation(s) could you correctly use a 'delay' or 'after' statement? Is that something for a different type of hardware?

This demo board does have a great big 10MHz crystal and also a couple other crystals. I am thinking at least 1 of those crystals is for the built-in USB-Blaster (probably the 24MHz one). I have done absolutely nothing to configure them. I will look into this and see if I can follow your suggestion creating a counter based on an external clock source.

Here is my build info with lots of warnings if you are interested. Nothing jumps out at me about this delay.

Code: Select all

Info: *******************************************************************
Info: Running Quartus Prime Analysis & Synthesis
	Info: Version 20.1.1 Build 720 11/11/2020 SJ Lite Edition
	Info: Processing started: Tue Jun 08 20:08:32 2021
	Info: Version 20.1.1 Build 720 11/11/2020 SJ Lite Edition
	Info: Processing started: Tue Jun 08 20:08:32 2021
Info: Command: quartus_map --read_settings_files=on --write_settings_files=off nrom -c nrom-top
Warning (18236): Number of processors has not been specified which may cause overloading on shared machines.  Set the global assignment NUM_PARALLEL_PROCESSORS in your QSF to an appropriate value for best performance.
Info (20030): Parallel compilation is enabled and will use 6 of the 6 processors detected
Info (12021): Found 1 design units, including 1 entities, in source file nrom-top.bdf
	Info (12023): Found entity 1: nrom-top
	Info (12023): Found entity 1: nrom-top
Info (12021): Found 2 design units, including 1 entities, in source file nrom_block.vhd
	Info (12022): Found design unit 1: NROM_MAPPER-logic
	Info (12023): Found entity 1: NROM_MAPPER
	Info (12022): Found design unit 1: NROM_MAPPER-logic
	Info (12023): Found entity 1: NROM_MAPPER
Info (12021): Found 2 design units, including 1 entities, in source file mmc1_block.vhd
	Info (12022): Found design unit 1: MMC1_MAPPER-logic
	Info (12023): Found entity 1: MMC1_MAPPER
	Info (12022): Found design unit 1: MMC1_MAPPER-logic
	Info (12023): Found entity 1: MMC1_MAPPER
Info (12127): Elaborating entity "nrom-top" for the top level hierarchy
Info (12128): Elaborating entity "MMC1_MAPPER" for hierarchy "MMC1_MAPPER:inst"
Warning (10540): VHDL Signal Declaration warning at mmc1_block.vhd(18): used explicit default value for signal "prg_ram_ce_n" because signal was never assigned a value
Warning (10540): VHDL Signal Declaration warning at mmc1_block.vhd(22): used explicit default value for signal "irq_inverted" because signal was never assigned a value
Warning (13039): The following bidirectional pins have no drivers
	Warning (13040): bidirectional pin "cpu_data_bus" has no driver
	Warning (13040): bidirectional pin "cpu_data_bus" has no driver
	Warning (13040): bidirectional pin "cpu_data_bus" has no driver
	Warning (13040): bidirectional pin "cpu_data_bus" has no driver
	Warning (13040): bidirectional pin "cpu_data_bus" has no driver
	Warning (13040): bidirectional pin "cpu_data_bus" has no driver
	Warning (13040): bidirectional pin "cpu_data_bus" has no driver
	Warning (13040): bidirectional pin "cpu_data_bus" has no driver
	Warning (13040): bidirectional pin "cpu_data_bus" has no driver
	Warning (13040): bidirectional pin "cpu_data_bus" has no driver
	Warning (13040): bidirectional pin "cpu_data_bus" has no driver
	Warning (13040): bidirectional pin "cpu_data_bus" has no driver
	Warning (13040): bidirectional pin "cpu_data_bus" has no driver
	Warning (13040): bidirectional pin "cpu_data_bus" has no driver
	Warning (13040): bidirectional pin "cpu_data_bus" has no driver
	Warning (13040): bidirectional pin "cpu_data_bus" has no driver
Warning (13024): Output pins are stuck at VCC or GND
	Warning (13410): Pin "prg_ram_ce_n" is stuck at VCC
	Warning (13410): Pin "irq_inverted" is stuck at GND
	Warning (13410): Pin "chr_ram_oe_n" is stuck at GND
	Warning (13410): Pin "chr_rom_ce_n" is stuck at VCC
	Warning (13410): Pin "chr_rom_address[18]" is stuck at GND
	Warning (13410): Pin "chr_rom_address[17]" is stuck at GND
	Warning (13410): Pin "prg_rom_address[18]" is stuck at GND
	Warning (13410): Pin "prg_ram_ce_n" is stuck at VCC
	Warning (13410): Pin "irq_inverted" is stuck at GND
	Warning (13410): Pin "chr_ram_oe_n" is stuck at GND
	Warning (13410): Pin "chr_rom_ce_n" is stuck at VCC
	Warning (13410): Pin "chr_rom_address[18]" is stuck at GND
	Warning (13410): Pin "chr_rom_address[17]" is stuck at GND
	Warning (13410): Pin "prg_rom_address[18]" is stuck at GND
Warning (21074): Design contains 4 input pin(s) that do not drive logic
	Warning (15610): No output dependent on input pin "cpu_a0"
	Warning (15610): No output dependent on input pin "ppu_rd_n"
	Warning (15610): No output dependent on input pin "cpu_address_bus[12]"
	Warning (15610): No output dependent on input pin "cpu_address_bus[11]"
	Warning (15610): No output dependent on input pin "cpu_a0"
	Warning (15610): No output dependent on input pin "ppu_rd_n"
	Warning (15610): No output dependent on input pin "cpu_address_bus[12]"
	Warning (15610): No output dependent on input pin "cpu_address_bus[11]"
Info (21057): Implemented 138 device resources after synthesis - the final resource count might be different
	Info (21058): Implemented 14 input pins
	Info (21059): Implemented 28 output pins
	Info (21060): Implemented 8 bidirectional pins
	Info (21061): Implemented 88 logic cells
	Info (21058): Implemented 14 input pins
	Info (21059): Implemented 28 output pins
	Info (21060): Implemented 8 bidirectional pins
	Info (21061): Implemented 88 logic cells
Info: Quartus Prime Analysis & Synthesis was successful. 0 errors, 25 warnings
	Info: Peak virtual memory: 4829 megabytes
	Info: Processing ended: Tue Jun 08 20:08:43 2021
	Info: Elapsed time: 00:00:11
	Info: Total CPU time (on all processors): 00:00:22
	Info: Peak virtual memory: 4829 megabytes
	Info: Processing ended: Tue Jun 08 20:08:43 2021
	Info: Elapsed time: 00:00:11
	Info: Total CPU time (on all processors): 00:00:22
Info: *******************************************************************
Info: Running Quartus Prime Fitter
	Info: Version 20.1.1 Build 720 11/11/2020 SJ Lite Edition
	Info: Processing started: Tue Jun 08 20:08:44 2021
	Info: Version 20.1.1 Build 720 11/11/2020 SJ Lite Edition
	Info: Processing started: Tue Jun 08 20:08:44 2021
Info: Command: quartus_fit --read_settings_files=off --write_settings_files=off nrom -c nrom-top
Info: qfit2_default_script.tcl version: #1
Info: Project  = nrom
Info: Revision = nrom-top
Warning (18236): Number of processors has not been specified which may cause overloading on shared machines.  Set the global assignment NUM_PARALLEL_PROCESSORS in your QSF to an appropriate value for best performance.
Info (20030): Parallel compilation is enabled and will use 6 of the 6 processors detected
Info (119006): Selected device 5M570ZF256C5 for design "nrom-top"
Info (21077): Low junction temperature is 0 degrees C
Info (21077): High junction temperature is 85 degrees C
Info (171003): Fitter is performing an Auto Fit compilation, which may decrease Fitter effort to reduce compilation time
Warning (292013): Feature LogicLock is only available with a valid subscription license. You can purchase a software subscription to gain full access to this feature.
Info (176444): Device migration not selected. If you intend to use device migration later, you may need to change the pin assignments as they may be incompatible with other devices
	Info (176445): Device 5M570ZF256I5 is compatible
	Info (176445): Device 5M1270ZF256C5 is compatible
	Info (176445): Device 5M1270ZF256I5 is compatible
	Info (176445): Device 5M2210ZF256C5 is compatible
	Info (176445): Device 5M2210ZF256I5 is compatible
	Info (176445): Device 5M570ZF256I5 is compatible
	Info (176445): Device 5M1270ZF256C5 is compatible
	Info (176445): Device 5M1270ZF256I5 is compatible
	Info (176445): Device 5M2210ZF256C5 is compatible
	Info (176445): Device 5M2210ZF256I5 is compatible
Info (332104): Reading SDC File: 'nrom-top.sdc'
Info (332144): No user constrained base clocks found in the design
Info (332128): Timing requirements not specified -- optimizing circuit to achieve the following default global requirements
	Info (332127): Assuming a default timing requirement
	Info (332127): Assuming a default timing requirement
Info (332111): Found 1 clocks
	Info (332111):   Period   Clock Name
	Info (332111): ======== ============
	Info (332111):    1.000           m2
	Info (332111):   Period   Clock Name
	Info (332111): ======== ============
	Info (332111):    1.000           m2
Info (186079): Completed User Assigned Global Signals Promotion Operation
Info (186215): Automatically promoted signal "m2" to use Global clock
Info (186228): Pin "m2" drives global clock, but is not placed in a dedicated clock pin position
Info (186079): Completed Auto Global Promotion Operation
Info (176234): Starting register packing
Info (186468): Started processing fast register assignments
Info (186469): Finished processing fast register assignments
Info (176235): Finished register packing
Info (171121): Fitter preparation operations ending: elapsed time is 00:00:00
Info (14896): Fitter has disabled Advanced Physical Optimization because it is not supported for the current family.
Info (170189): Fitter placement preparation operations beginning
Info (170190): Fitter placement preparation operations ending: elapsed time is 00:00:00
Info (170191): Fitter placement operations beginning
Info (170137): Fitter placement was successful
Info (170192): Fitter placement operations ending: elapsed time is 00:00:00
Info (170193): Fitter routing operations beginning
Info (170195): Router estimated average interconnect usage is 4% of the available device resources
	Info (170196): Router estimated peak interconnect usage is 4% of the available device resources in the region that extends from location X0_Y0 to location X13_Y8
	Info (170196): Router estimated peak interconnect usage is 4% of the available device resources in the region that extends from location X0_Y0 to location X13_Y8
Info (170199): The Fitter performed an Auto Fit compilation.  Optimizations were skipped to reduce compilation time.
	Info (170201): Optimizations that may affect the design's routability were skipped
	Info (170201): Optimizations that may affect the design's routability were skipped
Info (170194): Fitter routing operations ending: elapsed time is 00:00:00
Info (11888): Total time spent on timing analysis during the Fitter is 0.11 seconds.
Info (11218): Fitter post-fit operations ending: elapsed time is 00:00:00
Warning (169064): Following 8 pins have no output enable or a GND or VCC output enable - later changes to this connectivity may change fitting results
	Info (169065): Pin cpu_data_bus[7] has a permanently disabled output enable
	Info (169065): Pin cpu_data_bus[6] has a permanently disabled output enable
	Info (169065): Pin cpu_data_bus[5] has a permanently disabled output enable
	Info (169065): Pin cpu_data_bus[4] has a permanently disabled output enable
	Info (169065): Pin cpu_data_bus[3] has a permanently disabled output enable
	Info (169065): Pin cpu_data_bus[2] has a permanently disabled output enable
	Info (169065): Pin cpu_data_bus[1] has a permanently disabled output enable
	Info (169065): Pin cpu_data_bus[0] has a permanently disabled output enable
	Info (169065): Pin cpu_data_bus[7] has a permanently disabled output enable
	Info (169065): Pin cpu_data_bus[6] has a permanently disabled output enable
	Info (169065): Pin cpu_data_bus[5] has a permanently disabled output enable
	Info (169065): Pin cpu_data_bus[4] has a permanently disabled output enable
	Info (169065): Pin cpu_data_bus[3] has a permanently disabled output enable
	Info (169065): Pin cpu_data_bus[2] has a permanently disabled output enable
	Info (169065): Pin cpu_data_bus[1] has a permanently disabled output enable
	Info (169065): Pin cpu_data_bus[0] has a permanently disabled output enable
Warning (169174): The Reserve All Unused Pins setting has not been specified, and will default to 'As output driving ground'.
Info (144001): Generated suppressed messages file C:/altera/projects/mmc1_vhdl/output_files/nrom-top.fit.smsg
Info: Quartus Prime Fitter was successful. 0 errors, 4 warnings
	Info: Peak virtual memory: 5716 megabytes
	Info: Processing ended: Tue Jun 08 20:08:46 2021
	Info: Elapsed time: 00:00:02
	Info: Total CPU time (on all processors): 00:00:03
	Info: Peak virtual memory: 5716 megabytes
	Info: Processing ended: Tue Jun 08 20:08:46 2021
	Info: Elapsed time: 00:00:02
	Info: Total CPU time (on all processors): 00:00:03
Info: *******************************************************************
Info: Running Quartus Prime Assembler
	Info: Version 20.1.1 Build 720 11/11/2020 SJ Lite Edition
	Info: Processing started: Tue Jun 08 20:08:48 2021
	Info: Version 20.1.1 Build 720 11/11/2020 SJ Lite Edition
	Info: Processing started: Tue Jun 08 20:08:48 2021
Info: Command: quartus_asm --read_settings_files=off --write_settings_files=off nrom -c nrom-top
Warning (18236): Number of processors has not been specified which may cause overloading on shared machines.  Set the global assignment NUM_PARALLEL_PROCESSORS in your QSF to an appropriate value for best performance.
Info (115031): Writing out detailed assembly data for power analysis
Info (115030): Assembler is generating device programming files
Info: Quartus Prime Assembler was successful. 0 errors, 1 warning
	Info: Peak virtual memory: 4730 megabytes
	Info: Processing ended: Tue Jun 08 20:08:49 2021
	Info: Elapsed time: 00:00:01
	Info: Total CPU time (on all processors): 00:00:01
	Info: Peak virtual memory: 4730 megabytes
	Info: Processing ended: Tue Jun 08 20:08:49 2021
	Info: Elapsed time: 00:00:01
	Info: Total CPU time (on all processors): 00:00:01
Info (293026): Skipped module Power Analyzer due to the assignment FLOW_ENABLE_POWER_ANALYZER
Info: *******************************************************************
Info: Running Quartus Prime Timing Analyzer
	Info: Version 20.1.1 Build 720 11/11/2020 SJ Lite Edition
	Info: Processing started: Tue Jun 08 20:08:50 2021
	Info: Version 20.1.1 Build 720 11/11/2020 SJ Lite Edition
	Info: Processing started: Tue Jun 08 20:08:50 2021
Info: Command: quartus_sta nrom -c nrom-top
Info: qsta_default_script.tcl version: #1
Warning (18236): Number of processors has not been specified which may cause overloading on shared machines.  Set the global assignment NUM_PARALLEL_PROCESSORS in your QSF to an appropriate value for best performance.
Info (20030): Parallel compilation is enabled and will use 6 of the 6 processors detected
Info (21077): Low junction temperature is 0 degrees C
Info (21077): High junction temperature is 85 degrees C
Info (334003): Started post-fitting delay annotation
Info (334004): Delay annotation completed successfully
Info (332104): Reading SDC File: 'nrom-top.sdc'
Info (332142): No user constrained base clocks found in the design. Calling "derive_clocks -period 1.0"
Info (332105): Deriving Clocks
	Info (332105): create_clock -period 1.000 -name m2 m2
	Info (332105): create_clock -period 1.000 -name m2 m2
Info: Found TIMING_ANALYZER_REPORT_SCRIPT_INCLUDE_DEFAULT_ANALYSIS = ON
Info: Can't run Report Timing Closure Recommendations. The current device family is not supported.
Critical Warning (332148): Timing requirements not met
Info (332146): Worst-case setup slack is -24.329
	Info (332119):     Slack       End Point TNS Clock 
	Info (332119): ========= =================== =====================
	Info (332119):   -24.329           -1260.450 m2 
	Info (332119):     Slack       End Point TNS Clock 
	Info (332119): ========= =================== =====================
	Info (332119):   -24.329           -1260.450 m2 
Info (332146): Worst-case hold slack is 3.129
	Info (332119):     Slack       End Point TNS Clock 
	Info (332119): ========= =================== =====================
	Info (332119):     3.129               0.000 m2 
	Info (332119):     Slack       End Point TNS Clock 
	Info (332119): ========= =================== =====================
	Info (332119):     3.129               0.000 m2 
Info (332140): No Recovery paths to report
Info (332140): No Removal paths to report
Info (332146): Worst-case minimum pulse width slack is -2.289
	Info (332119):     Slack       End Point TNS Clock 
	Info (332119): ========= =================== =====================
	Info (332119):    -2.289              -2.289 m2 
	Info (332119):     Slack       End Point TNS Clock 
	Info (332119): ========= =================== =====================
	Info (332119):    -2.289              -2.289 m2 
Info (332001): The selected device family is not supported by the report_metastability command.
Info (332102): Design is not fully constrained for setup requirements
Info (332102): Design is not fully constrained for hold requirements
Info: Quartus Prime Timing Analyzer was successful. 0 errors, 2 warnings
	Info: Peak virtual memory: 4755 megabytes
	Info: Processing ended: Tue Jun 08 20:08:52 2021
	Info: Elapsed time: 00:00:02
	Info: Total CPU time (on all processors): 00:00:02
	Info: Peak virtual memory: 4755 megabytes
	Info: Processing ended: Tue Jun 08 20:08:52 2021
	Info: Elapsed time: 00:00:02
	Info: Total CPU time (on all processors): 00:00:02
Info (293000): Quartus Prime Full Compilation was successful. 0 errors, 32 warnings
Trirosmos
Posts: 50
Joined: Mon Aug 01, 2016 4:01 am
Location: Brinstar, Zebes
Contact:

Re: Becoming an FPGA Engineer

Post by Trirosmos »

Ben Boldt wrote: Tue Jun 08, 2021 6:20 pm Good to know that it doesn't make sense to do that. In what situation(s) could you correctly use a 'delay' or 'after' statement? Is that something for a different type of hardware?
They're only useful for simulations.
I believe some FPGAs have I/O blocks that allow for delaying a signal by a tiny bit. A chain of inverters could also introduce some delay, but getting precise timings out of such an arrangement is pretty much impossible. Other than those options, the only way to get a delay really is with some extra logic and a clock source.
I suppose synthesis tools could try to automatically generate said logic and configure a PLL for you, but that would get very messy very quickly.
Ben Boldt wrote: Tue Jun 08, 2021 6:20 pm I will look into this and see if I can follow your suggestion creating a counter based on an external clock source.
In Verilog, it could look something like this

Code: Select all

reg [1:0] m2_buffer;
wire m2_delayed = m2_buffer[1] & m2_buffer[0];

always @(posedge external_clk) m2_buffer <= {m2_buffer[0],m2};

always @(posedge m2_delayed) begin
 //Mapper logic
end
fwiw, mapper register writes can all be triggered on the falling edge of M2, which makes all of this much simpler. As for chip enables and whatnot, those can be driven by asynchronous logic. That's what both the PowerPak and EverDrive do. Edit: nope, was wrong about that. They do the delayed M2 thing.
Ben Boldt wrote: Tue Jun 08, 2021 6:20 pm Here is my build info with lots of warnings if you are interested. Nothing jumps out at me about this delay.
Oh, yeah, my bad. Delays do seem to be completely ignored. But if I try to use unsynthesizeable tasks such as $write, Quartus does throw a " ignoring unsupported system task " warning.
Last edited by Trirosmos on Tue Jun 08, 2021 7:44 pm, edited 2 times in total.
lidnariq
Posts: 11432
Joined: Sun Apr 13, 2008 11:12 am

Re: Becoming an FPGA Engineer

Post by lidnariq »

Trirosmos wrote: Tue Jun 08, 2021 7:25 pm As for chip enables and whatnot, those can be driven by asynchronous logic. That's what both the PowerPak and EverDrive do.
It's probably what the Sunsoft 3 does, too. ( nesdevwiki:Sunsoft 3 pinout )
Post Reply