YCPU: an imaginary 16-bit processor.

You can talk about almost anything that you want to on this board.

Moderator: Moderators

tepples
Posts: 22345
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Addressing modes for a fake cpu - what would you include

Post by tepples »

Joe wrote:
tepples wrote:nor what registers a subroutine can expect to preserve
That kind of information belongs in the specification for the ABI
True. Perhaps I should have said "You'll need to define what JSR, RTS, SWI, and RTI do in order to give ABI designers something to work with."
User avatar
Bregalad
Posts: 8036
Joined: Fri Nov 12, 2004 2:49 pm
Location: Caen, France

Re: Addressing modes for a fake cpu - what would you include

Post by Bregalad »

When thinking about the stack, I realized that in order to have the stack grow downwards, I would have to implement two more addressing modes - the opposites of the Post-increment and Pre-decrement indirects. Do I have this correct?
Not really. There is 4 kinds of stacks, really. Using ARM's notation :
- Full decreasing
- Empty decreasing
- Full increasing
- Empty increasing

The full / empty thing determines whenever the stack pointer points at the address where the next element will be pushed (empty), or at the address where the next element has been pushed (full).

The increasing / decreasing things determine in which direction the stack grows in memory.

There is no major difference between any one of those for the user, and there is no technical reason to use one instead of the other, exept the limitation of the addressing modes. For instance the 6502 and it's family are hardwired to empty decreasing, while the ARM in Thumb mode uses full decreasing (please correct me if I'm wrong).

So I really think you should use the stack the make the most sense for your addressing modes, and there's no need to add more mode just to make support for a different kind of stack.
User avatar
Jarhmander
Formerly ~J-@D!~
Posts: 521
Joined: Sun Mar 12, 2006 12:36 am
Location: Rive nord de Montréal

Re: Addressing modes for a fake cpu - what would you include

Post by Jarhmander »

In addition...
  • if the offset (in your indirect offset and indirect indexed) can only be positive, a downward growing stack is much more useful than an upward growing stack, because you can then access a stacked element with that offset;
  • The decision between full or empty descending is a bit of a personal choice, however a full descending stack give a useful meaning to LOD R0, [SP, 0]; it accesses the first stacked element (contrast with a empty descending one, it's an invalid access, or otherwise unmeaningful). The only cost, as you mentioned, is that you initialize SP to an "invalid" location, ex. if your stack uses words $1FF downward then you initialize SP with $200
((λ (x) (x x)) (λ (x) (x x)))
pops
Posts: 91
Joined: Sun Apr 04, 2010 4:28 pm

Re: Addressing modes for a fake cpu - what would you include

Post by pops »

I've updated the specification based on your suggestions, and placed it on the first post of this thread.
tepples wrote:ZCPU might be confused with IBM or Infocom ISAs.
Thanks for pointing that out. I've decremented the first letter of the spec; it's now YCPU
tepples wrote:You'll need to define what JSR, RTS, SWI, and RTI do in order to give ABI designers something to work with."
JSR/RTS save and restore only the PC. There is no guarantee that registers will remain unchanged; thus, saving and restoring them must be done manually.

SWI and RTI save and restore only the PC and FL. Save/restore of registers is handled in some cases by the calling thread, and in other cases by the interrupt routine. For interrupts which are called by a thread (an example would be a system call), the programmer should be aware of the operation of the interrupt and thus know which registers they want to save before calling the interrupt. For interrupts which occur without a thread calling them (hardware interrupts, for example), the interrupt routine must save and restore any registers that it modifies.

Question about system calls - how should I expose the hardware to the CPU? Should I allow any program to query hardware and send messages to hardware, or should that function be restricted to the Supervisor thread only, which can expose hardware functionality via a SWI?
tepples wrote:Signed and unsigned multiplication are the same operation for 16x16=16. They differ only for 16x16=32.
I'm sorry, I should have clarified: MUL/MLI are always 16x16=32b, with the high 16b of the result stored in R0. If you specify R0 as the source/dest register, the following will result:
1. Operation completes.
2. High 16b are written to R0.
3. Low 16b are written to Rx - in this case, R0. R0 contains the low 16b, and the high 16b are wiped out.
tepples wrote:What are 5 % -3, -5 % 3, and -5 % -3 in this ISA?
I've decided that the result has the same sign as the dividend (as with C++2011, C#, and Java).

Finally, thanks to everyone that posted descriptions on the different ways which a stack can work. I've decided I'm going to go with a full decreasing stack. I'm thinking about adding bounds checking functionality to the processor via additional special registers: for example, a pair of stack bounds registers could help the processor determine if the stack has over- or under-flowed. Is there a better way to handle this functionality?
zzo38
Posts: 1080
Joined: Mon Feb 07, 2011 12:46 pm

Re: Addressing modes for a fake cpu - what would you include

Post by zzo38 »

Can't you use POP PC in place of RTS (and POP PC,FL in place of RTI)? You seem to have defined these instructions, and then why do you need the RTS and RTI instructions?
[url=gopher://zzo38computer.org/].[/url]
psycopathicteen
Posts: 3001
Joined: Wed May 19, 2010 6:12 pm

Re: Addressing modes for a fake cpu - what would you include

Post by psycopathicteen »

There's one thing that would've made a true 16-bit difficult to use in a system, and that's being limited to 128 kilobytes of memory. Maybe you can give every register it's own bank register, and have an instruction that swaps the data between the main register and its bank register.
pops
Posts: 91
Joined: Sun Apr 04, 2010 4:28 pm

Re: YCPU: an imaginary 16-bit processor

Post by pops »

I do intend to keep the flat address space of 64 kilo words. When I know enough to implent it, I'll also implement a mmu spec that allows for bank switching 4kb segments, and also knows enough about thread context to make sure that non-privileged threads can't access memory they don't own.
pops
Posts: 91
Joined: Sun Apr 04, 2010 4:28 pm

Re: Addressing modes for a fake cpu - what would you include

Post by pops »

I've updated the specification again, and updated the first post of this thread.
zzo38 wrote:Can't you use POP PC in place of RTS (and POP PC,FL in place of RTI)? You seem to have defined these instructions, and then why do you need the RTS and RTI instructions?
Your comment made me realize that I had neglected some interrupt functionality - I've now given the processor a 'supervisor/interrupt' mode where the processor has 100% access to all features, in contrast to a 'user' mode, which has a different stack pointer and cannot directly access all memory, etc. However, your comment still holds true for RTS.

Code: Select all

==========================[ Interrupt Instructions ]============================

  RTI               Return from Interrupt
  Returns from an interrupt.
  Processor is currently in Supervisor mode.
    1.  N bit in Flags register is cleared.
    2.  FL is popped from the Supervisor Stack.
    3A. If Supervisor Mode Bit is set, we are still in Supervisor mode:
        M is popped from the stack [using SSP]. PC is set to M.
    3B. If Supervisor Mode Bit is not set, we exit Supervisor mode:
        M is popped from the stack [using USP]. PC is set to M.
    4. Execution continues.
  
  
  SWI               Call Software  Interrupt
  Calls interrupt with index = R0.
    1. PC is pushed to the current active stack.
    2. FL is pushed to Supervisor Stack.
    3. Supervisor Bit in FL is set.
    4. N bit in Flags register is set.
    5. PC is set to Mem[IA + (R0 & $00FF)].
    6. Execution continues.
I'm unsure that this is the best way to exit User mode --- and I'm not sure how I can enter user mode. My current idea is to have an opcode that jumps to a subroutine and turns off Supervisor mode at the same time. Thoughts?

I've also been thinking about an implementation of a memory paging scheme. This is what I'm currently thinking: the address space is divided into 4kb pages. Each active page can contain any page from the 64kw of memory, a blank page (all 0x0000), or a page from a ROM chip. Each page also has bits specifying if the page can be Executed, if the page can be written, and if the page is supervisor only. There will some ability to page out and page in - perhaps to some slower storage device - but this will come at a performance penalty.

Aside from my remaining Supervisor mode and MMU questions, are there any other obvious features that my processor is missing?
zzo38
Posts: 1080
Joined: Mon Feb 07, 2011 12:46 pm

Re: YCPU: an imaginary 16-bit processor.

Post by zzo38 »

  • The new "Interrupt In Process" flag is still labeled as "N" which the negative flag also is. It should be labeled with a different letter.
  • It says that RTS is a synonym for "POP PC", although it still has its own opcode number. It could be simplified by making it not have its own opcode number and instead just be assembler macro for "POP PC".
  • It seems to me like SWI using the high eight bits of the opcode as the interrupt number, would be better than using R0 for this purpose.
  • It says RTI clears the interrupt in process flag before popping the flag. This doesn't seem to make much sense; it will be overridden anyways. It seems to me that "POP FL,PC" would do the same thing if it is currently in supervisor mode (and that in user mode, it would just remain in user mode since supervisor-only flags are ignored).
  • You could use the high three bits of the TSR and TSW opcodes to select which register to apply to, rather than always R0.
  • If you divide by zero, what will be written into the result register?
  • LOD and STO are not mentioned in ALU section.
  • A signed divide of -32768 by -1 would not have a representable result; perhaps it should set the carry flag in this case (and leave alone the register value)?
  • Flag affected by shifting is not mentioned either. (Does it affect carry like in 6502?)
  • Is this rotation through a register or through the carry flag? (You could have both; an unused SHF sub-opcode exists which could be defined for this purpose.)
  • When you test a bit, what flags are affected to show you the result? (Maybe it should store the result in the carry flag?)
pops wrote:I'm not sure how I can enter user mode...
You could have an opcode that jumps and enters user mode, but another way would be to push stuff into the stack using other instructions and then use RTI (or POP FL,PC) to enter user mode.
pops wrote:Each page also has bits specifying if the page can be Executed, if the page can be written, and if the page is supervisor only.
I do not think an execution bit is really necessary; x86 has it, but I do not think it is useful there either. It seems to be used only to compensate for badly written programs, and that is not a good excuse.
[url=gopher://zzo38computer.org/].[/url]
Joe
Posts: 469
Joined: Mon Apr 01, 2013 11:17 pm

Re: YCPU: an imaginary 16-bit processor.

Post by Joe »

pops wrote:I've also been thinking about an implementation of a memory paging scheme. This is what I'm currently thinking: the address space is divided into 4kb pages. Each active page can contain any page from the 64kw of memory, a blank page (all 0x0000), or a page from a ROM chip. Each page also has bits specifying if the page can be Executed, if the page can be written, and if the page is supervisor only. There will some ability to page out and page in - perhaps to some slower storage device - but this will come at a performance penalty.
Why are you limiting the MMU to only 64kW of memory? You could have the MMU map those 4kB pages from a larger address space (perhaps 1MW/2MB, 20 bits), which would remove the need for separate ROM/RAM spaces, and give the system designer more flexibility in how the address space may be allocated for ROM, RAM, and memory-mapped peripherals.

How do you intend to implement virtual memory? For that to work, you need some way to signal the supervisor when the user tries to access a page that must be loaded from the slower storage device. You might want to define interrupt handling first; once you have your interrupts defined, you can reserve one of them for the MMU "page fault" exception. (You could even use separate interrupts for "page is valid but must be loaded from slow storage" and "page is not valid", but that's not really necessary.)

I should probably add the disclaimer that I'm approaching this from the viewpoint of an operating system developer, since that's the primary (if not single) use for separate user and supervisor modes.
zzo38 wrote:
  • If you divide by zero, what will be written into the result register?
If you divide by zero, will that trigger the "divide by zero" interrupt handler?
tepples
Posts: 22345
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: YCPU: an imaginary 16-bit processor.

Post by tepples »

zzo38 wrote:[Data execution prevention] seems to be used only to compensate for badly written programs, and that is not a good excuse.
I disagree with your claim that it "is not a good excuse". Most programmers lack the resources to formally prove that a sufficiently large program is not "badly written". Even I make mistakes.
User avatar
Jarhmander
Formerly ~J-@D!~
Posts: 521
Joined: Sun Mar 12, 2006 12:36 am
Location: Rive nord de Montréal

Re: YCPU: an imaginary 16-bit processor.

Post by Jarhmander »

I second, for it is also an essential part for making a system "secure". Nothing is perfect, but if you can help prevent buffer overflows to inject malicious code in memory and running it, that's a huge plus. Also, while it could incur some severe overhead, I see some uses of this feature outside of security considerations, ex. trap on execution of a page, to reload it/swap with actual instructions for the YCPU. It could be used to cheaply emulate another system while having to "translate" into native machine code only pages that are actually executed.
((λ (x) (x x)) (λ (x) (x x)))
pops
Posts: 91
Joined: Sun Apr 04, 2010 4:28 pm

Re: YCPU: an imaginary 16-bit processor.

Post by pops »

I've updated the specification again on the first post of this thread.

zzo38, Thanks for all your notes. I've implemented the following per your suggestion:
* The "Interrupt In Process" bit is now labeled Q.
* RTS is now an assembler macro for "POP PC".
* I've clarified how RTI clears the "Interrupt In Process" bit - it is now in the new 'Processor Status' register.
* The high three bits of TSR/TSW now select which register is read from/written to.
* CPU's response to divide by zero, as well as divide -32768 by -1, is specified. Result register in this case remains unchanged.
* LOD and STO are now specified in the ALU section.
* There is now an opcode that jumps and enters user mode: JMU.

I have a question about SWI suggestion:
It seems to me like SWI using the high eight bits of the opcode as the interrupt number, would be better than using R0 for this purpose.
I thought that there might be cases where a program would need to determine which interrupt to call at run-time. Is this not the case, even in modern operating systems? (Not that this is what I'm targeting, mind you).

I'd like someone to look over to see if my flag specifications for bit shifting and testing make sense:
* Flag status after a shift operation is now specified.
* I've specified that bit testing should use the zero and carry flags.

Joe, I've added a preliminary memory paging specification. Under this spec, the processor has 16 pages of on-chip memory, but the MMU can page in memory from a device on the bus - which could be an additional memory chip with 256mw of memory (2^16 4kw pages). However, the processor itself - and the active process running on it - are limited to a flat $10000 address space. A kernel that is aware of additional memory or slow storage could provide interrupts to a process that would switch out pages on demand.

I don't know what would be the best way to determine if a process is accessing a page that doesn't belong to it, since all user mode processes will be sharing the same address space (imagine a switch from one user mode process to another: all the user mode pages loaded for the first user mode are still available to the second process). Do I need to add some other functionality to address this issue - or would it be enough to expect the kernel to save and restore the MMU status on process switch?

Code: Select all

=========================[ 1.F. Memory Management ]=============================
(From 0.1e)
The processor has an integrated memory manager, which can switch 4kw 'pages'
into the address space. Each of the 16 x 4kw pages in address space are
described by 2 words:

WORD 0 (flags)
FEDC BA98 7654 3210
SWEM TT.. hhhh hhhh
    S - Supervisor only. User mode accesses to this page cause a page fault.
    W - Write protect, 1 = writing to this page cause a page fault.
    E - Execute protect, 1= executing on this page cause a page fault.
    M - Page has been modified since load.
    T - Page type:
        00: Use processor memory page with index = (word 1 & 0x000F)
        01: Use blank page, reads/executes are 0x0000, writes fail silently.
        10: Use hardware page, device = h, page index = (word 1)
        11: Use processor ROM, page index = (word 1)

WORD 1 (index)
FEDC BA98 7654 3210
iiii iiii iiii iiii
    i - Index of device page mapped to this address page.
Again - as before - I really appreciate all the suggestions.
tepples
Posts: 22345
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: YCPU: an imaginary 16-bit processor.

Post by tepples »

pops wrote:
It seems to me like SWI using the high eight bits of the opcode as the interrupt number, would be better than using R0 for this purpose.
I thought that there might be cases where a program would need to determine which interrupt to call at run-time.
The Game Boy Advance BIOS determines which syscall number was called by peeking at the return address.
I don't know what would be the best way to determine if a process is accessing a page that doesn't belong to it, since all user mode processes will be sharing the same address space
Writing to memory management registers might need to be a privileged operation so that the kernel can virtualize the address space, translating between page numbers that the application sees and physical page numbers that the hardware sees.
Do I need to add some other functionality to address this issue - or would it be enough to expect the kernel to save and restore the MMU status on process switch?
It's the kernel's job to configure the MMU for each process. I'd also recommend having a seventeenth page that replaces one of the pages in supervisor mode, so that the user process can see the full 64 Kwords.

And what's this "processor memory page" and "hardware page"? Is it that there's a fast 64 Kword memory in the CPU package and a slower, larger memory accessed through 36-bit "hardware page" (8 bits h, 16 bits i, 12 bits address)?
zzo38
Posts: 1080
Joined: Mon Feb 07, 2011 12:46 pm

Re: YCPU: an imaginary 16-bit processor.

Post by zzo38 »

I looked and I believe it is much better now.
pops wrote:I have a question about SWI suggestion:
It seems to me like SWI using the high eight bits of the opcode as the interrupt number, would be better than using R0 for this purpose.
I thought that there might be cases where a program would need to determine which interrupt to call at run-time. Is this not the case, even in modern operating systems? (Not that this is what I'm targeting, mind you).
If it is needed, you can still do such thing as make a interrupt handler that will read it and call another one (I think DOS does something similar?), or else using self modifying codes.
I'd like someone to look over to see if my flag specifications for bit shifting and testing make sense:
* Flag status after a shift operation is now specified.
* I've specified that bit testing should use the zero and carry flags.
I notice you can tell it to shift out zero bits. What happens in these cases?
  • Does LSR still always clear the negative flag if no bits are shifted out?
  • If ROL is used shifting nothing, will the carry flag copy the low bit (low four bits of the result of (16-0) is 0)?
  • If ROR is used shifting nothing, will the carry flag copy the high bit (low four bits of the result of (0-1) is 15)?
[url=gopher://zzo38computer.org/].[/url]
Post Reply