True. Perhaps I should have said "You'll need to define what JSR, RTS, SWI, and RTI do in order to give ABI designers something to work with."Joe wrote:That kind of information belongs in the specification for the ABItepples wrote:nor what registers a subroutine can expect to preserve
YCPU: an imaginary 16-bit processor.
Moderator: Moderators
Re: Addressing modes for a fake cpu - what would you include
Re: Addressing modes for a fake cpu - what would you include
Not really. There is 4 kinds of stacks, really. Using ARM's notation :When thinking about the stack, I realized that in order to have the stack grow downwards, I would have to implement two more addressing modes - the opposites of the Post-increment and Pre-decrement indirects. Do I have this correct?
- Full decreasing
- Empty decreasing
- Full increasing
- Empty increasing
The full / empty thing determines whenever the stack pointer points at the address where the next element will be pushed (empty), or at the address where the next element has been pushed (full).
The increasing / decreasing things determine in which direction the stack grows in memory.
There is no major difference between any one of those for the user, and there is no technical reason to use one instead of the other, exept the limitation of the addressing modes. For instance the 6502 and it's family are hardwired to empty decreasing, while the ARM in Thumb mode uses full decreasing (please correct me if I'm wrong).
So I really think you should use the stack the make the most sense for your addressing modes, and there's no need to add more mode just to make support for a different kind of stack.
- Jarhmander
- Formerly ~J-@D!~
- Posts: 521
- Joined: Sun Mar 12, 2006 12:36 am
- Location: Rive nord de Montréal
Re: Addressing modes for a fake cpu - what would you include
In addition...
- if the offset (in your indirect offset and indirect indexed) can only be positive, a downward growing stack is much more useful than an upward growing stack, because you can then access a stacked element with that offset;
- The decision between full or empty descending is a bit of a personal choice, however a full descending stack give a useful meaning to LOD R0, [SP, 0]; it accesses the first stacked element (contrast with a empty descending one, it's an invalid access, or otherwise unmeaningful). The only cost, as you mentioned, is that you initialize SP to an "invalid" location, ex. if your stack uses words $1FF downward then you initialize SP with $200
((λ (x) (x x)) (λ (x) (x x)))
Re: Addressing modes for a fake cpu - what would you include
I've updated the specification based on your suggestions, and placed it on the first post of this thread.
SWI and RTI save and restore only the PC and FL. Save/restore of registers is handled in some cases by the calling thread, and in other cases by the interrupt routine. For interrupts which are called by a thread (an example would be a system call), the programmer should be aware of the operation of the interrupt and thus know which registers they want to save before calling the interrupt. For interrupts which occur without a thread calling them (hardware interrupts, for example), the interrupt routine must save and restore any registers that it modifies.
Question about system calls - how should I expose the hardware to the CPU? Should I allow any program to query hardware and send messages to hardware, or should that function be restricted to the Supervisor thread only, which can expose hardware functionality via a SWI?
1. Operation completes.
2. High 16b are written to R0.
3. Low 16b are written to Rx - in this case, R0. R0 contains the low 16b, and the high 16b are wiped out.
Finally, thanks to everyone that posted descriptions on the different ways which a stack can work. I've decided I'm going to go with a full decreasing stack. I'm thinking about adding bounds checking functionality to the processor via additional special registers: for example, a pair of stack bounds registers could help the processor determine if the stack has over- or under-flowed. Is there a better way to handle this functionality?
Thanks for pointing that out. I've decremented the first letter of the spec; it's now YCPUtepples wrote:ZCPU might be confused with IBM or Infocom ISAs.
JSR/RTS save and restore only the PC. There is no guarantee that registers will remain unchanged; thus, saving and restoring them must be done manually.tepples wrote:You'll need to define what JSR, RTS, SWI, and RTI do in order to give ABI designers something to work with."
SWI and RTI save and restore only the PC and FL. Save/restore of registers is handled in some cases by the calling thread, and in other cases by the interrupt routine. For interrupts which are called by a thread (an example would be a system call), the programmer should be aware of the operation of the interrupt and thus know which registers they want to save before calling the interrupt. For interrupts which occur without a thread calling them (hardware interrupts, for example), the interrupt routine must save and restore any registers that it modifies.
Question about system calls - how should I expose the hardware to the CPU? Should I allow any program to query hardware and send messages to hardware, or should that function be restricted to the Supervisor thread only, which can expose hardware functionality via a SWI?
I'm sorry, I should have clarified: MUL/MLI are always 16x16=32b, with the high 16b of the result stored in R0. If you specify R0 as the source/dest register, the following will result:tepples wrote:Signed and unsigned multiplication are the same operation for 16x16=16. They differ only for 16x16=32.
1. Operation completes.
2. High 16b are written to R0.
3. Low 16b are written to Rx - in this case, R0. R0 contains the low 16b, and the high 16b are wiped out.
I've decided that the result has the same sign as the dividend (as with C++2011, C#, and Java).tepples wrote:What are 5 % -3, -5 % 3, and -5 % -3 in this ISA?
Finally, thanks to everyone that posted descriptions on the different ways which a stack can work. I've decided I'm going to go with a full decreasing stack. I'm thinking about adding bounds checking functionality to the processor via additional special registers: for example, a pair of stack bounds registers could help the processor determine if the stack has over- or under-flowed. Is there a better way to handle this functionality?
Re: Addressing modes for a fake cpu - what would you include
Can't you use POP PC in place of RTS (and POP PC,FL in place of RTI)? You seem to have defined these instructions, and then why do you need the RTS and RTI instructions?
[url=gopher://zzo38computer.org/].[/url]
-
psycopathicteen
- Posts: 3001
- Joined: Wed May 19, 2010 6:12 pm
Re: Addressing modes for a fake cpu - what would you include
There's one thing that would've made a true 16-bit difficult to use in a system, and that's being limited to 128 kilobytes of memory. Maybe you can give every register it's own bank register, and have an instruction that swaps the data between the main register and its bank register.
Re: YCPU: an imaginary 16-bit processor
I do intend to keep the flat address space of 64 kilo words. When I know enough to implent it, I'll also implement a mmu spec that allows for bank switching 4kb segments, and also knows enough about thread context to make sure that non-privileged threads can't access memory they don't own.
Re: Addressing modes for a fake cpu - what would you include
I've updated the specification again, and updated the first post of this thread.
I'm unsure that this is the best way to exit User mode --- and I'm not sure how I can enter user mode. My current idea is to have an opcode that jumps to a subroutine and turns off Supervisor mode at the same time. Thoughts?
I've also been thinking about an implementation of a memory paging scheme. This is what I'm currently thinking: the address space is divided into 4kb pages. Each active page can contain any page from the 64kw of memory, a blank page (all 0x0000), or a page from a ROM chip. Each page also has bits specifying if the page can be Executed, if the page can be written, and if the page is supervisor only. There will some ability to page out and page in - perhaps to some slower storage device - but this will come at a performance penalty.
Aside from my remaining Supervisor mode and MMU questions, are there any other obvious features that my processor is missing?
Your comment made me realize that I had neglected some interrupt functionality - I've now given the processor a 'supervisor/interrupt' mode where the processor has 100% access to all features, in contrast to a 'user' mode, which has a different stack pointer and cannot directly access all memory, etc. However, your comment still holds true for RTS.zzo38 wrote:Can't you use POP PC in place of RTS (and POP PC,FL in place of RTI)? You seem to have defined these instructions, and then why do you need the RTS and RTI instructions?
Code: Select all
==========================[ Interrupt Instructions ]============================
RTI Return from Interrupt
Returns from an interrupt.
Processor is currently in Supervisor mode.
1. N bit in Flags register is cleared.
2. FL is popped from the Supervisor Stack.
3A. If Supervisor Mode Bit is set, we are still in Supervisor mode:
M is popped from the stack [using SSP]. PC is set to M.
3B. If Supervisor Mode Bit is not set, we exit Supervisor mode:
M is popped from the stack [using USP]. PC is set to M.
4. Execution continues.
SWI Call Software Interrupt
Calls interrupt with index = R0.
1. PC is pushed to the current active stack.
2. FL is pushed to Supervisor Stack.
3. Supervisor Bit in FL is set.
4. N bit in Flags register is set.
5. PC is set to Mem[IA + (R0 & $00FF)].
6. Execution continues.I've also been thinking about an implementation of a memory paging scheme. This is what I'm currently thinking: the address space is divided into 4kb pages. Each active page can contain any page from the 64kw of memory, a blank page (all 0x0000), or a page from a ROM chip. Each page also has bits specifying if the page can be Executed, if the page can be written, and if the page is supervisor only. There will some ability to page out and page in - perhaps to some slower storage device - but this will come at a performance penalty.
Aside from my remaining Supervisor mode and MMU questions, are there any other obvious features that my processor is missing?
Re: YCPU: an imaginary 16-bit processor.
- The new "Interrupt In Process" flag is still labeled as "N" which the negative flag also is. It should be labeled with a different letter.
- It says that RTS is a synonym for "POP PC", although it still has its own opcode number. It could be simplified by making it not have its own opcode number and instead just be assembler macro for "POP PC".
- It seems to me like SWI using the high eight bits of the opcode as the interrupt number, would be better than using R0 for this purpose.
- It says RTI clears the interrupt in process flag before popping the flag. This doesn't seem to make much sense; it will be overridden anyways. It seems to me that "POP FL,PC" would do the same thing if it is currently in supervisor mode (and that in user mode, it would just remain in user mode since supervisor-only flags are ignored).
- You could use the high three bits of the TSR and TSW opcodes to select which register to apply to, rather than always R0.
- If you divide by zero, what will be written into the result register?
- LOD and STO are not mentioned in ALU section.
- A signed divide of -32768 by -1 would not have a representable result; perhaps it should set the carry flag in this case (and leave alone the register value)?
- Flag affected by shifting is not mentioned either. (Does it affect carry like in 6502?)
- Is this rotation through a register or through the carry flag? (You could have both; an unused SHF sub-opcode exists which could be defined for this purpose.)
- When you test a bit, what flags are affected to show you the result? (Maybe it should store the result in the carry flag?)
You could have an opcode that jumps and enters user mode, but another way would be to push stuff into the stack using other instructions and then use RTI (or POP FL,PC) to enter user mode.pops wrote:I'm not sure how I can enter user mode...
I do not think an execution bit is really necessary; x86 has it, but I do not think it is useful there either. It seems to be used only to compensate for badly written programs, and that is not a good excuse.pops wrote:Each page also has bits specifying if the page can be Executed, if the page can be written, and if the page is supervisor only.
[url=gopher://zzo38computer.org/].[/url]
Re: YCPU: an imaginary 16-bit processor.
Why are you limiting the MMU to only 64kW of memory? You could have the MMU map those 4kB pages from a larger address space (perhaps 1MW/2MB, 20 bits), which would remove the need for separate ROM/RAM spaces, and give the system designer more flexibility in how the address space may be allocated for ROM, RAM, and memory-mapped peripherals.pops wrote:I've also been thinking about an implementation of a memory paging scheme. This is what I'm currently thinking: the address space is divided into 4kb pages. Each active page can contain any page from the 64kw of memory, a blank page (all 0x0000), or a page from a ROM chip. Each page also has bits specifying if the page can be Executed, if the page can be written, and if the page is supervisor only. There will some ability to page out and page in - perhaps to some slower storage device - but this will come at a performance penalty.
How do you intend to implement virtual memory? For that to work, you need some way to signal the supervisor when the user tries to access a page that must be loaded from the slower storage device. You might want to define interrupt handling first; once you have your interrupts defined, you can reserve one of them for the MMU "page fault" exception. (You could even use separate interrupts for "page is valid but must be loaded from slow storage" and "page is not valid", but that's not really necessary.)
I should probably add the disclaimer that I'm approaching this from the viewpoint of an operating system developer, since that's the primary (if not single) use for separate user and supervisor modes.
If you divide by zero, will that trigger the "divide by zero" interrupt handler?zzo38 wrote:
- If you divide by zero, what will be written into the result register?
Re: YCPU: an imaginary 16-bit processor.
I disagree with your claim that it "is not a good excuse". Most programmers lack the resources to formally prove that a sufficiently large program is not "badly written". Even I make mistakes.zzo38 wrote:[Data execution prevention] seems to be used only to compensate for badly written programs, and that is not a good excuse.
- Jarhmander
- Formerly ~J-@D!~
- Posts: 521
- Joined: Sun Mar 12, 2006 12:36 am
- Location: Rive nord de Montréal
Re: YCPU: an imaginary 16-bit processor.
I second, for it is also an essential part for making a system "secure". Nothing is perfect, but if you can help prevent buffer overflows to inject malicious code in memory and running it, that's a huge plus. Also, while it could incur some severe overhead, I see some uses of this feature outside of security considerations, ex. trap on execution of a page, to reload it/swap with actual instructions for the YCPU. It could be used to cheaply emulate another system while having to "translate" into native machine code only pages that are actually executed.
((λ (x) (x x)) (λ (x) (x x)))
Re: YCPU: an imaginary 16-bit processor.
I've updated the specification again on the first post of this thread.
zzo38, Thanks for all your notes. I've implemented the following per your suggestion:
* The "Interrupt In Process" bit is now labeled Q.
* RTS is now an assembler macro for "POP PC".
* I've clarified how RTI clears the "Interrupt In Process" bit - it is now in the new 'Processor Status' register.
* The high three bits of TSR/TSW now select which register is read from/written to.
* CPU's response to divide by zero, as well as divide -32768 by -1, is specified. Result register in this case remains unchanged.
* LOD and STO are now specified in the ALU section.
* There is now an opcode that jumps and enters user mode: JMU.
I have a question about SWI suggestion:
I'd like someone to look over to see if my flag specifications for bit shifting and testing make sense:
* Flag status after a shift operation is now specified.
* I've specified that bit testing should use the zero and carry flags.
Joe, I've added a preliminary memory paging specification. Under this spec, the processor has 16 pages of on-chip memory, but the MMU can page in memory from a device on the bus - which could be an additional memory chip with 256mw of memory (2^16 4kw pages). However, the processor itself - and the active process running on it - are limited to a flat $10000 address space. A kernel that is aware of additional memory or slow storage could provide interrupts to a process that would switch out pages on demand.
I don't know what would be the best way to determine if a process is accessing a page that doesn't belong to it, since all user mode processes will be sharing the same address space (imagine a switch from one user mode process to another: all the user mode pages loaded for the first user mode are still available to the second process). Do I need to add some other functionality to address this issue - or would it be enough to expect the kernel to save and restore the MMU status on process switch?
Again - as before - I really appreciate all the suggestions.
zzo38, Thanks for all your notes. I've implemented the following per your suggestion:
* The "Interrupt In Process" bit is now labeled Q.
* RTS is now an assembler macro for "POP PC".
* I've clarified how RTI clears the "Interrupt In Process" bit - it is now in the new 'Processor Status' register.
* The high three bits of TSR/TSW now select which register is read from/written to.
* CPU's response to divide by zero, as well as divide -32768 by -1, is specified. Result register in this case remains unchanged.
* LOD and STO are now specified in the ALU section.
* There is now an opcode that jumps and enters user mode: JMU.
I have a question about SWI suggestion:
I thought that there might be cases where a program would need to determine which interrupt to call at run-time. Is this not the case, even in modern operating systems? (Not that this is what I'm targeting, mind you).It seems to me like SWI using the high eight bits of the opcode as the interrupt number, would be better than using R0 for this purpose.
I'd like someone to look over to see if my flag specifications for bit shifting and testing make sense:
* Flag status after a shift operation is now specified.
* I've specified that bit testing should use the zero and carry flags.
Joe, I've added a preliminary memory paging specification. Under this spec, the processor has 16 pages of on-chip memory, but the MMU can page in memory from a device on the bus - which could be an additional memory chip with 256mw of memory (2^16 4kw pages). However, the processor itself - and the active process running on it - are limited to a flat $10000 address space. A kernel that is aware of additional memory or slow storage could provide interrupts to a process that would switch out pages on demand.
I don't know what would be the best way to determine if a process is accessing a page that doesn't belong to it, since all user mode processes will be sharing the same address space (imagine a switch from one user mode process to another: all the user mode pages loaded for the first user mode are still available to the second process). Do I need to add some other functionality to address this issue - or would it be enough to expect the kernel to save and restore the MMU status on process switch?
Code: Select all
=========================[ 1.F. Memory Management ]=============================
(From 0.1e)
The processor has an integrated memory manager, which can switch 4kw 'pages'
into the address space. Each of the 16 x 4kw pages in address space are
described by 2 words:
WORD 0 (flags)
FEDC BA98 7654 3210
SWEM TT.. hhhh hhhh
S - Supervisor only. User mode accesses to this page cause a page fault.
W - Write protect, 1 = writing to this page cause a page fault.
E - Execute protect, 1= executing on this page cause a page fault.
M - Page has been modified since load.
T - Page type:
00: Use processor memory page with index = (word 1 & 0x000F)
01: Use blank page, reads/executes are 0x0000, writes fail silently.
10: Use hardware page, device = h, page index = (word 1)
11: Use processor ROM, page index = (word 1)
WORD 1 (index)
FEDC BA98 7654 3210
iiii iiii iiii iiii
i - Index of device page mapped to this address page.Re: YCPU: an imaginary 16-bit processor.
The Game Boy Advance BIOS determines which syscall number was called by peeking at the return address.pops wrote:I thought that there might be cases where a program would need to determine which interrupt to call at run-time.It seems to me like SWI using the high eight bits of the opcode as the interrupt number, would be better than using R0 for this purpose.
Writing to memory management registers might need to be a privileged operation so that the kernel can virtualize the address space, translating between page numbers that the application sees and physical page numbers that the hardware sees.I don't know what would be the best way to determine if a process is accessing a page that doesn't belong to it, since all user mode processes will be sharing the same address space
It's the kernel's job to configure the MMU for each process. I'd also recommend having a seventeenth page that replaces one of the pages in supervisor mode, so that the user process can see the full 64 Kwords.Do I need to add some other functionality to address this issue - or would it be enough to expect the kernel to save and restore the MMU status on process switch?
And what's this "processor memory page" and "hardware page"? Is it that there's a fast 64 Kword memory in the CPU package and a slower, larger memory accessed through 36-bit "hardware page" (8 bits h, 16 bits i, 12 bits address)?
Re: YCPU: an imaginary 16-bit processor.
I looked and I believe it is much better now.
If it is needed, you can still do such thing as make a interrupt handler that will read it and call another one (I think DOS does something similar?), or else using self modifying codes.pops wrote:I have a question about SWI suggestion:I thought that there might be cases where a program would need to determine which interrupt to call at run-time. Is this not the case, even in modern operating systems? (Not that this is what I'm targeting, mind you).It seems to me like SWI using the high eight bits of the opcode as the interrupt number, would be better than using R0 for this purpose.
I notice you can tell it to shift out zero bits. What happens in these cases?I'd like someone to look over to see if my flag specifications for bit shifting and testing make sense:
* Flag status after a shift operation is now specified.
* I've specified that bit testing should use the zero and carry flags.
- Does LSR still always clear the negative flag if no bits are shifted out?
- If ROL is used shifting nothing, will the carry flag copy the low bit (low four bits of the result of (16-0) is 0)?
- If ROR is used shifting nothing, will the carry flag copy the high bit (low four bits of the result of (0-1) is 15)?
[url=gopher://zzo38computer.org/].[/url]