Spec for HLL targeting NES
Moderator: Moderators
Spec for HLL targeting NES
Edit: Back to working on this, made some real progress, spec attached below. Some of the post below is outdated.
Before posting this, I went back and reread this thread (Among others) to see if there were any ideas I overlooked.
http://nesdev.com/bbs/viewtopic.php?t=7976
The op's ideas were pretty close to my own: The syntax is going to be very simple, almost BASIC-like, but it does adapt things from C:
-Pointers are a separate type and must be declared. Uses Indirect-Y.
-Structs are very much in play. An array of structs would use Direct Indexing.
-Local variables will be simulated, but there's no stack manipulation with Indirect.
The plan is to generate asm roughly close to what would be hand-written.
Some issues that might crop-up (Or, things from C that didn't make it):
-No 24-bit or 32-bit vars: These will be implemented when the compiler targets 16-bit cpus. What do you call a 24-bit variable when 'int' is reserved for 32-bits and 'long' for 64-bits? For now, the carry flag is available if more than 16-bits are needed.
-1D arrays only: Readily maps to processor addressing modes. On x86 and Arm, 2D and more arrays would be trivial with their multiply instruction, but Noism will not be targeting those platforms. You'll have to use a pointer table to simulate 2D arrays.
One change to the spec: I'm not crazy about retaining any feature of C pointer syntax, so
ptr pointer_var = &var
might become
ptr pointer_var = [var]
Also, forgot negation (~).
Reasons to use Noism:
-Scoping removes the need to juggle temporary variables
-Create a portable code base for NES and GB (Others to be added later)
-Syntax maps closely to a handful of asm instructions
Right now the compiler does syntax error checking. Once I get the compiler functional, the plan is to release it with at least one demo that will run on both NES and GB. Looking forward to feedback -- Hopefully someone else will use it.
Before posting this, I went back and reread this thread (Among others) to see if there were any ideas I overlooked.
http://nesdev.com/bbs/viewtopic.php?t=7976
The op's ideas were pretty close to my own: The syntax is going to be very simple, almost BASIC-like, but it does adapt things from C:
-Pointers are a separate type and must be declared. Uses Indirect-Y.
-Structs are very much in play. An array of structs would use Direct Indexing.
-Local variables will be simulated, but there's no stack manipulation with Indirect.
The plan is to generate asm roughly close to what would be hand-written.
Some issues that might crop-up (Or, things from C that didn't make it):
-No 24-bit or 32-bit vars: These will be implemented when the compiler targets 16-bit cpus. What do you call a 24-bit variable when 'int' is reserved for 32-bits and 'long' for 64-bits? For now, the carry flag is available if more than 16-bits are needed.
-1D arrays only: Readily maps to processor addressing modes. On x86 and Arm, 2D and more arrays would be trivial with their multiply instruction, but Noism will not be targeting those platforms. You'll have to use a pointer table to simulate 2D arrays.
One change to the spec: I'm not crazy about retaining any feature of C pointer syntax, so
ptr pointer_var = &var
might become
ptr pointer_var = [var]
Also, forgot negation (~).
Reasons to use Noism:
-Scoping removes the need to juggle temporary variables
-Create a portable code base for NES and GB (Others to be added later)
-Syntax maps closely to a handful of asm instructions
Right now the compiler does syntax error checking. Once I get the compiler functional, the plan is to release it with at least one demo that will run on both NES and GB. Looking forward to feedback -- Hopefully someone else will use it.
- Attachments
-
- noism_spec.txt
- (17.52 KiB) Downloaded 170 times
Last edited by strat on Sat Nov 30, 2013 8:23 pm, edited 2 times in total.
I have a note on the whole idea. HLL is a High Level Language, i.e. very abstracted from the low level, the hardware. So there is a dillema - either you target it for effective resulting code, but this brings some hardware limitations back to the abstaction level (like 1D arrays and lack of 32 bit math), thus compicating use of the language, or you target it to simplify programming a lot, hiding all these details, but this leads to not very effective code.
Another thing, my personal opinion is that some new language, that has syntax far from a popular one (BASIC, C, Java) is doomed to be used by the author and maybe a few other people only - just because people couldn't use their previous experience well, they also can't use this experience later, and it is difficult to get help on an unpopular, new thing.
Another thing, my personal opinion is that some new language, that has syntax far from a popular one (BASIC, C, Java) is doomed to be used by the author and maybe a few other people only - just because people couldn't use their previous experience well, they also can't use this experience later, and it is difficult to get help on an unpopular, new thing.
This is no doubt an experiment. But if nothing else, it's pretty fun to work on a really simple compiler. The first point I agree with. Effective code is the aim here. One way a xD array might be simulated is to restrict the higher dimensions to a power of 2. Then again, that will result in a crazy amount of bit shifts. The compiler could also create and load the pointer table automatically; I don't like the idea of surprise code, so that might be enabled with a preprocessor option.
I may just go ahead and include 32-bit math. But 24-bit vars are also a necessity.
The second point I see where you're coming from but not really. Most programming languages demand experienced programmers change their habits a bit. Lua had the gall to break the tradition of indexing an array with zero, and those Blizzard guys love it. This language will never be popular anyway unless it gets new people into deving on old systems.
Also, please ignore the embarrassing self-contradiction in section XIV. of the spec.
I may just go ahead and include 32-bit math. But 24-bit vars are also a necessity.
The second point I see where you're coming from but not really. Most programming languages demand experienced programmers change their habits a bit. Lua had the gall to break the tradition of indexing an array with zero, and those Blizzard guys love it. This language will never be popular anyway unless it gets new people into deving on old systems.
Also, please ignore the embarrassing self-contradiction in section XIV. of the spec.
Nothing is really wrong with C-style syntax itself, but C itself has some really backwards features in it.
Lack of forward declarations is the single most annoying part of C and C++. I write a function, then need to copy-paste the first line somewhere else just so I can call it in code that happens to be before the function. That is absolutely ridiculous.
Also, C doesn't have a good way for a function to return multiple values. You can return a struct, but that mainly leads to the compiler throwing it on the stack instead of returning it in several registers.
What else is wrong with C and C++? Assignments in If expression. Infinite while loops because you accidentally put a semicolon before the open brace. The postincrement operator having undefined meaning when there is more than one use of that variable. The wrong order of operations makes bitwise arithmetic lower priority than expected (OR should be like addition, AND and bit shifts should be like multiplication), but they are all low priority instead. Leading zeroes magically make your numbers octal. Tons of annoying legacy crap.
Lack of forward declarations is the single most annoying part of C and C++. I write a function, then need to copy-paste the first line somewhere else just so I can call it in code that happens to be before the function. That is absolutely ridiculous.
Also, C doesn't have a good way for a function to return multiple values. You can return a struct, but that mainly leads to the compiler throwing it on the stack instead of returning it in several registers.
What else is wrong with C and C++? Assignments in If expression. Infinite while loops because you accidentally put a semicolon before the open brace. The postincrement operator having undefined meaning when there is more than one use of that variable. The wrong order of operations makes bitwise arithmetic lower priority than expected (OR should be like addition, AND and bit shifts should be like multiplication), but they are all low priority instead. Leading zeroes magically make your numbers octal. Tons of annoying legacy crap.
Last edited by Dwedit on Thu Aug 02, 2012 12:35 pm, edited 1 time in total.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
I agree with you on this one !Dwedit wrote: Lack of forward declarations is the single most annoying part of C and C++. I write a function, then need to copy-paste the first line somewhere else just so I can call it in code that happens to be before the function. That is absolutely ridiculous.
To return two values, return a long and bit pack both values in the result.Also, C doesn't have a good way for a function to return multiple values. You can return a struct, but that mainly leads to the compiler throwing it on the stack instead of returning it in several registers.
For more than two values, have a pointer in the argument list that points to where you'd like the function to write its results.
Useless, lumbering half-wits don't scare us.
Yeah, things like that can be cumbersome in C. But if I created a new language like this, I would probably keep the syntax as close to C as I could. It will mean the language is easy to pick up for anyone who has done a little programming. Noism doesn't even use return values as far as I can see? Sort of like only allowing for void functions in a C like world.
I do like this project, btw. Since I think assembly is very complicated to do large logic stuff in, this will probably make things easier.
I have been using CC65 a lot, and while it does what I want most of the time, there are some things that really make me want to try something else.
If noism will make it simpler that CC65 to handle bank switching, it can be really useful. I don't have much experience working with bank switching, but in CC65 you basically have to keep your code small enough to fit in one 16K bank, and use the other one for pure data.
If noism also generates more efficient code than CC65, that is also a great thing.
Keep up the good work.
I do like this project, btw. Since I think assembly is very complicated to do large logic stuff in, this will probably make things easier.
I have been using CC65 a lot, and while it does what I want most of the time, there are some things that really make me want to try something else.
If noism will make it simpler that CC65 to handle bank switching, it can be really useful. I don't have much experience working with bank switching, but in CC65 you basically have to keep your code small enough to fit in one 16K bank, and use the other one for pure data.
If noism also generates more efficient code than CC65, that is also a great thing.
Keep up the good work.
Could you tell us more ?I have been using CC65 a lot, and while it does what I want most of the time, there are some things that really make me want to try something else.
I had this idea of porting sdcc for the 6502 not long ago, but it will be a large project and I'm not sure I can handle it.
Useless, lumbering half-wits don't scare us.
Re: Spec for HLL targeting NES
Most C platforms I know of with 16-bit int have 32-bit long. You could use the <stdint.h> names int16_t, uint16_t, int32_t, and uint32_t for variables that stay 16-bit or 32-bit regardless of platform. C doesn't define a 24-bit integer type, but int24_t and uint24_t would be least astonishing to programmers, and they can typedef it to something more convenient (like the s32, u32, s16, and u16 commonly seen in GBA code).strat wrote:Some issues that might crop-up (Or, things from C that didn't make it):
-No 24-bit or 32-bit vars: These will be implemented when the compiler targets 16-bit cpus. What do you call a 24-bit variable when 'int' is reserved for 32-bits and 'long' for 64-bits?
Yay! No more DRY violations that are characteristic of ports to some platforms. And if you provide a standard C back end, the result becomes more easily portable to Windows, Mac OS X, desktop Linux, iOS, and Android.Reasons to use Noism:
[...]
-Create a portable code base for NES and GB (Others to be added later)
I believe several compilers have warnings against that. For example, one would get two warnings for code like this:Dwedit wrote:What else is wrong with C and C++? Assignments in If expression. Infinite while loops because you accidentally put a semicolon before the open brace.
Code: Select all
while (pointer = getNext()) ;Code: Select all
while ((pointer = getNext()) != NULL) { }Then why not put your increments on another line?The postincrement operator having undefined meaning when there is more than one use of that variable.
Not if the values you want to return won't fit in a long. For example, a pointer typically takes up a whole long (sizeof(intptr_t) >= sizeof(long)). Passing a pointer to a struct is far more common in code that I've read, even for two results.Bregalad wrote:To return two values, return a long and bit pack both values in the result.
It should be fine to have code in switchable banks in CC65, just make sure the library routines (like stack manipulation) are in the fixed bank. This can be achieved by naming the fixed bank/segment "CODE". Of course you have to manually make sure the correct functions are mapped in the non-fixed bank whenever calling them.Nioreh wrote:If noism will make it simpler that CC65 to handle bank switching, it can be really useful. I don't have much experience working with bank switching, but in CC65 you basically have to keep your code small enough to fit in one 16K bank, and use the other one for pure data.
Re: Spec for HLL targeting NES
Thinking it over real quick, maybe it's best to adopt the GBA syntax as default: s8, ... s64. The 's' is supposed to stand for 'signed' but it might as well be 'storage', since cpus afaik don't distinguish between signed and unsigned, only the 'printf' function.Most C platforms I know of with 16-bit int have 32-bit long. You could use the <stdint.h> names int16_t, uint16_t, int32_t, and uint32_t for variables that stay 16-bit or 32-bit regardless of platform.
Hmmm... having read your XNA article, my best interpretation of this idea is that Noism compiles into C code. That's not really a bad idea. Then you'd have one code base that will create a real NES game and a retro-style game for highend systems. Too bad we didn't have this discussion while Megaman 9 was being developed.Yay! No more DRY violations that are characteristic of ports to some platforms. And if you provide a standard C back end, the result becomes more easily portable to Windows, Mac OS X, desktop Linux, iOS, and Android.
Re: Spec for HLL targeting NES
An 8*8=16 bit multiply, or 16*16=32, or 32*32=64 sure does. So do the operators /, <, and >.strat wrote:Thinking it over real quick, maybe it's best to adopt the GBA [integer type names] as default: s8, ... s64. The 's' is supposed to stand for 'signed' but it might as well be 'storage', since cpus afaik don't distinguish between signed and unsigned, only the 'printf' function.
Re: Spec for HLL targeting NES
I'm glad to see my previous work here is still being read
I'll be interested to see what you come up with! I did get mine producing assembly code, but did not pursue it much beyond that.
What I discovered in my toying around (and never really reported back on) was that the HLL I had designed simply could not produce machine code that was as efficient (or even close) to the code I would write by hand. This was due to the fact that the HLL and the machine were engineered for different patterns.
So, if your goal is to produce machine code that is as efficient or very close to the assembly you would write by hand, you need to identify the patterns you are using while writing assembly and then base the requirements of the HLL on those patterns.
If you want a common code base for multiple platforms then you're best bet is to use a small set of basic patterns to base your HLL on, then translate those into machine instructions for the target platform that may not necessarily be very efficient.
I think trying to achieve both is not terribly productive on these early microprocessor architectures. These things (the 65xx and Z80 series MC's) were specifically engineered to be programmed in their machine language. Other architectures (like the Intel 80 series and later Motorola 68K series) were designed with HLL's in mind, and efficiently implement some of these HLL patterns in hardware.
What I discovered in my toying around (and never really reported back on) was that the HLL I had designed simply could not produce machine code that was as efficient (or even close) to the code I would write by hand. This was due to the fact that the HLL and the machine were engineered for different patterns.
So, if your goal is to produce machine code that is as efficient or very close to the assembly you would write by hand, you need to identify the patterns you are using while writing assembly and then base the requirements of the HLL on those patterns.
If you want a common code base for multiple platforms then you're best bet is to use a small set of basic patterns to base your HLL on, then translate those into machine instructions for the target platform that may not necessarily be very efficient.
I think trying to achieve both is not terribly productive on these early microprocessor architectures. These things (the 65xx and Z80 series MC's) were specifically engineered to be programmed in their machine language. Other architectures (like the Intel 80 series and later Motorola 68K series) were designed with HLL's in mind, and efficiently implement some of these HLL patterns in hardware.
Re:
Me too! I hate that crap, it's so unneeded. I also hate how there's no real way to include data for the binary to use at runtime, you have to load it from a file and stick it in an array or something. I hate how modern languages work in general honestly. C isn't too bad, but still, could be much better.Bregalad wrote:I agree with you on this one !Dwedit wrote: Lack of forward declarations is the single most annoying part of C and C++. I write a function, then need to copy-paste the first line somewhere else just so I can call it in code that happens to be before the function. That is absolutely ridiculous.