makes it quite clear: you only fixed the bug.Brass 1 will never provide that functionality.
So, good luck with Brass 2
Code: Select all
.define p+ $70B8
(1) is a point I should have addressed early on, but didn't get around to it. I'd rather have a list of characters which are not allowed rather than a list of characters that are allowed. Characters which would not be allowed would include any whitespace (tab or space), any character used in an operator, any character that is seen as punctuation ( ) [ ] , \, directive marker (# or .) and constant prefixes ($ % @). This is quickly off the top of my head, so if you can think of any further items that are definite no-nos, please tell me!CoBB wrote:Thank you. I definitely agree with treating . and # equally as a directive marker. They don’t conflict with anything, so why not? Fast questions:
1. What characters are allowed in names? I presume that the same rules apply to labels, directives, module names and so on.
2. Is case insensitivity unicode aware or only applied to English letters?
3. How do you distinguish between expression commands and assembler commands? How do you determine their borders?
4. Will you allow * to denote the program counter? I hope not. I don’t think I’ve ever seen it used by anyone, fortunately.
The tricky part is thinking in unicode again. You should exclude every kind of punctuation and white space, only allowing underscore and apostrophe (which is needed for shadow registers with your tokeniser; by the way, doesn’t that cause trouble with single-quoted literals?).benryves wrote:Characters which would not be allowed would include any whitespace (tab or space), any character used in an operator, any character that is seen as punctuation ( ) [ ] , \, directive marker (# or .) and constant prefixes ($ % @). This is quickly off the top of my head, so if you can think of any further items that are definite no-nos, please tell me!
This is a must. Also, keep in mind that even case conversions can be handled differently depending on locale (check dotted and dotless i in Tukic languages). You should exclude that too.benryves wrote:This means that the radix point is '.' regardless of machine locale.
I’m not sure know if using . for module notation is the best solution, since it is already used for directives, and it’s probably not a good idea to allow them inside name literals. On the other hand, it’s the most appealing choice when it comes to visuals. Am I right to think that you’ll canonise label names internally as soon as they are parsed?benryves wrote:With regard to labels, I also need to sort out modules. Chances are I'll use the . syntax (module.module.label) again, but I don't think I'll use TASM-style _ local labels. Reusable (+, -) labels too need looking into. I don't know if I'll drop the need to put { } around reusable labels - your thoughts on whether it makes the code too ugly or not would be valued!