Brass 3 (Under Development)

One suite to code them all. An complete IDE and assembler for all your z80 projects!

Moderators: benryves, kv83

User avatar
benryves
Maxcoderz Staff
Posts: 3074
Joined: Thu 16 Dec, 2004 10:06 pm
Location: Croydon, England
Contact:

Brass 3 (Under Development)

Post by benryves »

OK, so Brass 2 was a bit of a damp squib. No matter.

Brass 3 takes some of Brass 2's ideas - ie, a user-extendable plugin-driven compiler - yet simplifies it quite a lot. The Brass 2 parser was pretty sophisticated, which meant that whilst it was based on very clean code trying to shoe-horn in additional functionality like a macro preprocessor (which modified the source) or various odd bits of syntax (like TASM's '.equ') was a nightmare and just didn't work.

The new parser is still light-years ahead of the original one in Brass (including much better operator support and functions), though!

There's a vast amount of stuff that needs to be implemented (modules, reusable labels, lots of missing directives) but at least the concept has stood up well thus far. :)

I've gone attribute-crazy (attributes are used in .NET to attach metadata to types/functions/&c) and so, for example, individual plugins can expose their documentation. From this you can get a nifty help viewer (that can be embedded into Latenite):


Of course, command-line apps don't usually look very interesting, but I couldn't resist syntax-highlighting error reports (the compiler knows what each token is - a label, an operator, an instruction, a comment, a directive - so syntax highlighting is effectively built-in):

Image

I post this to hopefully commit myself to the project, and also driesguldolf seems to want .while \ .loop functionality so at least this indicates that it's coming. :D

As far as plugins go, here's the source for the .incbin plugin:

Code: Select all

using System;
using System.Collections.Generic;
using System.Text;
using System.IO;
using System.ComponentModel;

using Brass3;
using Brass3.Plugins;
using Brass3.Attributes;

namespace Core.Directives {

   [Syntax(".incbin \"file\"")]
   [Description("Insert all data from a binary file into the output at the current program counter position.")]
   [Remarks("Use this to import precompiled resources from other sources into your project.")]
   [CodeExample("MonsterSprite:\r\n.incbin \"Resources/Sprites/Monster.spr\"")]
   [Category("Data")]
   public class IncBin : IDirective {

      public string[] Names { get { return new string[] { "incbin" }; } }
      public string Name { get { return Names[0]; } }

      public void Invoke(Compiler compiler, TokenisedSource source, int index, string directive) {

         int[] Args = source.GetCommaDelimitedArguments(index + 1, 1);
         if (!source.ExpressionIsStringConstant(Args[0])) throw new DirectiveArgumentException(source, "Filename expected.");

         string Filename = compiler.ResolveFilename(source.GetExpressionStringConstant(Args[0], false));

         if (!File.Exists(Filename)) throw new DirectiveArgumentException(source.Tokens[index], "File '" + Filename + "' not found.");

         try {
            using (FileStream FS = File.OpenRead(Filename)) {
               switch (compiler.CurrentPass) {
                  case AssemblyPass.Pass1:
                     compiler.Labels.ProgramCounter.Value += FS.Length;
                     break;
                  case AssemblyPass.Pass2:
                     int Data = 0;
                     while ((Data = FS.ReadByte()) != -1) compiler.WriteOutput((byte)Data);
                     break;
               }
            }
         } catch (Exception ex) {
            throw new CompilerExpection(source, ex.Message);
         }
      }

   }
}


You should be able to write plugins in any CLI-targetting (".NET") language. Another problem with Brass 2 was that plugins were static. Brass 3 creates instances of plugins, and provides an easy way for them to access eachother and thus share data between them by allowing them to query the compiler for other directives (eg, the fclose() function can ask for the instance of the fopen() function plugin from the compiler so it can access fopen()'s file handle table).
Last edited by benryves on Mon 01 Oct, 2007 11:39 am, edited 1 time in total.
User avatar
benryves
Maxcoderz Staff
Posts: 3074
Joined: Thu 16 Dec, 2004 10:06 pm
Location: Croydon, England
Contact:

Post by benryves »

I've added support for "function-like" macros (like our beloved bcall(label)), extended conditionals to support "ifdef" variants (as well as defined() and undefined() functions) and added module support to the label system.

I've also changed the way that projects are built. Rather than put everything into command-line arguments, the front-end can load a project file (.brassproj) directly. A project file is a simple XML format file, and attributes or elements that aren't recognised are ignored so that the format can be extended easily. Currently the thing looks rather bare:

Code: Select all

<?xml version="1.0" encoding="utf-8" ?>
<brassproject version="3">
   <plugins>
      <collection source="core.dll">
         <exclude plugin="define" />
      </collection>
      <collection source="legacy.dll">
         <include plugin="define" />
      </collection>
      <collection source="z80.dll" />
   </plugins>
   <input source="test.asm" assembler="z80">
      <includedir path="ti/includes" />
   </input>
   <output destination="test.bin" writer="raw" />
</brassproject>


The "collection" elements can be empty (in which case all plugins from a collection are loaded) or can contain "exclude" elements to explicitly exclude plugins or "include" elements to explicitly include plugins (you cannot mix and match exclude and include, naturally).

Ultimately I'd also like the projects to be made up of multiple configurations which could automatically define constants, rather than the current environment variables hack.
User avatar
KermMartian
Calc Wizard
Posts: 549
Joined: Tue 05 Jul, 2005 11:28 pm
Contact:

Post by KermMartian »

You can has epic win? I'm glad you're finally going through with this. :)
Image Image Image
User avatar
Timendus
Calc King
Posts: 1729
Joined: Sun 23 Jan, 2005 12:37 am
Location: Netherlands
Contact:

Post by Timendus »

Looking good, Ben. I was wondering why you were being so quiet :mrgreen:
http://clap.timendus.com/ - The Calculator Link Alternative Protocol
http://api.timendus.com/ - Make your life easier, leave the coding to the API
http://vera.timendus.com/ - The calc lover's OS
User avatar
benryves
Maxcoderz Staff
Posts: 3074
Joined: Thu 16 Dec, 2004 10:06 pm
Location: Croydon, England
Contact:

Post by benryves »

There are a couple of "minor" issues to contend with that I can think of; label+page syntax and unsquished binaries.

Labels have a value and a page property. Normally, assignments and value access return the value property. I was thinking that if you specified a label name with a : prefix it would work with the page value instead. This mirrors the : suffix nicely:

Code: Select all

$: = $9D93 ; ".org $9D93"
:$ = 2     ; ".page 2"
$ += 100   ; Implicitly accesses value rather than page.


Unsquished binaries are a bit more of an issue. Currently plugins can output data by calling WriteOutput(byte)/WriteOutput(short)/WriteOutput(int) (short and int flip byte order for different-endian devices). The output plugin then gets a list of output data, each item having an address, a page and a byte value.

I'm thinking that maybe this output data should have an array of bytes, and another plugin can be loaded that expands each byte written (via WriteOutput()) into the unsquished format, thus keeping the program counter intact even though there's more than one byte of output data to each byte of written data. The output plugin can then reject output elements with more than one byte of data if it's inappropriate.
User avatar
benryves
Maxcoderz Staff
Posts: 3074
Joined: Thu 16 Dec, 2004 10:06 pm
Location: Croydon, England
Contact:

Post by benryves »

I went for the :label for page access and label: for value access. I also went for an "output modifier" plugin that process each byte written into a byte[], and have written an unsquisher for the TI-83:

Code: Select all

using System;
using System.Text;
using System.ComponentModel;

using Brass3;
using Brass3.Plugins;
using Brass3.Attributes;

namespace TexasInstruments {

   [Description("Unsquishes output binary data into ASCII text.")]
   [Syntax(".squish")]
   [Syntax(".unsquish")]
   [Remarks("This is suitable for use with TI-83 and uncompiled TI-83 Plus programs to be run without a shell.")]
   public class Squish : IOutputModifier, IDirective {

      #region Name

      public string[] Names {
         get { return new string[] { "squish", "unsquish" }; }
      }

      public string Name {
         get { return this.Names[0]; }
      }

      #endregion

      private bool Squishing = true;

      public Squish(Compiler c) {
         c.PassBegun += new EventHandler(delegate(object sender, EventArgs e) { this.Squishing = true; });
      }

      public byte[] ModifyOutput(Compiler compiler, byte data) {
         if (this.Squishing) {
            return new byte[] { data };
         } else {
            return Encoding.ASCII.GetBytes(data.ToString("X2"));
         }
      }

      public void Invoke(Compiler compiler, TokenisedSource source, int index, string directive) {
         source.GetCommaDelimitedArguments(index + 1, 0);
         this.Squishing = directive == "squish";
      }
   }
}


After making a few minor adjustments/hacks (eg %binary is a bit of a problem as % is an operator, so there's a special case if % is followed by 0 or 1) it can now compile a few TASM programs. To my absolute horror it was taking over 20 seconds for a small source file until I remembered to try without the debugger being attached, which dropped it down to 800ms - about par with Brass 1, which isn't good. I have a hunch its down to me reading data from the source file character by character, and this is supposedly very slow (if this is the case then I can just dump the entire file into a MemoryStream and read that instead). Just running the source file reader on its own (and not actually decoding any instructions or invoking any directives) takes about 700ms, which shows me where the bottle-neck is!
User avatar
benryves
Maxcoderz Staff
Posts: 3074
Joined: Thu 16 Dec, 2004 10:06 pm
Location: Croydon, England
Contact:

Post by benryves »

Unsurprisingly, parsing a very large assembly file from a FileStream was slow (700ms) when you read it byte-by-byte. I also had encoding issues with using a BinaryReader.

I now use File.ReadAllText (handles encoding issues), convert it to UTF-16 (Encoding.Unicode.GetBytes()), stick that all in a MemoryStream and read from that. From ~700ms to ~25ms isn't bad. :)

I then put the macro processor directly into the AssemblyReader.ReadTokenisedSource method (AssemblyReader : StreamReader) which boosted performance again (~15ms) and removed the myriad hacks in the compiler for the awkward case that a macro takes one statement (such as bcall(xyz)) into two statements (rst $28 and .dw xyz).

The AssemblyReader also keeps track of line number and source file, now, so error messages are now faintly useful. :)

The next bug to resolve are macros like this:

Code: Select all

#define xyz (x+y)*2


As the parser rips out whitespace the #define directive sees that as a function so gets a bit stroppy. Some sort of "Token.IsTouching" method might be in order to avoid that condition, whilst keeping whitespace out.

Now, text encoding! Currently string literals can be declared with "" or ''; I think that maybe the best way to handle this would be via a plugin that would take a string and return an array of bytes (like the Encoding.GetBytes(string) method) and the plugin that is used is defined by a prefix on the string, like this:

Code: Select all

.db utf8"This would be in UTF-8"
.db ti"Some [program]" ; TIOS specific


To further help things along it might be an idea to have an implicit string encoding plugin for those cases where no prefix is specified, which could be set globally.
User avatar
driesguldolf
Extreme Poster
Posts: 395
Joined: Thu 17 May, 2007 4:49 pm
Location: $4080
Contact:

Post by driesguldolf »

Great ideas Ben! I don't have a lot to add, mainly because what you type is like chinese to me :mrgreen: though:
benryves wrote:I now use File.ReadAllText (handles encoding issues), convert it to UTF-16 (Encoding.Unicode.GetBytes()), stick that all in a MemoryStream and read from that. From ~700ms to ~25ms isn't bad.
I do understand ;)

More function like macro's will greatly increase their usage! Looking forward :)
User avatar
Timendus
Calc King
Posts: 1729
Joined: Sun 23 Jan, 2005 12:37 am
Location: Netherlands
Contact:

Post by Timendus »

I'm not sure this would improve readibility, Ben

Code: Select all

.db utf8"This would be in UTF-8"
.db ti"Some [program]" ; TIOS specific

If I were you I'd do this the same way you handle most other things:

Code: Select all

.encodeutf8
.db "This would be in UTF-8"
.encodeti
.db "This would be Ti-OS charset"

It seems more consistent with the other directives you use, and it doesn't clutter the strings (which would allow for more typos).
http://clap.timendus.com/ - The Calculator Link Alternative Protocol
http://api.timendus.com/ - Make your life easier, leave the coding to the API
http://vera.timendus.com/ - The calc lover's OS
King Harold
Calc King
Posts: 1513
Joined: Sat 05 Aug, 2006 7:22 am

Post by King Harold »

I'm with Tim there, it seems more like 'the assembler way' to do it Tim's way :)
Or maybe you could have both? (if not too hard?)
User avatar
benryves
Maxcoderz Staff
Posts: 3074
Joined: Thu 16 Dec, 2004 10:06 pm
Location: Croydon, England
Contact:

Post by benryves »

Hm, I was probably thinking of the C++ or C# style of prefixes (W"Hello" or @"Some\File\Path").

With a slight modification;

Code: Select all

.encode utf8
.db "This would be in UTF-8"
.encode ti
.db "This would be Ti-OS charset"


Of course, this would also allow for:

Code: Select all

.encode asciimap


...and the provision of a text encoding plugin that mimics the old .asciimap directive. :)
User avatar
Timendus
Calc King
Posts: 1729
Joined: Sun 23 Jan, 2005 12:37 am
Location: Netherlands
Contact:

Post by Timendus »

Looks good to me :)
http://clap.timendus.com/ - The Calculator Link Alternative Protocol
http://api.timendus.com/ - Make your life easier, leave the coding to the API
http://vera.timendus.com/ - The calc lover's OS
User avatar
benryves
Maxcoderz Staff
Posts: 3074
Joined: Thu 16 Dec, 2004 10:06 pm
Location: Croydon, England
Contact:

Post by benryves »

I've continued updating/fixing/working on the parser, and started thowing together a GUI project editor (browse for source/destination file, pick assembler, choose which plugins are available, select output writer plugin and click Build).

It now builds all of the test projects I used for the original Brass without complaining, which is a good sign, but it still doesn't output 100% accurate binaries just yet, so still got some bug-squashing to do.

I also updated the help viewer a little; it now displays help for all plugins (not just functions and directives) and when it displays code examples it runs them through the parser first so they end up highlighted, and you can click on directives and functions to jump straight to their help file.

Image

If you run Brass without any command-line parameters it runs as a command-line calculator;

Code: Select all

Usage: Brass ProjectFile
Running in interactive calculator mode. Type exit to quit.

> x=2**3
8
> abs(-x)<<1
16
> $FFFF+y
Value for label 'y' not found.
$FFFF+y
> y=-1
-1
> $FFFF+y
65534
> list
$       0
x       8
y       -1
> _

...which has been useful for testing the expression parser.

Now, more string-encoding related questions. :)

First up; how should string constants like this be handled:

Code: Select all

    ld a,"!"
    ld hl,"?!"


Currently, it takes the string, encodes it using the current string encoding, then treats the resulting byte[] as a large integer value (so, with ASCII, "ab"->{$41, $42}->$4142). Is that sensible, or can you think of a better way to do it?

The assembler has an endianness switch (can be set by the .big and .little directives); should it reverse the string ($4241) in big-endian mode?

Also, should .db "xyz" with UTF-16 output big-endian Unicode in big-endian mode? (UTF-16 and UTF-32 are currently fixed to little-endian output).

Furthermore; the assembler defaults to an ASCII encoding. I notice that the Venus header has some non-ASCII characters in it; which code page should I default to? I'd rather not default to the current system's code page, as that would break source portability.

As per the Encoding.GetBytes method, encoding plugins return a dud character for symbols they cannot reproduce (so "é" == "?" is true with the ASCII encoding).
User avatar
driesguldolf
Extreme Poster
Posts: 395
Joined: Thu 17 May, 2007 4:49 pm
Location: $4080
Contact:

Post by driesguldolf »

benryves wrote:Currently, it takes the string, encodes it using the current string encoding, then treats the resulting byte[] as a large integer value (so, with ASCII, "ab"->{$41, $42}->$4142).
This is the most logic way to do it

benryves wrote:The assembler has an endianness switch (can be set by the .big and .little directives); should it reverse the string ($4241) in big-endian mode?

Definitely not, I'd keep characters in the way they're presented, but of course when you have 16bits per letter, swap the bytes if in little endian

benryves wrote:Furthermore; the assembler defaults to an ASCII encoding. I notice that the Venus header has some non-ASCII characters in it
Let the developers write hex values?

benryves wrote:As per the Encoding.GetBytes method, encoding plugins return a dud character for symbols they cannot reproduce (so "é" == "?" is true with the ASCII encoding).
Yeah, but allow the dud char to be set with a directive (like .dudchar "?")
User avatar
Timendus
Calc King
Posts: 1729
Joined: Sun 23 Jan, 2005 12:37 am
Location: Netherlands
Contact:

Post by Timendus »

Looking good Ben! :)

benryves wrote:Furthermore; the assembler defaults to an ASCII encoding. I notice that the Venus header has some non-ASCII characters in it
Let the developers write hex values?

I was about to say the same thing. The Venus source uses those characters to represent binary values that it should have represented with .db $8A or whatever instead. In my opinion this is just bad programming practice, as it results in poor source compatibility, and supporting that just for the sake of backwards compatibility seems... pointless...
http://clap.timendus.com/ - The Calculator Link Alternative Protocol
http://api.timendus.com/ - Make your life easier, leave the coding to the API
http://vera.timendus.com/ - The calc lover's OS
Post Reply