MaxCoderz

for your 1 bit pleasure!

All times are UTC




Post new topic Reply to topic  [ 68 posts ]  Go to page 1, 2, 3, 4, 5  Next
Author Message
PostPosted: Mon 24 Sep, 2007 11:11 am 
Offline
Maxcoderz Staff
User avatar

Joined: Thu 16 Dec, 2004 10:06 pm
Posts: 3064
Location: Croydon, England
OK, so Brass 2 was a bit of a damp squib. No matter.

Brass 3 takes some of Brass 2's ideas - ie, a user-extendable plugin-driven compiler - yet simplifies it quite a lot. The Brass 2 parser was pretty sophisticated, which meant that whilst it was based on very clean code trying to shoe-horn in additional functionality like a macro preprocessor (which modified the source) or various odd bits of syntax (like TASM's '.equ') was a nightmare and just didn't work.

The new parser is still light-years ahead of the original one in Brass (including much better operator support and functions), though!

There's a vast amount of stuff that needs to be implemented (modules, reusable labels, lots of missing directives) but at least the concept has stood up well thus far. :)

I've gone attribute-crazy (attributes are used in .NET to attach metadata to types/functions/&c) and so, for example, individual plugins can expose their documentation. From this you can get a nifty help viewer (that can be embedded into Latenite):


Of course, command-line apps don't usually look very interesting, but I couldn't resist syntax-highlighting error reports (the compiler knows what each token is - a label, an operator, an instruction, a comment, a directive - so syntax highlighting is effectively built-in):

Image

I post this to hopefully commit myself to the project, and also driesguldolf seems to want .while \ .loop functionality so at least this indicates that it's coming. :D

As far as plugins go, here's the source for the .incbin plugin:

Code:
using System;
using System.Collections.Generic;
using System.Text;
using System.IO;
using System.ComponentModel;

using Brass3;
using Brass3.Plugins;
using Brass3.Attributes;

namespace Core.Directives {

   [Syntax(".incbin \"file\"")]
   [Description("Insert all data from a binary file into the output at the current program counter position.")]
   [Remarks("Use this to import precompiled resources from other sources into your project.")]
   [CodeExample("MonsterSprite:\r\n.incbin \"Resources/Sprites/Monster.spr\"")]
   [Category("Data")]
   public class IncBin : IDirective {

      public string[] Names { get { return new string[] { "incbin" }; } }
      public string Name { get { return Names[0]; } }

      public void Invoke(Compiler compiler, TokenisedSource source, int index, string directive) {

         int[] Args = source.GetCommaDelimitedArguments(index + 1, 1);
         if (!source.ExpressionIsStringConstant(Args[0])) throw new DirectiveArgumentException(source, "Filename expected.");

         string Filename = compiler.ResolveFilename(source.GetExpressionStringConstant(Args[0], false));

         if (!File.Exists(Filename)) throw new DirectiveArgumentException(source.Tokens[index], "File '" + Filename + "' not found.");

         try {
            using (FileStream FS = File.OpenRead(Filename)) {
               switch (compiler.CurrentPass) {
                  case AssemblyPass.Pass1:
                     compiler.Labels.ProgramCounter.Value += FS.Length;
                     break;
                  case AssemblyPass.Pass2:
                     int Data = 0;
                     while ((Data = FS.ReadByte()) != -1) compiler.WriteOutput((byte)Data);
                     break;
               }
            }
         } catch (Exception ex) {
            throw new CompilerExpection(source, ex.Message);
         }
      }

   }
}


You should be able to write plugins in any CLI-targetting (".NET") language. Another problem with Brass 2 was that plugins were static. Brass 3 creates instances of plugins, and provides an easy way for them to access eachother and thus share data between them by allowing them to query the compiler for other directives (eg, the fclose() function can ask for the instance of the fopen() function plugin from the compiler so it can access fopen()'s file handle table).


Last edited by benryves on Mon 01 Oct, 2007 11:39 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue 25 Sep, 2007 1:15 pm 
Offline
Maxcoderz Staff
User avatar

Joined: Thu 16 Dec, 2004 10:06 pm
Posts: 3064
Location: Croydon, England
I've added support for "function-like" macros (like our beloved bcall(label)), extended conditionals to support "ifdef" variants (as well as defined() and undefined() functions) and added module support to the label system.

I've also changed the way that projects are built. Rather than put everything into command-line arguments, the front-end can load a project file (.brassproj) directly. A project file is a simple XML format file, and attributes or elements that aren't recognised are ignored so that the format can be extended easily. Currently the thing looks rather bare:

Code:
<?xml version="1.0" encoding="utf-8" ?>
<brassproject version="3">
   <plugins>
      <collection source="core.dll">
         <exclude plugin="define" />
      </collection>
      <collection source="legacy.dll">
         <include plugin="define" />
      </collection>
      <collection source="z80.dll" />
   </plugins>
   <input source="test.asm" assembler="z80">
      <includedir path="ti/includes" />
   </input>
   <output destination="test.bin" writer="raw" />
</brassproject>


The "collection" elements can be empty (in which case all plugins from a collection are loaded) or can contain "exclude" elements to explicitly exclude plugins or "include" elements to explicitly include plugins (you cannot mix and match exclude and include, naturally).

Ultimately I'd also like the projects to be made up of multiple configurations which could automatically define constants, rather than the current environment variables hack.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue 25 Sep, 2007 2:06 pm 
Offline
Calc Wizard
User avatar

Joined: Tue 05 Jul, 2005 11:28 pm
Posts: 549
You can has epic win? I'm glad you're finally going through with this. :)

_________________
Image Image Image


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue 25 Sep, 2007 2:23 pm 
Offline
Calc King
User avatar

Joined: Sun 23 Jan, 2005 12:37 am
Posts: 1727
Location: Netherlands
Looking good, Ben. I was wondering why you were being so quiet :mrgreen:

_________________
http://clap.timendus.com/ - The Calculator Link Alternative Protocol
http://api.timendus.com/ - Make your life easier, leave the coding to the API
http://vera.timendus.com/ - The calc lover's OS


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue 25 Sep, 2007 3:01 pm 
Offline
Maxcoderz Staff
User avatar

Joined: Thu 16 Dec, 2004 10:06 pm
Posts: 3064
Location: Croydon, England
There are a couple of "minor" issues to contend with that I can think of; label+page syntax and unsquished binaries.

Labels have a value and a page property. Normally, assignments and value access return the value property. I was thinking that if you specified a label name with a : prefix it would work with the page value instead. This mirrors the : suffix nicely:

Code:
$: = $9D93 ; ".org $9D93"
:$ = 2     ; ".page 2"
$ += 100   ; Implicitly accesses value rather than page.


Unsquished binaries are a bit more of an issue. Currently plugins can output data by calling WriteOutput(byte)/WriteOutput(short)/WriteOutput(int) (short and int flip byte order for different-endian devices). The output plugin then gets a list of output data, each item having an address, a page and a byte value.

I'm thinking that maybe this output data should have an array of bytes, and another plugin can be loaded that expands each byte written (via WriteOutput()) into the unsquished format, thus keeping the program counter intact even though there's more than one byte of output data to each byte of written data. The output plugin can then reject output elements with more than one byte of data if it's inappropriate.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed 26 Sep, 2007 10:09 am 
Offline
Maxcoderz Staff
User avatar

Joined: Thu 16 Dec, 2004 10:06 pm
Posts: 3064
Location: Croydon, England
I went for the :label for page access and label: for value access. I also went for an "output modifier" plugin that process each byte written into a byte[], and have written an unsquisher for the TI-83:

Code:
using System;
using System.Text;
using System.ComponentModel;

using Brass3;
using Brass3.Plugins;
using Brass3.Attributes;

namespace TexasInstruments {

   [Description("Unsquishes output binary data into ASCII text.")]
   [Syntax(".squish")]
   [Syntax(".unsquish")]
   [Remarks("This is suitable for use with TI-83 and uncompiled TI-83 Plus programs to be run without a shell.")]
   public class Squish : IOutputModifier, IDirective {

      #region Name

      public string[] Names {
         get { return new string[] { "squish", "unsquish" }; }
      }

      public string Name {
         get { return this.Names[0]; }
      }

      #endregion

      private bool Squishing = true;

      public Squish(Compiler c) {
         c.PassBegun += new EventHandler(delegate(object sender, EventArgs e) { this.Squishing = true; });
      }

      public byte[] ModifyOutput(Compiler compiler, byte data) {
         if (this.Squishing) {
            return new byte[] { data };
         } else {
            return Encoding.ASCII.GetBytes(data.ToString("X2"));
         }
      }

      public void Invoke(Compiler compiler, TokenisedSource source, int index, string directive) {
         source.GetCommaDelimitedArguments(index + 1, 0);
         this.Squishing = directive == "squish";
      }
   }
}


After making a few minor adjustments/hacks (eg %binary is a bit of a problem as % is an operator, so there's a special case if % is followed by 0 or 1) it can now compile a few TASM programs. To my absolute horror it was taking over 20 seconds for a small source file until I remembered to try without the debugger being attached, which dropped it down to 800ms - about par with Brass 1, which isn't good. I have a hunch its down to me reading data from the source file character by character, and this is supposedly very slow (if this is the case then I can just dump the entire file into a MemoryStream and read that instead). Just running the source file reader on its own (and not actually decoding any instructions or invoking any directives) takes about 700ms, which shows me where the bottle-neck is!


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Thu 27 Sep, 2007 12:13 pm 
Offline
Maxcoderz Staff
User avatar

Joined: Thu 16 Dec, 2004 10:06 pm
Posts: 3064
Location: Croydon, England
Unsurprisingly, parsing a very large assembly file from a FileStream was slow (700ms) when you read it byte-by-byte. I also had encoding issues with using a BinaryReader.

I now use File.ReadAllText (handles encoding issues), convert it to UTF-16 (Encoding.Unicode.GetBytes()), stick that all in a MemoryStream and read from that. From ~700ms to ~25ms isn't bad. :)

I then put the macro processor directly into the AssemblyReader.ReadTokenisedSource method (AssemblyReader : StreamReader) which boosted performance again (~15ms) and removed the myriad hacks in the compiler for the awkward case that a macro takes one statement (such as bcall(xyz)) into two statements (rst $28 and .dw xyz).

The AssemblyReader also keeps track of line number and source file, now, so error messages are now faintly useful. :)

The next bug to resolve are macros like this:

Code:
#define xyz (x+y)*2


As the parser rips out whitespace the #define directive sees that as a function so gets a bit stroppy. Some sort of "Token.IsTouching" method might be in order to avoid that condition, whilst keeping whitespace out.

Now, text encoding! Currently string literals can be declared with "" or ''; I think that maybe the best way to handle this would be via a plugin that would take a string and return an array of bytes (like the Encoding.GetBytes(string) method) and the plugin that is used is defined by a prefix on the string, like this:

Code:
.db utf8"This would be in UTF-8"
.db ti"Some [program]" ; TIOS specific


To further help things along it might be an idea to have an implicit string encoding plugin for those cases where no prefix is specified, which could be set globally.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Thu 27 Sep, 2007 10:12 pm 
Offline
Extreme Poster
User avatar

Joined: Thu 17 May, 2007 4:49 pm
Posts: 395
Location: $4080
Great ideas Ben! I don't have a lot to add, mainly because what you type is like chinese to me :mrgreen: though:
benryves wrote:
I now use File.ReadAllText (handles encoding issues), convert it to UTF-16 (Encoding.Unicode.GetBytes()), stick that all in a MemoryStream and read from that. From ~700ms to ~25ms isn't bad.
I do understand ;)

More function like macro's will greatly increase their usage! Looking forward :)


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri 28 Sep, 2007 7:23 am 
Offline
Calc King
User avatar

Joined: Sun 23 Jan, 2005 12:37 am
Posts: 1727
Location: Netherlands
I'm not sure this would improve readibility, Ben
Code:
.db utf8"This would be in UTF-8"
.db ti"Some [program]" ; TIOS specific

If I were you I'd do this the same way you handle most other things:
Code:
.encodeutf8
.db "This would be in UTF-8"
.encodeti
.db "This would be Ti-OS charset"

It seems more consistent with the other directives you use, and it doesn't clutter the strings (which would allow for more typos).

_________________
http://clap.timendus.com/ - The Calculator Link Alternative Protocol
http://api.timendus.com/ - Make your life easier, leave the coding to the API
http://vera.timendus.com/ - The calc lover's OS


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri 28 Sep, 2007 7:58 am 
Offline
Calc King

Joined: Sat 05 Aug, 2006 7:22 am
Posts: 1513
I'm with Tim there, it seems more like 'the assembler way' to do it Tim's way :)
Or maybe you could have both? (if not too hard?)


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri 28 Sep, 2007 9:40 am 
Offline
Maxcoderz Staff
User avatar

Joined: Thu 16 Dec, 2004 10:06 pm
Posts: 3064
Location: Croydon, England
Hm, I was probably thinking of the C++ or C# style of prefixes (W"Hello" or @"Some\File\Path").

With a slight modification;

Code:
.encode utf8
.db "This would be in UTF-8"
.encode ti
.db "This would be Ti-OS charset"


Of course, this would also allow for:

Code:
.encode asciimap


...and the provision of a text encoding plugin that mimics the old .asciimap directive. :)


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon 01 Oct, 2007 9:13 am 
Offline
Calc King
User avatar

Joined: Sun 23 Jan, 2005 12:37 am
Posts: 1727
Location: Netherlands
Looks good to me :)

_________________
http://clap.timendus.com/ - The Calculator Link Alternative Protocol
http://api.timendus.com/ - Make your life easier, leave the coding to the API
http://vera.timendus.com/ - The calc lover's OS


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon 01 Oct, 2007 12:15 pm 
Offline
Maxcoderz Staff
User avatar

Joined: Thu 16 Dec, 2004 10:06 pm
Posts: 3064
Location: Croydon, England
I've continued updating/fixing/working on the parser, and started thowing together a GUI project editor (browse for source/destination file, pick assembler, choose which plugins are available, select output writer plugin and click Build).

It now builds all of the test projects I used for the original Brass without complaining, which is a good sign, but it still doesn't output 100% accurate binaries just yet, so still got some bug-squashing to do.

I also updated the help viewer a little; it now displays help for all plugins (not just functions and directives) and when it displays code examples it runs them through the parser first so they end up highlighted, and you can click on directives and functions to jump straight to their help file.

Image

If you run Brass without any command-line parameters it runs as a command-line calculator;

Code:
Usage: Brass ProjectFile
Running in interactive calculator mode. Type exit to quit.

> x=2**3
8
> abs(-x)<<1
16
> $FFFF+y
Value for label 'y' not found.
$FFFF+y
> y=-1
-1
> $FFFF+y
65534
> list
$       0
x       8
y       -1
> _

...which has been useful for testing the expression parser.

Now, more string-encoding related questions. :)

First up; how should string constants like this be handled:

Code:
    ld a,"!"
    ld hl,"?!"


Currently, it takes the string, encodes it using the current string encoding, then treats the resulting byte[] as a large integer value (so, with ASCII, "ab"->{$41, $42}->$4142). Is that sensible, or can you think of a better way to do it?

The assembler has an endianness switch (can be set by the .big and .little directives); should it reverse the string ($4241) in big-endian mode?

Also, should .db "xyz" with UTF-16 output big-endian Unicode in big-endian mode? (UTF-16 and UTF-32 are currently fixed to little-endian output).

Furthermore; the assembler defaults to an ASCII encoding. I notice that the Venus header has some non-ASCII characters in it; which code page should I default to? I'd rather not default to the current system's code page, as that would break source portability.

As per the Encoding.GetBytes method, encoding plugins return a dud character for symbols they cannot reproduce (so "é" == "?" is true with the ASCII encoding).


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon 01 Oct, 2007 1:46 pm 
Offline
Extreme Poster
User avatar

Joined: Thu 17 May, 2007 4:49 pm
Posts: 395
Location: $4080
benryves wrote:
Currently, it takes the string, encodes it using the current string encoding, then treats the resulting byte[] as a large integer value (so, with ASCII, "ab"->{$41, $42}->$4142).
This is the most logic way to do it

benryves wrote:
The assembler has an endianness switch (can be set by the .big and .little directives); should it reverse the string ($4241) in big-endian mode?

Definitely not, I'd keep characters in the way they're presented, but of course when you have 16bits per letter, swap the bytes if in little endian

benryves wrote:
Furthermore; the assembler defaults to an ASCII encoding. I notice that the Venus header has some non-ASCII characters in it
Let the developers write hex values?

benryves wrote:
As per the Encoding.GetBytes method, encoding plugins return a dud character for symbols they cannot reproduce (so "é" == "?" is true with the ASCII encoding).
Yeah, but allow the dud char to be set with a directive (like .dudchar "?")


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon 01 Oct, 2007 2:04 pm 
Offline
Calc King
User avatar

Joined: Sun 23 Jan, 2005 12:37 am
Posts: 1727
Location: Netherlands
Looking good Ben! :)

Quote:
benryves wrote:
Furthermore; the assembler defaults to an ASCII encoding. I notice that the Venus header has some non-ASCII characters in it
Let the developers write hex values?

I was about to say the same thing. The Venus source uses those characters to represent binary values that it should have represented with .db $8A or whatever instead. In my opinion this is just bad programming practice, as it results in poor source compatibility, and supporting that just for the sake of backwards compatibility seems... pointless...

_________________
http://clap.timendus.com/ - The Calculator Link Alternative Protocol
http://api.timendus.com/ - Make your life easier, leave the coding to the API
http://vera.timendus.com/ - The calc lover's OS


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 68 posts ]  Go to page 1, 2, 3, 4, 5  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB ® Forum Software © phpBB Group | DVGFX2 by: Matt