[TI ASM] Optimizations

Got questions? Got answers? Go here for both.

Moderator: MaxCoderz Staff

User avatar
Halifax
Sir Posts-A-Lot
Posts: 225
Joined: Mon 01 Jan, 2007 10:39 am
Location: Pennsylvania, US

Post by Halifax »

hmm ok maybe I was wrong.
User avatar
Halifax
Sir Posts-A-Lot
Posts: 225
Joined: Mon 01 Jan, 2007 10:39 am
Location: Pennsylvania, US

Post by Halifax »

Change

Code: Select all

ld a,KEY_GROUP
out (1),a
nop
nop
in a,(1)
to

Code: Select all

ld a,KEY_GROUP
out (1),a
ld a,(de)
in a,(1)
This saves 1 byte and 1 tstate.
User avatar
Timendus
Calc King
Posts: 1729
Joined: Sun 23 Jan, 2005 12:37 am
Location: Netherlands
Contact:

Post by Timendus »

CalcKing wrote:Expanding on Dwedit's trick, I created a macro system for Bot Attack.

Code: Select all

#define	curVPutS(curPos)	call	cur_vputs \ .dw	curPos

	...

	curVPutS(1+(256*9))			; Draw author's name... ;-)
	.db	"by Peter Wakefield",0

;-----> Optimized vputs routine
; inputs:   bytes following call: CoordLSB,CoordMSB,Null-terminated string
; output: String displayed, smaller than using redundant code
cur_vputs:
   pop   hl
   ld   e,(hl)
   inc   hl
   ld   d,(hl)
   inc   hl
   ld   (pencol),de
   bcall(_vputs)
   jp   (hl)
Why didn't you do it like this:

Code: Select all

#define	print(xcoord,ycoord,string)	call	cur_vputs \ .dw	xcoord+(256*ycoord) \ .db string,0

	...

	print(1,9,"by Peter Wakefield")
I guess it should work just as well (the API uses something like this somewhere I think, can't check now since my server is dead). It shouldn't make a difference in speed or size, but it greatly improves the readability of your code.

Oh, wait, if you do it like this, it'll also save you a few instructions (faster, but possibly bigger depending on how many times you use the macro):

Code: Select all

#define	print(xcoord,ycoord,string)	ld de,xcoord+(256*ycoord) \ call cur_vputs \ .db string,0

	...

	print(1,9,"by Peter Wakefield")

;-----> Optimized vputs routine
; inputs:   bytes following call: CoordLSB,CoordMSB,Null-terminated string
; output: String displayed, smaller than using redundant code
cur_vputs:
   ld   (pencol),de
   pop   hl
   bcall(_vputs)
   jp   (hl)
http://clap.timendus.com/ - The Calculator Link Alternative Protocol
http://api.timendus.com/ - Make your life easier, leave the coding to the API
http://vera.timendus.com/ - The calc lover's OS
junki
New Member
Posts: 9
Joined: Wed 07 May, 2008 10:45 am
Location: Finland

Post by junki »

[quote="sigma"]If you want a 16-bit loop counter, never, ever do this:

Code: Select all

- ; Loop body
  ;   .
  ;   .
  ;   .
  ld  a, d
  or  e
  jp  nz, -
That would be useful, if each run of the loop must be in constant time.

Just nitpicking about the "never, ever" :)
Juha
User avatar
driesguldolf
Extreme Poster
Posts: 395
Joined: Thu 17 May, 2007 4:49 pm
Location: $4080
Contact:

Post by driesguldolf »

Time variation added by doing it the correct way shouldn't bother you in most (if not all) occasions. :P
King Harold
Calc King
Posts: 1513
Joined: Sat 05 Aug, 2006 7:22 am

Re: [TI ASM] Optimizations

Post by King Harold »

Amazing that I never thought of this before, but in a multiplication you can actually stop after the operand that you are shifting out to test the bits becomes zero (not when, but after, very important difference) because you will never add anything to the result from that point onwards. It makes the loop slightly slower, but you will have an early exit in many cases - an early exit that often saves more cycles than it adds to the loop itself, and you won't need a loop counter (which is very cool when you're multiplying bigger things and you need all the registers you can get)

Proof of concept:

Code: Select all

DE_times_A:
   ld hl,0
   or a ;have to reset carry
_loop:
   rra
   jr nc,_skip
   add hl,de
_skip:
   sla e
   rl d
   or a   ;slightly slower than the usual djnz
      ;as a bonus it will reset the carry, needed for the rra
   jr nz,_loop  ;4+12 vs 13
   ret
Pro:
* early exit saves a lot of time for small value of A
* not used BC
* intro is 3 cc's faster (well, that's nothing..)
Con:
* slightly slower loop (3 cc's per iteration more)
Neither:
* exactly the same code size as the usual algorithm

disclaimer: I haven't slept much for a while due to the high temperature, it could be that I'm completely out of my mind. Please notify me if that's the case..
King Harold
Calc King
Posts: 1513
Joined: Sat 05 Aug, 2006 7:22 am

Re: [TI ASM] Optimizations

Post by King Harold »

So, now for a little cc analysis.
* one iteration of the old version takes 45 or 51 cc's (unless it is the last, then it's 5 less)
* (3*X)-3 cc's are added in the new version where X is the number of iterations
* 45*(8-X) cc's are saved where X is the number of iterations
* in the worst case, X=8 and 21 cycles are added.
* in the best case, X=1 (note: it can't be 0) it's 315-0=315 cc's faster
* the second-worst case is X=7: 45-18=27 cc's faster
* all timing differences: -315, -267, -219, -171, -123, -75, -27, 21
* the average (that is, 0.5 * 21 + 0.25 * 27 etc) is: -25.4 cc's
I apologize in advance for all errors that I will have made.

Why weren't we all using this way before? Or were we, but just not me?
User avatar
tr1p1ea
Maxcoderz Staff
Posts: 4141
Joined: Thu 16 Dec, 2004 10:06 pm
Location: I cant seem to get out of this cryogenic chamber!
Contact:

Re: [TI ASM] Optimizations

Post by tr1p1ea »

"My world is Black & White. But if I blink fast enough, I see it in Grayscale."
Image
Image
King Harold
Calc King
Posts: 1513
Joined: Sat 05 Aug, 2006 7:22 am

Re: [TI ASM] Optimizations

Post by King Harold »

Hm ok, I compared it to:

Code: Select all

DE_Times_A:          ; HL = DE × A
    LD     HL, 0      ; Use HL to store the product
    LD     B, 8       ; Eight bits to check
_loop:
    RRCA             ; Check least-significant bit of accumulator
    JR     NC, _skip  ; If zero, skip addition
    ADD    HL, DE
_skip:
    SLA    E         ; Shift DE one bit left
    RL     D
    DJNZ   _loop
    RET
(asm in 28 days, day 15)
User avatar
benryves
Maxcoderz Staff
Posts: 3087
Joined: Thu 16 Dec, 2004 10:06 pm
Location: Croydon, England
Contact:

Re: [TI ASM] Optimizations

Post by benryves »

tr1p1ea wrote:I think most people use this one: http://map.grauw.nl/sources/external/z80bits.html#1.1
*http://baze.au.com/misc/z80bits.html#1.1 ;) I don't know why Google ranks that outdated .nl one higher than the original.
Post Reply