Page 1 of 5

[TI ASM] Optimizations

Posted: Mon 25 Apr, 2005 2:11 am
by DarkAuron
This topic is for all optimizations you guys know in assembly.. from simple opcode tricks to complicated self-modifying code. These should be more geared to common encounters, so keep that in mind. I don't see another topic like this so I thought it'd be a good idea to start it. I think if each one was presented and given a brief explanation then maybe we could all learn from them. Anyhow, another thing I need to say is that any off-topic posts will most likely be REMOVED.. you have been warned.

Now, to start this off and give an example layout, I'll give the most obvious one that pretty much every Z80 programmer knows (or seriously needs to).

Optimizes for speed AND size (3 clock cycles and 1 byte):

Code: Select all

ld a,0
to

Code: Select all

xor a
Good: Commonly used and is a quick, easy optimization.

Bad: Clears the C and N flags, so if you really need them, this optimization might not work out. Generally it'll be good for ya though.

Posted: Mon 25 Apr, 2005 2:26 am
by Andy_J

Code: Select all

or a
Not

Code: Select all

cp 0
Another one everyone needs to know. :)

Posted: Mon 25 Apr, 2005 2:33 am
by koolmansam375
load values into register pairs . Ex:
Instead of:

Code: Select all

ld b,$56
ld c,$47
do

Code: Select all

ld bc,$5647
[/code]

Posted: Mon 25 Apr, 2005 3:05 am
by Andy_J
I keep getting my big/little-endian stuff confused... Would that work like that?

Posted: Mon 25 Apr, 2005 3:10 am
by tr1p1ea
Yes, the z80 is little endian.

If you dont like hex, you can also do:

Code: Select all

ld bc,(b_num * 256) + c_num
b will equal b_num and c will equal c_num.

This is using tasm mind you.

Posted: Mon 25 Apr, 2005 3:16 am
by DarkAuron
Speed optimization but hurts a bit in the space department: unrolling.

What this is, is copying the same code over and over again instead of using a loop, the speed gain is from not having to jump back for the loop, but the multiplied code takes away precious space. Do this only when speed is desperately needed.

Posted: Mon 25 Apr, 2005 3:24 am
by Spencer

Code: Select all

call xxxx
ret
to

Code: Select all

jp xxxx

Posted: Mon 25 Apr, 2005 3:32 am
by Andy_J
tr1p: Yes, I know it's little-endian, I just meant I keep getting how the data is stored in the source confused.

Spencer: Actually, that's not identical. The first one, you're branching to xxxx then returning to whatever called your routine. In the second one, you're going to xxxx and won't go back to whatever called you.

Posted: Mon 25 Apr, 2005 3:42 am
by Spencer
Yes it will. Only loss is one push-pop of the stack.

Posted: Mon 25 Apr, 2005 3:44 am
by Jim e
What spencer did, for most purposes, should work. though it is not the same thing.
And as far as little endian goes, B will always have the MSB. Even though it is stored LSB -> MSB.


Oh and heres an unrolling I don't see often.

Code: Select all

;instead of

	ld hl,gbuf
	ld de,pic
	ld bc,768
	ldir

;try

	ld hl,gbuf
	ld de,pic
	ld bc,768
copyfast:
	ldi \ ldi \ ldi \ ldi
	ldi \ ldi \ ldi \ ldi   ;bc must be divisible by 16
	ldi \ ldi \ ldi \ ldi
	ldi \ ldi \ ldi \ ldi
	jp pe,copyfast

Posted: Mon 25 Apr, 2005 3:51 am
by Andy_J
Err, yeah, I guess that will work... I think I should go to sleep, ehh?

Posted: Mon 25 Apr, 2005 4:58 am
by Dwedit
/me shuts his mouth...

Posted: Mon 25 Apr, 2005 5:48 am
by DarkAuron
LDIR takes 21 clock cycles per LDI, LDI itself is just 16 clock cycles. Thus theres a 5 cycle gain per LDI, or 5*16-10 (70) cycles boost per jump.. total of 3360 clock cycles gained. Unless I'm missing something here. And wait, shouldn't that be jp po,copyfast? And where'd that ret come from? :D

Posted: Mon 25 Apr, 2005 1:44 pm
by Jim e
hehe... The ret is my mistake. :oops:

Oh but parity even is when the p/v flag is set, in ldi bc is zero when it's reset.

Posted: Mon 25 Apr, 2005 2:04 pm
by tr1p1ea
I am speechless .... O_O.