Optimizing an Imaginary Sprite Part 2

<< Previous | Next >>

Listing 1 – Unoptimized

This is the first unoptimized version of the code. You will see the sprite data is sitting at address $4000, and is terminated at sp_end. The byte values are in hex: $00,$00,$A0,$0a,$a4,$aa. The number in the comment field is the cycle count, or the number of clock cycles it takes to execute the instruction. There is a way to get the count via lwasm, but it beats me how to do it on the command line. The backgnd is at address $4100 and finally the destination mixbuf is at $4200. mixbuf is what you look at once the program has finished.

sprite1.asm

01                         org     $3f00
02         
03 [3]                     ldu     #mixbuf         ;buffer 3
04 [3]                     ldx     #sprite         ;pointer to sprite stream
05 [4]                     ldy     #backgnd
06         
07 [4+0]   again           lda     ,x              4;get sprite byte
08 [5]                     bne     nxt             3;not 0, so we have to do something
09         
10         rezero          
11 [4+1]                   leax    1,x             5;was zero, inc to next byte
12 [4+2]                   ldb     ,y+             6;get bgnd and inc y
13 [4+2]                   stb     ,u+             6;store background and inc u
14         
15 [4+0]                   lda     ,x              4;get next byte
16 [5]                     beq     rezero          3
17         
18 [2]     nxt             bita    #$0f            2;lets check if bits are
19 [5]                     beq     poo             3;set in upper and lower
20 [2]                     bita    #$f0            2;half... if either half 
21 [5]                     beq     poo             3;is set, we need to do 
22 [5]                     bra     copy            3;stuff 
23         
24 [4+2]   poo             ldb     ,y+             6;get background byte + inc     
25 [2]                     anda    #$0f            2;check if high bits set
26 [5]                     bne     high_set        3
27         
28 [4+0]                   lda     ,x              4
29 [2]                     anda    #$f0            2
30 [2]                     andb    #$0f            2
31 [5]                     bra     low_set         3
32         
33 [2]     high_set        andb    #$f0            2
34 [5]     low_set         stb     dd+1            5
35         
36 [2]     dd              ora     #0              2
37         
38 [4+2]   copy            sta     ,u+             6
39 [4+1]                   leax    1,x             5
40 [4]     spend           cmpx    #sp_end         4
41 [5]                     blo     again           3
42 [5]     render          rts
43         
44                         org     $4000           
45         sprite          fcb     $00,$00,$A0,$0a,$a4,$aa
46         sp_end
47                         org     $4100
48         backgnd         fcb     $CD,$dd,$dd,$ad,$bd,$cd
49                         
50                         org     $4200
51         mixbuf          fcb     $00,$00,$00,$00,$00,$00
52         ;expected result        $cd,$dd,$ad,$aa,$a4,$aa
53         

The source above is the cycle count. So lets get to optimizing!

Next, Version 2 – A little fix