SEGA Genesis: Structure

Assembly language being lower level than other programming languages makes it very structurally open. It's an interesting language for a computer scientist to work with, as it's open for implementation of any existing or new programming paradigm.

In the following, I will discuss approaches to variables and types on the Genesis, as well as subroutine calling conventions, and an example approach to implementing the State Pattern.

Variables

In assembly code, a variable can be represented by a register or a memory location. The Motorola 68000 has 15 normal registers D0-D7 and A0-A6 (A7 is used as the stack pointer), which will work nicely for local variables in subroutines:

    move.w    #$7,d0    ; int x = $7
    move.w    #$F000,d1 ; int y = $F000
    add.w     d0,d1     ; y += x
    lea       str,a0    ; char *s = str
    move.b    (a0),d2   ; char c = *s

Global variables are useful for game state, and is most easily implemented using a fixed memory location. Here is one way to declare a variable in memory:

; this doesn't work
my_value:
    ds.w    7 ; int my_value = 7

The variable will be located in memory next to the code, which should work fine on most architectures. However, we're making a SEGA Genesis ROM, and the code is located in read-only memory on the cartridge. So my_value cannot be changed and will thus be constant.

Contra: Hard Corps ROM

If you want a mutable variable, you need to place it in RAM instead. My first naive attempt was to place it in address 0:

; this doesn't work
my_value = $00000000
    move.w  #7,my_value  ; my_value = 7 ?

This would have worked, if address 0 was in RAM, but it isn't. From Charles MacDonald's 'Sega Genesis hardware notes':

$000000-$3FFFFF : ROM
$400000-$7FFFFF : Unused (1)
$800000-$9FFFFF : Unused (2)
$A00000-$A0FFFF : Z80 address space (3)
$A10000-$A1001F : I/O
$A10020-$BFFFFF : Internal registers and expansion (4)
$C00000-$DFFFFF : VDP (5)
$E00000-$FFFFFF : RAM (6)

Address 0 is in ROM, making our variable constant. Our work RAM is in the address space $E00000-$FFFFFF (that's 2 MB). The Genesis only has 64 KB RAM ($10000), but the memory is repeated over and over again in the address space. We can just limit ourselves to using $E00000-$E0FFFF.

So this works:

my_value = $E00000
    move.w  #7,my_value  ; my_value = 7

Using the RAM address space, we can build a memory map of global variables, accessible from any part of our program:

mode  = $E00000    ; 2B
xoff  = $E00002    ; 2B
yoff  = $E00004    ; 2B
str   = $E00006    ; 32B string
input = $E00026    ; 1B

Avoiding variables overlapping in memory is done manually.

In the above example, I reserved 32 B for the str variable, which can be interpreted as an array of 32 single characters. Arrays are discussed in further detail below.

Arrays and Structures

Given a global array of characters like this:

str   = $E00006    ; 32B string

We can access any character using a constant index:

lea     str,a0
move.b  0(a0),d0 ; d0 = str[0]
move.b  1(a0),d1 ; d1 = str[1]

Or using pointer arithmetic:

lea     str,a0
move.b  (a0),d0  ; d0 = *str (str[0])
add.l   #1,a0    ; str++
move.b  (a0),d0  ; d0 = *str (str[1])

The C language has the concept of a structure, a collection of variables that are closely related. The widely used concept of classes is an extension of the structure concept: a structure with an accompanying set of functions that all accept a pointer to a structure as their first argument.

The structure concept is useful in game programming, especially when dealing with large numbers of very similar objects, such as enemies or bullets.

Any set of related variables could be considered an instance of a structure. Let's use this enemy structure as an example:

; struct enemy
enemy_x  = $E00100 ; int16 x
enemy_y  = $E00102 ; int16 y
enemy_dx = $E00104 ; int16 x
enemy_dy = $E00106 ; int16 y

It's useful to have an array of structures, in the case of enemy, we can allocate a pool of enemy instances.

Two approaches to storing an array of enemies immediately come to mind - either we could store instances of enemy structures sequentially:

enemy 0    enemy 1    enemy 2         enemy N
base+$00   base+$08   base+$10        base+N*8
x,y,dx,dy  x,y,dx,dy  x,y,dx,dy  ...  x,y,dx,dy

Or alternatively, we could store arrays of each member of enemy:

enemy x: base+$00
x0,x1,x2, ... xN

enemy y: base+N*2
y0,y1,y2, ... yN

enemy dx: base+(N+1)*2
dx0,dx1,dx2, ... dxN

enemy dy: base+(N+2)*2
dy0,dy1,dy2, ... dyN

The Motorola 68000 does not have a cache, caching was only present in the Motorola processors 68010 and newer, so performance on the 68000 only really depends on how many clock cycles a given algorithm uses. Unlike most newer CPUs, memory access patterns are not a factor.

To investigate the structure sequence and member array approaches to, let's consider this simple update algorithm:

for each enemy E
    E.x += E.dx
    E.y += E.dy

Here is an implementation using the member array approach:

    lea    enemy_x,a0
    lea    enemy_y,a1
    lea    enemy_dx,a2
    lea    enemy_dy,a3
    move.w #enemy_count-1,d0
.loop
    add.w  (a2),a0
    add.w  (a3),a1
    add.l   #2,a0
    add.l   #2,a1
    add.l   #2,a2
    add.l   #2,a3
    dbra    d0,.loop

And here is an implementation using the structure sequence approach:

    move.w #enemy_count-1,d0
    lea    enemy_table,a0
.loop
    add.w  enemy_dx(a0),enemy_x(a0) ; this probably doesn't work
    add.w  enemy_dy(a0),enemy_y(a0) ; this probably doesn't work
    add.l  enemy_size,a0
    dbra   d0,.loop

COUNT CYCLES.

Function Calling Conventions

The other guys just don't stack up

In many programming languages, parsing parameters into and out of functions and methods is hidden by the language implementation. For example, the formal parameters may be pushed onto the stack by the calling code before giving executing to the function body itself. In assembly code, you have to make these decisions yourself.

The simplest implementation would be defining input and output registers for a function:

      result = Add(parameterA, parameterB)
        D0             D0          D1

This function could be implemented like this:

Add:
    add.w   D1,D0
    rts

And called like this:

    ; D0 = Add(5,7)
    move.w  #5,D0   ; parameterA = 5
    move.w  #7,D1   ; parameterA = 7
    jsr     Add

This works, as long as you're careful with which registers are overwritten by functions.

However, when you have multiple levels of function calls, it goes bad almost immediately:

Multiply:
    move.w  d0,d2    ; Multiply destroys D2
.loop:
    add.w   d2,d0
    dbra    d1,.loop
    rts
    ; result = Power( x, power )
    ;   D0            D0   D1

Power:
    move.w  d1,d2    ; But D2 is also used by Power... Uh oh
.loop:
    move.w  d2,d0
    jsr     Multiply ; When this is done, D2 is destroyed
    dbra    d2,.loop ; argh
    rts

To avoid problems, our function can save the state of the registers that will be overwritten onto the stack:

Multiply:

    move.l  d1,-(sp)  ; push D1 onto stack
    move.l  d2,-(sp)  ; push D2 onto stack

    move.w  d0,d2     ; destroys D2, but we already saved it
.loop:
    add.w   d2,d0
    dbra    d1,.loop  ; here we subtract from D1

    ; Restore D1 and D2 from the stack.
    ; Since we pushed D1 before D2, we must pop them
    ; in the opposite order:
    move.l  (sp)+,d2  ; restore D2 from the stack
    move.l  (sp)+,d1  ; restore D1 from the stack

    rts

We can use the 'movem' instruction to save and restore multiple registers:

    movem.l d1-d2,-(sp)     ; push d1, d2 to the stack
    movem.l (sp)+,d1-d2     ; pop d2, d1 from the stack

If you don't wan't to figure out which registers are overwritten, you can save and restore all of them:

    movem d0-d7/a0-a6,-(sp) ;    push all registers onto the stack
    movem (sp)+,d0-d7/a0-a6 ; restore all registers from the stack

It's obviously faster to only push the ones you actually overwrite.

State Pattern

My menu

I wanted to implement a State Pattern in 68000 assembly to make a menu structure. My first implementation is based on the way I would implement it in C#, like this:

  delegate void StateMethod();

  void Test() {
      StateMethod state = StateA;
      state();
  }
  void StateA() { }
  void StateB() { }

The 68000 assembly version could look like this:

 state = $E00000            ; function ptr location
    move.l  #StateA,state   ; state = StateA
 .loop;
    move.l  state,a0        ; a0 = *(state)
    jsr     (a0)            ; jumps to a0, call StateA
    bra     .loop
 StateA:
    rts
 StateA:
    rts

After looking at the Sonic the Hedgehog disassembly, I learned how to do an equivalent implementation based on a jump table:

     move.w  #0,d0             ; function index = 0 (StateA)
     lsl.w   #2,d0             ; d0 *= sizeof( function pointer ) = 4
     jsr     DoState(pc,d0.w)  ; invoke jump table

 DoState:              ; Jump table
     bra.w     StateA  ; offset 0
     bra.w     StateB  ; offset 4
     rts               ; this should never be reached
 StateA:
     rts
 StateB:
     rts

I ended up using the jump table, as adding new functions was trivial compared to assigning new function pointers.

References

http://meseec.ce.rit.edu/eecc250-winter99/250-final-review.pdf

https://computerarchive.org/files/comp/applications/amiga/manual/ProAsm%20-%20Manual-ENG.pdf Info about ProAsm ... has similar features to vasm and a way better manual.