MIPS Assembler

Introduction
Entering Assembler Mode
Basic Syntax
Register Names
Opcodes
Structured Conditionals
Local Labels
Directives

Introduction

Open Firmware includes an assembler for the MIPS instruction set. The same assembler is present in both the development compiler (builder) and the target system.

The Open Firmware assembler is quite different from conventional MIPS assemblers.

Its syntax is postfix - the operands are specified before the opcode, which makes it much easier to support assembler macros
It supports the opcodes that are actually present in the MIPS hardware instruction set, plus a select few pseudo-ops that generate multiple instructions, instead of synthesizing a whole suite of "instructions" that are not really in the instruction set.
It supports structured conditional directives like if .. else .. then and begin .. until so you rarely need to invent labels for simple conditionals and loops.
It does not reorder the instruction stream to fill delay slots, nor does it attempt to optimize the instruction stream for any particular MIPS implementation's pipeline characteristics. What you write is what you get.
It does not generate or process standard object file formats containing linkage information. Instead, it is intended to be used either for incremental compilation of short "code words" that operate within the framework of Open Firmware's Forth interpreter or for self-contained "dropin modules" that execute during the Open Firmware early startup sequence (typically to do things like configuring the memory controller).
It has only limited support for the floating point (coprocessor) instruction set. The basic floating point instructions are supported, but many of the elaborations are not.

The net result is that the Open Firmware MIPS assembler is much smaller that conventional assemblers. The binary code is approximately 12K bytes, compared to over 2Mbytes for some versions of gas . It is small enough to leave the assembler resident in the target system, thus making it possible to create short assembly language test sequences at any time, without needing access to a development host.

Entering Assembler Mode

There are several different ways to enter assembler mode, depending on how the assembled code will be used.

code newname assembly code ... c;	code creates (i.e. incrementally adds to the resident Forth dictionary) a new Forth word named newname and enters assembler mode so you can implement the new word in assembly language. c; assembles the appropriate instruction sequence to return control to the Forth execution engine and exits from assembler mode. You can then execute the new word by typing its name, as with any other Forth word.
label newname assembly code ... end-code	label creates a new Forth word named newname and enters assembler mode so you can assemble some machine instructions that will be used within the context of the current Forth dictionary, but that will not be called directly as a Forth word. end-code exits from assembler mode. Later execution of newname pushes on the Forth stack the beginning address of that sequence of machine instructions. label is typically used to create instruction sequences, for example interrupt handlers, that will be executed in a context other than from the Forth virtual machine. It is also possible to mix the constructs: code newname ... end-code creates a Forth word that can be executed by typing its name, but that will not return to Forth. label newname ... c; creates a code sequence that could be entered from elsewhere (e.g. from an exception hander), but which would enter Forth when it was done.
start-assembling assembly code ... end-assembling	start-assembling .. end-assembling is uses to create a block of code that will later be written out to a file (often in the form of a "dropin module") for use in a target system environment. See Directives.

Basic Syntax

The assembler is really just a Forth vocabulary containing Forth words that put MIPS machine instructions in memory when they are executed.
The basic syntax of the Open Firmware MIPS assembler is postfix, just like ordinary Forth code. You first execute Forth words to push items on the stack - those items typically identify registers, numbers, or addresses, i.e. the operands. Then you execute a word that names the opcode, which assembles the machine instruction corresponding to that opcode with those operands.

The order of the operands is generally source first, then destination.

Example:

v0 20 t3 ld

This is equivalent to "ld $t3,20($v0)" in conventional syntax.

When the assembler is active, the vocabulary containing the MIPS opcode and operand words is first in the search order, but the "forth" vocabulary containing all the other Forth tools is also in the search order, so you can execute ordinary Forth commands while assembling MIPS code. This is useful for things like including other files (by using the "fload" command) and calculating the values of numeric constants.

Register names

Executing the name of a MIPS register pushes on the stack a number that identifies that register. The register names are:

$0 .. $31 Basic names for the 32 integer registers
$f0 .. $f31 Basic names for the 32 floating point registers

v0 .. v1
k0 .. k1
t0 .. t9
s0 .. s8
$a0 .. $a3
$sp $gp ra $at
$kt0 .. $kt1

Aliases for certain integer registers, reflecting their conventional use by MIPS compilers.

sp rp up ip np base tos w

Aliases for certain integer registers, reflecting their use within the Forth virtual machine.
sp = $sp , rp=s6 , up=s3 , ip=s5, np=s1 , base=s2, tos=s4, w=t0

The inconsistent naming (some alias names begin with $ and some don't) is historical accident - in part it is due to the fact that the names a0, a1, a2, and a3 conflict with hex numbers.

Opcodes

rs - A register (in load and store instructions, this register is the one that
rt - A register (in load and store instructions, this register is the one that receives or supplies the data)
imm - immediate operand

Opcodes	Stack Operands	Comments
ldl, ldr, lb, lh, lwl, lw, lbu, lhu, lwr, lwu, cache, ll, lwc1, lwc2, pref, lld, ldc1, ldc2, ld	( rs imm rt -- )	Load instructions - rt receives the data
sb, sh, swl, sw, sdl, sdr, swr, sc, swc1, swc2, scd, sdc1, sdc2, sd	( rt rs imm -- )	Store instructions - rt supplies the data
add, addu, sub, subu, and, or, xor, nor, slt, sltu, dadd, daddu, dsub, dsubu	( rs rt rd -- )	Two-operand instructions - rs OP rd -> rd
lui	( imm rt -- )	Load upper immediate - (imm << 16) -> rt
sethi	( imm1 rt -- )	Alternative form of lui - you specify the actual immediate value that you want to appear in the register. E.g. h# 0f450000 $a2 sethi
addi, addiu, slti, sltiu, daddi, daddiu, andi, ori, xori	( rs imm rt -- )	Two-operand immediate instructions - rs OP imm -> rt
j, jal	( adr -- )	Jump/Jump and link to target address adr
sll, srl, sra, dsll, dsrl, dsra, dsll32, dsrl32, dsra32	( rt imm rd -- )	Shift instructions - rt SHIFT imm -> rd
sllv, srlv, srav, dsllv, dsrlv, dsrav	( rt rs rd -- )	Variable shift instructions - rt SHIFT rs -> rd
mult, multu, div, divu, dmult, dmultu, ddiv, ddivu, tge, tgeu, tlt, tltu, teq, tne	( rs rt -- )	Two-operand instructions with implicit destination register
mfhi, mthi, mflo, mtlo	( rd -- )	Access to HI and LO registers. rd is the source of destination as appropriate.
beq, bne, beql, bnel	( adr rs rt -- )	Two-operand conditional branches - branch to address 'adr' if rs COND rt
blez, bgtz, blezl, bgtzl, bltz, bgez, bltzl, bgezl, bltzal, bgezal, bltzall, bgezall	( adr rs -- )	One operand conditional branches - branch to address 'adr' if COND(rs)
bra, bal	( adr -- )	Branch, branch and link (the former is a special case of a conditional branch)
bc0f, bc0fl, bc0t, bc0tl, bc1f, bc1fl, bc1t, bc1tl	( adr -- )	Coprocessor branches
tgei, tgeiu, tlti, tltiu, teqi, tnei	( rs imm -- )	Trap immediate - trap if rs COND imm
syscall, break, sync, dbreak	( -- )	Zero-operand instructions
mtdr, mfdr	( dbreg rt -)	Access to debug registers
single, double, float	( -- )	Set the mode for subsequent floating point instructions
addf, subf, mulf, divf	( fs ft fd -- )	Two-operand floating point instructions - fs OP ft -> fd
sqrt, abs, movf, negf, round.w, trunc.w, ceil.w, floor.w, cvt.s, cvt.d, cvt.w, cxx	( fs fd -- )	One-operand floating point instructions - OP(fs) -> fd
mfc1, cfc1, mtc1, ctc1	( fs rt -- )	Floating-point coprocessor-register to integer-register instructions
mfc0, mtc0, dmfc0, dmtc0	( cpreg rt -- )	System coprocessor-register to integer-register instructions
tlbp, tlbr, tlbwi, tlbwr,eret	( -- )	TLB and exception instructions
.s, .d, .w	( -- )	Floating point formats
dset	( low high reg -- )	Assemble code to put the 64-bit number low, high into the register
li, set	( n reg -- )	Assemble code to put the 32-bit number n into the register (li and set are equivalent)
la	( adr dst -- )	Like li but sets a relocation bit for the Forth dictionary file.
brif	( adr cond -- )	Assemble a branch from a conditional constructor (see below)
=, <>	( rs rt -- cond )	Constructor for two-operand structured conditionals - rs COND rt
0<=, 0>, 0<, 0>=	( rs -- cond )	Constructor for one-operand structured conditionals - COND(rs)

Structured Conditionals

The MIPS assembler includes macros for constructing common flow control structures without having to make up label names. The structures shown below can be nested to arbitrary depth.
The easiest way to deal with branch delay slots is just to fill them with nop's.
In the following examples, the delay-slot instructions are indented the same as the code preceding their branch instruction, to emphasize that the delay instruction is logically a part of the preceding code flow.

Usage Form	Example	Equivalent Code (in noreorder mode)	Comments
cond if .. then	`t3 t4 <> if nop \ Delay t3 1 t3 addiu then`	`beq t3,t4,1f nop addiu t3,1,t3 1:`	1) The conditional in this case is "t3 t4 <>", i.e not equal, which takes two register operands. 2) Note the branch delay slot after the "if" 3) There is no delay slot after "then" because it just marks the target of a forward branch.
cond if .. else .. then	`t2 0> if $7 0 $4 lw \ Delay sp -4 sp addiu $a1 $a3 mult else $3 $4 $5 add \ Delay $7 4 $4 lw then`	`blez t2,1f lw $4,0($7) addiu $sp,$sp,4 mult $a1,$a3 b 2f add $3,$4,$5 1: lw $4,4($7) 2:`	1) The conditional in this case is "t2 0>", i.e. t2 greater than zero. 2) Note the branch delay slots after "if" and "else", because of the implicit branch around the else clause.
begin .. again	`begin again t3 0 t4 lw \ Delay`	`1: beq $0,$0,1b lw $t4,0($t4)`	1) In this infinite-loop example, there are no other instructions between "begin" and "again". The important work is done in the delay slot after the unconditional branch that "again" generates. It is of course possible to put instructions between "begin" and "again" 2) Note that there is no delay slot after "begin", because "begin" does not generate a branch instruction; it marks the target of a backward branch.
begin .. cond until	`begin s0 s1 s1 add t3 t4 t5 add t5 t6 = until nop \ Delay`	`1: add $s1,$s0,$s1 add $t5,$t3,$t4 bne $t5,$t6,1b nop`	1) "until" generates a conditional backward branch, so it has a delay slot 2) "\ Delay" is just a comment to remind you what is happening. It does not affect the code that is generated.
begin .. cond while ... repeat	`begin $a0 8 $a1 lw $a1 $0 <> while nop \ Delay $a0 4 $a0 lw repeat nop \ Delay`	`1: lw $a1,8($a0) beq $a1,$0, 2f nop lw $a0,4($a0) beq $0,$0,1b nop 2:`	1) "while" generates a conditional forward branch past the "repeat", so it has a delay slot 2) "repeat" generates a conditional backward branch to "begin", so it has a delay slot 3) Note that the delay instruction can be on the same line as the "repeat" if you wish. This is generally true; since the assembler syntax is postfix without any need for "lookahead", line boundaries are irrelevant. You can put multiple opcodes on one line if you wish.
ahead .. then	`ahead nop \ Delay 12345 , 6789 , 3 , then`	`beq $0,$0, 1f nop .word 12345 .word 6786 .word 3 1:`	1) "ahead" generates an unconditional forward branch to "then". It has a delay slot. 2) In this example we are using it to skip some data that we have placed in-line. 3) The most common use of unconditional forward branches is subsumed by the "if .. else .. then" construct, so "ahead" is rarely used.
ahead .. but then	`ahead t0 0 t1 lw \ Delay begin t0 4 t2 lw t2 t3 t3 add but then t1 $0 = until t1 -1 t1 addiu \ Delay`	`beq $0,$0,1f lw $t1,0($t0) 2: lw $t2,4($t0) add $t3,$t2,$t3 1: bne $t1,$0,2b addiu $t1,$t1,-1`	This is an advanced usage in which two conditional constructs ("begin .. until" and "ahead .. then") are not properly nested. The intention here is to start the loop execution at the test condition at the end. "ahead .. then" is used to branch forward to the test condition. The "but" rearranges the assembler's control flow stack so that "then" can resolve "ahead" without being confused by the intervening "begin".
begin .. cond if .. then cond until	`begin t0 8 t1 lw t1 t2 <> if nop \ Delay t1 t3 t3 add then t1 -5 t1 addiu t1 0< until nop \ Delay`	`1: lw $t1,8($t0) beq $t1,$t2,2f nop add $t3,$t1,$t3 2: addiu $t1,$t1,-5 bgez t1,1b nop`	This is a straightforward example of well-nested control structures. Note the use of indentation to show the scope of the control flow. Especially note how the delay slots are indented the same as the code preceding the branch.

Local Labels

Local (numbered) labels can be used in conjunction with branch instructions to create control flows that can't easily be represented with the structured conditionals. These are rarely needed, since the structured conditionals cover the most common cases, and your code will be easier to understand if you can express it in terms of those structures.

Command	Stack Effect	Description
l:	( label# -- )	(ell-colon) Create a local label with the given number
f:	( label# -- adr )	Return the address of the next occurrence of the given local label in the forward direction
b:	( label# -- adr )	Return the address of the next occurrence of the given local label in the backward direction

Directives

These assembler directives are primarily used to create assembly-language "dropin modules" that execute on target machines before the Forth execution environment has been set up.

Directive	Stack	Description
start-assembling	( -- )	Set asm-base to the current address within Forth data space, thus marking that address as the start of memory that will later be written to a file. Set asm-origin to 0. Arrange for the names of any subsequent labels to be stored in a separate memory space.
end-assembling	( -- )	Restore labels to their default behavior (i.e. label names will subsequently be stored in-line in Forth data space rather than in a separate area).
label newname	( -- )	Mark the current location in Forth data space so that later execution of newname will return that address. When executed within the context of start-assembling .. end-assembling , the label name will be stored in a separate memory area outside of the Forth data space (normally, in a Forth dictionary context, the Forth data space consists of interspersed names and data).
asm-origin	( -- adr )	A Forth value that contains the address within the target machine's address space that corresponds to the beginning of the memory marked by start-assembling .The default value, set by start-assembling, is 0, but it can be changed to an arbitrary value n by executing "n to asm-origin" .
asm-base	( -- adr )	A Forth value that contains the start address within the host machines address space of the beginning of the memory marked by start-assembling .
pad-to	( adr -- )	Add 0x0 bytes to memory until the current address within the target machine's address space is adr. Specifically, ( here - asm-origin ) = ( adr - asm-origin ) when pad-to completes. Aborts if the current address is already above adr .
align-to	( n -- )	Add 0x0 bytes to memory until the current address within the target machine's address space is a multiple of n
assemble-little-endian	( -- )	Configure the MIPS assembler to generate code in little-endian byte order. The default is the byte order of the host system.
assemble-big-endian	( -- )	Configure the MIPS assembler to generate code in little-endian byte order. The default is the byte order of the host system.
c$,	( adr len -- )	Place the bytes from the range adr len into memory, followed by a null terminator byte (which need not be present in the source range) and sufficient padding for four-byte alignment. This is primarily used within the "reset" module to place the names of other dropin modules in-line within the code.