Assembly

Assembly refers to the lowest-level language of any computer architecture that is human readable. Every computer architecture has it's own dialect (or instruction set) of assembly (even virtual machines like .NET and JVM have their own instruction sets), thus it is not portable. Assembly use to be required for gamedev as graphics libraries weren't as common as they are now. However, it is still used for debugging and optimization for more advanced games, particularly for console games since you know exactly what hardware it has thus portability isn't as big of an issue and optimization is more valuable.

Currently, x86 has the largest instruction set.

Pros and Cons

It is generally not recommended to write games entirely in assembly language unless you are targeting very old, weak, or obscure hardware like the NES or Commodore 64. However, it is still a good language to know and useful when you really need to take full advantage of the hardware you are using, or for debugging since you can get a deeper understanding of your program at a lower level.

Pros

  • Full advantage of the hardware
  • Minimal CPU overhead
  • Embeddedable in some higher-lever languages via inline-assembly
  • Ability to target much older and weaker hardware, making it ideal for homebrew development and ROM hacking
  • Gives you a much better understanding of the inner workings of the hardware and your program
  • Useful for debugging and reverse engineering

Cons

  • Not beginner friendly
  • Requires more advanced knowledge about the hardware you are targeting
  • Requires much more preparation and planning before you could write something, otherwise you will end up with spaghetti code
  • Not portable
  • Modern compilers often produce better assembly than humans, so higher-level code may have better optimization than hand-written assembly

Properties

  • Low-level
  • Unstructured - no blocks for if-statements, loops, etc.
  • No real variables; only registers

Resources

Tutorials

Instruction Set References

Assemblers

Basics

These code examples are going to use MIPS Assembly, but most instruction sets are similar to each other, so it is easy to learn another instruction set after learning one. Also, these examples assume that you have some basic C programming knowledge.

Registers

Assembly has no variables, at least not in the same way that higher-level languages do. Instead you have registers which are stored in the CPU. In MIPS, you have 18 registers for integers: 8 permanent registers ($s0-$s7) and 10 ($t0-$t9) temporary registers. Registers are denoted by a dollar sign and a letter for the type of register they are. Some registers like $zero and $k are reserved for the assembler and cannot be changed. Anything else in memory is stored in RAM which needs to be loaded and stored manually. In MIPS, this is done by the sw (store word) and lw (load word) instructions:

sw $s0, (20)$s1 # Store the value of $s0 into the memory address of $s1 plus the offset of 20
lw $s0, (20)$s1 # Load the value of the memory address of $s1 plus the offset of 20 into $s0

Instructions

Instructions in assembly are the equivalent to statements in mid to high-level languages. Only one instruction can be done per line of code and instructions cannot take multiple lines. Each instruction contains arguments that are separated by commas.

<instruction name> <destination register>, <register1>, <register2>  # Comments go after the instruction and use the number sign

Labels

Labels allow you to jump to a line without referring to it's number. This is useful as line numbers will change as your code changes. In most instruction sets like MIPS, labels are declared by their identifier, followed by a colon.

main:
add $s0, $s1, $s2

Arithmetic Expressions

Assembly can only do one mathematical operation per instruction, so a mathematical expression that requires multiple operations takes multiple instructions. Take this line of C code for example:

f = (a + b) - (c * d);

In MIPS Assembly, it would like something like this:

add $t0, $s0, $s1   # $s0 and $s1 are the registers for a and b respectively. Add them and store them in $t0
mult $t1, $s2, $s3  # $s2 and $s3 are the registers c and d respectively. Multiply them and store them in $t1
sub $s4, $t0, $t1    # subtract $t1 from $t0 and store it into $s4

Branch Instructions

Assembly is an unstructured language, so there are no code blocks for if-statements, loops, functions, etc. Branch instructions are used to conditionally jump to certain lines (or labels) of code. The commonly used branch instructions in MIPS assembly are bne (branch not equal), beq (branch equal), blt (branch less than), and bgt (branch greater than). So an if-else statement in C like this:

if (a == b) {
    c = a + b;
}
else {
    c = a - b;
}

… would be written like this in MIPS assembly:

bne $s0, $s1, Else           # If $s0 (a) does not equal $s1 (b), goto Else
add $s2, $s0, $s1            # Add $s0 and $s1 and store it into $s2 (c)
Else:  sub $s2, $s0, $s1  # Subtract $s1 from $s0 and store it into $s2

Loops

As stated earlier, Assembly is an unstructured language, so there are no loop statements. Loops need to be done manually via branching and goto statements. Take this while loop in C:

while (a < b) {
    a++;
}

In MIPS assembly, it looks something like this:

Loop: bge $s0, $s1, Exit    # If $s0 is greater than $s1, goto Exit
addi $s0, $s0, 1    # Add 1 to $s0 (addi is used instead of add when using a constant)
j Loop    # Goto loop
Exit:    # Code after loop goes here

Subroutine and Function Calls

In MIPS assembly, subroutine and function calls are done through the j (jump) and jal (jump and link) instructions respectively. Arguments are passed through the $a0-$a3 registers and the return value is placed in registers $v0 and $v1. If there are anymore arguments or a struct that is being called by value, they need to be stored in a stack. Since Assembly is an unstructured language, functions aren't defined anywhere in the code and you need to jump to the line number or label manually. Let's use a simple C function that returns the sum of two arguments:

int main() {
    int a = 3;
    int b = 2;
    int c;
 
    c = sum(a, b);
 
    return 0;
}
 
int sum(int x, int y)  {
    return x + y;
}

The C compiler would probably structure the MIPS assembly like this:

main: li $s0, 3 # Set $s0 (a) to 3
li $s1, 2 # Set $s1 (b) to 2
move $a0, $s0 # Copy the value of $s0 into $a0
move $a1, $s1 # Copy the value of $s1 into $a1
jal sum # Jump and link to label "sum"

sum: add $v0, $a0, $a1 # Add $a0 and $a1 and store the value into $v0
jr $ra # Return to the line number stored in $ra
move $s2, $v0 # Store the value of $v0 into $s0

The line number you were on prior to the function call is stored in the register $ra (return address). The instruction jr returns to said line number. The instruction move copies values from one register into another while the instruction li sets a register to a numerical value.
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License