Search code examples
assemblymachine-codenand2tetris

Assembly Hack to Binary Machine Language


How do I convert Assembly Hack to Binary Machine Language?

For example, the below hack assembly, how would I manually translate it into machine code (binary). I just need to know a reference or where I can learn how to manually translate this.

Computes R0 = 2 + 3

@2
D=A
@3
D=D+A
@0
M=D

Solution

  • There's only few types of assembly-language line forms, here are some of them:

    1. A-type instruction
    2. C-type instruction
    3. Label
    4. Comment & blank line

    As you might imagine, labels & comments (3&4) don't generate any machine code instructions, and while comments are ignored, labels inform A-type instructions about offsets — however, your sample has neither, so not to concern with them.

    A- & C-type instructions are each 16 bits wide.

    A-type instructions are very simple, occupying 16 bits of machine code, where the first bit (top bit, MSB, most significant bit) of the 16 is 0, to indicate A-type instruction, and the other 15-bits are the numeric value (e.g. in @2) or label location (e.g. in @loop).

    So, @2 encodes as follows:

     +-- A type indicator, top bit is zero for A-type
     |
     v
     0000000000000010   <-- 16-bit machine code instruction
      |-------------|   range of immediate value field for A-type
    (0000000000111111)  
    (0123456789012345)  bit position (MSB at pos 0, LSB at 15)
    

    The top bit is 0 for A-type.  For the rest of this instruction (@2), the lower 15 bits encode the value "2".

    The C-type instructions are also 16 bits wide and start with the MSB with value 1, which differentiates them from A-type instructions (as those start with 0 instead).  C-type instructions have three fields of interest: comp, dest, jump.

    comp stands for what to compute, is a 6-bit field

    dest stands for where to store the computation, is a 3-bit field

    jump stands for what conditions to alter flow of control of the machine code program, is a 3-bit field

    The C-type instruction is often written as X = Y, where the X is simply whatever is on the left hand side of = and the Y is similarly whatever is on the right hand side of =.  The X corresponds to dest and the Y corresponds to comp.

    See this for a picture of the C-type instruction and these fields, reproduced here:

    C-type instructions have the following encoding:

    size (in bits):     1    2     7      3      3
                     +-----+----+------+------+------+
    field            | A/C | ZZ | comp | dest | jump |
                     +-----+----+------+------+------+
    

    In some texts, the 7-bit comp field is further broken down into a (1 bit) and c (6 bits):

    size (in bits):     1    2    1    6       3      3
                     +-----+----+---+------+------+------+
    field            | A/C | ZZ | a |   c  | dest | jump |
                     +-----+----+---+------+------+------+
                                |   comp   |
                                     7 bits
    

    The ZZ bits are unused in C-type instructions, so they can be any values, but common texts tend to use 1's (I don't know why, I would have used 0's).

    In order to find values for these fields, you use tables to look them up.  Tables can be found in a video referenced by the above link, and, also in https://zhongchuyun.gitbooks.io/nand2tetris/content/chapter_4.html

    For example, if a C-type instruction is intended to use normal flow control then use jump field encoding of 000.

    (Normal flow of control is where instructions execute one after another in a sequential order as they appear in memory at sequential increasing memory addresses.  It is very common as it often takes multiple instructions, one after another, to do anything significant.  Sometimes, however, we need to make the machine jump forwards in the machine code program (to do if-then/else), while other times we need to make the machine jump backwards in the machine code program (to do looping).

    In D=A, a C-type instruction, the comp (Y in X = Y) has to compute simply A, so that instruction field is 0110000 by the tables.  The dest (X) has to target D, so that is a dest table value of 010.

    Thus, we have a C-type instruction (1), with a comp of 0110000, with a dest of 010, and a jump of 000.  (Note that the C-type instructions have two ignored bits, shown below as ZZ.  These Z's can either be 0's or 1's — as you like since it doesn't matter.  Some authors appear to choose 1's.)

    Shown together:

    A/C ZZ  comp   dest jump
     1  11 0110000  010  000
    

    -or- 11101100000100002 = EC1016 = 60,43210


    Source: http://dragonwins.com/domains/getteched/csm/CSCI410/references/hack.htm