- Constants can be expressed in hex or in decimal. Hex constants consist
of an 'x' or 'X' followed by one or more hex digits. Decimal constants
consist of a '#' followed by one or more decimal digits. Negative constants
are identified by a minus sign immediately after the x or #. For example, #-10
is the negative of decimal 10 (i.e., -10), and x-10 is the negative of x10 (i.e. -16).
Since the sign is explicitly specified, the rest of the constant is treated as
an unsigned number. For example, x-FF is equivalent to -255. The 'x' tells us
the number is in hex, the '-' tells us it is a negative number, and "FF" is treated
as an unsigned hex number (i.e., 255). Putting it all together gives us -255.
- Your assembler does not have to check for multiple .ORIG pseudo-ops.
- Since the .END pseudo-op is used to designate the end of the assembly
language file, your assembler does not need to process anything that comes
after the .END.
- The trap vector for a TRAP instruction and the shift amount for SHF
instructions must be positive values. If they are not, you should return
error code 3.
- The same label should not appear in the symbol table more than once. During
pass 1 of the assembly process, you should check to make sure a label is not
already in the symbol table before adding it to the symbol table. If the label
is already in the symbol table, you should return error code 4.
- An invalid label (i.e., one that contains non-alphanumeric characters, or
one that starts with the letter 'x' or a number) is another example of error
code 4.
- The standard C function
isalnum() can be used to check if a character is alphanumeric.
- After you have gone through the input file for pass 1 of the assembler and
your file pointer is at the end of the file, there are two ways you can get the
file pointer back to the beginning. You can either close and reopen the file or
you can use the standard C I/O function
rewind().
- The following definitions can be used to create your symbol table:
#define MAX_LABEL_LEN 20
#define MAX_SYMBOLS 255
typedef struct{
int address;
char label[MAX_LABEL_LEN + 1]; /*Question for the reader: Why do we need to add 1? */
} TableEntry;
TableEntry symbolTable[MAX_SYMBOLS];
- To check if two strings are the same, you can use the standard C string function
strcmp(). To copy one
string to another, you can use the standard C string function
strcpy().
- If you decide to use any of the math functions in math.h, you also have to link the math library by using the command "gcc -lm -ansi -o assemble assembler.c".
- The specification for the lab classifies the error "ADD R0, R0, 1" as
"invalid operand," error code 4. However, if you use the given toNum function
to detect this error while checking a possible immediate operand, it produces
error code 3 ("invalid constant"). Therefore, we will accept both error 3 and
error 4 for this case.
- When your assembler finds an error in the input assembly language program,
it is not required that you print out an error message to the screen. If you
choose to do this to make debugging easier, that is fine. What is required
is that you exit with the appropriate error code. This is what we will be
checking for when we grade your program; we will ignore anything that is
printed to the screen.
-
A student pointed out a discrepancy between the lab description and the assembler we posted. It was
related to the following example:
.ORIG x3000
JSR ADD ; JSR is parsed as an opcode and then ADD is the undefined label
.END
The lab description states that this should be reported as error code 1 (undefined label). The old
assembler we posted reported error code 4 instead. We have fixed the problem and have posted the new
assembler.
-
A student pointed out that the toNum() function we provided did not detect errors for examples such as:
ADD R1, R1, #Invalid
.FILL xInvalid
We have posted a new and improved toNum() function on the useful code page. The use of this new
toNum() function is optional.