This lab will exercise your understanding of the E100's instruction set and introduce you to a datapath and finite-state machine that implements some of this instruction set. You will demonstrate your understanding of the E100 by showing the results of an E100 program and by tracing through the execution of another E100 program on the datapath and finite-state machine provided.
This section summarizes the E100 CPU that was covered in lecture.
The word size for the E100 is 16 bits. Numbers are stored in 2s complement representation (i.e., bit 15 has a place value of -32768). Words thus range in value from -32768 to 32767. E.g., 16'h8000 represents -32768, 16'hffff represents -1, and 16'h7fff represents 32767. The calculator on the course web page converts between hexadecimal and signed numbers.
The E100 has instructions for arithmetic, array access, branches, function calls, and I/O. Each E100 instruction is composed of 4 memory words. The value of the first word of the instruction is called the opcode and specifies the operation for the instruction. The values of the next three memory words of an instruction are called addr0, addr1, and addr2. The following table describes the effect of each instruction on the E100's program counter (PC) and memory.
| Instruction name | Opcode | Effect |
|---|---|---|
| halt | 0 | PC = PC+4 stop executing instructions |
| add | 1 | PC = PC+4 memory[addr0] = memory[addr1] + memory[addr2] |
| sub | 2 | PC = PC+4 memory[addr0] = memory[addr1] - memory[addr2] |
| mult | 3 | PC = PC+4 memory[addr0] = memory[addr1] * memory[addr2] |
| div | 4 | PC = PC+4 memory[addr0] = memory[addr1] / memory[addr2] |
| cp | 5 | PC = PC+4 memory[addr0] = memory[addr1] |
| and | 6 | PC = PC+4 memory[addr0] = memory[addr1] & memory[addr2] |
| or | 7 | PC = PC+4 memory[addr0] = memory[addr1] | memory[addr2] |
| not | 8 | PC = PC+4 memory[addr0] = ~memory[addr1] |
| sl | 9 | PC = PC+4 memory[addr0] = memory[addr1] << memory[addr2] |
| sr | 10 | PC = PC+4 memory[addr0] = memory[addr1] >> memory[addr2] |
| cpfa | 11 | PC = PC+4 memory[addr0] = memory[addr1 + memory[addr2]] |
| cpta | 12 | PC = PC+4 memory[addr1 + memory[addr2]] = memory[addr0] |
| be | 13 |
if (memory[addr1] == memory[addr2]) {
PC = addr0
} else {
PC = PC+4
}
|
| bne | 14 |
if (memory[addr1] != memory[addr2]) {
PC = addr0
} else {
PC = PC+4
}
|
| blt | 15 |
if (memory[addr1] < memory[addr2]) {
PC = addr0
} else {
PC = PC+4
}
Comparisons take into account the sign of the number. E.g., 16'hffff (-1) is less than 16'h0000 (0). |
| call | 16 | memory[addr1] = PC+4 PC = addr0 |
| ret | 17 | PC = memory[addr0] |
| in | 18 | PC = PC + 4 memory[addr1] = data from I/O port addr0 |
| out | 19 | PC = PC + 4 I/O port addr0 = memory[addr1] |
The following Verilog files implement an E100 datapath (except for hexdigit.v) and partial control unit for the E100. Create a new Quartus project with these files (most should look familiar). For your convenience, here is a ZIP file containing all the Verilog files (extract them with the unzip command).
Read top.v and control.v, and review the truth table for the portion of the control unit that we covered in lecture (fetch, decode, and execute for the add and be instructions). Familiarize yourself with the signals used to control the datapath, and trace through the execution of a simple instruction (e.g., add).
The E100's clock generation differs from Lab 4. As with clock.v from lab 4, clocks.v generates the main clock signal for the circuit. clocks.v also generates four other clock signals of various speeds that will be used for the E100's I/O devices. clocks.v uses special circuitry (called a phase-locked loop) in the Cyclone II FPGA to keep these clocks synchronized. The phase-locked loop also produces a signal clock_valid. clock_valid will be set to 1 if the clock signals are valid. clock_valid will be set to 0 if the clock signals are invalid and should therefore be ignored. All components in the E100 ignore posedge clock events if clock_valid is 0.
Your pre-lab assignment is to answer questions about the E100 instruction set and implementation. Write your answers to a PDF file and submit it by the due date above. Note that this pre-lab assignment is due 1-2 days before your lab section.
mem[0] = 12 mem[1] = 30 mem[2] = 32 mem[3] = 31 mem[4] = 11 mem[5] = 38 mem[6] = 42 mem[7] = 45 mem[8] = 16 mem[9] = 20 mem[10] = 39 mem[11] = 0 mem[12] = 1 mem[13] = 41 mem[14] = 28 mem[15] = 29 mem[16] = 0 mem[17] = 0 mem[18] = 0 mem[19] = 0 mem[20] = 2 mem[21] = 40 mem[22] = 28 mem[23] = 29 mem[24] = 17 mem[25] = 39 mem[26] = 0 mem[27] = 0 mem[28] = -100 mem[29] = 197 mem[30] = 12000 mem[31] = 5 mem[32] = 600 mem[33] = 599 mem[34] = 598 mem[35] = 597 mem[36] = 596 mem[37] = 10000 mem[38] = 11000 mem[39] = 13000 mem[40] = 14000 mem[41] = 15000 mem[42] = 900 mem[43] = 800 mem[44] = 700 mem[45] = 2
memory[0] = 15 memory[1] = 8 memory[2] = 12 memory[3] = 13 memory[4] = 0 memory[5] = 0 memory[6] = 0 memory[7] = 0 memory[8] = 0 memory[9] = 0 memory[10] = 0 memory[11] = 0 memory[12] = 1000 memory[13] = 2000Trace through how the datapath and finite-state machine provided would execute this program. At the beginning of each cycle, list:
You need only show values that differ from those in the previous cycle.
E.g. the datapath contents at the beginning of the first cycle would be:
There is no in-lab demonstration for this lab. Instead, you will discuss the questions above and practice writing assembly-language programs.