KEY EECS 370 Exam 1 KEY
Fall 2002, Prof. Mark Brehob
Name: ____________________________________ UM ID: _____________
Sign the honor code:
I have neither given nor received aid on this exam nor observed anyone else doing so.
___________________________________
Scores:
|
# |
Points |
|
1 |
/20 |
|
2 |
/40 |
|
3 |
/40 |
|
Total |
/100 |
NOTES:
Multiple
choice/matching
Multiple choice questions (4 points each): Circle the
letter of your answer. Only select one
answer. If we can’t tell which answer
you selected you will get no points.
1.
A 0-address architecture is also called:
a)
A stack architecture (ALU
operations take no argument as the stack is the source and dest. location.)
b)
An accumulator architecture
c)
A 2.5-D architecture
d)
A RISC architecture
e)
A MUSE architecture
2.
Executing which of the following LC assembly instructions clears
register 1 (that is sets register 1 equal to 0?)
a)
nand 1 1 1
b)
.fill 65536
c)
lw 0 1 0
d)
.fill 1 (This is an add 0 0
1)
e)
add 1 0 0
3.
If you were to write out the largest possible 32-bit floating
point number as a decimal integer, about how many digits would there be?
a)
256
b)
150
c)
39 (210=1024=~1000=103. So 2128=~10128/3=1042. So the answer is close to 42)
d)
13
e)
8
4.
If each activation record requires 256 bytes of storage, and the
maximum stack size possible on a machine is 3000 bytes, which of the following
recursive function calls will not be able to execute on the machine? Choose the
best answer.
int combination(int n,int r)
{
if ((r==0) || (n==r))
return(1);
else
return(combination(n-1,r)+combination(n-1,r-1));
}
a)
combination(16,0)
b)
combination(16,4)
c)
combination(8,4)
d)
combination(16,16)
e)
more than one of the above
a has a depth of 1. b has
a depth of 17. c of 9, d of 1.
So only b goes past 3000 bytes (17*256)
Matching (4 points)
In the blank provided write the letter which best defines the
associated term (1 point each)
Linker ___F___ a)
converts high-level language to assembly code
b)
converts assembly language to high-level language
Loader ___E___ c)
converts assembly language to machine code
d)
executes the instructions
Assembler ___C__ e) places machine code from hard disk to main
memory
f) computes
addresses of unresolved symbols in the object files
Compiler ___A___
1. Assembly language (15 points)
(Note: On this problem you will get full
credit for a working answer, half credit for a very close answer and no
credit for anything else!)
Consider the
following psuedo-code:
B=C+D
if(A==(B+1))
B=B-3;
Write a valid LC2K2 assembly program which does the same
thing. Let Reg1correspond to A, Reg2 to
B, Reg3 to C and Reg4 to D. Your
program should end with the same values in A,B,C and D as the above code.
add 3 4 2 // B=C+D
lw 0 5 pos1 // r5=1
add 2 5 5 // r5=B+1
beq 1 5 take // if
(A==(B+1) goto take
halt
take lw 0 5 neg3
//r5=-3
add 2 5 2 // B=B-3
halt
2. Addressing modes (10 points)
a)
Define the term "PC-relative addressing mode" (2 points)
When the value or
address is computed using the Program Counter’s current value. Usually PC + an offset.
b)
Define the term "immediate addressing mode" (2 points)
When a constant, kept in
the instruction encoding, is used as the operand. addi in MIPs is an example.
c)
Why do the MIPS branches use PC-relative addressing rather than
immediate addressing for their destination? (6 points)
There are two
reasons. First of all we might want to
branch to any word-aligned address in memory.
That would require 30 bit of the instruction to be dedicated to the
immediate value. That would use ¼ of
all encodings. So we need something
that uses less bits. Because we tend to
branch to locations which are nearby it makes sense to branch to a location
relative to the current instruction. So
we use PC-relative addressing.
3. Short design question (15 points)
For this problem, recall that in the
multi-cycle datapath (discussed in class and attached) adds and stores took 4
cycles, branches took 4 cycles if the branch was taken and 3 cycles if it
wasn’t taken, and loads took 5 cycles.
Suppose we wish to execute 100
instructions consisting of 25% loads, 20% stores, 30% branches (half of which
are taken), and 25% adds with the following propagation delays:
1. memory access: 2 ns
2. register read or write: 4 ns
3. ALU operations: 2 ns
4. all other devices 0ns
In that case cycle time of 4ns would have
to be used, since the maximum propagation delay is a register read/write at
4ns. If we modify the design to devote 2 cycles to each register access, we can
reduce the clock cycle time to 2ns.
Given a 2ns clock and a 2-cycle register access, what is the expected
execution time for these 100 instructions?
You must show your work.
Old cycle times:
LW = 5
ADD = 4
SW = 4
BR (taken) = 4
BR (not taken) = 3
New cycle times (given 2 cycle register access):
LW = 7 (one reg read, one reg write)
ADD = 6 (one reg read, one reg write)
SW = 5 (one reg read)
BR (taken) = 5 (one reg read)
BR (not taken) = 4 (one
reg read)
Note that while ADDs SWs and BRs all read two different registers,
reads
of the register file happen in parallel and are counted as one
cycle.
Given a 2ns clock, execution time is:
LW (7 cycles * 2
ns/cycle * 25 instructions) = 350 ns
ADD (6 cycles * 2
ns/cycle * 25 instructions) = 300 ns
SW (5 cycles * 2
ns/cycle * 20 instructions) = 200 ns
BR(T) (5 cycles * 2
ns/cycle * 15 instructions) = 150 ns
BR(NT) (4 cycles * 2 ns/cycle + 15 instructions) = 120 ns
350 + 300 + 200 + 150 + 120 = 1120 ns
In a new version of the LC2K2 ISA, the sw instruction has
been replaced with a push instruction.
It has the following specification:
push (I-type format) 011 Store RegB into memory. The memory
address is the contents of regA. After
this, regA is incremented by 1 if the offsetField is zero and is decremented by1 otherwise.
1.
Using the datapath used in project 2 (attached) complete the
execution for the push instruction. You
may not modify the fetch, decode, or decode2 states that are provided. Use the same specifications used in project
2 with the following exceptions:
·
Memory read accesses take zero time.
(That is, when the memoryAddress register is set, memoryData becomes available
in the next clock.)
·
Memory write accesses take zero time.
(That is, once the memoryData register is written to, the data is immediately
stored to the address specified by memoryAddress.)
·
Constants may be put on the bus. (For
example, the statement bus=4 is legal.)
(20 points)
So that you have enough room, we have provided the next sheet for
your answer! Use this page only for
scrap work.
push (I-type format) 011 Store RegB into memory. The memory
address is the contents of regA. After
this, regA is incremented by 1 if the offsetField is zero and is decremented by1 otherwise.
Put your answer to question 1 of section III here. We have started the problem for you.
fetch:
printState(&state, "fetch");
bus = state.pc;
state.memoryAddress = bus;
goto decode;
decode:
printState(&state,
"decode");
state.pc++;
bus = state.memoryData;
state.instrReg = bus;
goto decode2;
decode2:
printState(&state,
"decode2");
if((state.instrReg >> 22) == 0)
goto add;
……………
if((state.instrReg >> 22) == 3)
goto push1;
push1:
printState(&state, "push1");
(Note:
When grading this problem we did not look for wonderful C syntax. If the idea was plain and not wrong we took
it.)
(Let
RegA = ((state.instrReg >> 19) &7).
Let (RegB=(state.instrReg>>16) &7)
bus=state.reg[RegA];
state.memoryAddress=Bus;
goto push2;
push2:
printState(&state,”push2”);
bus=state.reg[RegB];
state.memoryData=bus;
goto push3;
push3:
printState(&state,”push3”);
bus=state.instrReg & 0xFFFF;
state.ALUoperand=bus;
goto push4;
push4:
printState(&state,”push4”);
bus=0;
state.ALUresult=bus+state.ALUOperand;
goto push5;
push5:
printState(&state,”push5”);
bus=state.reg[regA];
state.ALUOperand=bus;
if(state.ALUresult==0)
goto neg1;
else
goto pos1;
neg1:
bus=-1;
state.ALUresult=state.ALUoperand+bus;
goto done;
pos1:
bus=1;
state.ALUresult=state.ALUoperand+bus;
goto done;
done:
bus=state.ALUresult;
state.reg[RegA]=bus;
goto fetch;
Consider the push instruction from the previous question and the
multi-cycle datapath used in class (provided).
Assume you could add the necessary connections and MUXes needed. What would happen in each cycle of the
execution of a push instruction? You
are to use as few cycles as possible and may not add anything other than
wires and MUXes (no additional adders, registers, etc.) If a given state is unneeded, label it as unneeded,
otherwise give a brief description (in psuedo-code) of the events in each stage
(you may add more stages if needed). We
have provided the fetch and decode part and you may not modify them. (20
points)
Fetch:
InstReg=MEM[PC]
PC=PC+1
Goto Decode
Decode:
Readdata1=Register[InstReg bits 22-24]
Readdata2=Register[InstReg bits 19-21]
…………
If (push) Goto Push1
Push1:
// hold Readdatas from Decode
MEM[Readdata1]=Readdata2
if(signextend(InstReg bits 0-15) == 0)
then goto Push2
else goto Push4
Push2:
// hold Readdata1 from Push1
ALUout=Readdata1 + 1
Push3:
// hold Readdata1 and ALUout from Push2
Register[InstReg bits 19-24]=ALUout
goto Fetch
Push4:
// hold Readdata1 from Push1
ALUout=Readdata1 + -1
Push5:
// hold Readdata1 and ALUout from Push4
Register[InstReg bits 19-24]=ALUout
goto Fetch
Common
mistake:
·
missing
that we need to hold values on ALU and hence having one writeback state.
o
Not
putting states but paying lip service to holding values
o
Missing
WB or using eq as mux control, or doing both WB and EX in one cycle thereby
missing the problem.
Note:
We pretty much took it if it was right and
we could figure out what people were doing.