KEY EECS 370 Exam 1 KEY

Fall 2002,  Prof. Mark Brehob

 

Name: ____________________________________    UM ID: _____________

 

 

Sign the honor code:

                       

I have neither given nor received aid on this exam nor observed anyone else doing so.

 

 

___________________________________

 

 

 

 

Scores:

#

Points

1

       /20

2

       /40

3

       /40

Total

       /100

 

NOTES:


 Section I (20 points)

Multiple choice/matching

 

Multiple choice questions (4 points each): Circle the letter of your answer.  Only select one answer.  If we can’t tell which answer you selected you will get no points.

 

1.        A 0-address architecture is also called: 

a)       A stack architecture (ALU operations take no argument as the stack is the source and dest. location.)

b)       An accumulator architecture

c)       A 2.5-D architecture

d)       A RISC architecture

e)       A MUSE architecture

 

 

2.        Executing which of the following LC assembly instructions clears register 1 (that is sets register 1 equal to 0?)

a)       nand  1 1 1

b)       .fill 65536

c)       lw    0 1 0

d)       .fill 1 (This is an add 0 0 1)

e)       add   1 0 0

 

3.        If you were to write out the largest possible 32-bit floating point number as a decimal integer, about how many digits  would there be?

a)       256

b)       150

c)       39 (210=1024=~1000=103.  So 2128=~10128/3=1042.  So the answer is close to 42)

d)       13

e)       8

 

 

4.        If each activation record requires 256 bytes of storage, and the maximum stack size possible on a machine is 3000 bytes, which of the following recursive function calls will not be able to execute on the machine? Choose the best answer.

 

int combination(int n,int r)

{

if ((r==0) || (n==r))

return(1);

else

return(combination(n-1,r)+combination(n-1,r-1));

}

 

a)       combination(16,0)

b)       combination(16,4)

c)       combination(8,4)

d)       combination(16,16)

e)       more than one of the above

a has a depth of 1.  b has a depth of 17.  c of 9,  d of 1.  So only b goes past 3000 bytes (17*256)

 

 


Matching (4 points)

In the blank provided write the letter which best defines the associated term  (1 point each)

 

Linker        ___F___             a) converts high-level language to assembly code

                                                b) converts assembly language to high-level language

Loader       ___E___            c) converts assembly language to machine code

                                                d) executes the instructions

Assembler ___C__              e) places machine code from hard disk to main memory

                                                f) computes addresses of unresolved symbols in the object files

Compiler   ___A___

 

 

 

Section II: Short answer (40 points)

1. Assembly language (15 points)

(Note: On this problem you will get full credit for a working answer, half credit for a very close answer and no credit for anything else!)

               

                Consider the following psuedo-code:

 

                B=C+D

      if(A==(B+1))

          B=B-3;

               

Write a valid LC2K2 assembly program which does the same thing.  Let Reg1correspond to A, Reg2 to B, Reg3 to C and Reg4 to D.  Your program should end with the same values in A,B,C and D as the above code.

 

 

add 3 4 2     // B=C+D

lw 0 5 pos1   // r5=1

add 2 5 5     // r5=B+1

beq 1 5 take  // if (A==(B+1) goto take

halt

take  lw 0 5 neg3   //r5=-3

      add 2 5 2     // B=B-3

      halt

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2. Addressing modes (10 points)

a)       Define the term "PC-relative addressing mode" (2 points)

 

When the value or address is computed using the Program Counter’s current value.  Usually PC + an offset.

 

 

b)       Define the term "immediate addressing mode" (2 points)

 

When a constant, kept in the instruction encoding, is used as the operand.  addi in MIPs is an example.

 

 

c)       Why do the MIPS branches use PC-relative addressing rather than immediate addressing for their destination? (6 points)

 

There are two reasons.  First of all we might want to branch to any word-aligned address in memory.  That would require 30 bit of the instruction to be dedicated to the immediate value.  That would use ¼ of all encodings.  So we need something that uses less bits.  Because we tend to branch to locations which are nearby it makes sense to branch to a location relative to the current instruction.  So we use PC-relative addressing.

 

 

 

 

 

 

3. Short design question (15 points)

For this problem, recall that in the multi-cycle datapath (discussed in class and attached) adds and stores took 4 cycles, branches took 4 cycles if the branch was taken and 3 cycles if it wasn’t taken, and loads took 5 cycles.

 

Suppose we wish to execute 100 instructions consisting of 25% loads, 20% stores, 30% branches (half of which are taken), and 25% adds with the following propagation delays: 

 

     1. memory access: 2 ns

     2. register read or write: 4 ns

     3. ALU operations: 2 ns

     4. all other devices 0ns

 

In that case cycle time of 4ns would have to be used, since the maximum propagation delay is a register read/write at 4ns. If we modify the design to devote 2 cycles to each register access, we can reduce the clock cycle time to 2ns.  Given a 2ns clock and a 2-cycle register access, what is the expected execution time for these 100 instructions?  You must show your work.

 

 

 

Old cycle times:

LW             = 5

ADD            = 4

SW             = 4

BR (taken)     = 4

BR (not taken) = 3

 

New cycle times (given 2 cycle register access):

LW             = 7   (one reg read, one reg write)

ADD            = 6   (one reg read, one reg write)

SW             = 5   (one reg read)

BR (taken)     = 5   (one reg read)

BR (not taken) = 4   (one reg read)

 

Note that while ADDs SWs and BRs all read two different registers, reads

of the register file happen in parallel and are counted as one cycle.

 

Given a 2ns clock, execution time is:

 

LW     (7 cycles * 2 ns/cycle * 25 instructions) = 350 ns

ADD    (6 cycles * 2 ns/cycle * 25 instructions) = 300 ns

SW     (5 cycles * 2 ns/cycle * 20 instructions) = 200 ns

BR(T)  (5 cycles * 2 ns/cycle * 15 instructions) = 150 ns

BR(NT) (4 cycles * 2 ns/cycle + 15 instructions) = 120 ns

 

350 + 300 + 200 + 150 + 120 = 1120 ns

 


Section III -- Design question (40 points)

 

In a new version of the LC2K2 ISA, the sw instruction has been replaced with a push instruction.  It has the following specification:

 

push (I-type format)   011            Store RegB into memory. The memory 
                                      address is the contents of regA.  After 
                                      this, regA is incremented by 1 if the 
                                      offsetField is zero and is decremented by
                                      1 otherwise.  

 

1.        Using the datapath used in project 2 (attached) complete the execution for the push instruction.  You may not modify the fetch, decode, or decode2 states that are provided.  Use the same specifications used in project 2 with the following exceptions:

·         Memory read accesses take zero time. (That is, when the memoryAddress register is set, memoryData becomes available in the next clock.)

·         Memory write accesses take zero time. (That is, once the memoryData register is written to, the data is immediately stored to the address specified by memoryAddress.)

·         Constants may be put on the bus. (For example, the statement bus=4 is legal.)

(20 points)

 

 

 

 

 

 

So that you have enough room, we have provided the next sheet for your answer!  Use this page only for scrap work.

 


push (I-type format)   011            Store RegB into memory. The memory 
                                      address is the contents of regA.  After 
                                      this, regA is incremented by 1 if the 
                                      offsetField is zero and is decremented by
                                      1 otherwise.  

 


Put your answer to question 1 of section III here.  We have started the problem for you. 

fetch:

      printState(&state, "fetch");

      bus = state.pc;

      state.memoryAddress = bus;

      goto decode;

             

decode:

      printState(&state, "decode");

      state.pc++;

      bus = state.memoryData;

      state.instrReg = bus;

      goto decode2;

 

decode2:

      printState(&state, "decode2");

if((state.instrReg >> 22) == 0)

            goto add;

      ……………

      if((state.instrReg >> 22) == 3)

            goto push1;

push1:

printState(&state, "push1");

(Note: When grading this problem we did not look for wonderful C syntax.  If the idea was plain and not wrong we took it.)

     

(Let RegA = ((state.instrReg >> 19) &7).  Let (RegB=(state.instrReg>>16) &7)

 

bus=state.reg[RegA];

state.memoryAddress=Bus;

goto push2;

 

push2:

printState(&state,”push2”);

bus=state.reg[RegB];

state.memoryData=bus;

goto push3;

 

push3:

printState(&state,”push3”);

bus=state.instrReg & 0xFFFF;

state.ALUoperand=bus;

goto push4;

push4:

printState(&state,”push4”);

bus=0;

state.ALUresult=bus+state.ALUOperand;

goto push5;

push5:

 printState(&state,”push5”);

 bus=state.reg[regA];

state.ALUOperand=bus;

if(state.ALUresult==0)

            goto neg1;

else

            goto pos1;

 

neg1:

      bus=-1;

      state.ALUresult=state.ALUoperand+bus;

      goto done;

pos1:

      bus=1;

      state.ALUresult=state.ALUoperand+bus;

      goto done;

 

done:

      bus=state.ALUresult;

      state.reg[RegA]=bus;

      goto fetch;

 

 


Consider the push instruction from the previous question and the multi-cycle datapath used in class (provided).  Assume you could add the necessary connections and MUXes needed.   What would happen in each cycle of the execution of a push instruction?  You are to use as few cycles as possible and may not add anything other than wires and MUXes (no additional adders, registers, etc.)  If a given state is unneeded, label it as unneeded, otherwise give a brief description (in psuedo-code) of the events in each stage (you may add more stages if needed).  We have provided the fetch and decode part and you may not modify them. (20 points)

 

 

 

Fetch:

      InstReg=MEM[PC]

      PC=PC+1

      Goto Decode

 

Decode:

      Readdata1=Register[InstReg bits 22-24]

      Readdata2=Register[InstReg bits 19-21]

      …………

If (push) Goto Push1

Push1:

  // hold Readdatas from Decode

  MEM[Readdata1]=Readdata2

  if(signextend(InstReg bits 0-15) == 0)

  then goto Push2

  else goto Push4

Push2:

  // hold Readdata1 from Push1

  ALUout=Readdata1 + 1

Push3:

  // hold Readdata1 and ALUout from Push2

  Register[InstReg bits 19-24]=ALUout

  goto Fetch

Push4:

  // hold Readdata1 from Push1

  ALUout=Readdata1 + -1

Push5:

  // hold Readdata1 and ALUout from Push4

  Register[InstReg bits 19-24]=ALUout

  goto Fetch

 

Common mistake:

·         missing that we need to hold values on ALU and hence having one writeback state.

o        Not putting states but paying lip service to holding values

o        Missing WB or using eq as mux control, or doing both WB and EX in one cycle thereby missing the problem.

Note:

      We pretty much took it if it was right and we could figure out what people were doing.