





| Components           | Baseline                                                                          | Our Design                                                      |
|----------------------|-----------------------------------------------------------------------------------|-----------------------------------------------------------------|
| Cache                | N-way write-back, write-allocate cache<br>Blocked until the transmission complete |                                                                 |
| Network<br>Interface | Store head, data and tail flits                                                   | Only store head and tail<br>flits                               |
| Router               | Communicate only with the network interface                                       | Communicate with both<br>the network interface<br>and the cache |





# **Experimental Evaluation**

Area

| module            | baseline (µm <sup>2</sup> ) | our design (µm <sup>2</sup> |
|-------------------|-----------------------------|-----------------------------|
| router            | 512,292                     | 518,037                     |
| network interface | 179,558                     | 31,518                      |

- Synthesized the RTL codes by Synopsys Design Compiler
- The cache module is unmodified, so the area remains the same
- The area of the router increases by 1.1% due to the extra ports and control logic to communicate with the cache
- The area of the network interface is reduced by 6 times resulting from the buffer elimination

### Performance



- We obtained memory access traces by running test cases on the EECS 470 processor
- We measured execution cycles by injecting traces to the caches
- Performance improvement ranges from 0.43% 1.32%, with the average being 0.77%
- Performance improvement results from direct injection of data flits to the router without going to the network interface

## Conclusion

- Our design successfully reduces the area of the network interface by 6 times, while keeping the cache the same and having an extremely low area overhead for router
- Our design does not degrade the performance but instead improves the performance with the average of 0.77%



