ICS (Architecture)

Fudan University / 2020 Spring

Assignment 3 Part I (Cache)

You are required to add Cache and Branch Predictor to your MIPS Pipeline Processor in assignment 3. Due to its work and doc, assignment 3 is divided into two parts. The first part is about Cache, and the latter one is about Branch Predictor.

The first part will make up 25% of the final score. It will due on May 11, and you must pass all the tests (cpu_tb.sv is to be delayed) before submitting your work. And recently we've made some changes to checker, please refer to checker guidelines for details (If you've seen this in assignment-2, you can just ignore this).

In Part I, you may recall what you've learned from ICS about cache and follow this doc to implement both instruction cache and data cache. Before work, you need import codes in src/.

Description

cache from csapp

In the above picture, the 32-bit address is divided into three parts: Tag, Set index, Block offset. And we define the widths of each part as CACHE_T, CACHE_S and CACHE_B in cache.vh. We also get CACHE_E in cache.vh, which means there are CACHE_E lines in one Set.

If CACHE_B is 4 and CACHE_S is 2, then each line stores 16 bytes or 4 words and there are 4 sets in one cache.

When 32-bit address is given, the cache first finds the set index i of that address, then it checks in set[i] whether there exists a line whose tag matches the tag of the address. If so, the cache hits or else the cache misses and we need to load a block of data from the memory into the cache line.

In the real world, memory can provide at most 32 bits of data in one clock cycle on a 32-bit processor. So if we want to read a block of data from memory, we need plenty of clock cycles to get all of them.

Under cache miss, if the selected set is full of valid lines and we choose a dirty line to be replaced, we should first write data back to memory and load new data into this line.

To do those above, you'd better use Finite State Machines(FSMs) in the cache.

Cache Module

Our cache module will act like a bridge between MIPS processor and the memory. You need to implement cache module defined below. Besides, you may need to block your processor during the cache miss.

cache.v:

cache.vh:

You'd better review 6.4 Cache Memories in CSAPP first and understand how cache set and line modules should be organized (PS: you'd better also create a module for Line Replacement Strategy). You can refer to module declarations listed below.

Hint: declarations below are for reference only. You may modify them (e.g. exposed ports) according to your way to implement cache (and we think there should be no changes to cache.v because we think all the ports are musts and we should set up these ports to ease our testing).

Reference Implementation

Structure

Below is the structure of cache module. cache_controller, replacement_controller, set, line modules are instantiated directly or indirectly in cache.

In cache module, there are several sets and a cache_controller which is implemented with FSMs to control the cache.

cache_controller will provide control signals and you need to send these control signals to the corresponding cache set (decided by the set index of address from processor), and in that cache set, control signals needed by line should be assigned to the correct cache line (line's tag matches the tag of address).

Every set instantiate a replacement_controller. If cache misses and the set is full, replacement_controller in the set will select one line to kick off. Besides, you should make sure the replacement strategy in replacement_controller updates its states correctly.

Line

The line module behaves quite like dmem except that there are extra properties(registers) and corresponding control signals like valid, dirty, tag and set_valid, set_dirty, set_tag.
If w_en is 1, not only will write_data be written into line, but also set_valid, set_dirty and set_tag will be assigned to valid, dirty and tag registers in this clock cycle. If w_en is 0, nothing will be changed in line.

Set

set module will instantiate several line modules. To conveniently instantiate an array of custom modules, follow this link for more details. Besides, you'd better use a Line Replacement Strategy module(replacement_controller) in set module.

The ctls are control signals from cache_controller and part of them should be sent to cache line whose tag matches address's tag.

In ctls, strategy_en is the enable signal of Line Replacement Strategy. The reason why we need it is that there may be lots of clock cycles in one cycle of cache read/write, and some replacement strategy (eg. LFU) may be affected if the strategy runs in every clock cycle.

offset_sel is used to select from mread_data and write_data. The selected data will be written into line as write_data.

Cache Controller

Personally, I suggest you use two FSMs. One is used to count on offset. The other records the action state of cache: ReadMem(read block from memory), WriteBack(write block to memory), Initial(initial state), etc. Also you should notice that there are connections between these two FSMs. For example, WriteBack will change to ReadMem only when offset is certain value.

Report Requirements