Overview of DRAMSim2’s Memory Structure

Despite that several blogs related with DRAMSim2 have been posted, here I’d like to have a brief introduction of its memory structure (not to show how to run DRAMSim2). The description below is based on the source code, so if you are wondering whether sth are incorrect, feel free to read the code to have a verification.

DRAMSim2 is a very popular open-source JEDEC DDRx simulator which models memory controller, memory channels, ranks, banks and all timing constraints.DRAM

DRAMSim2 can be ran alone or bound with CPU simulator to run as a system. On the code level, the difference lies on that for alone, start with TraceBasedSim.cpp; ow, start with the CPU simulator’s call of DRAMSim2. DRAMSim2’s real execution begins with MultiChannelMemorySystem.cpp. As the above figure shows, MultiChannelMemorySystem is just the DRAM system which consists of one or more channels (MemorySystem.cpp); and each channel is composed of multiple ranks; and each rank has several banks. Further, each channel has a pendingTransactionQueue. In addition, there is one corresponding memory controller for each channel.

Let’s have a closer look at the memory controller which has four queues: TransQ which is the transaction queue to receive and store coming transactions; CmdQ which stores the translated commands of each transaction; if a read command is dispatched to the memory, then the transaction will be stored into the PendRTQ until the data is returned; RtnQ is to store the returned transactions.

Upon the basic understanding of the structure, let’s move forward to the workflows:

Flow of adding transactions:

Forward the transactions to lower-level component

  • MultiChannelMemorySystem->addTransaction(trans)
  • MemorySystem->addTransaction(trans): if acceptable, directly add to MC’s TransQ; ow, add to MemorySystem’s PendTQ
  • MemoryController->addTransaction(trans)

Flow of update:

  • MultiChannelMemorySystem->update()->actual_update()
  • MemorySystem->update(): update the state of each rank: (*ranks)[i]->update; then, MC->addTrans(PendTQ(0))
  • MemoryController->update()

Flow of executing transactions (MC->update()):

  • Update bank states
  • Check for outgoing cmd/data packets and handle countdown; if packet is ready to be received by rank: (*ranks)[outgoingCmd/DataPacket->rank]->receiveFromBus(packet)
  • If there is sth valid in poppedBusPacket: CommandQueue.pop(&poppedBusPacket), outgoingCmdPacket=poppedBusPacket
  • Update each bank’s state based on the command that was just popped out of cmdQ
  • Handle TransQ: pop top transaction; if there is room, break up the transaction into appropriate commands and add them to the CmdQ
  • Check for outstanding data to return to CPU

=================================================================================================

As supplement, I’d like to talk a bit about the capacity calculation used by DRAMSim2:

  • Each column contains DEVICE_WIDTH bits.
  • A row contains NUM_COLS columns.
  • Each bank contains NUM_ROWS rows.

Therefore, the total storage per DRAM device is:

  • PER_DEVICE_STORAGE=NUM_ROWS*NUM_COLS*DEVICE_WIDTH*NUM_BANKS (in bits)

A rank must have a 64 bit output bus (JEDEC standard), so each rank must have:

  • NUM_DEVICES_PER_RANK = 64/DEVICE_WIDTH

(note: if you have multiple channels ganged together, the bus width is NUM_CHANS*64/DEVICE_WIDTH)

If we multiply these two numbers to get the storage per rank (in bits), we get:

  • PER_RANK_STORAGE=PER_DEVICE_STORAGE*NUM_DEVICES_PER_RANK = NUM_ROWS*NUM_COLS*NUM_BANKS*64

Finally, to get TOTAL_STORAGE, we need to multiple by NUM_RANKS

  • TOTAL_STORAGE=PER_RANK_STORAGE*NUM_RANKS

With above knowledge, let’s analyze one simple example: total storage is 8192 MB, 2 channels, 8 banks, 32768 rows/bank, 1024 columns/row and device width is 16. Below is how to get the number of ranks:

  1. TOTAL_STORAGE = megsOfMemory = 8192MB
  2. CHAN_STORAGE = megsOfMemory/NUM_CHANS = 8192MB/2 = 4096MB
  3. RANK_STORAGE = megsOfStoragePerRank = NUM_ROWS*NUM_COLS*NUM_BANKS*64 = 32768*1024*8*64/8*1024*1024 = 2048MB
  4. NUM_RANKS = CHAN_STORAGE/megsOfStoragePerRank = 4096MB/2048MB=2

Leave a comment