Tags: bucaps/marss-riscv
Tags
Release 4.1a - Added - Model a arbitrary fixed latency between LLC cache and Memory controller - Changed - For Ramulator and DRAMSim3, memory access request is split into MEM_BUS_WIDTH sized parts and latency for each part is queried - Fixed - Rounding mode (rm) must be calculated again before executing FP instruction during simulation
Release v4.0a - Added - Comprehensive logging support - Command-line option `-sim-file-path` to specify a top-level directory to store statistics and log files - Command-line option `-sim-file-prefix` to specify prefix appended to all simulator generated files - Command-line option `-sim-emulate-after-icount` to specify the number of instruction to simulate after starting simulation mode - [DRAMsim3](https://github.com/umd-memsys/DRAMSim3) support - [Ramulator](https://github.com/CMU-SAFARI/ramulator) support - Sample MARSS-RISCV configuration files for a 64-bit RISC-V In-order and Out-of-order SoC in [configs](./configs) folder - More performance counters to count different types of load instructions (byte, half-word, word, double-word) - Time-stamp to all the statistics files generated by the simulator - Specify latency in CPU cycles for RISC-V `SYSTEM` class instructions in the config file - Counter to track the number of CPU pipeline flushes - Counters to tracks each type of software exceptions and hardware interrupt processed during simulation - Parallel build support for Makefile - During the simulation, `mtime` is calculated using simulation clock cycles - Specify frequency for CPU and RTC device via the config file - Add option `flush_on_context_switch` in the config file to enable/disable flushing of BPU on a context switch - Start fetching the target from the next cycle on branch misprediction - Load for non-word quantities (byte and half-word) take an extra one cycle on cache-hit - Add function to invalidate entries in mem_request_queue on the miss-speculated path - Changed - Re-factor and modularize simulator code-base - STORE type instructions submit write-request to L1-data cache and exit memory stage in a single cycle - Delay for reading/writing page-table entries is now simulated via L1-Data cache - Print IPC for all the RISC-V CPU modes after simulation completes to the console and log file - In-order core doesn't support parallel execution in multiple functional units - Replace hot-cold LRU eviction policy with bit-PLRU eviction policy for BTB and caches - Improve the format of TinyEMU config file - Update [MARSS-RISCV Docs](https://marss-riscv-docs.readthedocs.io/en/latest/) - Update README.md - Page walk delays are simulated via L1 D$ - Removed [DRAMsim2](https://github.com/umd-memsys/DRAMSim2) support - Fixed - Memory leaks - Don't start simulating DRAM access delay until cache lookup delay is simulated - Branch entry is added to BTB, only after the branch is resolved
Release v3.1a - Added - Print TLB stats to the terminal after the simulation completes - Specify latency for each FPU ALU instruction (`fadd`, `fsub`, `fmul`, `fdiv`, `fmin`, `fmax`, `fcvt`, `cvt`, `fle`, `flt`, `feq`, `fsgnj`, `fqsrt`, `fmv`, `fclass`) via TinyEMU config file - Figure showing the high-level overview of MARSS-RISCV in README.md - Changed - Simplify the base DRAM model - All memory accesses simulate a fixed latency `mem_access_latency` - Any subsequent accesses to the same physical page occupies a lower delay, which is roughly 60 percent of the fixed `mem_access_latency` - More info [here](https://marss-riscv-docs.readthedocs.io/en/latest/) - Parallel operation of functional units can be enabled or disabled in the in-order core via TinyEMU config file - Clean exception handling code - Simulate page table entry read/write delays directly via memory controller using a configurable fixed latency `pte_rw_latency` - Don't stall the pipeline stage for the write request to complete on the memory controller - Make FPU-ALU non-pipelined - Rename `dram_dispatch_queue` to`mem_request_queue` - Update [MARSS-RISCV Docs](https://marss-riscv-docs.readthedocs.io/en/latest/) - Update README.md - Update TinyEMU config file [here](https://cs.binghamton.edu/~marss-riscv/marss-riscv-images.tar.gz) - Fixed - memory leaks
Release v3.0a - Added - Support for separate RISC-V Bios and Kernel - Command line option `flush-sim-mem` to flush simulator memory hierarchy on every fresh simulation run - Command line option `sim-trace` to generate instruction commit trace during simulation - Distinct configurable read-hit and write-hit latency for all the caches - Return address stack (RAS) - Branch prediction and speculative execution support for out of order core - Print performance counters on terminal when the simulation completes - More performance counters: - Instruction types - ecall - page walks for loads, stores and instructions - memory controller delay for data and instructions - hardware interrupts - Changed - Port to TinyEMU version `2019-12-21` - For bimodal branch predictor, store prediction bits in a separate Branch history table (BHT) - For in-order core, non-memory instructions can forward their result from MEM stage in addition to EX stage - For in-order core, relaxed interlocking on WAW data hazard - Simplified out of order core design, ROB slots are now used as physical registers along with a single rename table and a single global issue queue - Fixed - Correctly calculated the rounding mode for floating pointing instruction decoding - Converted `c.addiw` result buffer into `int32_t` on 64-bit simulation - Set the data type to `unint64_t` for 64-bit simulation, for the buffer which holds the memory address for atomic instructions - Issue #13 and #14 (thanks to Okhotnikov Grigory)
Release v1.1a - Added 16550A UART support (thanks to Marc Gauthier) - Reworked the dram latency parameters to match the Sifive HiFive U540 Board - Increased the dram dispatch queue size from 32 to 64 - Add a timestamp suffix to the stats file - Bug fix: fixed the calculation of hardware page walk latency - Bug fix: fixed the miscalculation in page fault counters - Fixed Issue #2: memory leaks in copy_file - Fixed Issue #3: 'log' instead 'log2'