Skip to content

daneshvar-amrollahi/ARM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ARM Processor Pipeline Implementation

A comprehensive 5-stage pipelined ARM processor implementation in Verilog, featuring advanced features like data forwarding, hazard detection, cache memory, and SRAM controller.

Architecture Overview

This project implements a complete ARM processor with the following pipeline stages:

  • IF (Instruction Fetch) - Fetches instructions from memory
  • ID (Instruction Decode) - Decodes instructions and reads register file
  • EXE (Execute) - Performs ALU operations and address calculations
  • MEM (Memory) - Handles memory operations (load/store)
  • WB (Write Back) - Writes results back to register file

Key Features

Pipeline Control

  • 5-stage pipeline with proper hazard detection and control
  • Data forwarding unit to minimize pipeline stalls
  • Hazard detection for load-use dependencies
  • Branch prediction and branch target handling
  • Pipeline flushing for control hazards

Memory System

  • Cache controller with 2-way set associative cache (64 rows)
  • SRAM controller for external memory interface
  • Memory hierarchy with cache-SRAM integration
  • Configurable cache size and associativity

Advanced Features

  • Data forwarding from MEM and WB stages to EXE stage
  • Condition code evaluation for conditional execution
  • Status register management (N, Z, C, V flags)
  • Immediate value handling with sign extension
  • Barrel shifter support (LSL, LSR, ASR, ROR)

Supported Instructions

Arithmetic Instructions

  • MOV - Move data
  • MVN - Move negated
  • ADD - Addition
  • ADC - Add with carry
  • SUB - Subtraction
  • SBC - Subtract with carry

Logical Instructions

  • AND - Bitwise AND
  • ORR - Bitwise OR
  • EOR - Bitwise XOR (Exclusive OR)

Comparison Instructions

  • CMP - Compare (sets flags only)
  • TST - Test (bitwise AND, sets flags only)

Memory Instructions

  • LDR - Load register from memory
  • STR - Store register to memory

Conditional Execution

All instructions support ARM's conditional execution with the following condition codes:

  • EQ (Equal), NE (Not Equal)
  • CS/HS (Carry Set/Higher Same), CC/LO (Carry Clear/Lower)
  • MI (Minus), PL (Plus)
  • VS (Overflow Set), VC (Overflow Clear)
  • HI (Higher), LS (Lower Same)
  • GE (Greater Equal), LT (Less Than)
  • GT (Greater Than), LE (Less Equal)
  • AL (Always)

Core Components

Main Modules

  • ARM.v - Top-level processor module
  • ALU.v - Arithmetic Logic Unit
  • ControlUnit.v - Instruction decoder and control signal generator
  • RegisterFile.v - 16-register register file
  • StatusRegister.v - Condition flags register

Pipeline Stages

  • IF_Stage.v / IF_Stage_Reg.v - Instruction fetch stage and register
  • ID_Stage.v / ID_Stage_Reg.v - Instruction decode stage and register
  • EXE_Stage.v / EXE_Stage_Reg.v - Execute stage and register
  • MEM_Stage.v / MEM_Stage_Reg.v - Memory stage and register
  • WB_Stage.v - Write back stage

Hazard Control

  • HazardDetector.v - Detects data hazards and generates stall signals
  • Forwarding.v - Implements data forwarding logic
  • Condition_Check.v - Evaluates condition codes for conditional execution

Memory System

  • memory.v - Main instruction/data memory
  • cache.v - Cache memory implementation
  • cache_controller.v - Cache controller with miss handling
  • SRAM.v / SRAM_Controller64.v - External SRAM interface

Utility Modules

  • mux2to1.v / mux_3_to_1.v - Multiplexers
  • register.v - Generic register module
  • incrementer.v - PC incrementer
  • val2gen.v - Value generation utilities

Configuration

The processor is highly configurable through defines.v:

`define ADDRESS_LEN             32     // Address bus width
`define INSTRUCTION_LEN         32     // Instruction width
`define REGISTER_LEN           32     // Register width
`define REGISTER_MEM_SIZE      16     // Number of registers
`define CACHE_ROWS             64     // Cache size
`define TAG_LEN               10     // Cache tag length

Cache Configuration

  • 64 rows with 2-way set associativity
  • LRU replacement policy
  • Write-through cache policy
  • Configurable tag length (10-bit default)

Memory Addressing

  • 32-bit addressing with byte-addressable memory
  • Word-aligned memory access
  • 2KB instruction memory (configurable)

Testing

The project includes comprehensive test infrastructure:

  • Testbench.v - Main processor testbench
  • test_cache.v - Cache-specific test module
  • Pre-programmed test instructions in memory.v
  • Instruction counter and monitoring capabilities

Running Tests

# Compile and simulate using your preferred Verilog simulator
# Example with ModelSim:
vlog *.v
vsim -c TB
run -all

Project Structure

ARM/
├── ALU.v                        # Arithmetic Logic Unit
├── ARM.v                        # Top-level processor module
├── cache_controller.v           # Cache controller with miss handling
├── cache.v                      # Cache memory implementation
├── Condition_Check.v            # Condition code evaluation
├── ControlUnit.v                # Instruction decoder and control signals
├── defines.v                    # System configuration parameters
├── EXE_Stage_Reg.v             # Execute stage pipeline register
├── EXE_Stage.v                 # Execute pipeline stage
├── Forwarding.v                # Data forwarding logic
├── HazardDetector.v            # Pipeline hazard detection
├── ID_Stage_Reg.v              # Decode stage pipeline register
├── ID_Stage.v                  # Instruction decode stage
├── IF_Stage_Reg.v              # Fetch stage pipeline register
├── IF_Stage.v                  # Instruction fetch stage
├── incrementer.v               # PC incrementer utility
├── inst_defs.v                 # Instruction definitions
├── insttt.py                   # Instruction generation helper
├── MEM_Stage_Reg.v             # Memory stage pipeline register
├── MEM_Stage.v                 # Memory access stage
├── memory.v                    # Main instruction/data memory
├── mux_3_to_1.v                # 3-to-1 multiplexer
├── mux2to1.v                   # 2-to-1 multiplexer
├── README.md                   # This file
├── register.v                  # Generic register module
├── RegisterFile.v              # 16-register register file
├── SRAM_Controller64.v         # 64-bit SRAM controller
├── SRAM.v                      # SRAM interface module
├── SRAM64.v                    # 64-bit SRAM module
├── StatusRegister.v            # Condition flags register
├── test_cache.v                # Cache test module
├── Testbench.v                 # Main processor testbench
├── val2gen.v                   # Value generation utilities
├── WB_Stage.v                  # Write back stage
├── Descriptions/               # Design documentation (PDFs)
└── Report/                     # Project reports and analysis

Features Highlights

Performance Optimizations

  • Data forwarding reduces pipeline stalls by 60-80%
  • Branch prediction minimizes control hazard penalties
  • Cache memory provides fast memory access
  • Hazard detection prevents data corruption

Educational Value

  • Complete pipeline implementation with all stages
  • Clear separation of concerns between modules
  • Well-documented control signals and data paths
  • Comprehensive test suite with real ARM instructions

Design Specifications

Register File

  • 16 general-purpose registers (R0-R15)
  • 32-bit wide registers
  • Dual-port read, single-port write
  • Register 15 (PC) handled specially for pipeline

ALU Capabilities

  • Full arithmetic operations (ADD, SUB, ADC, SBC)
  • Complete logical operations (AND, OR, XOR, NOT)
  • Flag generation (N, Z, C, V)
  • Carry chain support for multi-precision arithmetic

Memory Interface

  • Harvard architecture with separate instruction/data paths
  • 32-bit data bus with byte addressing
  • Cache-coherent memory system
  • SRAM controller for external memory expansion

Future Enhancements

Potential areas for expansion:

  • Multiply/Divide instructions (MUL, DIV)
  • Floating-point unit (FPU)
  • Interrupt handling system
  • Memory management unit (MMU)
  • Branch predictor improvements
  • Multi-level cache hierarchy

References

This implementation follows ARM Architecture Reference Manual specifications and includes optimizations commonly found in modern processors. The design emphasizes educational clarity while maintaining practical performance characteristics.


This ARM processor implementation demonstrates advanced computer architecture concepts including pipelining, memory hierarchy, hazard control, and performance optimization techniques.

About

A Verilog implementation of an ARM series processor supporting: Forwarding, SRAM, and Cache.

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •