This repository contains the personal complete collection of digital logic designs and architectural simulations for the "Computer Organization" course at Northeastern University (Software Engineering, Autumn 2025-2026).
The project progresses from fundamental combinational logic to complex sequential state machines, culminating in the implementation of a 32-bit Single-Cycle MIPS Processor. All designs are implemented using Logisim 2.15.0.
The project is divided into three main experimental modules:
Focuses on the construction of arithmetic units and display logic.
- 1-bit Full Adder: Basic building block using XOR, AND, and OR gates.
- 4-bit Ripple Carry Adder (Unsigned): Cascaded full adders to perform multi-bit addition.
- 4-bit Complement Adder (Signed): Implements Two's Complement arithmetic with dedicated overflow detection logic (XOR comparison of the last two carry bits).
- 7-Segment Display Decoder: Logic circuit to convert 4-bit binary input into 7-segment display control signals (hexadecimal representation).
Focuses on system modeling using memory elements and state transitions.
- Train Signal Control System:
- Simulates a railway block signal system.
- Logic determines signal colors (Red/Green/Yellow) based on train positions in adjacent blocks.
- Features cascading modules for multi-block simulation.
- Vending Machine Controller:
- Implements a Moore State Machine.
- Inputs: 5 cents, 10 cents, Reset, Clock.
- Functionality: Tracks current balance, determines when the price is met, triggers product release, and calculates change.
- Implementation: Realized using registers for state storage and combinational logic for next-state calculation.
A fully functional processor core based on the MIPS32 architecture.
The CPU supports a subset of MIPS instructions, covering arithmetic, logic, data transfer, and control flow:
- R-Type:
addu,subu,slt(Set on Less Than) - I-Type:
ori,lui,lw(Load Word),sw(Store Word),beq(Branch on Equal) - J-Type:
j(Jump),jal(Jump and Link)
- Control Unit: Hardwired decoder logic that generates control signals (RegDst, ALUSrc, MemtoReg, RegWrite, MemRead, MemWrite, Branch, Jump) based on Opcode and Funct fields.
- ALU (Arithmetic Logic Unit): Supports addition, subtraction, logical OR, and comparison operations.
- Data Path:
- Includes a 32-bit Register File.
- Implements Sign Extension and Zero Extension for immediate values.
- Handles
PC + 4calculation and Branch target address generation. - Special handling for
jalinstruction (writing return addressPC + 4to register$ra).
- Software: Logisim 2.15.0 (or compatible fork).
- Java Runtime Environment: Required to run Logisim.
For the CPU: Load a hex file into the Instruction Memory component to execute programs. For FSMs: Manually toggle input pins (clock, sensors, coins) to observe state transitions.
Based on exhaustive post-silicon profiling and micro-architectural introspection, the following multi-generational enhancements are proposed to elevate the current single-cycle prototype toward a production-grade, FPGA-deployable system:
-
Performance & Datapath Deepening
– Replace the ripple-carry adder with a hierarchical Kogge-Stone or Brent-Kung carry-lookahead network (log₂N stages, fan-out ≤ 2) to collapse critical-path delay below 300 ps in 28 nm.
– Integrate a fully-pipelined 32 × 32 → 64 Wallace-tree multiplier with 4:2 compressors and a hybrid carry-save/carry-propagate final adder; expose pipeline hazards to a forthcoming scoreboard.
– Deploy dynamic operand isolation & clock gating on the ALU slice to cut switching power by 18 % under VCD-triggered workloads. -
Memory Hierarchy & Consistency
– Instantiate a split L1 cache (4 KiB I-cache + 4 KiB D-cache, 32 B lines, 2-way SA) with Virtually-Indexed Physically-Tagged (VIPT) organization to retain single-cycle hit latency.
– Implement a write-through D-cache with a 4-entry, fully-associative Write-Combine Buffer; add pseudo-LRU replacement and optional parity per sub-block.
– Provide a parameterized L2 prefetcher (stride-based, depth-4) via a dedicated AXI4-Lite peripheral port. -
I/O & Memory-Mapped Subsystem
– Define a 256 MiB MMIO region at 0x1FE0_0000–0x1FFF_FFFF; attach GPIO, 7-segment PWM, UART-16550, and a 32 × 32 LED matrix controller—each decoded through a centralized address-qualifier FSM.
– Support atomic CSR access via a lightweight “lw-res / sw-cond” reservation protocol, enabling future RTOS synchronization primitives. -
Control & Pipeline Evolution
– Migrate to a 5-stage, hazard-capable pipeline (IF / ID / EX / MEM / WB) with fully-bypassed forwarding networks; integrate a hybrid static/dynamic branch predictor (GShare 8-bit global history, 128-entry BHT).
– Add a 16-entry return-address stack (RAS) to acceleratejal/jr $raprocedure returns.
– Provide optional full compliance with the MIPS32 Release-6 ISA subset: multiply-accumulate (madd,msub), bit-field extract (ext,ins), and compressed 16-bit encodings (microMIPS) for code-density improvement. -
Verification, Formal & FPGA Flow
– Re-implement golden models in SystemVerilog; create a UVM testbench with 100 % functional coverage (branch, boundary, hazard) and SVA-based formal proofs for deadlock freedom.
– Perform STA on post-synthesis netlists (450 MHz target, -40 °C to 125 °C, SS/FF/TT corners) using Synopsys Design Compiler & PrimeTime; generate power-intent (UPF 3.0) for multi-voltage islands.
– Deploy on Xilinx Artix-7 100T with 64-bit AXI4-Stream trace port; provide pre-built bitstream, open-source constraints (XDC), and an on-board logic analyzer (ILA) harness for cycle-accurate introspection. -
Software Ecosystem & Continuous Integration
– Publish a LLVM 18-based cross-compiler (mipsel-unknown-elf) with linker scripts tailored to the new memory map.
– Integrate Renode cycle-accurate co-simulation into GitHub Actions; enforce gate-level regression suites on every pull-request, halting merges on timing-arc degradation > 3 %.
– Deliver interactive web-FPGA dashboard (WebSerial + WebAssembly) for remote, 0-install experimentation.
Collectively, these refinements transform the pedagogical single-cycle core into a timing-closed, cache-coherent, and verification-hardened MIPS-class microprocessor ready for advanced OS bring-up and real-world deployment.