This curriculum is my personal approach to learning topics in computer science, mathematics, and deep learning (+ quant finance) from first principles. This repository is not necessarily a full curriculum, but rather a reflection of my interests, needs, and gaps in my knowledge. However, my goal is to cover the whole compute stack from transistors and assembly to the math underpinning transformer models. As such, the notes and exercises compiled below should form a solid computer science, math, and deep learning curriculum.
- Hardware / Processors
- Compilers / Operating Systems
- Data Structures / Algorithms
- Computer Languages
- Mathematics
- Statistics
- Machine Learning
- Deep Learning
- Quantitative Finance
- Additional Topics
- Assorted
- Unix Cheatsheet
- Git Cheatsheet
- Python Cheatsheet
- R Cheatsheet
- R Plots and Tables Cheatsheet
- Java Cheatsheet
- SQL Cheatsheet
- Mathematics Notes
- Calculus Notes
- Linear Algebra Notes
- Statistics Notes
- Machine Learning Notes
- Deep Learning Notes
- Quantitative Finance Notes
/Assorted/atoms2bits.pdf
- This preface (atoms2bits) gives an overview of the complete compute stack: the refining of sand into silicon ingots, the doping of silicon to create differences in valence shell electrons, how doped silicon is used to build transistors, how transistors form logic gates, CPU architecture and operation, memory caches, instruction set architecture, operating systems, and higher level languages.
- Implement AND, NAND, XOR, and other logic gates from scratch, use these logic gates to build functional units such as a 1-bit adder, multiplexer, sequential logic, SRAM, etc. Use logic gates to create simple implementations of various memory units: L1 cache, DRAM, etc. The functional units built in this section will be used later when building a model instruction set.
- Encode logic gates in a LUT (lookup table).
- Build a simple ARM7 CPU in Verilog or with another infrastructure.
- Start by building a pipeline with simple start, decode, fetch, and exectute commands. Then build a simple register/memory unit to push and pull data, a simple ALU (arithmetic logic unit) that can perform basic arithmetic and logic operations, and a simple CU (control unit) for finding instructions and directing operations. These should be built on top of the functional units constructed in the hardware lesson.
- Additional: Write basic arithmetic instructions, branch instructions, and memory instructions. Allow for instruction out-of-order, basic parallelism, use dependency graphs for instructions. Set up a memory hierarchy with Registers, L1 cache, L2 cache, L3 cache, and DRAM.
- Extra-curricular: Build a UART in Verilog, GPU basics.
/CompilersOperatingSystems
- Cheatsheets:
- Unix Cheatsheet: Covers useful Unix commands.
- Create a set of general notes on linux/unix, combine with notes in unix.pdf in local. Add notes on the basic linux filesystem format from here.
- Build a C compiler in Haskell. Consider the basics of compiler design, write a parser, output ARM assembly which can then be run through the simple processor designed in the processors lesson.
- Build functions for converting binary to numbers and ASCII characters.
- Learn RISC-V architecture, contrast with x86, ARM, and other instruction set architectures.
- Build a UNIX-ish operating system in C or C++ with simple abilities like open, read, write, close, init, cat, ls, rm.
- Build a filesystem like FAT in C or C++.
/DataStructuresAlgorithms
-
Cheatsheets:
- Data Structures and Time Complexity Cheatsheet: Covers dependencies, primitive types, basic types, fundamental data structures, sorting/searching algorithms, and time complexities of common algorithms.
- Binary Tree Cheatsheet: Covers binary tree search, recursion, and implementation.
-
Notes on:
- DSA Fundamentals: Data structures and algorithms fundamentals.
- Linked Lists: Linked lists, doubly linked lists, and circular linked lists.
- Sorting and Searching: Sorting and searching algorithms.
- Stacks Queues Sets: Stacks, queues, and sets.
- Hash Tables: Hash tables and hash algorithms.
- Trees: Tree structures, properties, and algorithms.
- Heaps Priority Queues: Heaps and priority queues.
- Graphs: Graph structures.
- Graph Algorithms: Graph algorithms.
-
Exercises:
- Comparison of the speed of recursion in Python, R, Java, C, and C++.
- Build a simple grid search algorithm, then build a Bayesian grid search algorithm, demo using Bayesian grid search for hyperparameter tuning.
- Learn the low-level data structures behind the base data types in Python such as: list, array, tuple, et cetera. Build versions of these data types in C++ and then port to Python with a package like
pybind11. - Add a lesson on the properties of different number systems: hexadecimal system, 32-bit numbers, 128-bit numbers, how with a 32-bit number system you can generate the numbers 0-65535, etc.
/ComputerLanguages
-
Cheatsheets:
- Programming Paradigms Cheatsheet: Covers compiled vs. interpreted languages, imperative, declarative, functional, and object-oriented programming paradigms.
-
Notes on:
- Python Notes: Python
- Java Notes: Java
- R Notes: R
- Julia Notes: Julia
- SQL Cheatsheet: SQL
- Add notes on C, C++, Go, Rust.
- Replicate lower level Python functions such as len(), dictionary, hash table, a sorting algorithm, slicing, indexing, set inclusion, et cetera in C++ to learn how these functions work. Then port these functions to Python with
pybind11. Interface these functions with the data structures built from scratch in the data structures lesson. - Overview of object-oriented-programming (OOP).
- Build a dictionary method using C++ and an R dictionary interface. Then publish an R dictionary class to CRAN.
/Math
- Notes on:
- Linear Algebra Notes: General concepts of linear algebra.
- Calculus Notes: General concepts of calculus.
- Math Notes: Assorted topics in mathematics.
- Math Symbols Cheatsheet: Commonly used math symbols.
/Statistics
-
Notes on:
- Statistics Notes: General topics in statistics.
- Regression Notes: Notes on regression, methods of estimation, and linear regression interpretations.
- SVD and PCA Notes: Notes on Singular Value Decomposition and Principal Component Analysis.
-
Exercises:
- Implementation of Maximum Likelihood Estimation (MLE) from scratch that benchmarks against a canonical optimization function.
- Implementation of the Metropolis-Hastings method from scratch.
- Poker exercises: implement a card deck class with operations such as draw X cards, calculate probabilites of different hands given different stages of a game, implement riffle shuffle, et cetera.
/MachineLearning
- Notes on:
- Machine Learning Principles: Machine learning principles.
- Machine Learning Code: Machine learning code.
/NeuralNetworks
-
Notes on:
- Deep Learning Principles: Deep learning principles.
- Deep Learning Code: Deep learning code.
-
Exercises:
- Implementation of the following neural network architectures from scratch using NumPy. Neural networks are then trained and tested out-of-sample.
- Feedforward Neural Network
- Recurrent Neural Network
- Solving simple neural network problems relating to finding optimas, gradient descent, et cetera.
- Implementation of the following neural network architectures from scratch using NumPy. Neural networks are then trained and tested out-of-sample.
- Build the following models from scratch:
- Feedforward Neural Network > Recurrent Neural Network > Convolution Neural Network > ResNet > Transformer Model > Diffusion Model
- Verify the correct order of the models so that all paradigmatic deep learning models are built in-order.
/QuantitativeFinance
- Notes on:
- Quantitative Finance Notes: Notes on assorted topics in quantitative finance.
- Valuation Measures
- Sensitivity Measures (Interest Rate Risk)
- Credit Analysis (Credit Risk)
- Yield Curves
- Fixed Income Security Types
- Bond Measure Definitions
- General Definitions
- Assorted
- Finance Papers
- Quantitative Finance Notes: Notes on assorted topics in quantitative finance.
/NetworkingInternet
- Notes on:
- Network / Internet Notes: Notes on computer networking and how the internet protocol stack works.
- Manually create and send TCP/IP packets: tutorial
- Write a hashing function from scratch. Use the hashing function to build a hash table function.
- Upload pre-existing crptography notes.
- Write a script to generate public/private key pairs using SHA256 or another hashing function.
- Cheatsheets:
- Git Cheatsheet: Covers useful commands for working with Git and GitHub.
- Latex Cheatsheet: Covers useful commands and packages for working in LaTeX.
- Add notes to
Statisticson Poisson processes, Cox Process / Doubly Stochastic Poisson Process. - Add notes to
QuantitativeFinancenotes on Cox-Ingersoll-Ross process which is commonly used in term-structure modelling, add notes on Affine Term Structure Models.- Add notes on the difference between risk-free measures Q and physical measures P.
- Reference ReducedFormModel.pdf for help.
- Work on
/Mathcalculus fundamentals notes and/Statisticsstatistics fundamentals notes. - Work on
/Mathnotes on differential equations, partial differential equations.- Use 3B1B videos.
- Add
/QuantitativeFinancenotes on Merton (1974), Nelson-Siegel, Black-Scholes, and a Normal Jump Diffusion Model (GBM that allows for Poisson jumps in asset prices, see Merton 1976). - Neural Network work.
- Look at Karpathy lectures.
- Look at tinygrad tutorial, play with tinygrad.
- Build paradigmatic models in sequential order up to the Transformer model.
- Read Attention Is All You Need paper.
- Add a quick study of BLAS and LAPACK routines.
- Work on Fluent Python notes.
- Add LaTeX equations to pre-existing math/stats/neural network exercises and
/Assorted/EquationsCheetsheet.mdbased off of this tutorial. - Work on poker exercises.
- Build a hashing function (SHA256) from scratch, then use it to build a hash table class from scratch.
