Add Llama 3.2 1B #6

findmyway · 2025-12-19T12:27:45Z

No description provided.

Copilot

Pull request overview

This pull request adds support for Llama 3.2 1B model training and includes significant refactoring of the julax framework including:

Major layer class refactoring (Repeated→Repeat, SkipConnection→Residual)
New visualization capabilities with rich console output
Addition of RMSNorm layer and improved embedding layers
New test infrastructure for input preprocessing

Reviewed changes

Copilot reviewed 10 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
uv.lock	Updates dependencies including new packages for Llama support (certifi, requests, tiktoken, safetensors, tensorflow-datasets)
tests/test_inputs.py	Adds comprehensive test cases for text input preprocessing with packed sequences
src/julax/pprint.py	Implements rich console pretty-printing for model structure visualization
src/julax/layers.py	Major refactoring: renames Repeated→Repeat, SkipConnection→Residual; adds RMSNorm, Select layers; improves param/state length tracking
src/julax/inputs.py	Removes placeholder implementation (file deleted)
src/julax/experiment.py	Adds cached mesh property for improved performance
src/julax/core.py	Removes common dtype/sharding fields from LayerBase; adds numel() method for parameter counting; adds IPython display support
src/julax/base.py	Adds len method to PyTree wrapper
pyproject.toml	Updates version to 0.0.4-dev; moves dev dependencies to dependency-groups; adds new dev dependencies
experiments/03_Llama_3.2_1B.py	Implements complete Llama 3.2 1B architecture with HuggingFace weight loading capability
experiments/02_mini_transformer.py	Updates to use refactored API (Repeat, Residual)
experiments/01_mnist.py	New MNIST training example demonstrating framework usage
.gitignore	Adds models/ directory to ignore list

Comments suppressed due to low confidence (2)

experiments/02_mini_transformer.py:84

The parameter name changed from "layer" to "processor" in the Residual class. However, in the Residual class implementation in layers.py, the constructor takes "processor" as the first parameter but internally stores it in a field called "skip_through" and another field. The API inconsistency needs review to ensure the class works as expected.
experiments/02_mini_transformer.py:33
The class name "Repeated" was changed to "Repeat" in the layers module, but the import reference here still uses the old name "Unembedding". However, looking at the diff, the Unembedding class definition was removed from layers.py (line 305-307 in the old version). This import will fail at runtime.

Copilot · 2025-12-19T12:31:09Z

pyproject.toml

 [project]
 name = "julax"
-version = "0.0.3"
+version = "0.0.4-dev"


The version string "0.0.4-dev" uses a hyphen separator which is not compliant with PEP 440. According to PEP 440, development releases should use ".dev0" notation (e.g., "0.0.4.dev0") instead of "-dev". This may cause issues with package managers and version comparisons.

Suggested change

version = "0.0.4-dev"

version = "0.0.4.dev0"

Copilot · 2025-12-19T12:31:10Z

src/julax/layers.py


-class Parallel(NamedLayers):
+class Residual(Branch):
+    def __init__(self, processor, *, skip_through=identity, reduce: Callable = jnp.add):


There's a type inconsistency in the Residual class constructor. It accepts "skip_through" and "reduce" parameters where "skip_through" defaults to identity function, but the Branch parent class uses "reduce" parameter. The Residual then passes both processor and skip_through to Branch's super().init as positional arguments, which expects named layers. This design may cause confusion about which parameters go where.

Suggested change

def __init__(self, processor, *, skip_through=identity, reduce: Callable = jnp.add):

def __init__(self, processor, *, skip_through: LayerLike | None = None, reduce: Callable = jnp.add):

if skip_through is None:

skip_through = identity

Copilot · 2025-12-19T12:31:10Z

experiments/03_Llama_3.2_1B.py

+
+def verify():
+    m = create_transformer()
+    p, s = m.init()


This assignment to 'p' is unnecessary as it is redefined before this value is used.

Suggested change

p, s = m.init()

Copilot · 2025-12-19T12:31:10Z

experiments/03_Llama_3.2_1B.py

+
+def verify():
+    m = create_transformer()
+    p, s = m.init()


This assignment to 's' is unnecessary as it is redefined before this value is used.

Suggested change

p, s = m.init()

findmyway added 11 commits November 29, 2025 09:54

bump version

5f79573

move some basic fields into subclass

d2a7732

init

54803a6

sync

8b1f4d2

Add initial VibeChannel structure

a92cc9b

move mesh into computed field

4df5b9b

sync

6c3902b

use scan in repeat

bcb3a73

sync

9dc4866

sync

007311f

sync

678f811

Copilot AI review requested due to automatic review settings December 19, 2025 12:27

Copilot started reviewing on behalf of findmyway December 19, 2025 12:28 View session

Copilot AI reviewed Dec 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Llama 3.2 1B #6

Add Llama 3.2 1B #6

Uh oh!

findmyway commented Dec 19, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 19, 2025

Uh oh!

Copilot AI Dec 19, 2025

Uh oh!

Copilot AI Dec 19, 2025

Uh oh!

Copilot AI Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-    def __init__(self, processor, *, skip_through=identity, reduce: Callable = jnp.add):
+    def __init__(self, processor, *, skip_through: LayerLike | None = None, reduce: Callable = jnp.add):
+        if skip_through is None:
+            skip_through = identity

Add Llama 3.2 1B #6

Are you sure you want to change the base?

Add Llama 3.2 1B #6

Uh oh!

Conversation

findmyway commented Dec 19, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants