Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Oct 28, 2025

Provides high-level API for out-of-core matrix operations that wraps existing NumPy/SciPy operations rather than reimplementing mathematical kernels. Focuses on orchestration: lazy evaluation, block-wise processing, buffer management, and streaming.

New API

OOCMatrix - NumPy-like wrapper for out-of-core operations:

  • Block-wise operations: blockwise_apply(), blockwise_reduce(), iterate_blocks()
  • Matrix operations: matmul() with custom operation support, lazy operators (+, *, @)
  • Statistics: sum(), mean(), std(), min(), max() computed block-wise
  • Integrates with existing Plan infrastructure for lazy evaluation and operator fusion

Example

from paper import OOCMatrix
import numpy as np

# Load large matrices (doesn't load full data into RAM)
A = OOCMatrix('fileA.bin', shape=(10_000_000, 1000))
B = OOCMatrix('fileB.bin', shape=(1000, 1000))

# Matrix multiply using NumPy dot for in-block ops
C = A.matmul(B, op=np.dot)

# Stream results block-by-block
for block, (r, c) in C.iterate_blocks():
    process(block)  # Each block is plain NumPy array

# Statistics computed without full load
mean = A.mean()
std = A.std()

# Apply transformations using existing NumPy ops
A_norm = A.blockwise_apply(lambda x: (x - mean) / std)

# Lazy evaluation with automatic fusion
result = (A + B) * 2  # Builds plan, fuses ops
materialized = result.compute('output.bin')  # Executes

Implementation

  • paper/operators.py: OOCMatrix class (380 lines)
  • tests/test_operators.py: 16 tests covering API surface
  • examples_oocmatrix.py: 6 examples demonstrating usage patterns
  • OPERATORS.md: API reference and design rationale
  • README.md: Quick start guide

All mathematical operations delegate to NumPy/SciPy. Framework handles block orchestration, lazy DAG building, buffer management, and scheduling only.

Original prompt

This section details on the original issue you should resolve

<issue_title>operators support to reach full ML algorithm capabilities - scaffold</issue_title>
<issue_description>we do not reimplement every mathematical operation from scratch to make the OOC framework useful and production-ready. The real gap is how operations are orchestrated / this out-of-core frameworks primarily need to:​​

Wrap existing NumPy/SciPy/Pandas operations with lazy evaluation, block-wise processing, and streaming constructs.​

Chunk large datasets and use existing optimized libraries for computations inside each chunk/block.​​

Intercept user code at the API level to provide out-of-core support transparently -> for example, providing a NumPy-like API that does not load the full matrix/array into RAM, but instead dynamically loads blocks and applies operations as needed.​

How an Ideal Implementation Looks

Matrix multiplication (@), element-wise operations, reductions (sum, mean), slicing, filtering, and custom function applications can utilize existing backends (NumPy, SciPy, CuPy).

The OOC framework focuses on providing smart block loading, buffer management, iteration control, compression, eviction, and prefetching.

Only a small set of custom logic must be written for block-wise orchestration, lazy DAG building, and scheduling / while math kernels stay fully optimized using current libraries.

Example: Block-wise Wrapper (No Kernel Rewrite Needed)

from ooc import OOCMatrix  # Your wrapper
import numpy as np

# User applies existing NumPy/SciPy operations on each block automatically
A = OOCMatrix('fileA.h5', shape=(10_000_000, 1000))
B = OOCMatrix('fileB.h5', shape=(1000, 1000))

# This triggers block-wise multiplication, but each block operation is plain NumPy
def matmul_blocks(A_block, B_block):
    return np.dot(A_block, B_block)

# API exposes big operations -- no kernel rewrite required!
C = A.matmul(B, op=matmul_blocks)

for block, idx in C.iterate_blocks():
    # downstream systems can consume each result block
    process(block)

# Other ops, e.g., sum, mean, normalization:
mean = A.blockwise_reduce(np.mean)
A_normalized = A.blockwise_apply(lambda x: (x-mean)/np.std(x))

Where you do sometimes need custom logic: For advanced DAG optimization (operator fusion, prescient eviction, smart scheduling), we do need custom Python code / but not for basic matrix, linear algebra, or statistical computations.

Proven Practice

Successful frameworks (Dask, Vaex, RAPIDS) leverage existing libraries for in-block ops and focus on orchestration, DAG, scheduling, and buffer management.​

Conclusion

we do not need to reimplement fundamental matrix and array ops if you architect the OOC framework to reuse mature libraries for in-core computations. The main engineering lift is building the block orchestration, IO scheduling, and memory management, which leverages and extends / not replaces / existing scientific Python code.​</issue_description>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 3 commits October 28, 2025 02:37
Co-authored-by: j143 <53068787+j143@users.noreply.github.com>
Co-authored-by: j143 <53068787+j143@users.noreply.github.com>
Co-authored-by: j143 <53068787+j143@users.noreply.github.com>
Copilot AI changed the title [WIP] Add operators support for ML algorithm capabilities Add OOCMatrix API for out-of-core operations with NumPy/SciPy backend orchestration Oct 28, 2025
Copilot AI requested a review from j143 October 28, 2025 02:45
@j143 j143 closed this Nov 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

operators support to reach full ML algorithm capabilities - scaffold

2 participants