Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
96 changes: 94 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,94 @@
# Research_Legacy_Codes
This repository stores code used in my previous research projects. The scripts are archived for reference and are not actively maintained, except when needed for review or specific requests.
# ResearchCodes

ResearchCodes is a personal research code vault. Everything lives under `Projects/` as self contained project folders (example data, Python modules, notebooks, notes). The goal is to make it easy to clone anywhere, reproduce the environment, and reuse code later.

## Quick start

### 1) Clone

```bash
git clone git@github.com:Yong2Sheng/ResearchCodes.git
cd ResearchCodes
```

### 2) Create the conda environment from `environment.yml`

Create a fresh environment:

```bash
conda env create -f environment.yml
conda activate researchcode
```
Notes:
- `environment.yml` is the source of truth for dependencies.
- If you add a new import in any project module, update `environment.yml` accordingly.
- Prefer keeping the environment reproducible on a clean machine, because CI will validate it.

During developing, you can update an existing environment to match `environment.yml` (and remove packages no longer listed):

```bash
conda env update -f environment.yml --prune
conda activate researchcode
```

### 3) Install pre-push checks (optional but recommended for pushing changes)

This repo is configured to run checks on `git push` (not on every commit).

```bash
pre-commit install -f
```

Manually run the same checks that would run on push:

```bash
pre-commit run --all-files --hook-stage pre-push
```

### 4) Run smoke tests locally

Smoke tests are intentionally lightweight. They mainly validate that key modules can be imported in the current environment.

```bash
pytest -q -m smoke -x -ra
```

To see what smoke tests would be collected:

```bash
pytest -m smoke --collect-only
```

## Repository layout

- `Projects/`
- Each subfolder is an independent project workspace.
- Typical contents: `*.py`, notebooks, configs, small example data.
- `tests/`
- Smoke tests (import checks and minimal sanity checks).
- `environment.yml`
- Conda environment specification (primary dependency source of truth).
- `pyproject.toml`
- Pytest configuration (markers, test discovery).
- `.pre-commit-config.yaml`
- Pre-push hooks (lint, hygiene checks, smoke test).
- `.github/workflows/ci.yml`
- CI runs pre-commit (full repo) and smoke tests on a clean environment.

## CI on GitHub

GitHub Actions runs on push and pull requests:
- A `pre-commit` job that runs repository hooks against the full repository.
- A `smoke` job that builds the conda environment from `environment.yml` and runs `pytest -m smoke`.

CI is useful because it runs in a clean environment. If something imports locally only because your machine already had an extra dependency installed, CI will catch it.

## Important notes and conventions

### Auto-fix hooks are restricted

Some hooks can automatically edit files (for example trimming trailing whitespace). This repo restricts auto-fix hooks to code and configuration file extensions to prevent accidental edits to scientific data files.

### Large data

Avoid committing large raw datasets into git. Prefer small example inputs that are sufficient to reproduce a workflow, and store large data elsewhere.