Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 0 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,11 +37,5 @@ The repository structure is organized as follows:
- Inclusion of over 25 new slots.
- 5 new enumerations: EnumClinicalDataSourceType, EnumDataCategory, EnumGuidType, EnumParticipantLifespanStage, EnumResearchDomain.

### CLI Enhancements:

- **Validation**: Streamlines data cleaning and validation via the command line (CLI), allowing users to specify the data type and file path. The CLI reads, cleans, and validates data using LinkML-defined models for robust validation. For more details, use:

```bash
validate-data --help
```

61 changes: 61 additions & 0 deletions src/data_validation/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# LinkML Schema Linting and Validation

This project uses [LinkML](https://linkml.io/) to define schemas and validate tabular data files. Below are the commands used to lint and validate schemas and data using `linkml-lint` and `linkml-validate`.

---

## 🔍 Schema Linting

We use `linkml-lint` to check for syntax and structure issues in the schema.

### 1. Full Linting Check

```bash
linkml-lint src/linkml/include_schema.yaml
```
- **More Info**: [linkml-lint CLI](https://linkml.io/linkml/cli/lint.html)

✅ Data Validation
-----------------

We use `linkml-validate` to check whether data files conform to their schema definitions.

### 1. Validate `study.csv` against `Study` class
```bash
linkml-validate -s src/linkml/include_schema.yaml -C Study src/data/input/study.csv
```
### 2. Validate `participant.csv` against `Participant` class
```bash
linkml-validate -s src/linkml/include_schema.yaml -C Participant src/data/input/participant.csv
```
### 3. Validate `condition.csv` against `Condition` class
```bash
linkml-validate -s src/linkml/include_schema.yaml -C Condition src/data/input/condition.csv
```
### 4. Validate `biospecimen.csv` against `Biospecimen` class
```bash
linkml-validate -s src/linkml/include_schema.yaml -C Biospecimen src/data/input/biospecimen.csv
```
### 5. Validate `datafile.csv` against `DataFile` class
```bash
linkml-validate -s src/linkml/include_schema.yaml -C DataFile src/data/input/datafile.csv
```

- **More Info**: [linkml-validate CLI](https://linkml.io/linkml/cli/validate.html)

### 📤 Saving Validation Logs

To save validation output to a file (e.g., for documentation or reporting), redirect the output of `linkml-validate`:

```bash
linkml-validate -s src/linkml/include_schema.yaml -C Study src/data/input/study.csv > src/data/output/validation-report.md
```
You can change the extension to .csv, .txt, or .json based on your preferred format.

📌 Notes

- Ensure all required fields are present in your CSV files.

- Column names in SCV files must match the schema slot names.

- The schema file (src/linkml/include_schema.yaml) must define all referenced classes (Study, Participant, etc.).
Empty file removed src/data_validation/__init__.py
Empty file.
43 changes: 0 additions & 43 deletions src/data_validation/cli.py

This file was deleted.

37 changes: 0 additions & 37 deletions src/data_validation/validate_biospecimen.py

This file was deleted.

34 changes: 0 additions & 34 deletions src/data_validation/validate_condition.py

This file was deleted.

33 changes: 0 additions & 33 deletions src/data_validation/validate_datafile.py

This file was deleted.

34 changes: 0 additions & 34 deletions src/data_validation/validate_dataset.py

This file was deleted.

20 changes: 0 additions & 20 deletions src/data_validation/validate_datasetmanifest.py

This file was deleted.

32 changes: 0 additions & 32 deletions src/data_validation/validate_participant.py

This file was deleted.

40 changes: 0 additions & 40 deletions src/data_validation/validate_study.py

This file was deleted.

Loading