diff --git a/README.md b/README.md index c2e0a5f..991d347 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,3 @@ - - [![Stable](https://img.shields.io/badge/docs-stable-blue.svg)](https://moleculehub.github.io/OpenBabel.jl/dev/) [![Code Style: Blue](https://img.shields.io/badge/code%20style-blue-4495d1.svg)](https://github.com/JuliaDiff/BlueStyle) [![Aqua QA](https://raw.githubusercontent.com/JuliaTesting/Aqua.jl/master/badge.svg)](https://github.com/JuliaTesting/Aqua.jl) diff --git a/assets/logo.png b/assets/logo.png deleted file mode 100644 index d1b706c..0000000 Binary files a/assets/logo.png and /dev/null differ diff --git a/docs/src/api/energy.md b/docs/src/api/energy.md index 23e65ba..3ae25f6 100644 --- a/docs/src/api/energy.md +++ b/docs/src/api/energy.md @@ -2,26 +2,55 @@ These macros perform energy calculations, geometry optimizations, and advanced molecular computations. -## Energy Calculations +## Calculate Energy ```@docs @calculate_energy -@minimize_energy -@add_partial_charges ``` -## Structure Generation +### Usage Examples -```@docs -@canonicalize -@generate_conformers +Calculate energy using different force fields: + +```julia +# MMFF94 forcefield (default) +@chain begin + @read_file("molecules.smi", "smi") + @gen_3D_coords("fast") + @calculate_energy("MMFF94") + @output_as("energies.sdf", "sdf") + @execute +end + +# UFF forcefield +@chain begin + @read_file("molecules.smi", "smi") + @gen_3D_coords("med") + @calculate_energy("UFF") + @output_as("uff_energies.sdf", "sdf") + @execute +end ``` -## Usage Examples +Energy ranking workflow: -### Energy Minimization Workflow +```julia +@chain begin + @read_file("conformers.sdf", "sdf") + @calculate_energy("MMFF94") + @sort_by("Energy") # Sort by ascending energy (most stable first) + @output_as("energy_ranked.sdf", "sdf") + @execute +end +``` + +## Minimize Energy -Optimize molecular geometries using different forcefields: +```@docs +@minimize_energy +``` + +### Usage Examples ```julia # MMFF94 forcefield (default) @@ -29,7 +58,6 @@ Optimize molecular geometries using different forcefields: @read_file("molecules.smi", "smi") @gen_3D_coords("fast") @minimize_energy("MMFF94") - @calculate_energy("MMFF94") @output_as("optimized.sdf", "sdf") @execute end @@ -44,19 +72,26 @@ end end ``` -### Available Forcefields +Complete energy minimization workflow: -| Forcefield | Description | Best for | -|------------|-------------|----------| -| `MMFF94` | Merck Molecular Force Field | General organic molecules | -| `MMFF94s` | MMFF94 static | Static conformations | -| `UFF` | Universal Force Field | Broad chemical space | -| `GAFF` | General AMBER Force Field | Drug-like molecules | -| `Ghemical` | Ghemical Force Field | Quick calculations | +```julia +@chain begin + @read_file("molecules.smi", "smi") + @gen_3D_coords("fast") + @minimize_energy("MMFF94") + @calculate_energy("MMFF94") + @output_as("optimized.sdf", "sdf") + @execute +end +``` -### Partial Charge Calculation +## Add Partial Charges -Add partial charges using different methods: +```@docs +@add_partial_charges +``` + +### Usage Examples ```julia # Gasteiger charges (default) @@ -78,7 +113,7 @@ end end ``` -### Available Charge Methods +Available charge methods: | Method | Description | Accuracy | Speed | |--------|-------------|----------|-------| @@ -88,7 +123,32 @@ end | `eqeq` | EQEq charges | Medium | Fast | | `eem` | Electronegativity Equalization | Medium | Medium | -### Conformer Generation +## Canonicalize + +```@docs +@canonicalize +``` + +### Usage Examples + +Standardize molecular representation: + +```julia +@chain begin + @read_file("compounds.smi", "smi") + @canonicalize() + @output_as("canonical_structures.smi", "smi") + @execute +end +``` + +## Generate Conformers + +```@docs +@generate_conformers +``` + +### Usage Examples Generate multiple conformations: @@ -103,9 +163,19 @@ Generate multiple conformations: end ``` -### Complete Energy Workflow +## Available Forcefields -Comprehensive energy calculation pipeline: +| Forcefield | Description | Best for | +|------------|-------------|----------| +| `MMFF94` | Merck Molecular Force Field | General organic molecules | +| `MMFF94s` | MMFF94 static | Static conformations | +| `UFF` | Universal Force Field | Broad chemical space | +| `GAFF` | General AMBER Force Field | Drug-like molecules | +| `Ghemical` | Ghemical Force Field | Quick calculations | + +## Complete Workflows + +### Comprehensive Energy Calculation Pipeline ```julia @chain begin @@ -124,8 +194,6 @@ end ### Drug Discovery Pipeline -Typical computational chemistry workflow for drug discovery: - ```julia @chain begin @read_file("drug_candidates.smi", "smi") diff --git a/docs/src/api/filtering.md b/docs/src/api/filtering.md index e5ff651..ec6798f 100644 --- a/docs/src/api/filtering.md +++ b/docs/src/api/filtering.md @@ -2,103 +2,229 @@ These macros filter, sort, and manipulate molecular datasets based on structural patterns and properties. -## Pattern Matching +## Match SMARTS String ```@docs @match_smarts_string +``` + +### Usage Examples + +Filter molecules containing benzene rings: + +```julia +@chain begin + @read_file("database.smi", "smi") + @match_smarts_string("c1ccccc1") # Aromatic benzene ring + @output_as("benzene_compounds.smi", "smi") + @execute +end +``` + +Common SMARTS patterns: + +| Pattern | Description | +|---------|-------------| +| `c1ccccc1` | Benzene ring | +| `[OH]` | Hydroxyl group | +| `[NH2]` | Primary amine | +| `C=O` | Carbonyl group | +| `[#6]=[#8]` | Carbon double bonded to oxygen | +| `[R]` | Any atom in a ring | + +## Don't Match SMARTS String + +```@docs @dont_match_smarts_string ``` -## Sorting and Deduplication +### Usage Examples + +Exclude molecules with specific functional groups: + +```julia +@chain begin + @read_file("compounds.smi", "smi") + @dont_match_smarts_string("[OH]") # Remove alcohols + @output_as("no_alcohols.smi", "smi") + @execute +end +``` + +## Sort By ```@docs @sort_by +``` + +### Usage Examples + +```julia +@chain begin + @read_file("molecules.sdf", "sdf") + @add_properties(["MW"]) + @sort_by("MW") # Ascending order + @output_as("sorted_by_mw.sdf", "sdf") + @execute +end +``` + +## Sort By Reverse + +```@docs @sort_by_reverse +``` + +### Usage Examples + +Sort by logP in descending order: + +```julia +@chain begin + @read_file("molecules.sdf", "sdf") + @add_properties(["logP"]) + @sort_by_reverse("logP") # Descending order + @output_as("high_logp_first.sdf", "sdf") + @execute +end +``` + +## Remove Duplicate Molecules + +```@docs @remove_duplicate_mols ``` -## Structural Modifications +### Usage Examples + +Remove duplicate structures: + +```julia +@chain begin + @read_file("raw_data.smi", "smi") + @remove_duplicate_mols() + @output_as("unique_molecules.smi", "smi") + @execute +end +``` + +## Convert Dative Bonds ```@docs @convert_dative_bonds -@remove_hydrogens -@set_atom_order_canonical -@separate_fragments ``` -## Data Processing +### Usage Examples + +Convert dative bonds to standard representation: + +```julia +@chain begin + @read_file("complexes.sdf", "sdf") + @convert_dative_bonds() + @output_as("converted_bonds.sdf", "sdf") + @execute +end +``` + +## Remove Hydrogens ```@docs -@ignore_bad_molecules -@start_with_index +@remove_hydrogens ``` -## Usage Examples +### Usage Examples -### SMARTS Pattern Filtering +Remove explicit hydrogens: -Filter molecules containing benzene rings: +```julia +@chain begin + @read_file("molecules.sdf", "sdf") + @remove_hydrogens() + @output_as("implicit_h.sdf", "sdf") + @execute +end +``` + +## Set Atom Order Canonical + +```@docs +@set_atom_order_canonical +``` + +### Usage Examples + +Standardize atom ordering: ```julia @chain begin - @read_file("database.smi", "smi") - @match_smarts_string("c1ccccc1") # Aromatic benzene ring - @output_as("benzene_compounds.smi", "smi") + @read_file("molecules.sdf", "sdf") + @set_atom_order_canonical() + @output_as("canonical_order.sdf", "sdf") @execute end ``` -Exclude molecules with specific functional groups: +## Separate Fragments + +```@docs +@separate_fragments +``` + +### Usage Examples + +Split salts and complexes: ```julia @chain begin - @read_file("compounds.smi", "smi") - @dont_match_smarts_string("[OH]") # Remove alcohols - @output_as("no_alcohols.smi", "smi") + @read_file("salts.sdf", "sdf") + @separate_fragments() + @output_as("fragments.sdf", "sdf") @execute end ``` -### Common SMARTS Patterns +## Ignore Bad Molecules -| Pattern | Description | -|---------|-------------| -| `c1ccccc1` | Benzene ring | -| `[OH]` | Hydroxyl group | -| `[NH2]` | Primary amine | -| `C=O` | Carbonyl group | -| `[#6]=[#8]` | Carbon double bonded to oxygen | -| `[R]` | Any atom in a ring | +```@docs +@ignore_bad_molecules +``` -### Sorting by Properties +### Usage Examples -Sort molecules by molecular weight: +Skip invalid molecular structures: ```julia @chain begin - @read_file("molecules.sdf", "sdf") - @add_properties(["MW"]) - @sort_by("MW") # Ascending order - @output_as("sorted_by_mw.sdf", "sdf") + @read_file("raw_data.smi", "smi") + @ignore_bad_molecules() + @output_as("valid_molecules.smi", "smi") @execute end ``` -Sort by logP in descending order: +## Start With Index + +```@docs +@start_with_index +``` + +### Usage Examples ```julia @chain begin - @read_file("molecules.sdf", "sdf") - @add_properties(["logP"]) - @sort_by_reverse("logP") # Descending order - @output_as("high_logp_first.sdf", "sdf") + @read_file("huge_database.sdf", "sdf") + @start_with_index(1000) # Begin from molecule 1000 + @add_properties(["MW", "logP"]) + @sort_by("MW") + @output_as("subset_processed.sdf", "sdf") @execute end ``` -### Data Cleaning Workflow +## Complete Workflows -Complete data cleaning and processing pipeline: +### Data Cleaning Pipeline ```julia @chain begin @@ -114,17 +240,16 @@ Complete data cleaning and processing pipeline: end ``` -### Processing Large Datasets - -Start processing from a specific molecule index: +### Structure-Based Filtering ```julia @chain begin - @read_file("huge_database.sdf", "sdf") - @start_with_index(1000) # Begin from molecule 1000 + @read_file("compounds.smi", "smi") + @match_smarts_string("c1ccccc1") # Keep aromatic compounds + @dont_match_smarts_string("[OH]") # Remove alcohols @add_properties(["MW", "logP"]) - @sort_by("MW") - @output_as("subset_processed.sdf", "sdf") + @sort_by_reverse("logP") + @output_as("filtered_aromatics.sdf", "sdf") @execute end ``` \ No newline at end of file diff --git a/docs/src/api/io.md b/docs/src/api/io.md index 2bc1c4c..a1fc35b 100644 --- a/docs/src/api/io.md +++ b/docs/src/api/io.md @@ -2,42 +2,58 @@ These macros handle the fundamental input and output operations for molecular data files. -## File Input +## Read File ```@docs @read_file ``` -## File Output +### Usage Examples -```@docs -@output_as -@write_multiple_files +Basic file reading with format specification: + +```julia +@chain begin + @read_file("molecules.smi", "smi") + @output_as("molecules.mol", "mol") + @execute +end ``` -## Pipeline Execution +## Output As ```@docs -@execute +@output_as ``` -## Usage Examples - -### Basic File Conversion +### Usage Examples -Convert SMILES to MOL format: +Convert and save to different formats: ```julia +# SMILES to SDF conversion @chain begin - @read_file("molecules.smi", "smi") + @read_file("compounds.smi", "smi") + @gen_3D_coords("fast") + @output_as("compounds.sdf", "sdf") + @execute +end + +# Multiple format outputs +@chain begin + @read_file("molecules.sdf", "sdf") @output_as("molecules.mol", "mol") @execute end ``` -### Multiple Output Files +## Write Multiple Files + +```@docs +@write_multiple_files +``` -Create separate files for each molecule: +### Usage Examples ```julia @chain begin @@ -48,7 +64,26 @@ Create separate files for each molecule: end ``` -### Supported File Formats +## Execute + +```@docs +@execute +``` + +### Usage Examples + +Execute the processing pipeline: + +```julia +@chain begin + @read_file("molecules.smi", "smi") + @add_properties(["MW", "logP"]) + @output_as("processed.sdf", "sdf") + @execute # This executes the entire pipeline +end +``` + +## Supported File Formats The library supports all Open Babel file formats: @@ -60,4 +95,27 @@ The library supports all Open Babel file formats: | XYZ | `.xyz` | XYZ coordinate format | | PDB | `.pdb` | Protein Data Bank format | -For a complete list, refer to the [Open Babel documentation](https://openbabel.org/docs/FileFormats/Overview.html). \ No newline at end of file +For a complete list, refer to the [Open Babel documentation](https://openbabel.org/docs/FileFormats/Overview.html). + +## Complete Workflows + +### Basic File Conversion + +```julia +@chain begin + @read_file("molecules.smi", "smi") + @output_as("molecules.mol", "mol") + @execute +end +``` + +### Batch Processing with Multiple Outputs + +```julia +@chain begin + @read_file("database.sdf", "sdf") + @write_multiple_files() + @output_as("molecule", "mol") + @execute +end +``` \ No newline at end of file diff --git a/docs/src/api/properties.md b/docs/src/api/properties.md index 6139fc7..42ef321 100644 --- a/docs/src/api/properties.md +++ b/docs/src/api/properties.md @@ -2,35 +2,13 @@ These macros add molecular properties, generate coordinates, and modify molecular metadata. -## Property Calculation +## Add Properties ```@docs @add_properties ``` -### Available Descriptors - -The `get_available_descriptors()` function returns a list of all molecular descriptors that can be calculated. If this function is not yet implemented, refer to the Open Babel documentation for available descriptors. - -## Coordinate Generation - -```@docs -@gen_3D_coords -@gen_2D_coords -@center_coords_at_zero -``` - -## Metadata Addition - -```@docs -@add_filename -@add_index -@add_title -``` - -## Usage Examples - -### Adding Molecular Properties +### Usage Examples Calculate common molecular descriptors: @@ -43,16 +21,7 @@ Calculate common molecular descriptors: end ``` -### Available Properties - -Get a list of all available descriptors: - -```julia -descriptors = get_available_descriptors() -println(descriptors) -``` - -Common properties include: +Available properties include: - `MW` - Molecular weight - `logP` - Partition coefficient - `TPSA` - Topological polar surface area @@ -62,9 +31,22 @@ Common properties include: - `atoms` - Number of atoms - `bonds` - Number of bonds -### 3D Coordinate Generation +### Available Descriptors + +The `get_available_descriptors()` function returns a list of all molecular descriptors that can be calculated: + +```julia +descriptors = get_available_descriptors() +println(descriptors) +``` + +## Generate 3D Coordinates + +```@docs +@gen_3D_coords +``` -Generate 3D structures with different quality levels: +### Usage Examples ```julia # Fast generation (good for large datasets) @@ -76,6 +58,44 @@ Generate 3D structures with different quality levels: end # Higher quality (slower) +@chain begin + @read_file("input.smi", "smi") + @gen_3D_coords("slow") + @output_as("high_quality_3d.mol", "mol") + @execute +end +``` + +## Generate 2D Coordinates + +```@docs +@gen_2D_coords +``` + +### Usage Examples + +Generate 2D coordinates for visualization: + +```julia +@chain begin + @read_file("molecules.smi", "smi") + @gen_2D_coords() + @output_as("molecules_2d.sdf", "sdf") + @execute +end +``` + +## Center Coordinates at Zero + +```@docs +@center_coords_at_zero +``` + +### Usage Examples + +Center molecular coordinates at the origin: + +```julia @chain begin @read_file("input.smi", "smi") @gen_3D_coords("slow") @@ -85,17 +105,88 @@ end end ``` -### Adding Metadata +## Add Filename + +```@docs +@add_filename +``` + +### Usage Examples -Add indices and filenames to track molecule origins: +Add the source filename as metadata: ```julia @chain begin @read_file("database.sdf", "sdf") - @add_index() @add_filename() - @add_title("Processed Database") + @output_as("molecules_with_filename.sdf", "sdf") + @execute +end +``` + +## Add Index + +```@docs +@add_index +``` + +### Usage Examples + +Add molecule indices for tracking: + +```julia +@chain begin + @read_file("molecules.sdf", "sdf") + @add_index() @output_as("indexed_molecules.sdf", "sdf") @execute end +``` + +## Add Title + +```@docs +@add_title +``` + +### Usage Examples + +```julia +@chain begin + @read_file("molecules.sdf", "sdf") + @add_title("Processed Database") + @output_as("titled_molecules.sdf", "sdf") + @execute +end +``` + +## Complete Workflows + +### Property Calculation Pipeline + +```julia +@chain begin + @read_file("molecules.smi", "smi") + @gen_3D_coords("med") + @add_properties(["MW", "logP", "TPSA", "HBD", "HBA"]) + @add_index() + @add_filename() + @output_as("full_analysis.sdf", "sdf") + @execute +end +``` + +### Coordinate Generation with Metadata + +```julia +@chain begin + @read_file("database.sdf", "sdf") + @gen_3D_coords("slow") + @center_coords_at_zero() + @add_index() + @add_filename() + @add_title("High Quality 3D Structures") + @output_as("processed_3d.sdf", "sdf") + @execute +end ``` \ No newline at end of file diff --git a/docs/src/assets/logo.png b/docs/src/assets/logo.png new file mode 100644 index 0000000..0c2e6c3 Binary files /dev/null and b/docs/src/assets/logo.png differ diff --git a/docs/src/index.md b/docs/src/index.md index 7e4035c..fd7e4d2 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -1,6 +1,6 @@ # OpenBabel.jl -A Julia library for reading, writing, and transforming chemical data, powered by [Open Babel](https://github.com/openbabel/openbabel). +Julia bindings for [Open Babel](https://github.com/openbabel/openbabel) library. ## Installation