Written by David Chalmers and Ben Roberts
Silico is a set of Perl libraries and scripts to assist with molecular modelling tasks. Capablilities include molecule and model system construction, format interconversion, molecular rotations and translations, atom and substructure naming, functional group identification or acting as a wrapper script for other software.
Download Silico from Github
git clone git@github.com:dkchalmers/Silico.git
Silico is organised as a single directory tree:
bin Containing the executable scripts
data Atom types, boilerplate text describing silico flags, etc.
doc Documentation
lib Library files containing silico subroutines
test Directory with test files
Silico needs the environment variable $SILICO_HOME to point to the top level Silico directory. Add the $SILICO_HOME/bin directory to your path.
csh
setenv SILICO_HOME /path/to/Silico
setenv PATH $PATH\:$SILICO_HOME/bin
bash
export SILICO_HOME=/path/to/Silico
export PATH="$PATH:$SILICO_HOME/bin"
Silico scripts use the general form:
scriptname filename1 filename2 ....
Flags start with a the character '-'. For example, the command
write_mol -o mol2 $SILICO_HOME/test/molecules/tricycle_3D.sdf
will read the file 'tricycle_3D.sdf', set the -o flag (output format) to mol2. The result will be a new output file 'tricycle_3D_new.mol2'.
While most flags are specific to a script several flags are handled in the silico_io module and operate in all Silico scripts. These are:
-help Print out the documenation present in the script file
-m Prompt for all flags
These flags are available in many Silico scripts:
-debug Print out debugging information
-nosort Do not change atom order
-noconnect Do not generate atom connectivites
-o Output file format. For most formats this is the file extension. E.g. pdb, mol2, sdf.
-O Output filename
Some Unix shells are limited by the number of arguments (files) that can be included on the command line. Silico supports very large lists of input files by using Perl regular expressions to carry out its own wildcard matching. To use this feature it is necessary to enclose the file argument in quotes (to escape shell expansion).
Example: write_mol2 '*.pdb'
This command enables conversion of a very long list of pdb files to mol2 format.
Silico scripts and libraries are documented within the source code. The documentation is marked up using a simple markup scheme (see silico_doc.pm) that can be converted to formatted ASCII text, markdown or HTML. Conversion routines are found in silico_doc.pm. The formatdoc script will perform the conversion for Silico files. A complete set of documentation is generated by running the script 'doc/makedoc'.
Internally molecular structures are stored in ensembles, molecules and atoms:
Each atom is a Perl hash which can contain any desired fields. The most important are:
$atom->{NUM} Atom number
$atom->{NAME} Atom name
$atom->{SUBNAME} Residue name
$atom->{SUBID} Residue number
$atom->{X} X coordinate
$atom->{Y} Y coordinate
$atom->{Z} Z coordinate
$atom->{ELEMENT} Atom element
$atom->{ELEMENT NUM} Atom element number
$atom->{CONNECT} Pointer to array containing atom numbers of connected atoms
$atom->{BORDERS} Pointer to array containing orders of bonds to connected atoms
Molecules are Perl hashes which contain attributes such as Name, Number of atoms and the collection of atom records in a hash. Any desired field can be put into the hash but the most important are:
$mol->{ATOMS} Pointer to an array of atoms
$mol->{NAME} Molecule name
$mol->{NUMATOMS} Number of atoms. Note: indexed from 1 rather than 0.
$mol->{NUMBONDS} Number of bonds. Note: indexed from 1 rather than 0.
Ensembles are collections of molecules. They are stored simply as Perl arrays of molecules and unlike molecules or atoms can contain no additional attributes
Ensembles are read from or written to output files as multiple molecule files (mol2, sdf, mmod) or as separate MODELS in a pdb file.
The bin directory contains variety of scripts that have been written in the course of various projects.
Silico can read and write files in several molecular modelling formats. The input file types is (mostly) recognised by filename extension. Some exceptions are made for formats that use a .out suffix, etc.
Silico can be used to make wrappers to run other programs from the command line, particularly when running series of programs. An example is the script mmod_min
Basic solvation scripts include mol_solvate (to embed a molecule in a solvent box), mol_solvate_bilayer_plane (to place a molecule in a periodic bilayer plane) and mol_solvate_bilayer_rod (to place a periodic bilayer rod).
Silico certainly contains bugs, definitions of things that may not be the best way to define things, inefficient algorithms, poor memory useage, and other things that can easily be fixed with the application of a lot of time and effort.
Older iterations of silico were published on Sourceforge: sourceforge.net/projects/silico/
Please cite as: Silico: A Perl Molecular Toolkit. DK Chalmers and BP Roberts. https://github.com/dkchalmers/Silico
You can find papers that have used the Silico molecular toolkit here: https://scholar.google.com/scholar?cluster=10206653385737204589