hintlogp <name of control file> <name of molecular structure file> <name of output file>for example:
hintlogp demo_control.dat demo.sdf demo.sdf.outCONTROL FILE
The control file contains keywords which define the molecular structure file format, the algorithms used, the information output to the output file, etc. The keywords are described in the table below. An example of this file is shown here.
DETAIL moleculelevel INPUTFORMAT oelibsmiles WARNINGS on ERRORS skip INDEX on OUTPUT record OUTPUT name OUTPUT formula OUTPUT weight OUTPUT logp DELIMITER space GOEach control file must contain a "GO" at the end of all other options. In this control file we only want to view the logP data for the whole molecule (rather than atom level) along with record number, molecule name, molecular formula, and molecular weight. This information is separated by spaces. INDEX writes the record number for each processed molecule to the stderr. Our molecular structure file will be in the SMILES format, but read by the OELIB code. The 4 options for INPUTFORMAT are "daylightsmiles", "oelibsmiles", "oelibsdf" and "oelibmol2". WARNINGS are turned on (sending information to the stderr), and ERRORS will cause the code to skip the calculations on that specific structure, and procede to the next.
Table 3: Control File KEYWORDS
keyword |
options |
function |
---|---|---|
DETAIL |
moleculelevel |
print the logP only for the whole molecule |
atomlevel |
print hydropathy information for atom components of the molecule |
|
OUTPUT |
record or norecord |
print the record number in the outfile (or not) |
name or noname |
print the molecule name in the outfile (or not) |
|
formula or noformula |
print the molecular formula in the outfile (or not) |
|
weight or noweight |
print the molecular weight in the outfile (or not) |
|
logp or nologp |
print the value of the logP in the outfile (or not) |
|
INPUTFORMAT |
daylightsmiles |
use (licensed) Daylight toolkit to interpret molecule structure |
oelibsmiles |
use OElib SMILES routine to interpret molecule structure |
|
oelibsdf |
use OElib SDF (MDL) routine to interpret molecule structure |
|
oelibmol2 |
use OElib MOL2 (Sybyl) routine to interpret molecule structure |
|
DELIMITER |
space |
information in each record is delimited by a space |
comma |
information in each record is delimited by a comma |
|
WARNINGS |
on* |
write non-serious warning messages to stderr |
off |
do not write non-serious warning messages to stderr |
|
ERRORS |
exit |
exit on serious error |
skip* |
skip current calculation, move to next molecule on serious error |
|
continue |
continue current calculation, even with compromised input data |
|
INDEX |
on* |
write index (or record) number for each processed molecule to stderr |
off |
do not write index number for each processed molecule to stderr |
|
GO |
n/a |
end of options input, begin calculations |
* - default settings for program parameters.
MOLECULAR STRUCTURE FILE
The program expects an input molecular structure file which can be in one of three formats. The formats are described in more detail in Input File Formats
Daylight SMILES format read by Daylight SMILES Toolkit (this requires a run-time "smiles" license from Daylight).
MDL SDFile format read by OElib code from OpenEye (this does not require any additional code or license).
Tripos Sybyl/MOL2
format read by OElib code from OpenEye
(this does not require any additional code or license).
OUTPUT FILE
The structure of the output file depends on the keywords used in the control file. For example, use of the the keyword DETAIL MOLECULELEVEL will provide the most consise logP output, while DETAIL ATOMLEVEL will provide extensive details on hydropathy components in both atoms and fragments of the molecule. One should be careful using the ATOMLEVEL keyword with large databases as this could produce a very large output file.
For the cases where a large database is used, and
INDEX is set to "on", it may be desirable to save this indexing information
to a file so that you may determine which molecules in the database have
problems that may need correcting. This information is printed to the
"stderr" port and therfore it can be collected in a separate file using
the following command:
UNIX/LINUX: $HINT_RUN/hintlogp control.dat database1.smi database1.s >& hint.log
Windows 2000/XP: HINTLOGP control.dat database1.smi database1.s 2> hint.log
You should note that the LICENSE errors (which are the most common) are also
printed to the "stderr" port, so anytime your output file is missing information,
you should check the terminal or the stderr log file.
hintlogp <name of control file> <name
of molecular structure file> <name of output file> [>& hint.log]
First, copy the files in the hintlogp3.06_
DEMO1
The file demo1.smi is simply benzene. To run (the $HINT_RUN is not necessary if this directory is in the $PATH):
$HINT_RUN/hintlogp demo_control.dat demo1.smi demo1.out
Compare demo1.out (new) and demo1.OUT (archival)
for differences.
DEMO2
The file demo2.smi contains 100 molecules of varying complexity. To run:
$HINT_RUN/hintlogp demo_control.dat demo2.smi demo2.out
Compare demo2.out with demo2.OUT.
DEMO3
Edit demo_control.dat and change the INPUTFORMAT keyword from oelibsmiles to oelibsdf.
The file demo3.sdf contains 12 simple molecules. To run:
$HINT_RUN/hintlogp demo_control.dat demo3.sdf demo3.out
Compare demo3.out with demo3.OUT
DEMO4
The file demo4.sdf contains 50 molecules of moderate complexity. To run:
$HINT_RUN/hintlogp demo_control.dat demo4.sdf demo4.out
Compare demo4.out with demo4.OUT
DEMO5
Edit demo_control.dat and change the INPUTFORMAT keyword to oelibmol2.
The file demo5.mol2 contains 10 fairly simple molecules. To run:
$HINT_RUN/hintlogp demo_control.dat demo5.mol2 demo5.out
Compare demo5.out with demo5.OUT