Using HintLogP

General Program Description

HintLogP is a file-oriented program with a terminal command-line interface. The program requires two input files (a molecular structure file and a control file) and it produces an output file whose structure is defined in the control file.. The program is executed on the command line of a character terminal (DOS window in Windows version) and has the following syntax:
hintlogp <name of control file> <name of molecular structure file> <name of output file>
for example:
hintlogp demo_control.dat demo.sdf demo.sdf.out

The control file contains keywords which define the molecular structure file format, the algorithms used, the information output to the output file, etc. The keywords are described in the table below. An example of this file is shown here.

DETAIL moleculelevel 
INPUTFORMAT oelibsmiles 
ERRORS skip 
OUTPUT record 
OUTPUT name 
OUTPUT formula 
OUTPUT weight 
OUTPUT logp 
Each control file must contain a "GO" at the end of all other options. In this control file we only want to view the logP data for the whole molecule (rather than atom level) along with record number, molecule name, molecular formula, and molecular weight. This information is separated by spaces. INDEX writes the record number for each processed molecule to the stderr. Our molecular structure file will be in the SMILES format, but read by the OELIB code. The 4 options for INPUTFORMAT are "daylightsmiles", "oelibsmiles", "oelibsdf" and "oelibmol2". WARNINGS are turned on (sending information to the stderr), and ERRORS will cause the code to skip the calculations on that specific structure, and procede to the next.

Table 3: Control File KEYWORDS






print the logP only for the whole molecule


print hydropathy information for atom components of the molecule


record or norecord

print the record number in the outfile (or not)

name or noname

print the molecule name in the outfile (or not)

formula or noformula

print the molecular formula in the outfile (or not)

weight or noweight

print the molecular weight in the outfile (or not)

logp or nologp

print the value of the logP in the outfile (or not)



use (licensed) Daylight toolkit to interpret molecule structure


use OElib SMILES routine to interpret molecule structure


use OElib SDF (MDL) routine to interpret molecule structure


use OElib MOL2 (Sybyl) routine to interpret molecule structure



information in each record is delimited by a space


information in each record is delimited by a comma



write non-serious warning messages to stderr


do not write non-serious warning messages to stderr



exit on serious error


skip current calculation, move to next molecule on serious error


continue current calculation, even with compromised input data



write index (or record) number for each processed molecule to stderr


do not write index number for each processed molecule to stderr



end of options input, begin calculations

* - default settings for program parameters.


The program expects an input molecular structure file which can be in one of three formats. The formats are described in more detail in Input File Formats

    Daylight SMILES format read by OElib code from OpenEye (this does not require any additional code or license).

    Daylight SMILES format read by Daylight SMILES Toolkit (this requires a run-time "smiles" license from Daylight).

    MDL SDFile format read by OElib code from OpenEye (this does not require any additional code or license).

    Tripos Sybyl/MOL2 format read by OElib code from OpenEye (this does not require any additional code or license).

In a typical application the user would include in the molecular structure file all the molecules which are a part of an investigation. Thus, the input molecular structure file can contain one or many structures. Other molecules may, of course, be added later or done separately. It is critical that the keyword INPUTFORMAT match the file format that is provided for input molecular structure file that is provided in the argument. That is, if the keyword INPUTFORMAT is set to "oelibsdf" then no matter what the name of the file is, it must be an SDF format file.


The structure of the output file depends on the keywords used in the control file. For example, use of the the keyword DETAIL MOLECULELEVEL will provide the most consise logP output, while DETAIL ATOMLEVEL will provide extensive details on hydropathy components in both atoms and fragments of the molecule. One should be careful using the ATOMLEVEL keyword with large databases as this could produce a very large output file.

For the cases where a large database is used, and INDEX is set to "on", it may be desirable to save this indexing information to a file so that you may determine which molecules in the database have problems that may need correcting. This information is printed to the "stderr" port and therfore it can be collected in a separate file using the following command:

UNIX/LINUX: $HINT_RUN/hintlogp control.dat database1.smi database1.s >& hint.log

Windows 2000/XP: HINTLOGP control.dat database1.smi database1.s 2> hint.log

You should note that the LICENSE errors (which are the most common) are also printed to the "stderr" port, so anytime your output file is missing information, you should check the terminal or the stderr log file.

Typical HintLogP Session

The following steps are generally followed in using HintLogP:

Demo HintLogP Sessions

Using the demo files described in "Getting Started With HintLogP", we can test the output and function of HintLogP.

First, copy the files in the hintlogp3.06_/demo directory into a working directory on your computer. (Note: for the Windows version, the file names for the test will need to be different since Windows does not distinguish between demo1.out and demo1.OUT.)


The file demo1.smi is simply benzene.  To run (the $HINT_RUN is not necessary if this directory is in the $PATH):

$HINT_RUN/hintlogp demo_control.dat demo1.smi demo1.out

Compare demo1.out (new) and demo1.OUT (archival) for differences.


The file demo2.smi contains 100 molecules of varying complexity.  To run:

$HINT_RUN/hintlogp demo_control.dat demo2.smi demo2.out

Compare demo2.out with demo2.OUT.


Edit demo_control.dat and change the INPUTFORMAT keyword from oelibsmiles to oelibsdf.

The file demo3.sdf contains 12 simple molecules.  To run:

$HINT_RUN/hintlogp demo_control.dat demo3.sdf demo3.out

Compare demo3.out with demo3.OUT


The file demo4.sdf contains 50 molecules of moderate complexity.  To run:

$HINT_RUN/hintlogp demo_control.dat demo4.sdf demo4.out

Compare demo4.out with demo4.OUT


Edit demo_control.dat and change the INPUTFORMAT keyword to oelibmol2.

The file demo5.mol2 contains 10 fairly simple molecules.  To run:

$HINT_RUN/hintlogp demo_control.dat demo5.mol2 demo5.out

Compare demo5.out with demo5.OUT