Described below is an outline of the concept for the data bases and how they are used in calculation of hydrophobic atom constants. The "small molecule" data bases are used for a structure/connection-based calculation of LogP and are especially appropriate for molecules that are neither proteins nor nucleic acids. The "protein" data base is used for a dictionary-based calculation of LogP and is only appropriate for macromolecules constructed of well-established and previously registered substructures or monomers such as amino acid residues.

Atomic Constants: | C(=)(-)(-) | 0.155 |

O(=) | -1.915 | |

OH(-) | -1.640 | |

Fragment Constants: | CO(-)(-) | -1.900 |

COOH(-) | -1.110 |

Atomic constants: | C(=)(-)(-) | 0.015 |

O(=) | -1.915 | |

OH(-) | -1.640 |

Atomic constants: | C(=)(-)(-) | 2.455 |

O(=) | -1.915 | |

OH(-) | -1.640 |

**HYD_CJG**= 0.045- Applied if atom is ConJuGated
**HYD_RFC**= 0.100- Applied if atom is a Ring Fusion Carbon
**HYD_RFH**= 0.315- Applied if atom is a Ring Fusion Heteroatom
**HYD_CHN**= -0.120- Applied if atom is in a CHaiN of n>2
**HYD_RNG**= -0.090- Applied if atom is in a RiNG
**HYD_BCH**= -0.130- Applied if atom is at an aliphatic BranCH
**HYD_BGH**= -0.220- Applied if atom is at a polar BrancH
**HYD_VIC**= 0.140- Applied if atom is attached to VICinal halogens
**HYD_ENH**= -0.080- Applied if atom is in a branched long chain
**HYD_CDN1**= -0.330- Applied if atom is next to a ChargeD Nitrogen
**HYD_CDN2**= -0.140- Applied if atom is 2 atoms away from ChargeD N
**HYD_CDN3**= -0.070- Applied if atom is 3 atoms away from ChargeD N
**HYD_CDN4**= -0.035- Applied if atom is 4 atoms away from ChargeD N

Type | ||||||

HYD_PPP (normal) | -0.380 | -0.320 | -0.260 | -0.100 | 0.000 | 0.000 |

HYD_PPO (hydroxyl) | -0.580 | -0.420 | -0.260 | -0.100 | 0.000 | 0.000 |

HYD_PPN (charged N) | -0.580 | -0.420 | -0.270 | -0.240 | -0.220 | -0.200 |

HYD_PPR (aliph. ring) | -0.440 | -0.320 | -0.200 | 0.000 | 0.000 | 0.000 |

HYD_PPA (arom. ring) | -0.240 | -0.160 | -0.080 | 0.000 | 0.000 | 0.000 |

The atomic hydrophobic parameters in "*ff*_aa_hydro_protein.bin" were
previously calculated by small molecule-type partition calculations
for the acetyl amide analogs (AAA) for each amino acid residue (or
modified AAA for C-terminal or N-terminal residues), common protein
cofactors such as heme, and appropriately capped nucleic acid bases.
Use of the AAA simulates the effects of the proximate polar groups of
the adjacent backbone amide linkages on the hydrophobicity of the
residue. The advantages of the dictionary are consistency in atom
partitioning for macromolecules, the ability to quickly explore the
effects of changing solvent conditions, and no reliance on potential
type assignment for proteins. The dictionary data base relies on atom
names only (host modeling system convention) to make hydrophobic parameter assignments.

B = b_{ij}; which can be summed for all i and j;

b_{ij} = S_{i} a_{i} S_{j} a_{j} R_{ij} T_{ij};

where b_{ij} is a MicroInteraction constant representing the
attraction/interaction between atoms i and j, S_{i} is the solvent
accessible surface area for i, a_{i} is the hydrophobic atom constant for
i, and R_{ij} is the functional distance behavior for the interaction of
i and j. For InterMolecular calculations i and j are atoms on the two
molecules; for IntraMolecular calculations i and j are two distinct
indices on the same molecule where i is not equal to, or covalently
bonded to, or involved in a 1-3 interaction with j.

T_{ij} is a discriminant function designed to keep the signs of interactions
consistent with the HINT convention that favorable interactions are positive and unfavorable
interactions are negative. (Much of this related to the issue of Polar and
Hydrophobic Protons discussed above.) However, the other point is that
while there is magnitude information for polar atoms in LogP (and a_{i}) there is no
"sign" information; that is, the sign and effect of a charge on a polar (atom) species must
be added in by HINT.

The following is an interaction matrix which details some of the conventions used by HINT to
calculate T_{ij}.

Atom Type | H (apolar) | H (polar) | C (apolar) | Polar (N,O,etc.) |

H (apolar) | +1^{1} | -1^{2} | +1^{1} | -1^{2} |

H (polar) | -1^{2} | -1^{3} | -1^{2} | +1^{4} |

C (apolar) | +1^{1} | -1^{2} | +1^{1} | -1^{2} |

Polar (N,O,etc.) | -1^{2} | +1^{4} | -1^{2} | -1^{5} |

Notes:

*1*: hydrophobic-hydrophobic

*2*: hydrophobic-polar

*3*: acid-acid (two polar hydrogens)

*4*: acid-base or hydrogen bond

*5*: may depend on charge, but probably base-base and unfavorable (T_{ij} = -1)

However, neither exponential nor power functions extract a penalty
for too-close atom-atom interactions. This is the province of the
Lennard-Jones potential/van der Waals potential attractions which
have no electrostatic (and by inference hydropathic) contribution.
E_{ij} values for the adaptation of this function used in HINT are
from the literature (Levitt, M. *Journal of Molecular Biology*
**1983**, *168*, 595-620; Levitt, M.; Perutz, M.F.
*Journal of Molecular Biology* **1988**, *201*,
751-754). The table below sets out the e parameters for atoms of
interest where e_{i} * e_{j} = E_{ij}. HINT distance functions including
both exponential hydropathic and Lennard-Jones steric contributions
appear to give the best results for Interaction calculations.
It must be emphasized that HINT is an empirical, phenomenological
model that relies on intuitive principle rather than a rigorous
theoretical treatment to produce understanding of molecular
interactions. This approach suggests the equations, distance
functions, and parameterization used in the HINT model.

Atom | e_{i} (kcal/mol)^{0.5} |

H | 0.1949 |

C,S,most others | 0.2717 |

C(sp^{2}) | 0.1940 |

N | 0.6428 |

O | 0.4299 |

b_{ij} = b_{ij} * electron_count * exp ( -ang * focus ),

that gives the largest score for antiparallel (i.e., ang = 0 degrees) vectors. Electron_count is simply the number of electrons in the orbital(s) associated with the direction vector. This is nominally 2.0 electrons for lone pairs and 0.5 electrons for pi orbitals. Electrons present but not assigned to specific vectors are assumed to be spherically distributed at each atom. The vector "focus" parameter simply forces a smaller and tighter cone of interactions to be scored favorably (see Figure 4). Note that the trivial case of vector focus zero is identical to the spherical model for all atoms.

Case | Geometry | known attach. | required vectors | vector weighting | example atom |

413 | tetrahedral | 1 | 3 lone pairs | 2.0 | chloride |

422 | tetrahedral | 2 | 2 lone pairs | 2.0 | ether O |

431 | tetrahedral | 3 | 1 lone pair | 2.0 | sp^{3} N |

514 | trigonal bipy. | 1 | 2 lone pairs 2 pi orbitals | 2.0 0.5 | carbonyl O |

523 | trigonal bipy. | 2 | 1 lone pair 2 pi orbitals | 2.0 0.5 | sp^{2} N |

532 | trigonal bipy. | 3 | 2 pi orbitals | 0.5 | sp^{2} C |

615 | octahedral | 1 | 1 lone pair 4 pi orbitals | 2.0 0.5 | sp N |

624 | octahedral | 2 | 4 pi orbitals | 0.5 | sp C |

A_{t} = a_{i} S_{i} R_{it},

where R_{it} is a function of the distance between each atom in
the system (i), and the grid point (t).

Interaction HintMaps can be envisioned as a calculation where
the test (grid) points are acting as observers to the interactions
at their locations. The test points measure the effects from
atoms i and j and then reconcile the two effects into a
localized MicroInteraction constant _{i} _{j}
which can be summed for all atom-atom pairs interacting at the grid point,

C_{t} = _{it} _{jt},

where and are atom-test
point interactions for atoms i and j, respectively, and C_{t} is the interaction grid point value.

In an attempt to increase the utility of HINT we have added a modest function to optimize the HINT score between a small ligand (like water) and the surrounding molecular structure. The algorithm is described, briefly, in this section. First, the site is created from partitioned molecules or fragments within the cutoff range from the ligand center. Next, the ligand is systematically moved and rotated within a sphere of radius equal to the translation limit. Each new orientation is "scored" intermolecularly between the ligand and site. The sphere is reduced in size each iteration in response to the highest scoring orientation/position. Convergence is reached when the size of the sphere is smaller than the convergence limit.

For each iteration, the ligand is moved to the center of, and a number of points on, the
sphere and then completely
rotated through a number of orientations at each of those points; the actual number of these
orientations and positions is manipulated throughout the process to optimize speed and accuracy.
There is a parameter called **level** which represents the densities of translational
positions and rotational orientations.

level | angle | number ofpoints | number oforientations |

1 | /2 | 5 | 64 |

2 | /4 | 27 | 512 |

3 | /8 | 115 | 4096 |

4 | /16 | 483 | 32768 |

5 | /32 | 1987 | 262144 |

6 | /64 | 8067 | 2097152 |

7 | /128 | 32515 | 16777216 |

8 | /256 | 130563 | 134217728 |

A single variable, termed **ispeed**, controls a variety of functions that impact speed and
accuracy as described in the table below.

ispeed | shrinkratio | dropfrequency | castlevel | lookahead | levellimit |

1 | 0.90 | 1 | 3 | 3 | 7 |

2 | 0.85 | 2 | 2 | 2 | 6 |

3 | 0.80 | 3 | 2 | 2 | 6 |

4 | 0.75 | 4 | 1 | 1 | 6 |

5 | 0.70 | 5 | 1 | 1 | 5 |

6 | 0.65 | 6 | 1 | 1 | 5 |

7 | 0.60 | 7 | 0 | 0 | 4 |

8 | 0.55 | 8 | 0 | 0 | 4 |

9 | 0.50 | 9 | 0 | 0 | 4 |

10 | 0.45 | 10 | 0 | 0 | 4 |

** ispeed** can be selected by the user in the range of 1 to 6.

The QSAR is calculated as a least squares fit of a matching function relating each individual map to the LockSmith map to the biological activity of each molecule in the set. The LockSmith map itself graphically displays the 3D hydropathic structure of a target molecule for design.

The LockSmith method requires a carefully laid out and chemically meaningful superimposition of the molecules in the input (learning) set. This implies a predetermined pharmacophore model and that all molecules that are included in the set are presumed to have a similar action and binding mode at a common receptor.