LESSON 4: Using Molconn-Z with SYBYL 3D QSAR

    This lesson demonstrates the use of Molconn-Z with the SYBYL QSAR module. It would be best for this lesson if there were no molecules or backgrounds from previous lessons currently active in SYBYL. If you are entering the Molconn-Z Tutorial at this point, follow the instructions in Step 1 of Lesson 1.

    1. Open a SYBYL Molecular Spreadsheet and Database
    2. From the File pulldown on the menubar select Molecular Spreadsheet and New.... The rows will represent Molecules. In the DATABASE_FILE dialog box, enter $TA_DEMO in the "Database containing molecules" text field and press Search Directory. This will allow us to select one of the already prepared molecular databases in the SYBYL Demo directory. Choose jacs.mdb for this lesson to retrieve the standard steroid CoMFA data set. After the spreadsheet is initialized and appears, we will import the biological data for column 1. From the File menu on the Molecular Spreadsheet choose the Import... item. On the resulting Import dialog choose Format: Tripos and enter $TA_DEMO/cbg.tripos in the File: text area. Press Import to load the first column of the MSS with the corticosteroid binding globulin data.

    3. Fill the second Spreadsheet column with the EState field
    4. From the spreadsheet menubar select the AutoFill button (and choose a new Column). Select MCONNCOMFA as the New column type. Note: if this column type is not listed, cancel the New column type Option dialog and type, to the Sybyl> prompt "mss!reset_eslc" before returning to the Autofill button on the MSS. The Add Column (EStateCoMFA) dialog box, which should now appear, contains options to tailor the Molconn-Z EState field that will be entered into the QSAR table. For this first run, we will choose mostly the default settings: (Information: = EState, Smoothing = None, Distance Function = 1/r**n, n = 3, Inside Mol Cut Off = on, High Limit = 0, Low Limit = 0, Van der Waals Limit = 1.0). The Region will be from Calculate Automatically... using the Calculate CoMFA Region Automatically dialog box, where all Spacings should be 2 Angstroms and all Margins should be 4 Angstroms. Use jacs.rgn as the CoMFA Region File name. Press OK to calculate the region and then press OK to the Add Column (EStateCoMFA) dialog box and accept mconn2 as the Column name. This AutoFill operation will take about 2-5 minutes. As the calculation proceeds through each row of the table, the statement "Field files should be checked for molecule X" will be reported by SYbyl in the text window. This is normal behavior for external fields imported into Sybyl. When the calculation is complete, column 2 will be filled with zeros. That is also normal Sybyl behavior -- what appears in the column on the spreadsheet is the field value for one corner of the grid box, which is almost certainly a zero.

    5. Run a PLS analysis on CBG as a function of the EState field.
    6. Choose the columns for the PLS study: Use Select Cols, enter 1, 2 in the Expression text field, press Add and Done. From the QSAR pulldown, select Partial Least Squares... to call the Partial Least Squares Analysis dialog box. The Dependent Column is 1. Select Leave-1-Out Validation, 5 Components, CoMFA Std Scaling, and Use SAMPLS on. This run will take a few seconds. and edit or list the report. If you run it interactively, be sure to choose Yes for Keep this analysis? In this run the optimum number of components is 3 and the cross-validated r2 is 0.713.

    7. Improving the Model by Adding the HEState Field.
    8. Some CoMFA models can be improved by adding the HEState Field. Unfortunately there is a long-standing bug in Sybyl that requires you to save and close the MSS and exit out of Sybyl before adding a second (externally-generated) CoMFA column to the table. So, after doing that, reopen the MSS with File, Molecular Spreadsheet, Open.... Select the HEState (Protons) field type, use the same Distance Function and other parameters as before, except that the CoMFA Region File is now Use Pre-existing: jacs.rgn. Select the same parameters as before for the PLS run. This 2 field PLS run should yield a cross validated r2 of 0.794 with 3 components. However, we have seen (with this data set) that a smaller grid spacing, or a different grid orientation/starting point (e.g., using a grid margin of 5 -- thus forcing a different grid, the HEState can add information to CoMFA models.

    9. Using the EState fields with the standard CoMFA fields
    10. The EState field can be used in combination with the SYBYL steric and/or electrostatic fields for 2 or 3 field CoMFA studies. Note that the region must be the same! This means that you can't let both Sybyl and Molconn-Z create the region automatically. Instead, use the "pre-defined region (file)" option for the region for columns created later.

      There is a graphical command in the Molconn-Z software to aid in graphing multifield CoMFA results. From the eslc pulldown on the main SYBYL menubar select the MolconnZ, MolconnZ QSAR, Graph MolconnZ QSAR... command. This brings up the Retrieve MolconnZ QSAR dialog box that guides you through retrieving and graphing the CoMFA field contours. Choose which field types you wish to graph and their Columns. Important: This dialog does not work when there is only one CoMFA type column in the analysis!