Amber: LEAP to prepare force field and initial coordinate input files
I. Concepts
In this note, we discuss how to use LEAP to generate force field (prmtop) and initial coordinate (inpcrd) input files for Amber MD program sander. We will have to convert files to Amber-suited PDB/mol2 and frcmod files. (Note: Leap commands are not case sensitive, like loadamberparams and loadAmberParams are all OK; but object variables are case sensitive, like WAT, wat, Wat are not the same thing.)
TABLE 1. The information contained in different files.
Information | PDB | mol2 | lib/prep | frcmod/dat | prmtop | inpcrd |
---|---|---|---|---|---|---|
Residue/Unit Name | Y | Y | Y | Y | ||
Atom Name | Y | Y | Y | Y | ||
Atom Type | Y | Y | Y | Y | ||
Charges | Y | Y | Y | |||
Connectivity | D | Y | Y | Y | ||
Coordinates | Y | Y | D | Y | ||
Atom Mass | Y | Y | ||||
Bonded param | Y | Y | ||||
Non-bonded param | Y | Y |
Here, Y means yes, and D means will be discarded when converted to force field files.
The general idea of converting files are outlines below.
Standard residue (PDB + lib -> prmtop + inpcrd)
For standard residue in protein and nucleic acids PDB files, LEAP has lib standard force field files, so leap will merge information in PDB and lib by matching residue name and atom name. Also, leap will merge information in lib and frcmod/dat by matching atom type. Then leap generates topology (prmtop) and coordinate (inpcrd) files.
Non-standard ligands or molecules (PDB/mol2 + frcmod -> prmtop + inpcrd)
For non-standard ligands or molecules, we need to create force field library files by ourselves.
First, we use antechamber to convert PDB into mol2/prep files (note: connectivity in PDB is discarded. Coordinates in prep will be discarded).
Second, we use parmchk to construct new additional frcmod file for the non-standard molecules with the input of mol2/prep.
Third, leap will merge information in PDB and mol2/prep by matching residue name and atom name. Also, leap will merge information in mol2/prep, standard frcmod/dat, and the new additional frcmod by matching atom type. Finally, leap will generate prmtop and inpcrd files.
II. General procedure
(1) Single molecular structure
Draw molecule in ChemDraw or iQmol and save as .mol2 or .pdb format (Spartan can open .mol created by ChemDraw and save as .mol2 or .pdb files)
(2) Charge methods
charge method | abbre. | index | charge method | abbre. | index |
---|---|---|---|---|---|
RESP | resp | 1 | AM1-BCC | bcc | 2 |
CM1 | cm1 | 3 | CM2 | cm2 | 4 |
ESP (Kollman) | esp | 5 | Mulliken | mul | 6 |
Gasteiger | gas | 7 | Read in charge | rc | 8 |
Write out charge | wc | 9 | Delete Charge | dc | 10 |
Option 1: Gaussian charge method like RESP
$ antechamber -i xx.ini.mol2 -fi mol2 -o xx.mol2 -fo mol2 -rn xx (refresh mol2 format for antechamber)
$ antechamber -i xx.mol2 -fi mol2 -o xx.gcrt -fo gcrt (generate g09 input file)
$ g09 < xx.gcrt > xx.gout (Gaussian calculation of charges)
$ antechamber -i xx.gout -fi gout -o xx.mol2 -fo mol2 -c resp -rn xx (assign charge resp)
(note that .mol2 file can be changed manually to match literature model values)
Option 2: semi-empirical method like bcc (AM1-BCC method)
$ antechamber -i xx.ini.mol2 -fi mol2 -o xx.mol2 -fo mol2 -c bcc -rn xx
(3) Use parmchk to convert mol2 (only) into frcmod
$ parmchk -i xx.mol2 -f mol2 -o xx.frcmod -a Y
$ vmd xx.mol2 (view graphics of molecule)
(4) Use LEAP to prepare inpcrd and prmtop files
A. Load molecules: xx.frcmod (the force field file) and xx.mol2 (the molecular geometry file)
copy both files to your leap folder, then $ xleap
> source leaprc.gaff (the general Amber force field, no topology)
other force fields options:
source leaprc.ff02pol
/export/apps/Amber/12/dat/leap/parm/frcmod.ff02pol.r1 (polarizable force field) ff03.r1
source leaprc.ff03.r1
/export/apps/Amber/12/dat/leap/parm/frcmod.ff03 (the non-polarizable force field)
Then create molecule called xx, yy,… and load force field parameters for xx,yy,…
> xx = loadmol2 xx.mol2
> loadamberparams xx.frcmod
> yy = loadmol2 yy.mol2
> loadamberparams yy.frcmod
> list
B. Solvate your molecule with solvents and form a simulation box
Option 1:
> loadoff solvents.lib (the default solvent models)
> list
CHCL3BOX DC4 MEOHBOX NMABOX PL3 POL3BOX QSPCFWBOX SPC
SPCBOX SPCFWBOX SPF SPG T4E TIP3PBOX TIP3PFBOX TIP4PBOX
TIP4PEWBOX TIP5PBOX TP3 TP4 TP5 TPF
Option 2:
Use solvateBox
> solvateBox xx yy 15
> saveAmberParms xx xx.prmtop xx.inpcrd
(here, xx is the solute and xx.prmtop and xx.inpcrd can be renamed afterwards)
> quit
Two input files for sander .prmtop and .inpcrd .prmtop is the force field file including both parameters and topology information for the whole solute/solvent system
.inpcrd is the initial configuration file
Option 3: use a script that loads all the input files except for the solvation step
e.g. $ xleap
> source script
the script file is as follows:
source leaprc.ff03.r1 OR source leaprc.ff02pol.r1
loadoff solvents.lib
hod = loadmol2 hod.mol2
leadAmberParams hod.frcmod
WAT = SPC
loadAmberParams frcmod.spce
list
then you type to finish the solvation step in xleap:
> solvateBox hod SPC 12.5
OR to use the default solvent box:
> solvateBox SPC MEOHBOX 12.0
> addIons CYN K+ 9
> saveAmberParams SPC (the solute) spc.prmtop spc.inpcrd
> quit
$ vmd New Molecule … –> load xx.prmtop using format AMBER7 Parm , then load xx.inpcrd using formate AMBER7 Restart to view the 3D structure of the system.
III. Example: Preparation of the PCBM_DMA in Chrorobenzene solution
Take the example of PCBM_DMA (residue name: PCB) as solute and Chlorobenzene (residue name: CBZ) as solvent. Three letter acronyms for residue name is preferred. We have PCBM_DMA.pdb CBZ.pdb data files.
(1) Convert pdb to mol2 using antechamber
For example, we use AM1-BCC empirical charge method, such that the missing charge information will be added to pdb file and form mol2 file.
$ antechamber -i CBZ.pdb -fi pdb -o CBZ.mol2 -fo mol2 -c bcc -s 2
In CBZ_SINGLE.pdb, “ATOM” is recode type, followed by atom serial number, name, residue name, x, y, z, occupancy, temperature factor. It is important to note “TER” marks the end of the residue that is required in Amber. “END” marks the end of the file. “ATOM” and “HEATOM” (for non-standard groups) are both ok.
ATOM 1 H CBZ 1 0.364 -2.147 0.000 1.00 0.00
ATOM 2 C CBZ 1 -0.185 -1.212 0.000 1.00 0.00
ATOM 3 C1 CBZ 1 -1.577 -1.204 0.000 1.00 0.00
ATOM 4 H1 CBZ 1 -2.111 -2.149 0.000 1.00 0.00
ATOM 5 C2 CBZ 1 -2.275 0.000 0.000 1.00 0.00
ATOM 6 H2 CBZ 1 -3.361 0.000 0.000 1.00 0.00
ATOM 7 C3 CBZ 1 -1.577 1.204 0.000 1.00 0.00
ATOM 8 H3 CBZ 1 -2.111 2.149 0.000 1.00 0.00
ATOM 9 C4 CBZ 1 -0.185 1.212 0.000 1.00 0.00
ATOM 10 H4 CBZ 1 0.364 2.147 0.000 1.00 0.00
ATOM 11 C5 CBZ 1 0.495 0.000 0.000 1.00 0.00
ATOM 12 Cl CBZ 1 2.247 0.000 0.000 1.00 0.00
TER
END
PDB file record format:
COLUMNS DATATYPE FIELD DEFINITION
-------------------------------------------------------------------------------------
1 - 6 Record name "ATOM "
7 - 11 Integer serial Atom serial number. (index=serial-1)
13 - 16 Atom name Atom name. (usually start from column 14!)
17 Character altLoc Alternate location indicator.
18 - 20 Residue name resName Residue name.
22 Character chainID Chain identifier.
23 - 26 Integer resSeq Residue sequence number.
27 AChar iCode Code for insertion of residues.
31 - 38 Real(8.3) x Orthogonal coordinates for X in Angstroms.
39 - 46 Real(8.3) y Orthogonal coordinates for Y in Angstroms.
47 - 54 Real(8.3) z Orthogonal coordinates for Z in Angstroms.
55 - 60 Real(6.2) occupancy Occupancy.
61 - 66 Real(6.2) tempFactor Temperature factor.
77 - 78 LString(2) element Element symbol, right-justified.
79 - 80 LString(2) charge Charge on the atom. (usually discarded)
The resulting CBZ.mol2 contains the 3 dimensional structure of our CBZ molecule as well as the charge on each atom, final column, the atom number (column 1), its name (column 2) and it’s atom type (column 6). It also specifies the bonding at the end of the file. This file does not, however, contain any parameters. The GAFF parameters are all defined in $AMBERHOME/dat/leap/parm/gaff.dat. The other thing you should notice here is that all of the GAFF atom types are in lower case. This is the mechanism by which the GAFF force field is kept independent of the macromolecular AMBER force fields. All of the traditional AMBER force fields use uppercase atom types. In this way the GAFF and traditional force fields can be mixed in the same calculation.
@<TRIPOS>MOLECULE
CBZ
12 12 1 0 0
SMALL
bcc
@<TRIPOS>ATOM
1 H 0.3640 -2.1470 0.0000 ha 1 CBZ 0.192810
2 C -0.1850 -1.2120 0.0000 ca 1 CBZ -0.158699
3 C1 -1.5770 -1.2040 0.0000 ca 1 CBZ -0.164546
4 H1 -2.1110 -2.1490 0.0000 ha 1 CBZ 0.178559
5 C2 -2.2750 0.0000 0.0000 ca 1 CBZ -0.163617
6 H2 -3.3610 0.0000 0.0000 ha 1 CBZ 0.174616
7 C3 -1.5770 1.2040 0.0000 ca 1 CBZ -0.164546
8 H3 -2.1110 2.1490 0.0000 ha 1 CBZ 0.178559
9 C4 -0.1850 1.2120 0.0000 ca 1 CBZ -0.158699
10 H4 0.3640 2.1470 0.0000 ha 1 CBZ 0.192810
11 C5 0.4950 0.0000 0.0000 ca 1 CBZ -0.090750
12 Cl 2.2470 0.0000 0.0000 cl 1 CBZ -0.016500
@<TRIPOS>BOND
1 1 2 1
2 2 3 ar
3 2 11 ar
4 3 4 1
5 3 5 ar
6 5 6 1
7 5 7 ar
8 7 8 1
9 7 9 ar
10 9 10 1
11 9 11 ar
12 11 12 1
@<TRIPOS>SUBSTRUCTURE
1 CBZ 1 TEMP 0 **** **** 0 ROOT
Do the same to the solute, and obtain PCBM_DMA.mol2 file.
(2) Convert mol2 to frcmod using parmchk
$ parmchk -i CBZ.mol2 -f mol2 -o CBZ.frcmod
The resulting CBZ.frcmod (force field modification type) file contains all the missing fore field parameters. If antechamber can’t empirically calculate a value or has no analogy it will either add a default value that it thinks is reasonable or alternatively insert a place holder (with zeros everywhere) and the comment “ATTN: needs revision”. In this case you will have to manually change parameters in this frcmod file. The CBZ.frcmod shows that there were 1 missing improper dihedral parameters.
remark goes here
MASS
BOND
ANGLE
DIHE
IMPROPER
ca-ca-ca-ha 1.1 180.0 2.0 General improper torsional angle (2 general atom types)
NONBON
Do the same to the solute, and obtain PCBM_DMA.frcmod file.
(3) Packmol : the program to create the simulation box by solvating the solute
$ packmol < setup.packm.PCBM
The packmol script setup.packm.PCBM is as follows.
# A mixture of PCBM and CBZ
# All the atoms from different molecules will be separated at least 2.0 Angstroms
tolerance 2.0
# The file type of input and output files is PDB
filetype pdb
# The name of the output file
output PCBM_CBZ_BULK.pdb
# 1 PCBM molecule and 6000 CBZ molecules will be put in a box
# defined by the minimum coordinates x, y and z = 0. 0. 0. and maximum
# coordinates 90. 90. 90. That is, they will be put in a cube of side
# 40. (the keyword "inside cube 0. 0. 0. 90.") could be used as well.
structure PCBM_DMA.pdb
number 1
inside box 35. 35. 35. 55. 55. 55.
end structure
structure CBZ.pdb
number 6000
inside box 0. 0. 0. 90. 90. 90.
end structure
(i) Use VMD to check visualize the box. In representation, Resname PCB and Resname CBZ to change rendering. OR use Jmol.sh to see structure.
(ii) Make sure net charge = 0.000000000 using dipole_read.f
(iii) Use UCSF Chimera to compare quantum .cube surface with the partial charges.
(4) LEAP to prmtop and inpcrd
A. Use leap to correct the BULK system PDB file into Amber acceptable PDB format.
One of the difference is the new PDB file has “TER” after every molecule/residue.
$ tleap -s -f setup.amber.INIT
setup.amber.INIT
source leaprc.gaff
CBZ = loadmol2 CBZ.mol2
loadamberparams CBZ.frcmod
PCB = loadmol2 PCBM_DMA.mol2
loadamberparams PCBM_DMA.frcmod
BULK = loadpdb PCBM_CBZ_BULK.pdb
savepdb BULK PCBM_CBZ_BULK_NEW.pdb
quit
B. Use leap to construct prmtop and inpcrd from the input of the NEW PDB of the whole BULK system, the mol2 of each species, and the additional force field parameter frcmod files.
We run the following leap script via tleap or via graphic interface xleap:
$ tleap -s -f setup.amber.BULK
setup.amber.BULK
source leaprc.gaff
CBZ = loadmol2 CBZ.mol2
loadamberparams CBZ.frcmod
PCB = loadmol2 PCBM_DMA.mol2
loadamberparams PCBM_DMA.frcmod
BULK = loadpdb PCBM_CBZ_BULK_NEW.pdb
setbox BULK centers
saveAmberParm BULK PCBM_CBZ_6001.prmtop PCBM_CBZ_6001.inpcrd
quit
Leave a Comment