Add restraints to I-TASSER modeling


If users know some information about the structure of the modeled proteins, the information can be conveniently uploaded to the I-TASSER server. These information can substantially improve the quality of structural and function predictions. The I-TASSER server currently accepts two types of user-specified restraints: (1) inter-residue contant and distance restraints; (2) template structures and template-target alignments.

The server provides 4 convenient options to assign the restraints:

  • Assign contact/distance restraints: If you know what atom pairs should be in contact or in some distances, you can use this option to upload a text file including the contact and/or distance information of atom pairs. Here is an example of the restraint file.
  • Specify template without alignment: If you want I-TASSER to use a specific PDB structure as a template, you can use this option specify the PDB structure. You only need to type in the PDBID:ChainID, e.g. 1wor:A without specifying the target-template alignments. I-TASSER will first fetch the structure from the PDB library and then generate the target-template alignment based on our in-house alignment tool, MUSTER.
  • Specify template without alignment: You can actually use any 3D structure as the template, which does not necessary exist in the PDB library. In this case, you can use this option to upload the 3D structure. This structure file must be in the standard PDB format. You do not need to input the target-template alignments. I-TASSER will generate target-template alignment based on our in-house alignment tool, MUSTER.
  • Specify template with alignment: This option allows you (usually the advanced users) to specify both template structure and the target-template alignment. The available file format includes 3D format and FASTA formt (see below for detailed explanation).

How to add inter-residue distance & contact restraints?

    The distance and contact information can be added by simply uploading a restraint file (see example).

    This file can specify the atom-based distances between ith and jth residues in the format:
    "DIST   Res_No.i   Atom_type_i    Res_No.j   Atom_type_j   Distance_in_Angstroms"
    e.g.

    DIST   12   HG21  50   HB1   8.1
    DIST   14   HA    57   1HE   6.2
    DIST   21   HB2   43   HD11  4.0
    DIST   124  CA    84   CA   17.4
    DIST   36   UNK   120  CA   17.4
    
    ('UNK' indicates the user does not know what atom in the 36th residue is involved in the distance restraint)

    The file can also specify the residue-based contact between ith and jth residues in the format:
    "CONTACT   Res_No.i   Res_No.j"
    e.g.
    CONTACT   33    6
    CONTACT   60    29
    CONTACT   37    345
    CONTACT   109   42
    
How to specify a template protein which I-TASSER modeling will be based on?

    When I-TASSER uses a known protein as template, it has to know the 3D structure of the template protein and the alignment between target and template sequences. There are two ways that the users can assign a template to I-TASSER.

    1. The simplest way is to specify the PDB ID of the template protein and the chian identifier in the format PDBID:Chain. The server will try to generate the best alignment of target and the template sequence based on the MUSTER program, an algorithm to align proteins based on multi-source of information, including secondary structure, sequence profile, solvent accessibility, and structural fragment profiles.

    2. If the user knows the target-template alignment, it is welcome to provide both structural and alignment information. I-TASSER server can accept two types of alignment formats, i.e. 3D format and FASTA format. Example of these two formats are provided in 3D format and FASTA format.

      The 3D format is similar as the standard PDB format but two more columns are added from the template sequences, e.g.

      ATOM   2001  CA  MET     1      41.116 -30.727   6.866  129 THR
      ATOM   2002  CA  ALA     2      39.261 -27.408   6.496  130 ARG
      ATOM   2003  CA  ALA     3      35.665 -27.370   7.726  131 THR
      ATOM   2004  CA  ARG     4      32.662 -25.111   7.172  132 ARG
      ATOM   2005  CA  GLY     5      29.121 -25.194   8.602  133 ARG
      
      Column 1 -30: Atom & Residue records of query sequence.
      Column 31-54: Coordinates of atoms in query copied from corresponding atoms in template.
      Column 55-59: Corresponding residue number in template based on alignment
      Column 60-64: Corresponding residue name in template
      
      The FASTA format is similar as the standard FASTA format except that the 3D structure is attached after the sequence alignemnts. e.g.
      >query
      --------------------------------------------------------------------------
      ------------------------------------------------------MAARGRRAEPQGREAPGPAG
      GGGGGSRWAESGSGTSPESGDEEVSGAGSSPVSGGVNLFANDGSFLELFKRKMEEEQRQRQEEPPPGPQRPDQS
      AAAAGPGDPKRKGGPGSTLS---------FVGKRRGGNKLALKTGIVAKKQKTEDEVL------------TSKG
      DAWAKYMAEVKKYKAHQCGDDDKTRPLVK---------------------------------------------
      --------------------------------------------------------------------------
      >1w0r:A
      DPVLCFTQYEESSGKCKGLLGGGVSVEDCCLNTAFAYQKRSGGLCQPCRSPRWSLWSTWAPCSVTCSEGSQLRY
      RRCVGWNGQCSGKVAPGTLEWQLQACEDQQCCPEMGGWSGWGPWEPCSVTCSKGTRTRRRACNHPAPKCGGHCP
      GQAQESEACDTQQVCPTHGAWATWGPWTPCSASCHGG--PHEPKETRSRKCSAPEPSQKPPGKPCPGLAYEQRR
      CTGLPPCPVAGGWGPWGPVSPCPVTCGLGQTMEQRTCNHPVPQHGGPFCAGDATRTHICNTAVPCPVDGEWDSW
      GEWSPCIRRNMKSISCQEIPGQQSRGRTCRGRKFDGHRCAGQQQDIRHCYSIQHCPLKGSWSEWSTWGLCMPPC
      GPNPTRARQRLCTPLLPKYPPTVSMVEGQGEKNVTFWGRPLPRCEELQGQKLVVEEKRPCLHVPACKDPEEEEL
      
      REMARK The following is the PDB file of 1w0r:A
      ATOM      1  N   ASP     1     -61.352  10.686 -21.622
      ATOM      2  CA  ASP     1     -61.577   9.382 -21.306
      ATOM      3  C   ASP     1     -60.494   8.357 -21.572
      ATOM      4  O   ASP     1     -59.461   8.661 -22.046
      ATOM      5  CB  ASP     1     -62.869   9.947 -21.978
      ATOM      6  CG  ASP     1     -64.050   9.936 -21.019
      ATOM      7  OD1 ASP     1     -64.163   9.015 -20.186
      ATOM      8  OD2 ASP     1     -64.907  10.837 -21.154
      ATOM      9  N   PRO     2     -60.719   7.053 -21.256
      ...
      ATOM   3368  OE2 GLU   441       3.538 -11.561 -19.634
      ATOM   3369  N   LEU   442       5.760 -10.509 -22.452
      ATOM   3370  CA  LEU   442       4.343 -10.088 -22.296
      ATOM   3371  C   LEU   442       3.130 -10.010 -23.201
      ATOM   3372  O   LEU   442       3.217  -9.645 -24.316
      ATOM   3373  CB  LEU   442       2.901 -10.133 -22.751
      ATOM   3374  CG  LEU   442       1.899  -9.162 -22.168
      ATOM   3375  CD1 LEU   442       0.483  -9.471 -22.688
      ATOM   3376  CD2 LEU   442       1.922  -9.267 -20.645
      ATOM   3377  OXT LEU   442       2.654  -9.424 -25.326
      END
      
      Note: To avoid the possible error in reproducing template structure and alignments, rather than specifying PDB ID, the users who use FASTA format must attach the original PDB structure of the template in the FASTA file, see the example.

 


zhanglab@ku.edu | (785) 864-1948 | 2030 Becker Dr, Lawrence, KS 66047