Difference between revisions of "Structural Alignments (Phenylketonuria)"

Revision as of 16:56, 2 June 2013

Summary

Structural alignments are used to determine the functional and evolutionary relationships between protein structures. <ref name="struc_align"> Walter Pirovano, K Anton Feenstra and Jaap Heringa (2008): "The meaning of alignment: lessons from structural diversity". BMC Bioinformatics Vol.9:556. doi:10.1186/1471-2105-9-556 </ref> In this task, we first generated a dataset of different related and unrelated structures to our protein sequence (PAH). Subsequently, we used different methods and measurements to quantify structural similarity between the given structures. Then, we generated structural alignments for the evaluation of some sequence-based alignments of Task 2. The results and appendant discussions are shown below.

Explore structural alignments

Lab journal

Dataset generation

Our protein (PAH) has the CATH Code 1.10.800.10 (Phenylalanine Hydroxylase). We used, for the generation of the dataset, similar and dissimilar structures to this protein. Thus, we added the following structures into it:

reference structure of PAH: 2PAH (96,41% identity)
identical sequence with filled binding site: 1LRM (100% identity --> pdb entry: looked at 3D structure and saw two filled binding site with the ligands: FE and HBI)
identical sequence with unfilled binding site: not found anyone
low sequence identity: 3LUY (32,2% - no pdb ID under 30%)
high sequence identity: pdb ID: 2PHM (89,7%)
CAT: 1J8U (CATH Code: 1.10.800.10) - there is no other category than this for CAT
CA: 2B5U (CATH Code: 1.10.287.620)
C: 3BQO (CATH Code: 1.25.40.210)
other CATH category: 1V8H (CATH Code: 2.60.40.10)

Now we want to apply different structural alignment methods with this dataset. In this case, each structure has only to be superimposed on the reference structure and not on the other structures too.

Pymol

Pymol is a python-enhanced and open source molecular visualization tool. It is particularly suitable for 3D visualization of proteins and small molecules as well as their density, surfaces and trajectories. It also includes molecular editing like aligning or superimposition of two molecules. <ref> http://sourceforge.net/projects/pymol/ short Pymol summary, retrieved June 02, 2013 <ref/>

LGA

LGA ...

SSAP / CATHEDRAL (used by CATH)

SSAP ...
uses Cβ

TopMatch

TopMatch ...
uses Cα

SAP or CE

SAP? ->Error!
->
CE
CE-PDB ...

Modelling scores

To compare the different models, the RMSDs (root-mean-square deviation) are compared. In TopMatch the same formular is taken but called Er (root-mean-square error). The RMSD gives the squared distance between corresponding positions of two superimposed proteins in Ångström. The results are shown in <xr id="rmsd"/>. <figtable id="rmsd">

RMSD results
Method	1lrm	3luy	2phm	1j8u	2b5u	3bqo	1vh8
LGA-RMSD	0.81	3.30	0.88	0.73	3.07	3.59	3.42
SSAP-RMSD	0.99	18.77	1.24	1.02	39.16	22.39	7.27
TopMatch-Er	0.60	1.98	0.81	0.63	1.21	1.12	3.25
CE-RMSD	0.65	5.13	0.95	0.68	4.06	4.68	5.92

Root-mean-square deviation/error in Ångström for the four protein structure alignment predictors LGA, SSAP, TopMatch and CE.

</figtable>

lowest RMSDs: TopMatch
LGA and CE sometimes the one sometimes the other is better. For very similar structures CE better, otherwise LGA???
worst/highest RMSDs: SSAP, maybe the Cβ are more distant???

careful about the sidechains: here always 2pah.A as query is taken and xx.A as target

...

Evaluate sequence alignments

Lab journal
<figtable id="model_rmsd">

LGA and hhsearch results
	LGA				hhsearch
pdb	RMSD	LGA_S	LGA_Q	seq_id	probability	e-value	identities(%)
1phz	0.83	90.65	32.44	99.34	100.00	6.9e-165	92
1j8u	0.73	90.29	35.83	99.67	100.00	3.1e-135	100
2v27	1.70	62.77	12.55	96.02	100.00	3.6e-74	32
2qmx	3.18	7.46	1.25	4.88	98.20	1.1e-09	36
3luy	2.82	7.17	1.24	13.89	98.07	3.3e-09	22
1qey	0.64	3.65	1.63	0.00	54.00	3.4	67
1wyp	2.67	8.43	1.37	0.00	29.42	15	19
1a6s	3.15	6.93	1.08	11.43	20.59	29	36

...

</figtable>

last two have a very low probability...

References

@@ Line 6: / Line 6: @@
 === Dataset generation ===
-Our protein has the CATH Code: [http://www.cathdb.info/version/3.5.0/superfamily/1.10.800.10 1.10.800.10] (Phenylalanine Hydroxylase)
+Our protein (PAH) has the CATH Code [http://www.cathdb.info/version/3.5.0/superfamily/1.10.800.10 1.10.800.10] (Phenylalanine Hydroxylase). We used, for the generation of the dataset, similar and dissimilar structures to this protein. Thus, we added the following structures into it:
 * reference structure of PAH: [http://www.rcsb.org/pdb/explore/explore.do?structureId=2PAH 2PAH] (96,41% identity)
-* identical sequence with filled binding site: [http://www.rcsb.org/pdb/explore/explore.do?structureId=1LRM 1LRM] (--> pdb entry: looked at 3D structure and saw a filled binding site, two ligands: FE and HBI)
+* identical sequence with filled binding site: [http://www.rcsb.org/pdb/explore/explore.do?structureId=1LRM 1LRM] (100% identity --> pdb entry: looked at 3D structure and saw two filled binding site with the ligands: FE and HBI)
-* identical sequence with unfilled binding site:  not found anyone
+* identical sequence with unfilled binding site: not found anyone
 * low sequence identity: [http://www.rcsb.org/pdb/explore/explore.do?structureId=3LUY 3LUY] (32,2% - no pdb ID under 30%)
 * high sequence identity: pdb ID: [http://www.rcsb.org/pdb/explore/explore.do?structureId=2PHM 2PHM] (89,7%)
-* CAT: [http://www.rcsb.org/pdb/explore/explore.do?structureId=1J8U 1J8U] (CATH Code: 1.10.800.10) - there is no other category than 1.10.800.10 for CAT
+* CAT: [http://www.rcsb.org/pdb/explore/explore.do?structureId=1J8U 1J8U] (CATH Code: 1.10.800.10) - there is no other category than this for CAT
 * CA: [http://www.rcsb.org/pdb/explore/explore.do?structureId=2B5U 2B5U] (CATH Code: 1.10.287.620)
 * C: [http://www.rcsb.org/pdb/explore/explore.do?structureId=3BQO 3BQO] (CATH Code: 1.25.40.210)
 * other CATH category: [http://www.rcsb.org/pdb/explore/explore.do?structureId=1V8H 1V8H] (CATH Code: 2.60.40.10)
+Now we want to apply different structural alignment methods with this dataset. In this case, each structure has only to be superimposed on the reference structure and not on the other structures too.
 === Pymol ===
+[http://pymol.org/ Pymol] is a python-enhanced and open source molecular visualization tool. It is particularly suitable for 3D visualization of proteins and small molecules as well as their density, surfaces and trajectories. It also includes molecular editing like aligning or superimposition of two molecules. <ref> http://sourceforge.net/projects/pymol/ short Pymol summary, retrieved June 02, 2013 <ref/>
-...
 === LGA ===

Difference between revisions of "Structural Alignments (Phenylketonuria)"

Revision as of 16:56, 2 June 2013

Contents

Summary

Explore structural alignments

Dataset generation

Pymol

LGA

SSAP / CATHEDRAL (used by CATH)

TopMatch

SAP or CE

Modelling scores

Evaluate sequence alignments

References

Navigation menu

Views

Personal tools

Bioinformatik navigation

MediaWiki navigation

Search

Tools