Lab Journal - Task 8 (PAH)
From Bioinformatikpedia
Contents
Mutation dataset
For the generation of the mutation dataset the following five SNPs from the HGMD database were used (25th June 2013): <figtable id="mutds_hgmd">
Missense mutations (SNPs) from HGMD | ||||||
---|---|---|---|---|---|---|
Accession Number | Codon change | Sequence position | Amino acid change | Codon number | Disease | Reference |
CM000542 | CAG⇒CTG | 59 | Gln(Q)-Leu(L) | 20 | Hyperphenylalaninaemia | Hennermann (2000) Hum Mutat 15, 254 |
CM045080 | GGT⇒AGT | 307 | Gly(G)-Ser(S) | 103 | Phenylketonuria | Lee (2004) J Hum Genet 49, 617 |
CM910286 | GCC⇒GTC | 776 | Ala(A)-Val(V) | 259 | Phenylketonuria | Labrune (1991) Am J Hum Genet 48, 1115 |
CM010981 | AAG⇒ACG | 1022 | Lys(K)-Thr(T) | 341 | Phenylketonuria | Tyfield (1997) Am J Hum Genet 60, 388 |
CM090791 | CCA⇒CAA | 1247 | Pro(P)-Gln(Q) | 416 | Hyperphenylalaninaemia | Dobrowolski (2009) J Inherit Metab Dis 32, 10 |
</figtable>
Furthermore, the following five mutations from dbSNP were added (25th June 2013):
<figtable id="mutds_dbSNP">
Missense mutations (SNPs) from dbSNP | |||||
---|---|---|---|---|---|
Reference SNP | Codon change | Sequence position | Amino acid change | Codon number | Disease |
rs199475569 | CAC⇒AAC | 190 | His(H)-Asn(N) | 64 | ? |
rs199475681 | AGA⇒ATA | 368 | Arg(R)-Ile(I) | 123 | ? |
rs62508752 | ACA⇒CCA | 796 | Thr(T)-Ala(A) | 266 | Phenylketonuria |
rs199475695 | TTT⇒TCT | 1175 | Phe(F)-Ser(S) | 392 | ? |
rs199475696 | ATT⇒ACT | 1262 | Ile(I)-Thr(T) | 421 | ? |
</figtable>
Analyze SNPs
PSSM
We created the PSSM matrix using the standard parameter of PsiBlast and five iterations.
Last position-specific scoring matrix computed, weighted observed percentages rounded down, information per position, and relative weight of gapless real matches to pseudocounts A R N D C Q E G H I L K M F P S T W Y V A R N D C Q E G H I L K M F P S T W Y V 20 Q -2 1 2 3 -4 2 2 -1 2 -3 -1 2 -2 -3 -3 -1 -1 -4 -3 -1 0 7 13 17 0 14 11 3 5 0 6 16 0 0 0 0 2 0 0 6 0.32 inf 64 H -5 2 -1 -3 -7 0 0 -6 9 -7 -6 2 -3 -2 -6 -2 -5 -6 1 -7 0 10 2 1 0 3 6 0 58 0 0 11 1 1 0 2 0 0 5 0 1.89 inf 103 G 0 0 0 0 2 0 0 -1 1 0 -1 0 1 0 0 0 0 -2 1 0 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 0 5 5 0.03 inf 123 R -3 6 -1 -3 -3 0 -2 -3 0 0 -1 3 -1 -4 -3 -1 1 -4 -3 -3 1 49 1 0 1 3 0 1 2 7 4 16 1 0 0 2 10 0 0 0 0.72 inf 259 A 6 -2 -2 -3 -3 -1 -2 -1 -3 -2 -2 -2 -1 -4 -3 1 -2 -4 -2 -2 70 2 1 0 0 3 1 3 0 2 4 1 1 0 0 9 1 0 1 1 0.93 inf 266 T 3 -1 -1 -3 -3 -3 -3 0 -3 -3 -3 0 -3 -4 -3 0 5 -4 0 -2 29 3 1 0 0 0 0 6 0 0 1 6 0 0 0 5 43 0 3 1 0.70 inf 341 K -2 4 -1 -2 -1 1 1 -3 -1 -3 -1 4 -1 0 -2 -1 -1 -1 -2 -2 0 28 0 0 2 2 10 0 0 0 8 39 1 7 0 1 1 1 0 1 0.44 inf 392 F -4 -5 -5 -5 0 -5 -5 -3 -3 3 2 -5 1 6 -5 -4 -3 -2 2 1 0 0 0 0 2 0 0 2 0 16 17 0 2 49 0 0 0 0 6 7 0.98 inf 416 P 0 -4 1 -3 -5 -3 -2 -3 -2 -2 -5 -2 -4 -1 7 -1 -3 -6 -4 -4 9 0 9 0 0 0 2 1 1 3 0 2 0 4 66 4 0 0 0 0 1.60 inf 421 I -1 -4 -4 -4 -3 -1 -4 -1 -4 4 1 -3 0 -2 -4 -3 -2 -4 -3 4 5 0 0 0 0 4 0 6 0 32 9 0 1 0 0 0 0 0 0 41 0.62 inf
SIFT
- A259V
Substitution at pos 20 from Q to L is predicted to be TOLERATED with a score of 0.16. Median sequence conservation: 3.20 Sequences represented at this position:42
- R123I
Substitution at pos 64 from H to N is predicted to AFFECT PROTEIN FUNCTION with a score of 0.00. Median sequence conservation: 3.01 Sequences represented at this position:77
- Q20L
Substitution at pos 103 from G to S is predicted to be TOLERATED with a score of 0.90. Median sequence conservation: 3.02 Sequences represented at this position:72
- G103S
Substitution at pos 123 from R to I is predicted to be TOLERATED with a score of 0.05. Median sequence conservation: 3.01 Sequences represented at this position:76
- H64N
Substitution at pos 259 from A to V is predicted to AFFECT PROTEIN FUNCTION with a score of 0.00. Median sequence conservation: 3.01 Sequences represented at this position:77
- I421T
Substitution at pos 266 from T to A is predicted to AFFECT PROTEIN FUNCTION with a score of 0.00. Median sequence conservation: 3.01 Sequences represented at this position:77
- K341T
Substitution at pos 341 from K to T is predicted to AFFECT PROTEIN FUNCTION with a score of 0.00. Median sequence conservation: 3.01 Sequences represented at this position:77
- F392S
Substitution at pos 392 from F to S is predicted to AFFECT PROTEIN FUNCTION with a score of 0.00. Median sequence conservation: 3.01 Sequences represented at this position:77
- P416Q
Substitution at pos 416 from P to Q is predicted to AFFECT PROTEIN FUNCTION with a score of 0.00. Median sequence conservation: 3.00 Sequences represented at this position:74
- T266A
Substitution at pos 421 from I to T is predicted to AFFECT PROTEIN FUNCTION with a score of 0.00. Median sequence conservation: 3.00 Sequences represented at this position:74
PolyPhen2
- A259V
HumDiv This mutation is predicted to be probably damaging with a score of 1.000 (sensitivity: 0.00; specificity: 1.00) HumVar This mutation is predicted to be probably damaging with a score of 1.000 (sensitivity: 0.00; specificity: 1.00)
- R123I
HumDiv This mutation is predicted to be possibly damaging with a score of 0.807 (sensitivity: 0.84; specificity: 0.93) HumVar This mutation is predicted to be possibly damaging with a score of 0.582 (sensitivity: 0.81; specificity: 0.83)
- Q20L
HumDiv This mutation is predicted to be benign with a score of 0.000 (sensitivity: 1.00; specificity: 0.00) HumVar This mutation is predicted to be benign with a score of 0.000 (sensitivity: 1.00; specificity: 0.00)
- G103S
HumDiv This mutation is predicted to be benign with a score of 0.003 (sensitivity: 0.98; specificity: 0.44) HumVar This mutation is predicted to be benign with a score of 0.006 (sensitivity: 0.97; specificity: 0.45)
- H64N
HumDiv This mutation is predicted to be probably damaging with a score of 0.993 (sensitivity: 0.70; specificity: 0.97) HumVar This mutation is predicted to be probably damaging with a score of 0.962 (sensitivity: 0.62; specificity: 0.92)
- I421T
HumDiv This mutation is predicted to be possibly damaging with a score of 0.667 (sensitivity: 0.86; specificity: 0.91) HumVar This mutation is predicted to be probably damaging with a score of 0.913 (sensitivity: 0.69; specificity: 0.90)
- K341T
HumDiv This mutation is predicted to be probably damaging with a score of 1.000 (sensitivity: 0.00; specificity: 1.00) HumVar This mutation is predicted to be probably damaging with a score of 0.996 (sensitivity: 0.36; specificity: 0.97)
- F392S
HumDiv This mutation is predicted to be probably damaging with a score of 1.000 (sensitivity: 0.00; specificity: 1.00) HumVar This mutation is predicted to be probably damaging with a score of 1.000 (sensitivity: 0.00; specificity: 1.00)
- P416Q
HumDiv This mutation is predicted to be probably damaging with a score of 0.996 (sensitivity: 0.55; specificity: 0.98) HumVar This mutation is predicted to be probably damaging with a score of 0.985 (sensitivity: 0.55; specificity: 0.94)
- T266A
HumDiv This mutation is predicted to be probably damaging with a score of 1.000 (sensitivity: 0.00; specificity: 1.00) HumVar This mutation is predicted to be probably damaging with a score of 1.000 (sensitivity: 0.00; specificity: 1.00)
SNAP
- A259V
- R123I
- Q20L
- G103S
- H64N
- I421T
- K341T
- F392S
- P416Q
- T266A
MutationTaster
Gene: ENSG00000171759
Transcript: ENST00000553106
For affected protein features only those are reported that are annotated as "lost" and not as "might get lost".
- A259V - C776T
Prediction: disease causing, Model: simple_aae, prob: ~1 (classification due to ClinVar) Summary: • amino acid sequence changed • known disease mutation at this position (HGMD CM910286) • known disease mutation: rs118203921 (pathogenic) • protein features (might be) affected (HELIX: 251-259, lost) splice sites: Acc marginally increased
- R123I - G368T
Prediction: disease causing, Model: simple_aae, prob: ~1 Summary • amino acid sequence changed • listed as SNP • protein features (might be) affected • splice site changes (Acc increased)
- Q20L - A59T
Prediction: disease causing, Model: simple_aae, prob: ~0.95 Summary • amino acid sequence changed • known disease mutation at this position (HGMD CM000542) • protein features (might be) affected • splice site changes (Donor lost, acc marginally increased)
- G103S - G307A
Prediction: disease causing, Model: simple_aae, prob: ~1 Summary • amino acid sequence changed • known disease mutation at this position (HGMD CM045080) • protein features (might be) affected (Domain ACT: lost) • splice site changes (Donor gained)
- H64N - C190A
Prediction: disease causing, Model: simple_aae, prob: ~1 Summary • amino acid sequence changed • known disease mutation at this position (HGMD CD993066) • protein features (might be) affected (Domain ACT: lost)
- I421T - T1262C
Prediction: disease causing, Model: simple_aae, prob: ~1 Summary • amino acid sequence changed • listed as SNP • protein features (might be) affected (STRAND: 420-424, lost)
- K341T - A1022C
Prediction: disease causing, Model: simple_aae, prob: ~1 Summary • amino acid sequence changed • known disease mutation at this position (HGMD CM010980) • known disease mutation at this position (HGMD CM010981) • protein features (might be) affected (STRAND: 339-342, lost)
- F392S - T1175C
Prediction: disease causing, Model: simple_aae, prob: ~1 Summary • amino acid sequence changed • listed as SNP • protein features (might be) affected (HELIX: 392-403, lost) • splice site changes (Donor increased)
- P416Q - C1247A
Prediction: disease causing, Model: simple_aae, prob: ~1 Summary • amino acid sequence changed • known disease mutation at this position (HGMD CM090791) • protein features (might be) affected (TURN: 416-419, lost) • splice site changes (Acc marginally increased, donor increased, donor gained)
- T266A - A796C
Prediction: disease causing, Model: simple_aae, prob: ~1 Summary • amino acid sequence changed • listed as SNP • protein features (might be) affected • splice site changes (Acc increased, acc gained, donor increased)