Difference between revisions of "Canavan Disease"

From Bioinformatikpedia
Line 1: Line 1:
  +
== Secondary Structure ==
[[Image:ASPA.jpg|thumb|450px|Crystal structure of aspartoacylase (source: PDB)]]
 
  +
To determine which approach to follow we examined the proposed run-combinations for ReProf, where prediction only from FASTA-sequence vs. prediction from PSSM generated by PSI-Blast was looked at. Additionally the prediction of the secondary structure by ReProf with PSSM was further divided into PSSM generated by using big_80 and PSSM generated by using SwissProt. For further comparison a secondary structure prediction via PSI-Pred was initiated as well as a secondary structure assignment by DSSP. As DSSP assigns the secondary structure using the atom coordinates stored in PDB, we assume that we can use the DSSP assignment as the "true secondary structure" and compare the prediction methods in terms of performance to DSSP as reference. For the evaluation of the prediction methods there were however some problems we stepped into and had to deal with. First of all the PDB entry of ACY2 regards the protein as a homo-dimer, however it only exists in that form when crystallized. Therefore to compare and create statistics between the prediction methods and DSSP the output of the DSSP assignment had to be double checked and only one part of the assignment (to get the monomer) could be used. Additionally the beginning as well as the ending of the DSSP assignment had to be extended with some no secondary structure assigned symbols to stretch the DSSP assignment data to the full length of the protein. The final statistics concerning the secondary structure prediction of Aspartoacylase (P45381|ACY2_HUMAN) is displayed in <xr id="ACY2_statistics"> Table </xr>.
'''Canavan Disease''' ([http://apps.who.int/classifications/icd10/browse/2010/en#/E75.2 ICD-10 E75.2]) is an autosomal recessive disorder, in which a dysfunctional enzyme causes severe brain damage. It is also known under a variety of other names describing the chemical basis or phenotype of the disease. Examples are "Spongy Degeneration Of Central Nervous System", "Aspartoacylase (ASPA) Deficiency", or "Aminoacylase 2 (ACY2) Deficiency"[[http://omim.org/entry/271900]]. The trivial name, Canavan Disease, stems from the name of Myrtelle Canavan (1879 – 1953)[[http://en.wikipedia.org/wiki/Myrtelle_Canavan]], an american physician that first described the disease in 1931.
 
There is no cure and almost all patients die within the first decade of their life. The mild / juvenile type is less severe. The treatment is based on the symptoms and supportive.
 
   
  +
<figtable id="ACY2_statistics">
== Inheritance ==
 
  +
{| border="1" cellpadding="5" cellspacing="0" align="center"
Canavan disease is an autosomal recessive genetic defect of the ASPA (Aspartoacyclase) gene on chromosome 17. With this pattern of heritage a newborn of a couple where both parents are carriers of the defective genome has a 25% chance neither being born suffering from Canavan Disease nor being born a carrier. For some time children born of Ashkenazi Jewish ancestry had a higher prevalence of having Canavan Disease while in the last years this prevalence is sinking due to ongoing prenatal screening programs. Other ethnic groups where Canavan Disease has a higher penetrance are for example populations of Saudi Arabian ancestry. <br>
 
  +
|-
According to [http://ghr.nlm.nih.gov/condition/canavan-disease ''Genetics Home''] about one in 6400 to 13500 of the Ashkenazi Jewish are affected. We found no further information about prevalences in other populations. However the different populations have also different frequencies regarding the mutation they are based on. For further information see section [https://i12r-studfilesrv.informatik.tu-muenchen.de/wiki/index.php/Canavan_Disease#Disease_Causing_Mutations ''Disease Causing Mutations''].
 
  +
! colspan="13" style="background:#87CEFA;" | Secondary Structure Prediction Statistics for ACY2
  +
|-
  +
! style="background:#BFBFBF;" align="center" |
  +
! colspan="3" style="background:#BFBFBF;" | Precision
  +
! colspan="3" style="background:#BFBFBF;" | Recall
  +
! colspan="3" style="background:#BFBFBF;" | F-Measure
  +
|-
  +
! style="background:#E5E5E5;" align="center" | Type
  +
! style="background:#E5E5E5;" align="center" | H
  +
! style="background:#E5E5E5;" align="center" | E
  +
! style="background:#E5E5E5;" align="center" | L
  +
! style="background:#E5E5E5;" align="center" | H
  +
! style="background:#E5E5E5;" align="center" | E
  +
! style="background:#E5E5E5;" align="center" | L
  +
! style="background:#E5E5E5;" align="center" | H
  +
! style="background:#E5E5E5;" align="center" | E
  +
! style="background:#E5E5E5;" align="center" | L
  +
|-
  +
| ReProf (FASTA) ||0.773||0.822||0.562||0.829||0.446||0.808||0.800||0.578||0.663
  +
|-
  +
| ReProf (big_80) ||0.878||0.889||0.644||0.793||0.675||0.890||0.833||0.767||0.747
  +
|-
  +
| ReProf (SwissProt) ||0.853||0.937||0.62||0.780||0.711||0.849||0.815||0.809||0.717
  +
|-
  +
| Psi-Pred ||0.914||0.970||0.647||0.780||0.771||0.904||0.842||0.859||0.754
  +
|-
  +
|}
  +
<center><small><caption> Statistical overview of Precision, Recall and F-Measure for the prediction tools used, with DSSP as reference. H = Helix, E = Beta-Strand, L = Loop. Psi-Pred shows the best performance for ACY2. ReProf with a PSSM created by Psi-Blast using big_80 as database preforms second best but greatly outperforms (not shown) Psi-Pred in terms of speed (ReProf run locally, Psi-Pred run on offical webserver) </caption></small></center>
  +
</figtable>
   
  +
As Psi-Pred predictions when run via the official webserver take up much more time than running ReProf locally on the students lab, the decision to further use ReProf was made. More specifically ReProf with a position specific scoring matrix derived from big_80 was chosen (PSSM created with Psi-Blast, cut-off e-10 and 3 iterations). However, out of curiosity, additionally to the ReProf prediction, PSI-Pred predictions for the remaining proteins where run nevertheless.
== Phenotype ==
 
   
  +
During the mapping of Uniprot ID to PDB ID there arose some complications as not all proteins that where found contained the full sequence of the translated gene. The proteins that where used for the DSSP assignment where chosen manually to ensure that the whole sequence is contained within the protein, at least as part of the whole PDB entry. Additionally some modifications had to done again to ensure that the DSSP assignment has the same length as the predictions by ReProf and PSI-Pred. For example Q08209 mapped to 1AUI chain A covering most of translated gene, however parts of 1AUI could not be crystallized and the atom coordinates are missing from the PDB file (374 - 468). As a result those positions are fully absent from the DSSP assignment as well, and had to be filled with no predicted structure. After dealing with all those complications Precision, Recall and F-measure where calculated again in the same manner as it was done to decide on the preferred prediction method. An overview of the prediction statistics with the DSSP assignment as reference can be seen in <xr id="additional_statistics"> Table </xr>.
Canavan Disease has a variety of different phenotypes all over the body.
 
Here is a short overview:
 
* Head
 
** macrocephaly (increased head circumference)
 
** mental retardation and impairment (losing mental skills)
 
** losing ability to move head
 
* Eyes
 
** becoming blind
 
** nystagmus (greek: νυσταζω ''nytaxoo'' "sleep, nod", german: "Augenzittern")
 
* Ears
 
** becoming deaf
 
* Mouth
 
** problems with swallowing
 
** losing communicational abilities (cannot talk, stay quiet)
 
* Body
 
** paralysis
 
** seizures
 
** problems moving the muscles
 
   
  +
<figtable id="additional_statistics">
Children suffering from Canavan Disease usually die within the first decade.
 
  +
{| border="1" cellpadding="5" cellspacing="0" align="center"
In the mild/juvenile form of Canavan Disease, the children usually have some developmental delay and some speech problems.
 
  +
|-
  +
! colspan="14" style="background:#87CEFA;" | Secondary Structure Prediction Statistics for P10775, Q08209, Q9X0E6
  +
|-
  +
! colspan="2" style="background:#BFBFBF;" align="center" |
  +
! colspan="3" style="background:#BFBFBF;" | Precision
  +
! colspan="3" style="background:#BFBFBF;" | Recall
  +
! colspan="3" style="background:#BFBFBF;" | F-Measure
  +
|-
  +
! style="background:#E5E5E5;" align="center" | Protein
  +
! style="background:#E5E5E5;" align="center" | Type
  +
! style="background:#E5E5E5;" align="center" | H
  +
! style="background:#E5E5E5;" align="center" | E
  +
! style="background:#E5E5E5;" align="center" | L
  +
! style="background:#E5E5E5;" align="center" | H
  +
! style="background:#E5E5E5;" align="center" | E
  +
! style="background:#E5E5E5;" align="center" | L
  +
! style="background:#E5E5E5;" align="center" | H
  +
! style="background:#E5E5E5;" align="center" | E
  +
! style="background:#E5E5E5;" align="center" | L
  +
|-
  +
! rowspan="2" | P10775 (1DFJ_I)
  +
| ReProf ||0.974||0.959||0.793||0.945||0.855||0.912||0.959||0.904||0.848
  +
|-
  +
| Psi-Pred ||0.976||0.980||0.630||0.814||0.873||0.938||0.888||0.923||0.754
  +
|-
  +
! rowspan="2" | Q08209 (1AUI_A)
  +
| ReProf ||0.957||0.842||0.658||0.780||0.787||0.878||0.859||0.814||0.752
  +
|-
  +
| Psi-Pred ||0.895||0.971||0.594||0.723||0.557||0.944||0.800||0.708||0.729
  +
|-
  +
! rowspan="2" | Q9X0E6 (1O5J)
  +
| ReProf ||0.973||0.971||0.526||0.947||0.829||0.833||0.960||0.894||0.645
  +
|-
  +
| Psi-Pred ||1.000||1.000||0.600||0.947||0.854||1.000||0.973||0.921||0.750
  +
|-
  +
|}
  +
<center><small><caption> Statistical overview of Precision, Recall and F-Measure for the prediction tools used, with DSSP as reference. H = Helix, E = Beta-Strand, L = Loop. For P10775 (1DFJ chain I) and Q08209 (1AUI chain A) ReProf clearly shows the better performance. Psi-Pred shows better preformance for Q9X0E6.</caption></small></center>
  +
</figtable>
   
   
== Disease mechanism ==
+
== Disorder ==
   
[[Image:Canavan disease pathway KEGG.png|thumb|750px|Alanine, Aspartate and Glutamate Metabolism (source: KEGG) highlighting disease associated enzymes]]
 
Canavan Disease belongs to the group of leukodystrophies. This comes from greek: λευκος ''leukos'' "white", δυς ''dys'' "bad, wrong", τροφη ''trophae'' "feeding, growth". This is a genetic induced metabolic disorder, which affects the white matter of the nervous system. If the white matter is not properly grown, the myelin, which surrounds the nerve cells for protection, is degraded. This is especially true for the canavan disease. The visible phenotypes are a result of a genetic defective that negatively affects the growth of the myelin sheath covering the nerve fibres. A improperly build myelin sheath, results in a reduced ability to transmitting the electric signal along the nerve fibres, eventually losing it completely and finally the degradation of whole nerve cells. <br>
 
The cause for the malfunctioning myelin sheath growth is a genetic defect of the aspartoacylase (ASPA) gene. The product of the gene, the enzyme aspartoacylase is crucial in the degradation process of N-acetyl-L-aspartate (NAA) which is present at much higher levels than normal in patients suffering from canavan disease. Normally ASPA would degrade NAA into smaller fragments which are required prerequisites for the production of the myelin sheath. Therefore the missing / defective ASPA is reason for the defective generation of myelin. The degradation of the nerve cells / white brain matter has the consequence that empty spaces are arising which are filled with brain fluid leading to even more degradation of nerve cells and signal transduction problems.
 
   
== Diagnosis ==
+
== Transmembrane Helices ==
   
  +
Following the task the transmembrane helices and topology for the three given proteins plus ACY2 were predicted via Polyphobius and MEMSAT-SVM. As running the prediction with MEMSAT-SVM automatically returned the prediction results for MEMSAT-3 too, this data was incorporated in the comparison of the results as well.
There are a couple of possibilities how and when an affected patient is diagnosed with Canavan Disease. The time points are prenatal, postnatal, and when a mild or juvenile form of Canavan Disease is already present. Nevertheless one of the most important things to know beforehand is if both parents carry one copy of the disease causing gene. This can be done by simple DNA testing.
 
   
==== Prenatal Diagnosis ====
+
===P45381===
   
  +
ACY2 (P45381) is a protein that is located in the cytoplasma and not bound to the cell membrane therefore it should be save to expect that none of the prediction methods predicts a transmembrane helix. However Polyphobius was the only one to do so. MEMSAT-3 predicted a helix from the amiino acid position 60 to 78, even though the score is negative. MEMSAT-SVM predicted a helix ranging form amino acid 114 to 129 again with a negative score. As MEMSAT seems to test all possible combinations of helices present in the protein, ranging from the amount of 1 to n, with the possibility of 0 not tested, it could be hypothesized that MEMSAT always returns a prediction for a transmembrane helix even if the score is negative.
There are several types of prenatal testing possibilities depending whether the carrier status of both parents is known or not. For couples where it is only known that one of the parents is a carrier and the remaining parent’s status is not known, normally testing is done by measuring the concentration of N-acetyl-L-aspartic acid (NAA) in the amniotic fluid within the time between the 16th and 18th week of pregnancy.
 
Another possibility is molecular genetic testing. Following this method an analysis of DNA extracted from fetal cells is done. These fetal cells are obtained either between the tenth to 12th week of pregnancy by chorionic villus (“proto-”placental tissue that has the same genetic material as the fetus) sampling or between the 15th and 18th week by amniocentesis, also known as amniotic fluid testing (AFT). However for the molecular genetic testing both disease causing genes of the parents have to be identified first.
 
   
  +
===P35462===
==== Neonatal / Infantile Diagnosis ====
 
  +
P35462 (PDB:3PBL) a dopamine receptor in human is a 7-helical-transmembrane protein. Prediction of the transmembrane helices was done with the aid of MEMSAT-(SVM & 3) and Polyphobius. Interestingly MEMSAT-SVM did not predict the correct amount of helices, stoping after the sixth one. MEMSAT-3 did correctly predict seven helices despite being claimed to be worse in prediction power. PolyPhobis did achieve the best prediction for that protein, have correctly predicted all 7 helices and having predicted the borders of the helices more precisely than MEMSAT. The exact numbers can be found in <xr id="P35462_tmhs"> Table </xr>
   
  +
<figtable id="P35462_tmhs">
Postnatal testing for Canavan Disease can as well be done in several ways. One possibility is to test for a raised N-acetyl-L-aspartic acid (NAA) concentration in urine, blood and cerebrospinal fluid (CSF) (comparable to prenatal testing with the carrier status of one parent not known).
 
  +
{| border="1" cellpadding="5" cellspacing="0" align="center"
Other possibilities may be cultivating skin fibroblasts and test them for reduced aspartoacylase activity, perform neuroimaging of the brain and look for spongy degeneration, or test the gene itself for a defect in the newborn child. However it takes between three to nine months after birth until most of the symptoms become apparent.
 
  +
|-
  +
! colspan="8" style="background:#87CEFA;" | Predicted Transmembrane Helices for P35462
  +
|-
  +
! colspan="1" style="background:#BFBFBF;" align="center" |
  +
! colspan="7" style="background:#BFBFBF;" | Helix Positions
  +
|-
  +
! style="background:#E5E5E5;" align="center" | Method
  +
! style="background:#E5E5E5;" align="center" | #1
  +
! style="background:#E5E5E5;" align="center" | #2
  +
! style="background:#E5E5E5;" align="center" | #3
  +
! style="background:#E5E5E5;" align="center" | #4
  +
! style="background:#E5E5E5;" align="center" | #5
  +
! style="background:#E5E5E5;" align="center" | #6
  +
! style="background:#E5E5E5;" align="center" | #7
  +
|-
  +
| UniProt ||33-55||66-88||105-126||150-170||188-212||330-351||367-388
  +
|-
  +
| PolyPhobius ||30-55||66-88||105-126||150-170||188-212||329-352||367-386
  +
|-
  +
| MEMSAT-SVM ||32-55||65-88||101-129||151-169||188-209||331-354||no prediction
  +
|-
  +
| MEMSAT-3 ||31-55||67-91||102-126||148-167||189-213||327-350||365-383
  +
|-
  +
|}
  +
<center><small><caption> Overview of the predicted transmembrane helices for P35462 compared to the annotation in UniProt </caption></small></center>
  +
</figtable>
   
  +
'''Additional information:'''
==== Mild / Juvenile Diagnosis ====
 
   
  +
* UniProt entry: [http://www.uniprot.org/uniprot/P35462 P35462]
Diagnosing a patient with Canavan Disease if he is suffering from a mild or juvenile form, is a bit more challenging, as the postnatal diagnosis methods, except testing the gene itself, won't yield in a satisfactory result or may even overlook the disease completely. The concentration of NAA may be elevated only slightly and not as significant such that a proper diagnosis can be made. The same being true for the results of neuroimaging, and the mild developmental delay that is a result of Canavan Disease which can simply go unrecognized.
 
  +
* OMP entry: [http://opm.phar.umich.edu/protein.php?search=3PBL 3PBL]
  +
* PDBTM entry: [http://pdbtm.enzim.hu/?_=/pdbtm/3pbl 3PBL]
   
  +
===Q9YDF8===
  +
Q9YDF8 (PDB:1ORQ/1ORS/2A0L/2KYH) a crucial part to form potassium channels is a 7-helical-transmembrane protein. Prediction of the transmembrane helices was done with the aid of MEMSAT-(SVM & 3) and Polyphobius. In this case only Polyphobius correctly predicted the number of existent helices. Both MEMSAT-3 and MEMSAT-SVM predicted only six. Additionally all three tools had great problems of predicting the right borders. Polyphobius seems to have jumped over the third helix annotated in Swissprot, completely misspredicting the borders of the fifth helix (fourth helix predicted) and predicts a (sitxth) helix where in the actual protein a intramembrane element is located at the amino acid position 196 to 208. MEMSAT-SVM and MEMSAT-3, although falsely predicting six transmembrane helices, are concerning the precision of predicted helix borders closer to the annotation in SwissProt, except for the third helix where MEMSAT seems to have fused the third and fourth annotated helix. The exact numbers can be found in <xr id="Q9YDF8_tmhs"> Table </xr>
   
  +
<figtable id="Q9YDF8_tmhs">
== Treatment ==
 
  +
{| border="1" cellpadding="5" cellspacing="0" align="center"
  +
|-
  +
! colspan="8" style="background:#87CEFA;" | Predicted Transmembrane Helices for Q9YDF8
  +
|-
  +
! colspan="1" style="background:#BFBFBF;" align="center" |
  +
! colspan="7" style="background:#BFBFBF;" | Helix Positions
  +
|-
  +
! style="background:#E5E5E5;" align="center" | Method
  +
! style="background:#E5E5E5;" align="center" | #1
  +
! style="background:#E5E5E5;" align="center" | #2
  +
! style="background:#E5E5E5;" align="center" | #3
  +
! style="background:#E5E5E5;" align="center" | #4
  +
! style="background:#E5E5E5;" align="center" | #5
  +
! style="background:#E5E5E5;" align="center" | #6
  +
! style="background:#E5E5E5;" align="center" | #7
  +
|-
  +
| UniProt ||39-63||68-92||97-105||109-125||129-145||160-184||222-253
  +
|-
  +
| PolyPhobius ||42-60||68-88||108-129||137-157||163-184||196-213||224-244
  +
|-
  +
| MEMSAT-SVM ||43-59||72-90||101-118||128-143||163-184||221-245||no prediction
  +
|-
  +
| MEMSAT-3 ||38-60||66-90||100-119||122-141||161-184||218-242||no prediction
  +
|-
  +
|}
  +
<center><small><caption> Overview of the predicted transmembrane helices for Q9YDF8 compared to the annotation in UniProt </caption></small></center>
  +
</figtable>
   
  +
'''Additional information:'''
Right now there is no cure for Canavan Disease, but there are treatments depending on the symptoms, which work in a supportive manner.
 
   
  +
* UniProt entry: [http://www.uniprot.org/uniprot/Q9YDF8 Q9YDF8]
==== Prenatal Treatment ====
 
  +
* OMP entry: not clear
  +
* PDBTM entry: see OMP
   
  +
===P47863===
There is a possibility of prenatal screening to check whether or not you are a carrier of the disease (as described in the section before). Other prenatal treatments are under investigation and depend on animal models.
 
  +
P47863 (PDB:2D57) a aquaporin in rat is a 6-helical-transmembrane protein. Prediction of the transmembrane helices was done with the aid of MEMSAT-(SVM & 3) and Polyphobius. In this case every prediction tool correctly predicted the number of existent helices. PolyPhobius and MEMSAT-SVM were slightly off predicting the borders of the helices, whereas in this case the claimed inferiority of MEMSAT-3 compared to MEMSAT-SVM can clearly be seen showing less precise border prediction. The exact numbers can be found in <xr id="P47863_tmhs"> Table </xr>
   
  +
<figtable id="P47863_tmhs">
==== Neonatal / Infantile Treatment ====
 
  +
{| border="1" cellpadding="5" cellspacing="0" align="center"
  +
|-
  +
! colspan="7" style="background:#87CEFA;" | Predicted Transmembrane Helices for P47863
  +
|-
  +
! colspan="1" style="background:#BFBFBF;" align="center" |
  +
! colspan="6" style="background:#BFBFBF;" | Helix Positions
  +
|-
  +
! style="background:#E5E5E5;" align="center" | Method
  +
! style="background:#E5E5E5;" align="center" | #1
  +
! style="background:#E5E5E5;" align="center" | #2
  +
! style="background:#E5E5E5;" align="center" | #3
  +
! style="background:#E5E5E5;" align="center" | #4
  +
! style="background:#E5E5E5;" align="center" | #5
  +
! style="background:#E5E5E5;" align="center" | #6
  +
|-
  +
| UniProt ||37–57||65-85||116-136||156-176||185-205||232-252
  +
|-
  +
| PolyPhobius ||34-58||70-91||115-136||156-177||188-208||231-252
  +
|-
  +
| MEMSAT-SVM ||35-56||71-89||113-136||157-178||190-205||232-252
  +
|-
  +
| MEMSAT-3 ||35-59||71-95||117-141||157-180||187-206||240-264
  +
|-
  +
|}
  +
<center><small><caption> Overview of the predicted transmembrane helices for P47863 compared to the annotation in UniProt </caption></small></center>
  +
</figtable>
   
  +
'''Additional information:'''
Since Canavan also affects the metabolism there is need to control the nutrition and hydration. This includes specialized food to make up for missing metabolites and nutrients as well as different ways of feeding / providing nutrition to the child to prevent problems arising from swallowing difficulties and other physical disabilities. To improve those physical disabilities and muscle problems, it is recommended that children need physical therapy. Additionally there are antiepileptic drugs against seizures and spastic behaviour.
 
   
  +
* UniProt entry: [http://www.uniprot.org/uniprot/P47863 P47863]
==== Mild / Juvenile Treatment ====
 
  +
* OMP entry: [http://opm.phar.umich.edu/protein.php?search=2D57 2D57]
  +
* PDBTM entry: [http://pdbtm.enzim.hu/?_=/pdbtm/2d57 2D57]
   
  +
== Signal Peptides ==
Since mild and juvenile Canavan patients only have some delays in the development and speech, a speech therapy may be useful. Further deep medical care is not necessary.
 
   
  +
For the prediction of signal peptides SignalP version 4.1 (webserver) was used.
== Future Work ==
 
   
  +
===P02768===
There are some clinical trials and animal models under investigation to find a cure for canavan disease.
 
   
  +
Serum albumin (P02768) is a protein that is one of the main components of blood plasma. As it clearly has to to be secreted into the blood vessels it can be expected that P02768 has motives that are crucial for the delivery down the secretory pathway and therefore contains a signal peptide sequence. This is exactly what the prediction for signal peptides using SignalP shows. SignalP predicts that P02768 has signal peptide sequence and that a cleavage site exists between amino acid position 18 and 19. Looking at the plot <xr id="P02768_signalp"> (see Figure</xr>) created by SignalP v4.1 this clear signal at position 19 (0.710) can be observed.
==== Gene Therapy ====
 
   
  +
<figure id="P02768_signalp">
There were several studies in the gene therapy, using viral and nonviral vectors to transfer genes into the patients that were thought to improve the course of the disease. However none of children showed an improvement and the disease showed a development similar to an untreated patient.
 
  +
[[Image:P02768 signalp.png|centre]]
  +
<center><small><caption> Plot displaying the scores (C = cleavage, S = signal peptide, Y = combined) predicted for each aminoacid by SignalP v4.1 for P02768. A clear spike for the cleavage site at position 19 can be seen, as well as high scores for signal peptide for the first 18 amino acids. (Image Source: Maple Sirup Urine Disease Group to prevent file duplicates in the wiki)</caption></small></center>
  +
</figure>
   
  +
'''Additional information:'''
==== Lithium Citrate as Pharmaceutical ====
 
   
  +
* UniProt entry: [http://www.uniprot.org/uniprot/P02768 P02768]
Since N-acetyl-L-aspartate (NAA) is one important factor in the biochemical background of Canavan Disease, where the NAA level is too high, lithium citrate may be able to reduce the NAA concentration. Rat models have shown that treating a rat with lithium citrate resulted in a reduced level of NAA. Furthermore if the drug is administered to a human the same effect can be observed with a return to elevated NAA concentration when the lithium citrated is washed out of the body after roughly 2 weeks. However so far no larger controlled clinical studies have been conducted, but lithium citrate shows a potential treatment that is worth pursuing.
 
  +
* Signal Peptide Database Entry: [http://www.signalpeptide.de/index.php?sess=&m=myprotein&s=details&id=22229&listname= ALBU_HUMAN]
   
  +
===P47863===
==== Animal Models ====
 
   
  +
As we know after the task to predict transmembrane helices P47863 is a aquaporin that is located within the membrane. The prediction by SignalP shows that neither a signal peptide sequence nor a cleavage site can be detected. Detailed graphical output can be seen in <xr id="P47863_signalp">Figure</xr>.
Several gene models in knockout mice and rats have been studied, with lithium citrate and an enzyme replacement therapy showing the best result so far and therefore being the most promising at the moment.
 
   
  +
<figure id="P47863_signalp">
  +
[[Image:P47863 signalp.png|centre]]
  +
<center><small><caption> Plot displaying the scores (C = cleavage, S = signal peptide, Y = combined) predicted for each aminoacid by SignalP v4.1 for P47863. Neither a spike in the c-score nor high s-scores can be seen, therefore no signal peptide sequence and no cleavage site is predicted by SignalP. (Image Source: Maple Sirup Urine Disease Group to prevent file duplicates in the wiki)</caption></small></center>
  +
</figure>
   
  +
'''Additional information:'''
== Aspartoacylase (ASPA) ==
 
[[Image:NAA hydrolyzation.gif|thumb|450px|The hydrolyzation of N-acetyl-L-aspartate (C01042) catalyzed by ASPA to acetyl (C00033) and aspartate (C00049) (source: KEGG)]]
 
==== Summary ====
 
Aspartoacylase is the enzyme that hydrolyses N-acetyl-L-aspartate into acetate and L-aspartate, which are essential for the build-up process of the myelin sheath. Crystallized ASPA exists as a homodimer however it is assumed that the in-vivo form only works as a monomer. The active site of ASPA contains a zinc atom which acts catalytic in the hydrolyzation process and is only accessible through a channel like surface fold of the protein. This channel like structure serves two purposes. On the one hand it hinders polypeptides to enter and bind at the active site, therefore ASPA does not function as protease. On the other hand and more importantly it is assumed, that the positive electrostatic potential that is present on the channel serves as a form of transport mechanism to properly carry the negatively charged substrate (NAA) to the hydrolysing site. Furthermore, the binding pocket is highly specific to N-acetyl-L-aspartate with a far lower hydrolyzing activity towards other N-acetyl-amino complexes like N-acetylglutamate.
 
   
  +
* UniProt entry: [http://www.uniprot.org/uniprot/P47863 P47863]
==== Gene, Mutations ====
 
   
  +
===P11279===
The ASPA gene sits on chromosome 17 on the p-arm (upper part, short arm) band 1 subband 3 subsubband 2 (short 17p13.2).
 
[[Image:ASPA gene location.png|thumb|centre|750px|Chromosome 17 with highlighted position of ASPA-gene (source: http://www.genecards.org/cgi-bin/carddisp.pl?gene=ASPA)]]
 
   
  +
LAMP-1 (Lysosome-associated membrane glycoprotein 1 | P11279) is a membrane protein. It takes an important role in the autophagy process and is associated with tumor metastasis. It has one transmenbrane helix which could be a some sort of protein anchor. Taking a look at the signal peptide prediction by SignalP reveals that LAMP-1 has an assumed signal peptide sequence and a cleavage site between the amino acids 28 and 29. This is congruent with the information stored in the Signal Peptide Database [http://www.signalpeptide.de/index.php?sess=&m=myprotein&s=details&id=17551&listname=]. A detailed graphical output of the SignalP prediction is displayed in <xr id="P11279_signalp"> see Figure</xr>.
===== Reference sequence =====
 
*[[ASPA#Genomic Sequence|Reference sequence (genomic) of ASPA]]
 
*[[ASPA#Protein Sequence|Reference sequence (protein) of ASPA]]
 
===== Neutral Mutations =====
 
===== Disease Causing Mutations =====
 
The disease causing mutations can be found in the image below. Also very interesting is the frequency across the different populations.
 
[[Image:ASPAGeneMutations.png|thumb|centre|750px|Disease causing Mutations in Canavan Disease (source: http://www.ncbi.nlm.nih.gov/books/NBK1234/)]]
 
[[Image:AllelicVariantsASPA.png|thumb|centre|500px|Disease causing Mutations in Canavan Disease (source: http://www.ncbi.nlm.nih.gov/books/NBK1234/)]]
 
   
  +
<figure id="P11279_signalp">
== Tasks ==
 
  +
[[Image:P11279 signalp.png|centre]]
* Link to Task 02: [[Canavan_Disease:_Task_02_-_Alignments|Alignments]]
 
  +
<center><small><caption> Plot displaying the scores (C = cleavage, S = signal peptide, Y = combined) predicted for each aminoacid by SignalP v4.1 for P11279. A clear spike for the cleavage site at position 29 can be seen, as well as high scores for signal peptide for the first 28 amino acids. (Image Source: Maple Sirup Urine Disease Group to prevent file duplicates in the wiki)</caption></small></center>
* Link to Task 03: [[Canavan_Disease:_Task_03_-_Sequence-based_Predictions|Sequence-based Predictions]]
 
  +
</figure>
   
  +
'''Additional information:'''
== References ==
 
  +
The written text is based on a summary of different sources: <br>
 
  +
* UniProt entry: [http://www.uniprot.org/uniprot/P11279 P11279]
http://en.wikipedia.org/wiki/Myrtelle_Canavan <br>
 
  +
* Signal Peptide Database Entry: [http://www.signalpeptide.de/index.php?sess=&m=myprotein&s=details&id=17551&listname= LAMP1_HUMAN]
http://ghr.nlm.nih.gov/condition/canavan-disease <br>
 
  +
http://www.pnas.org/content/104/2/456.short <br>
 
  +
== GO-Terms ==
https://www.counsyl.com/diseases/canavan-disease/ <br>
 
  +
http://omim.org/entry/608034 <br>
 
  +
===GO-Pet===
http://omim.org/entry/271900 <br>
 
  +
The GO-Term prediction for Aspartoacylase executed by GO-Pet (see <xr id="P11279_signalp"> Table</xr>) is refelcting the known biological processes.
http://www.uniprot.org/uniprot/P45381 <br>
 
  +
http://www.canavanfoundation.org <br>
 
  +
<figtable id="P45381_gopet">
http://www.canavandisease.net <br>
 
  +
{| border="1" cellpadding="5" cellspacing="0" align="center"
http://www.ncbi.nlm.nih.gov/books/NBK1234/ <br>
 
  +
|-
http://ghr.nlm.nih.gov/condition/canavan-disease <br>
 
  +
! colspan="7" style="background:#87CEFA;" | Predicted GOTerms for P45381 by GO-Pet
http://www.nlm.nih.gov/medlineplus/ency/article/001586.htm <br>
 
  +
|-
http://www.ninds.nih.gov/disorders/canavan/canavan.htm <br>
 
  +
! style="background:#BFBFBF;" align="center" | GO-ID
http://www.genome.jp/dbget-bin/www_bget?ds:H00074 <br>
 
  +
! style="background:#BFBFBF;" align="center" | GO-Term / Description
http://www.kegg.jp/kegg-bin/get_htext?htext=br08402.keg&query=canavan <br>
 
  +
! style="background:#BFBFBF;" align="center" | Confidence
http://rarediseases.info.nih.gov/gard/5984/canavan-disease/resources/1
 
  +
|-
  +
| GO:0016787 ||hydrolase activity||96%
  +
|-
  +
| GO:0004046 ||aminoacylase activity||82%
  +
|-
  +
| GO:0019807 ||aspartoacylase activity||82%
  +
|-
  +
| GO:0016788 ||hydrolase activity acting on ester bonds||81%
  +
|-
  +
|}
  +
<center><small><caption> Overview of the predicted GO-Terms for P45381 </caption></small></center>
  +
</figtable>

Revision as of 14:03, 2 June 2013

Secondary Structure

To determine which approach to follow we examined the proposed run-combinations for ReProf, where prediction only from FASTA-sequence vs. prediction from PSSM generated by PSI-Blast was looked at. Additionally the prediction of the secondary structure by ReProf with PSSM was further divided into PSSM generated by using big_80 and PSSM generated by using SwissProt. For further comparison a secondary structure prediction via PSI-Pred was initiated as well as a secondary structure assignment by DSSP. As DSSP assigns the secondary structure using the atom coordinates stored in PDB, we assume that we can use the DSSP assignment as the "true secondary structure" and compare the prediction methods in terms of performance to DSSP as reference. For the evaluation of the prediction methods there were however some problems we stepped into and had to deal with. First of all the PDB entry of ACY2 regards the protein as a homo-dimer, however it only exists in that form when crystallized. Therefore to compare and create statistics between the prediction methods and DSSP the output of the DSSP assignment had to be double checked and only one part of the assignment (to get the monomer) could be used. Additionally the beginning as well as the ending of the DSSP assignment had to be extended with some no secondary structure assigned symbols to stretch the DSSP assignment data to the full length of the protein. The final statistics concerning the secondary structure prediction of Aspartoacylase (P45381|ACY2_HUMAN) is displayed in <xr id="ACY2_statistics"> Table </xr>.

<figtable id="ACY2_statistics">

Secondary Structure Prediction Statistics for ACY2
Precision Recall F-Measure
Type H E L H E L H E L
ReProf (FASTA) 0.773 0.822 0.562 0.829 0.446 0.808 0.800 0.578 0.663
ReProf (big_80) 0.878 0.889 0.644 0.793 0.675 0.890 0.833 0.767 0.747
ReProf (SwissProt) 0.853 0.937 0.62 0.780 0.711 0.849 0.815 0.809 0.717
Psi-Pred 0.914 0.970 0.647 0.780 0.771 0.904 0.842 0.859 0.754
Statistical overview of Precision, Recall and F-Measure for the prediction tools used, with DSSP as reference. H = Helix, E = Beta-Strand, L = Loop. Psi-Pred shows the best performance for ACY2. ReProf with a PSSM created by Psi-Blast using big_80 as database preforms second best but greatly outperforms (not shown) Psi-Pred in terms of speed (ReProf run locally, Psi-Pred run on offical webserver)

</figtable>

As Psi-Pred predictions when run via the official webserver take up much more time than running ReProf locally on the students lab, the decision to further use ReProf was made. More specifically ReProf with a position specific scoring matrix derived from big_80 was chosen (PSSM created with Psi-Blast, cut-off e-10 and 3 iterations). However, out of curiosity, additionally to the ReProf prediction, PSI-Pred predictions for the remaining proteins where run nevertheless.

During the mapping of Uniprot ID to PDB ID there arose some complications as not all proteins that where found contained the full sequence of the translated gene. The proteins that where used for the DSSP assignment where chosen manually to ensure that the whole sequence is contained within the protein, at least as part of the whole PDB entry. Additionally some modifications had to done again to ensure that the DSSP assignment has the same length as the predictions by ReProf and PSI-Pred. For example Q08209 mapped to 1AUI chain A covering most of translated gene, however parts of 1AUI could not be crystallized and the atom coordinates are missing from the PDB file (374 - 468). As a result those positions are fully absent from the DSSP assignment as well, and had to be filled with no predicted structure. After dealing with all those complications Precision, Recall and F-measure where calculated again in the same manner as it was done to decide on the preferred prediction method. An overview of the prediction statistics with the DSSP assignment as reference can be seen in <xr id="additional_statistics"> Table </xr>.

<figtable id="additional_statistics">

Secondary Structure Prediction Statistics for P10775, Q08209, Q9X0E6
Precision Recall F-Measure
Protein Type H E L H E L H E L
P10775 (1DFJ_I) ReProf 0.974 0.959 0.793 0.945 0.855 0.912 0.959 0.904 0.848
Psi-Pred 0.976 0.980 0.630 0.814 0.873 0.938 0.888 0.923 0.754
Q08209 (1AUI_A) ReProf 0.957 0.842 0.658 0.780 0.787 0.878 0.859 0.814 0.752
Psi-Pred 0.895 0.971 0.594 0.723 0.557 0.944 0.800 0.708 0.729
Q9X0E6 (1O5J) ReProf 0.973 0.971 0.526 0.947 0.829 0.833 0.960 0.894 0.645
Psi-Pred 1.000 1.000 0.600 0.947 0.854 1.000 0.973 0.921 0.750
Statistical overview of Precision, Recall and F-Measure for the prediction tools used, with DSSP as reference. H = Helix, E = Beta-Strand, L = Loop. For P10775 (1DFJ chain I) and Q08209 (1AUI chain A) ReProf clearly shows the better performance. Psi-Pred shows better preformance for Q9X0E6.

</figtable>


Disorder

Transmembrane Helices

Following the task the transmembrane helices and topology for the three given proteins plus ACY2 were predicted via Polyphobius and MEMSAT-SVM. As running the prediction with MEMSAT-SVM automatically returned the prediction results for MEMSAT-3 too, this data was incorporated in the comparison of the results as well.

P45381

ACY2 (P45381) is a protein that is located in the cytoplasma and not bound to the cell membrane therefore it should be save to expect that none of the prediction methods predicts a transmembrane helix. However Polyphobius was the only one to do so. MEMSAT-3 predicted a helix from the amiino acid position 60 to 78, even though the score is negative. MEMSAT-SVM predicted a helix ranging form amino acid 114 to 129 again with a negative score. As MEMSAT seems to test all possible combinations of helices present in the protein, ranging from the amount of 1 to n, with the possibility of 0 not tested, it could be hypothesized that MEMSAT always returns a prediction for a transmembrane helix even if the score is negative.

P35462

P35462 (PDB:3PBL) a dopamine receptor in human is a 7-helical-transmembrane protein. Prediction of the transmembrane helices was done with the aid of MEMSAT-(SVM & 3) and Polyphobius. Interestingly MEMSAT-SVM did not predict the correct amount of helices, stoping after the sixth one. MEMSAT-3 did correctly predict seven helices despite being claimed to be worse in prediction power. PolyPhobis did achieve the best prediction for that protein, have correctly predicted all 7 helices and having predicted the borders of the helices more precisely than MEMSAT. The exact numbers can be found in <xr id="P35462_tmhs"> Table </xr>

<figtable id="P35462_tmhs">

Predicted Transmembrane Helices for P35462
Helix Positions
Method #1 #2 #3 #4 #5 #6 #7
UniProt 33-55 66-88 105-126 150-170 188-212 330-351 367-388
PolyPhobius 30-55 66-88 105-126 150-170 188-212 329-352 367-386
MEMSAT-SVM 32-55 65-88 101-129 151-169 188-209 331-354 no prediction
MEMSAT-3 31-55 67-91 102-126 148-167 189-213 327-350 365-383
Overview of the predicted transmembrane helices for P35462 compared to the annotation in UniProt

</figtable>

Additional information:

Q9YDF8

Q9YDF8 (PDB:1ORQ/1ORS/2A0L/2KYH) a crucial part to form potassium channels is a 7-helical-transmembrane protein. Prediction of the transmembrane helices was done with the aid of MEMSAT-(SVM & 3) and Polyphobius. In this case only Polyphobius correctly predicted the number of existent helices. Both MEMSAT-3 and MEMSAT-SVM predicted only six. Additionally all three tools had great problems of predicting the right borders. Polyphobius seems to have jumped over the third helix annotated in Swissprot, completely misspredicting the borders of the fifth helix (fourth helix predicted) and predicts a (sitxth) helix where in the actual protein a intramembrane element is located at the amino acid position 196 to 208. MEMSAT-SVM and MEMSAT-3, although falsely predicting six transmembrane helices, are concerning the precision of predicted helix borders closer to the annotation in SwissProt, except for the third helix where MEMSAT seems to have fused the third and fourth annotated helix. The exact numbers can be found in <xr id="Q9YDF8_tmhs"> Table </xr>

<figtable id="Q9YDF8_tmhs">

Predicted Transmembrane Helices for Q9YDF8
Helix Positions
Method #1 #2 #3 #4 #5 #6 #7
UniProt 39-63 68-92 97-105 109-125 129-145 160-184 222-253
PolyPhobius 42-60 68-88 108-129 137-157 163-184 196-213 224-244
MEMSAT-SVM 43-59 72-90 101-118 128-143 163-184 221-245 no prediction
MEMSAT-3 38-60 66-90 100-119 122-141 161-184 218-242 no prediction
Overview of the predicted transmembrane helices for Q9YDF8 compared to the annotation in UniProt

</figtable>

Additional information:

  • UniProt entry: Q9YDF8
  • OMP entry: not clear
  • PDBTM entry: see OMP

P47863

P47863 (PDB:2D57) a aquaporin in rat is a 6-helical-transmembrane protein. Prediction of the transmembrane helices was done with the aid of MEMSAT-(SVM & 3) and Polyphobius. In this case every prediction tool correctly predicted the number of existent helices. PolyPhobius and MEMSAT-SVM were slightly off predicting the borders of the helices, whereas in this case the claimed inferiority of MEMSAT-3 compared to MEMSAT-SVM can clearly be seen showing less precise border prediction. The exact numbers can be found in <xr id="P47863_tmhs"> Table </xr>

<figtable id="P47863_tmhs">

Predicted Transmembrane Helices for P47863
Helix Positions
Method #1 #2 #3 #4 #5 #6
UniProt 37–57 65-85 116-136 156-176 185-205 232-252
PolyPhobius 34-58 70-91 115-136 156-177 188-208 231-252
MEMSAT-SVM 35-56 71-89 113-136 157-178 190-205 232-252
MEMSAT-3 35-59 71-95 117-141 157-180 187-206 240-264
Overview of the predicted transmembrane helices for P47863 compared to the annotation in UniProt

</figtable>

Additional information:

Signal Peptides

For the prediction of signal peptides SignalP version 4.1 (webserver) was used.

P02768

Serum albumin (P02768) is a protein that is one of the main components of blood plasma. As it clearly has to to be secreted into the blood vessels it can be expected that P02768 has motives that are crucial for the delivery down the secretory pathway and therefore contains a signal peptide sequence. This is exactly what the prediction for signal peptides using SignalP shows. SignalP predicts that P02768 has signal peptide sequence and that a cleavage site exists between amino acid position 18 and 19. Looking at the plot <xr id="P02768_signalp"> (see Figure</xr>) created by SignalP v4.1 this clear signal at position 19 (0.710) can be observed.

<figure id="P02768_signalp">

P02768 signalp.png
Plot displaying the scores (C = cleavage, S = signal peptide, Y = combined) predicted for each aminoacid by SignalP v4.1 for P02768. A clear spike for the cleavage site at position 19 can be seen, as well as high scores for signal peptide for the first 18 amino acids. (Image Source: Maple Sirup Urine Disease Group to prevent file duplicates in the wiki)

</figure>

Additional information:

P47863

As we know after the task to predict transmembrane helices P47863 is a aquaporin that is located within the membrane. The prediction by SignalP shows that neither a signal peptide sequence nor a cleavage site can be detected. Detailed graphical output can be seen in <xr id="P47863_signalp">Figure</xr>.

<figure id="P47863_signalp">

P47863 signalp.png
Plot displaying the scores (C = cleavage, S = signal peptide, Y = combined) predicted for each aminoacid by SignalP v4.1 for P47863. Neither a spike in the c-score nor high s-scores can be seen, therefore no signal peptide sequence and no cleavage site is predicted by SignalP. (Image Source: Maple Sirup Urine Disease Group to prevent file duplicates in the wiki)

</figure>

Additional information:

P11279

LAMP-1 (Lysosome-associated membrane glycoprotein 1 | P11279) is a membrane protein. It takes an important role in the autophagy process and is associated with tumor metastasis. It has one transmenbrane helix which could be a some sort of protein anchor. Taking a look at the signal peptide prediction by SignalP reveals that LAMP-1 has an assumed signal peptide sequence and a cleavage site between the amino acids 28 and 29. This is congruent with the information stored in the Signal Peptide Database [1]. A detailed graphical output of the SignalP prediction is displayed in <xr id="P11279_signalp"> see Figure</xr>.

<figure id="P11279_signalp">

P11279 signalp.png
Plot displaying the scores (C = cleavage, S = signal peptide, Y = combined) predicted for each aminoacid by SignalP v4.1 for P11279. A clear spike for the cleavage site at position 29 can be seen, as well as high scores for signal peptide for the first 28 amino acids. (Image Source: Maple Sirup Urine Disease Group to prevent file duplicates in the wiki)

</figure>

Additional information:

GO-Terms

GO-Pet

The GO-Term prediction for Aspartoacylase executed by GO-Pet (see <xr id="P11279_signalp"> Table</xr>) is refelcting the known biological processes.

<figtable id="P45381_gopet">

Predicted GOTerms for P45381 by GO-Pet
GO-ID GO-Term / Description Confidence
GO:0016787 hydrolase activity 96%
GO:0004046 aminoacylase activity 82%
GO:0019807 aspartoacylase activity 82%
GO:0016788 hydrolase activity acting on ester bonds 81%
Overview of the predicted GO-Terms for P45381

</figtable>