Prediction from protein sequence

From Protein Prediction 1 Summer Semester 2016 For Informaticians

This page is organized as follows:

Keywords are terms that you need to understand to follow the lecture. To test your knowledge, try to define and explain these keywords (in a few sentences). If you cannot think of anything to say about a keyword, read up on that topic.

Under Sources you find literature we suggest (textbooks, web pages, articles) that will help you to understand the topic. You can use this as a resource to complete your knowledge of the keywords and to help you answer the questions and solve the tasks. You are not required to read and study any of these, but they provide more detailed knowledge on the topic and are a good complement to the lecture. Of course you can feel free to use any other source you like.

In the section Exercise we provide Questions and Hands-on tasks that allow you to test and further your knowledge of the given topic. During the exercise session you can ask questions pertaining to the topic (keywords and exercises).


  • Transmembrane helices
  • Biolocial membranes
  • Cells, cell compartments
  • Disorder, (solvent accessibility)
  • Protein function, GO terms
  • Subcellular localization
  • Hidden Markov model, neural networks, support vector machines




  • Which aspects of protein structure and function can be predicted from protein sequence?
  • What is machine learning?
  • List several machine learning algorithms.

Hands on tasks

  • Implement a simple method to predict transmembrane helices from an amino acid sequence. Assume that every helix is 21 residues long, and a helix is predicted in a window of 21 residues when the sum of the hydrophobicity values in that window is above 4.0. You can use one of the hydrophobicity scales here.