Protein structure

From Protein Prediction 1 Summer Semester 2016 For Informaticians

This page is organized as follows:

Keywords are terms that you need to understand to follow the lecture. To test your knowledge, try to define and explain these keywords (in a few sentences). If you cannot think of anything to say about a keyword, read up on that topic.

Under Sources you find literature we suggest (textbooks, web pages, articles) that will help you to understand the topic. You can use this as a resource to complete your knowledge of the keywords and to help you answer the questions and solve the tasks. You are not required to read and study any of these, but they provide more detailed knowledge on the topic and are a good complement to the lecture. Of course you can feel free to use any other source you like.

In the section Exercise we provide Questions and Hands-on tasks that allow you to test and further your knowledge of the given topic. During the exercise session you can ask questions pertaining to the topic (keywords and exercises).


  • Amino acids, side chains, residues
  • Protein sequence
  • Secondary, tertiary, quaternary protein structure
  • Hydrogen bond
  • Alpha helix, beta sheet, loop, random coil, disordered region
  • Protein features: hydrophobicity, solvent accessibility, active site, binding site
  • Ramachandran plot
  • Protein domain
  • Protein Data Bank (PDB)
  • Root mean square deviation (RMSD)
  • Protein function




  • What are the building blocks of proteins?
  • Define protein backbone and amino acid side chain in 1 or 2 sentences for each term.
  • How many amino acids appear in proteins? How can they be classified?
  • Name atom types involved in a hydrogen bond. Do S-H groups form hydrogen bonds? Why (not)?
  • How are alpha helices held together?
  • What is similar and what is different in the hydrogen bonding of the alpha helix and the beta sheet?
  • Why do we find "forbidden" areas in a Ramachandran plot?
  • What is a protein domain?
  • How many amino acids are typically found a in a domain? Why is there a minimum/maximum size?

Hands on tasks

  • Go to the PDB website
    • Check the current release/date
    • How many structures are stored in the PDB? How many of those are protein structures?
    • Look at the statistics: Which experimental methods are (mainly) used to determine the structures?
    • How many human ("Homo sapiens") protein structures are in the PDB? Which three experimental methods are most frequently used to determine the structures? Choose to "show only representatives (protein structures) at 100%/95%/30% sequence identity". How many protein structures do you find then? Why does the number of protein structures already decreases when reducing at 100% sequence identity?
    • Search PDB ID 2W72. Which protein is this, from which organism? Which experimental method was used to solve the structure at what resolution? Look at this protein structure, for example using the 3D View (JMol) of PDB . Look at the PDB file (you can find it under "Display Files").

Hint: Molecular visualization (looking at 3D protein structures)

  • Looking at protein structures: Short introduction to molecular visualization
  • You have several options:
    • Use Aquaria. You can enter a PDB ID or protein name under "Specify a protein", then the structure will be shown. If you open Aquaria in chrome, you see a slightly simplified view of the structure. To see structures in other browsers, e.g. Firefox, you have to install Java3D (Aquaria will ask you).
    • Use the 3D view (Jmol) of PDB (link on the right hand side under the structure picture). Standalone version of Jmol.
    • Install a protein structure viewer on your computer. For example Chimera.

  • Take a 3D protein structure (any from PDB), look at it for example using the 3D View (JMol) of PDB and decide where you define regions of secondary structure. Compare to the assignments you find in the PDB file. Have a look at PDB IDs 2BNH and 1KR4.
  • Pick a 3D alpha helix, think between which residues you would expect H-bonds. Try to locate them in the structure.
  • Pick a 3D beta-sheet, think between which residues you would expect H-bonds. Try to locate them in the structure.
  • Which parameters do you need to know if you were to draw a helix or a beta sheet on paper? Look them up.
  • Imagine you would want to write a program that assigns secondary structure based on tertiary structure, which criteria would you apply?
  • Sketch a program (high-level, pseudo-code) that would draw a Ramachandran plot. Which parameters do you need to calculate?