Multiple sequence alignments

From Protein Prediction 1 Summer Semester 2016 For Informaticians

This page is organized as follows:

Keywords are terms that you need to understand to follow the lecture. To test your knowledge, try to define and explain these keywords (in a few sentences). If you cannot think of anything to say about a keyword, read up on that topic.

Under Sources you find literature we suggest (textbooks, web pages, articles) that will help you to understand the topic. You can use this as a resource to complete your knowledge of the keywords and to help you answer the questions and solve the tasks. You are not required to read and study any of these, but they provide more detailed knowledge on the topic and are a good complement to the lecture. Of course you can feel free to use any other source you like.

In the section Exercise we provide Questions and Hands-on tasks that allow you to test and further your knowledge of the given topic. During the exercise session you can ask questions pertaining to the topic (keywords and exercises).


Keywords

  • Multiple Sequence Alignments (MSAs)
  • Sequence profile, iterative profile creation
  • BLAST, PSI-BLAST
  • Sequence identity, E-value
  • Sequence databases: UniProt, Swiss-Prot

Sources


Excercise

Questions

  • What is a multiple sequence alignment?
    • What kind of sequences are likely to be used for an MSA? In which relationship are they to each other?
    • Why would you want to align multiple sequences? What kind of information is contained in MSAs but not directly in e.g. all-against-all pairwise alignments?
    • Given your knowledge of the algorithms for pairwise alignments, how could you calculate an MSA?
      • Is that a feasible approach? Why?
  • You have a sequence which you would like to find in a database. Which search method and which E-value cutoff do you use,
    • if you know your sequence is in the database and only want to find that entry
    • if you would like to find homologs.
  • What is the difference between BLAST and PSI-BLAST?


Hands on tasks

  • Retrieve the sequence of papain. Search the Swiss-Prot database using Blast and PSI-Blast. How do the results differ? Do the alignments differ? For example here you can run BLAST and PSI-BLAST (servers provided by the EBI). Adjust the parameters, e.g. choose Swiss-Prot as search database. "More options" allows you to adjust other parameters. Run 3 iterations of PSI-BLAST (the results page of the first run allows you to start the next iteration).