Due before class, Tuesday, September 26, 2006
Part A (5 pts) GCG
In this part, you will use p0aa25.uniprot_sprot, test1.pep and test2.pep.
test1.pep and test2.pep are on the class website in fasta format. Convert them to GCG format before you start, if you have not done so already.
- Align p0aa25.uniprot_sprot and test1.pep using the modules bestfit and
gap.
Write a short description of the differences between the two alignments,
if any. Which alignment is "better"? Why? WARNING: GCG uses the same
default output filename for both these modules. CHANGE one or both names
(How?) or your second result will overwrite your first.
- Use the same two modules to align p0aa25.uniprot_sprot and test2.pep. Write a
short description of the differences between the two alignments, if any.
Which alignment is "better"? Why?
- Use pileup to align all three proteins: p0aa25.uniprot_sprot, test1.pep
and test2.pep. Create a .png format file with the dendrogram. Based on
this dentrogram, can you determine if test2.pep is more similar to
test1.pep or to p0aa25.uniprot_sprot? Why or why not?
- Use blast to compare test2.pep to the UNIPROT
protein database. If this finds an identical sequence, give its name.
What is the strongest hit with known function?
- Find protein sequences of several (10-20) members of the protein family with
that known function (How?), and use pileup to create a
multisequence alignment with test2.pep and all these sequences.
If you were studying this "for real" you would find every possible member of the protein family, but for this homework assignment, only use 10-20 sequences in the family. Is
test2.pep a member of that protein family? Why or why not?
- Is test2.pep a putative protein with known function, a conserved hypothetical, or an unknown protein?
Submit: Printed copies of all of the alignments and
dendrograms. A copy of the first part of the blast output,
through the
first hit with known function. Write your short descriptions or answers
to questions directly on the GCG output pages that relate to them.
Part B (5 pts) Perl
Based on Pseudocode for HW3B, adapt the program you wrote in homework 2 to become a Perl program,
with comments, that:
- creates three hashes, all with keys = 1 letter amino acid codes, one
with values = amino acid names, the second with values = three letter
codes, the third with amino acid property codes;
- creates one hash with keys = amino acid property;
codes and values = full names of amino acid properties;
- randomly chooses an amino acid;
- randomly prints its 1 letter code, name, or three letter code;
- requests the user to input the other two items;
- request the user to input the full name of one property of that
amino acid;
- requests the user to draw the chemical structure of
the R group of the amino acid and prints four blank lines so the user
has space to do so.
- The program should repeatedly do this until the user requests it to
end. It should produce different random values each time it is run.
Submit: a printed copy of your program and the output it produces
when it is run three times, with three or more loops each time. On the
printed output, manually draw 2-D structures of the requested R groups.
Also, email your program to Flora Fan at fanx0038@umn.edu
Part C (10 pts) Perl
Based on the Pseudocode for HW3C, add to the Perl program in Part B, so it:
- checks the accuracy of the
information input by the user, except for the R group structure;
- praises correct answers and corrects incorrect answers;
- counts the correct and the total number of questions; and,
- when the user requests it to end, reports the percent correct and the
date and time.
Submit: a printed copy of your program and the output it produces
when run three times as described below, with three or more loops each
time. One run should have all answers correct; one run all answers
incorrect; and one run a mixture of correct and incorrect answers.
Again, draw correct R groups on the printed output.
Also, email your program to Flora Fan at fanx0038@umn.edu
|