MICa 8006 Home
MICa 8006
Protein Sequence Analysis

Homework 5


Due before class, Tuesday, October 10, 2006

You must sign on to cgls1.msi.umn.edu for Part C. You must sign on to cgls2.msi.umn.edu for Part A. Part B can be done on either cgls1 or cgls2. Sign on information (user name and password) are the same for both systems. Files stored on one are available on both.

Part A (10 pts) GCG

Sign on to cgls2.msi.umn.edu, as you have before. In this part, you will use test3.pep, which is on the class website in fasta format. Convert it to GCG format before you start, if you have not done so already.

This exercise is meant to help you think about your personal strategy for analysis of an unknown protein sequence. Use several GCG modules, as many as is appropriate, in a reasonable order. Write a report on this sequence, as you did in HW4A, for a person who is not familiar with protein sequence analysis. Include what modules you used, in what order, and explain what the modules are intended to do, so that the reader can understand the results.

Submit: a short report on all you learned about this protein, with GCG output that supports that report. Don't include negative GCG output, instead state in the report that you ran the test and the results were negative, but do describe what the modules are intended to do.

Part B (10 pts) Perl - subroutines

This part of the assignment can be done on cgls1 or cgls2. Create a subroutine that, given a string that contains an amino acid sequence in one-letter code, with no blank spaces and all upper case, calculates its amino acid composition and prints it out. Create a main program that reads from the terminal the name of a file containing one or more proteins in fasta format (use the file tests.pep), converts each protein into the format needed by the composition subroutine, and calls that subroutine. You may modify the program you used for Homework 4C, or write a new program.

Submit: a printed copy of your program and the output it produces when it is run. Also, email your program to Flora Fan at fanx0038@umn.edu

Part C (10 pts) Perl - BioPerl

Sign on to cgls1.msi.umn.edu to do this part of the assignment. Use the Bioperl SeqIO module to convert one or more protein sequences in fasta format into GCG format, and create a file with the sequence in the new format. Test it using the file tests.pep. The main BioPerl page is: http://www.bioperl.org/
The BioPerl Tutorial: http://www.bioperl.org/wiki/Bptutorial.pl

Submit: a printed copy of your program and the output it produces when it is run. Also, email your program to Flora Fan at fanx0038@umn.edu


February 21, 2008 Lynda Ellis

© 2009, University of Minnesota.
All rights reserved.

[an error occurred while processing this directive]