|
|
Homework 1 |
Due before class, Tuesday, September 12, 2006IntroductionSign on to the cgls2.msi.umn.edu server using an SSH terminal client, such as Putty (Windows) or Fugu (Mac). Use your username and password for this computer system. Change your password to one known only to you.
Part A (5 pts)Create a directory named "gcg" (mkdir gcg).Change to this directory (chdir) Verify what directory you are in (pwd) Start gcg (gcg) Read the genhelp document for the gcg command "fetch" (genhelp fetch or via the MSI GenHelp website). Print this document (easiest from the website or copy/paste from the command-line genhelp). Use the fetch command to obtain the protein sequence with the Uniprot (UNI) ID: P0AA25 Find the full name of the file that was fetched. (ls) View that file (more filename). What is the number of amino acid residues in P0AA25? Move the file from the cgls2 server to your local computer (WSFTP or other FTP client), and print it from your local computer. Submit: a printed copy of the genhelp document for "fetch", a printed copy of the file P0AA25, a sentence stating the number of amino acid resides in P0AA25.
Part B (5 pts)Change to your top directory (cd).Create a new directory named "gcg" (mkdir gcg). Find where Perl is (whereis perl) Create your own program to print an English phrase, using pico, #!, print, \n, double quoted strings, and exit. While the phrase "Hello, world!" is traditionally used for this first program, you can alternatively use any English phrase you want. Save your program. List your program (more filename). Run your program (perl filename). Submit: a printed copy of your program and the output it produces when it is run.
Part C (10 pts)Protein structure prediction programs are validated by using them to predict the structure of proteins whose structure is already known. The calculation used to compare the predicted to the known structure is called root mean square deviation (RMS), often shortened to R. However, by constraining a protein sequence of fixed length to randomly pack into a sphere, Cohen and Sternberg (1980) empirically calculated that the 'random' R, in Angstroms, wasR = 0.0468N + 9.25, where N is the number of amino acid residues in the protein sequence. Many early tertiary structure predictions were little better than random. Based on the Pseudocode for HW1C, create a Perl program, with comments, that asks the user for the number of residues in a protein sequence, and outputs an English sentence that reproduces (echoes) the input and gives the random R for that protein, rounded to two digits after the decimal point, and, on a separate line, the date and time. Test it with a protein with 100 residues (random R should be 13.93 Angstroms), and with a protein the size of P0AA25.
Submit: a printed copy of your program and the output it produces when run with the two required test inputs. |
February 21, 2008 Lynda Ellis
© 2009, University of Minnesota.
All rights reserved.
[an error occurred while processing this directive]