FINAL EXAM PYTHON FOR GENOMIC DATA
For John Hopkins University by Coursera
If you haven't yet read the instructions, please exit the exam and read the Final Exam Instructions.
Please run the following data set in the program(s) that you have written: dna2.fasta
If you created your program(s) correctly, you will be able to answer the questions below.
How many records are in the multi-FASTA file?
2.
What is the length of the longest sequence in the file?
3.
What is the length of the shortest sequence in the file?
4.
What is the length of the longest ORF appearing in reading frame 2 of any of the sequences?
5.
What is the starting position of the longest ORF in reading frame 3 in any of the sequences? The position should indicate the character number where the ORF begins. For instance, the following ORF:
> sequence1
ATGCCCTAG
starts at position 1.
6.
What is the length of the longest ORF appearing in any sequence and in any forward reading frame?
7.
What is the length of the longest forward ORF that appears in the sequence with the identifier gi|142022655|gb|EQ086233.1|16?
8.
Find the most frequently occurring repeat of length 6 in all sequences. How many times does it occur in all?
9.
Find all repeats of length 12 in the input file. Let's use Max to specify the number of copies
of the most frequent repeat of length 12. How many different 12-base sequences
occur Max times?
10.
Which one of the following repeats of length 7 has a maximum number of occurrences?
Comments
Post a Comment