课程概况
In previous courses in the Specialization, we have discussed how to sequence and compare genomes. This course will cover advanced topics in finding mutations lurking within DNA and proteins.
In the first half of the course, we would like to ask how an individual’s genome differs from the “reference genome” of the species. Our goal is to take small fragments of DNA from the individual and “map” them to the reference genome. We will see that the combinatorial pattern matching algorithms solving this problem are elegant and extremely efficient, requiring a surprisingly small amount of runtime and memory.
In the second half of the course, we will learn how to identify the function of a protein even if it has been bombarded by so many mutations compared to similar proteins with known functions that it has become barely recognizable. This is the case, for example, in HIV studies, since the virus often mutates so quickly that researchers can struggle to study it. The approach we will use is based on a powerful machine learning tool called a hidden Markov model.
Finally, you will learn how to apply popular bioinformatics software tools applying hidden Markov models to compare a protein against a related family of proteins.
课程大纲
周1
完成时间为 4 小时
Week 1: Introduction to Read Mapping
周2
完成时间为 4 小时
Week 2: The Burrows-Wheeler Transform
周3
完成时间为 4 小时
Week 3: Speeding Up Burrows-Wheeler Read Mapping
周4
完成时间为 1 小时
Week 4: Introduction to Hidden Markov Models
周5
完成时间为 1 小时
Week 5: Profile HMMs for Sequence Alignment
周6
完成时间为 4 小时
Week 6: Bioinformatics Application Challenge