Second year MS graduate student trying to finish Master's project. BS in Computer Science from UCSD. Long walks on the beach, leather bound books, etc.
Traditional mismatch-detecting algorithms cannot deal with insertions and deletions. Indels will usually cause a large number of mismatches when trying to map a read to a reference genome because they cause a shift of all the bases. An indel causes a shift with respect to the reference genome. The mismatch-detecting algorithm usually only handles one to three mismatches. (More specific information to follow.)
Week Four (4/19-4/25)
- Create this page (yay!)
- First goal of project written out
To Do Next
- Better describe the (first?) goal of the project
- Plan out first goal implementation/theory before any coding begins
First Goal: Develop a mapper that can take many short reads and use a reference genome of length L to discover the target sequence the reads came from. The mapper should be able to identify the indels in the target sequence of length one. We will produce a random reference sequence, create single base indels in it, feed this new target sequence to a read simulator, and then pass these reads and original sequence into the mapper to test it.
Grade: A-/B+. Finally started doing things but not nearly as much as I would have liked to.