Mapping With Indels/Mismatches

Konstantinos Niktas



Second year MS graduate student trying to finish Master's project. BS in Computer Science from UCSD. Long walks on the beach, leather bound books, etc.

Project Information

Traditional mismatch-detecting algorithms cannot deal with insertions and deletions. Indels will usually cause a large number of mismatches when trying to map a read to a reference genome because they cause a shift of all the bases. An indel causes a shift with respect to the reference genome. The mismatch-detecting algorithm usually only handles one to three mismatches. (More specific information to follow.)

Related papers



Week Four (4/19-4/25)


  • Create this page (yay!)
  • First goal of project written out

To Do Next

  • Better describe the (first?) goal of the project
  • Plan out first goal implementation/theory before any coding begins

First Goal: Develop a mapper that can take many short reads and use a reference genome of length L to discover the target sequence the reads came from. The mapper should be able to identify the indels in the target sequence of length one. We will produce a random reference sequence, create single base indels in it, feed this new target sequence to a read simulator, and then pass these reads and original sequence into the mapper to test it.

Grade: A-/B+. Finally started doing things but not nearly as much as I would have liked to.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License