Imputation

Description of the project

Imputation is a method to predict the alleles of missing single nucleotide polymorphisms (SNPs). Imputation methods use the linkage disequilibrium (LD) structure to impute the alleles of the hidden SNPs. The HapMap, where a large number of SNPs are genotyped, is usually used as a reference data set to infer the correlation structure between SNPs. Then, using the observed SNPs and the correlation structure between the observed SNPs and hidden SNPs, the alleles of the hidden SNPs can be predicted. A variety of imputation methods based on haplotype proxies or Hidden Markov Models (HMM) have been recently proposed. In this project, I will explore the various imputation methods and figure out how to develop a simple imputation algorithm. Then, I will implement the simple algorithm using the HapMap as the reference data set.

About me

I am a first-year PhD student in computer science.

Goal for end of quarter

To design and implement simple software for imputation.

Weekly schedule

Week 10

  1. Weekly progress
    • Prepared for the presentation.
    • Gave the presentation.
    • Uploaded my ppt file and R code.
  2. Next week plan
  3. Grade for week: A
  4. Problems that came up
  5. Problems solved this week

Week 9

  1. Weekly progress
    • Completed the 2nd version of the software, which is called Multi-SNP method
    • Analyzed the single-snp and multi-snp imputation methods.
  2. Next week plan
    • Prepare for the presentation
  3. Grade for week: A
  4. Problems that came up
  5. Problems solved this week
1% of Missingness 5% of Missingness 10% of Missingness
impute_1per.jpeg impute_5per.jpeg impute_10per.jpeg

Week 8

  1. Weekly progress
    • Submitted the preliminary report.
    • Completed the 1st version of the software, which is called Single-SNP method
  2. Next week plan
    • Test and improve the software
  3. Grade for week: A
  4. Problems that came up
  5. Problems solved this week

Week 7

  1. Weekly progress
    • finished the parsing module.
    • started the correlation module.
  2. Next week plan
    • Write a preliminary report.
    • Finished the first version of the software.
  3. Grade for week: A
  4. Problems that came up
  5. Problems solved this week

Week 6

  1. Weekly progress
    • Analyzed the HapMap data - Phasing data.
    • Designed the data structure and started to write the module for parsing the input file.
  2. Next week plan
    • Finish the parsing module.
    • Design the correlation module.
  3. Grade for week: A
  4. Problems that came up
  5. Problems solved this week

Week 5

  1. Weekly progress
    • Searched and reviewed papers related to imputation.
    • Downloaded the HapMap data
  2. Next week plan
    • Define the input format.
    • Start to design a simple algorithm.
  3. Grade for week: A
  4. Problems that came up
  5. Problems solved this week

Week 4

  1. Weekly progress
    • Chose a project topic.
    • Wrote a project proposal
    • Created a project page
  2. Next week plan
    • Review literature
    • Get the HapMap data
  3. Grade for week: A
  4. Problems that came up
  5. Problems solved this week
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License