Genotype Calling


Ignacio Zendejas. I'm a second year MS student finishing up this quarter. My areas of interest are Machine Learning, Data Mining and Information Retrieval. I'm currently a member of Professor Cardenas' Multimedia and Information System Technology Lab. I obtained my BS here at UCLA and have interned with Hewlett-Packard in the areas of software testing, engineering and research (5 internships). I am currently Research Associate in the Business Optimization Lab at HP Labs. I'm in the chameleon group working on personalization algorithms: BOL. I'll be finishing up the work there and will be joining a startup in the LA Area.

Project Description

I'm working on Project 28 - Genotype Calling because it's right around my areas of interesting. I'll be focused on completing the medium-level portion, which involves identifying clusters in the genotype plots and using distance to the centroids to make genotype predictions. I hope to

Project Goals

My goal is to get my implementation to do nearly as well as state-of-the art implementations that are out there. I want to try using information-theoretic co-clustering because I think there's potential this algorithm could do better along with some feature selection. But as of 4/23 it's not clear whether this will work or not.

Tentative Schedule

By end of Week 4

  • Download data [X]
  • Analyze data []
  • Review literature in the area []

Grade so far: C+, I've been very busy the last couple of weeks, so I'm a bit behind here. I should finish most by tomorrow

By end of Week 5

  • Pre-process data
  • Visualizae data and run simple statistics to find useful features

By end of Week 6

  • Modify co-clustering algorithm (or better algorithm)

By end of Week 7

  • Run algorithm
  • Evaluate performance, adjust

By end of Week 8

  • Improve algorithm

By end of Week 9

  • Prepare presentation/report
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License