personalized medicine

Welcome to the Personalized Medicine Project page. I am Rui, a first year graduate student of ACCESS program, an umbrella biological program.

Project Description

This Project is to help doctor decide which SNPs are worth measuring for a certain type of diseases.

There are two webpages that relate to this project which can help us know more about this project:

https://www.23andme.com/

http://www.navigenics.com/campaigns/prevention?s_kwcid=navigenics.%7C3883443815

Goal for the quarter

Build up a model that could help doctors choose the SNP that worthy measuring based on the prevalence and risks of the diseases.

Weekly Schedule

Week1 4/23

background learning

Week2 4/30

I have calculated the Var(r)(variance of risk increased by the SNP) for one single SNP based on the prevalence F and the probability of the SNP.

I problem I met was firstly I used case study way to try to find the distribution of r. It is difficult. When I switch this way to do the derivation, it's much easier. But new problem comes, the formula of Var(r) contains r itself. If I want to use this formula to real data, either I should know the r in advance or I have to estimate the r by samples. It is also like a case study then.

(1)
$$D=F_{0}---(1-p_{A})$$
(2)
\begin{align} D=F_{0}\gamma---(p_{A}) \end{align}
(3)
\begin{align} Var( X)=E(X^{2})-(EX)^{2}=F^{2}_{0}p_{A}(1-p_{A})( \gamma-1)^{2} \end{align}

The goal for the next step is to calculate the Var(r) for multiple SNPs, then code them in R.

I grade A for myself this week.

Week3 5/7

I tried to get the formula for multi-SNPs.
Assume there are 3 SNPs related to the disease:then there will be 8 situations of prevalence.
Assume SNPs are A,B,C;
The relative risk are:$\gamma_{A},\gamma_{B},\gamma_{C}$
The disease prevalence under every condition is following:

(4)
$$F_{0}---(1-p_{A})(1-p_{B})(1-p_{C})$$
(5)
\begin{align} F_{0}\gamma_{A}---p_{A}(1-p_{B})(1-p_{C}) \end{align}
(6)
\begin{align} F_{0}\gamma_{B}---p_{B}(1-p_{A})(1-p_{C}) \end{align}
(7)
\begin{align} F_{0}\gamma_{C}---p_{C}(1-p_{A})(1-p_{B}) \end{align}
(8)
\begin{align} F_{0}\gamma_{A}\gamma_{B}---p_{A}p_{B}(1-p_{C}) \end{align}
(9)
\begin{align} F_{0}\gamma_{A}\gamma_{C}---p_{A}p_{C}(1-p_{B}) \end{align}
(10)
\begin{align} F_{0}\gamma_{B}\gamma_{C}---p_{B}p_{C}(1-p_{A}) \end{align}
(11)
\begin{align} F_{0}\gamma_{B}\gamma_{C}\gamma_{C}---p_{B}p_{C}p_{A} \end{align}

Under this condition, the formula for variance is very complex.
The plan to next step is to move to R coding for the model. Also, find a way to summary the variance for N SNPs.

I grade myself A this week.

Week4 5/14

After the discussion in class, I got the formula for multiple SNPs. I find to use different cases to explain the formula.
I simulate the data and create the SNP matrix using R code and calculate the expectation and variance.
I gave the presentation the first Friday.

I grade myself A- this week.

Week5 5/22

I prepared the presentation report for this project.
I also think out another way to measure the variance of disease prevalence.
Besides using the criteria, I think out anther way to measure, that is to use:

(12)
\begin{align} E(D)+/-2\sigma \end{align}
(13)
\begin{align} F_{0}+/-2\sigma \end{align}

I grade myself A+ this week.

page revision: 11, last edited: 12 Jun 2009 05:03