# Project Description

Parents transmit 1 chromosome to each child. This result in approximately 50% resemblance of DNA between siblings. Relatedness between 2 or more people can be estimated utilizing this fact.

# About us

KangWon Lee: Second year M.S Student in Computer Science.

Alfred Heu : Second year M.S Student in Computer Science.

# Project Goals

1. Given the genotypes of two individuals, how can we tell if they are siblings or not?

2. SNPs of which MAFs are informative?

3. How many SNPs should be taken into consideration?

-Simulation on randomly generated data based on probability matrix

# Probability Matrix

**1. Unrelated**

A SNP with minor frequency of 0.1

P(AA) = (0.9)2

P(AG) = (0.9)*(0.1)

P(GG) = (0.1)2

The probability of having the allele on both chromosomes

Let ind1 has GG and ind2 has GG. Assume that they are not related.

Pu(GGGG) = P(GG) * P(GG) = (0.1)2 *(0.1)2 =0.0001

**2. Related(Full-siblings)**

Let ind1 has GG and ind2 has GG. Assume that they are related.

Consider with parents of these two siblings

There are 4 cases of parents that have the children GG and GG

Pr(GGGG) = Pu(AGAG) * 0.25 * 0.25 + Pu(AAAG) * 0.5 * 0.5 * 2 + Pu(GGGG)*1*1 = 0.003025

**3. Example**

MAF = 0.1 unrelated

MAF = 0.1 related

MAF = 0.4 unrelated

MAF = 0.4 related

# Simulations & Results

**We randomly generated 10,000 pairs of nonrelated and siblings genotype samples based on the probability matrix.
ex) MAF = 0.1 unrelated 5 SNPs**

**Estimation Method**

**Result**

- 10,000 pairs of siblings and 10,000 pairs of unrelated individuals

- Number of SNPs

1, 2, 5, 10, 20, 30, 50, 100

- MAFs tested

0.05, 0.1, 0.2, 0.3, 0.4

**0.4 had the best result.**

# Conclusion

**40 SNPs(MAF = 0.2~0.4) Error rate < 0.05**

0.4 had the best result.

**Notable point**

Why do the results of lower MAFs have higher error rate?

If we know that sample data has minor allele, lower MAF definitely helps.

But related or not, most of the pairs will fall under AA AA, which makes the result more ambiguous.

**Week 10
Progress**

Completed coding.

Completed analysis.

Completed presentation preparation.

**Plans**

Presentation this week.

**Grade: A**

**Week 9
Progress**

Completed coding, completed simulation.

**Plans**

Analyse and prepare for the presentation.

**Grade: A**

**Problems that arose this week**

Presentation data.

**Problems that were solved this week**

Analysis of the output data.

**Week 8
Progress**

Finished coding for the calculation of the relatedness between Full-siblings and non-related individuals.

Discussed the way to create data for the Siblings and Non-related individuals.

**Plans**

Complete coding and run simulation to analyse.

**Grade: A**

**Problems that arose this week**

How many SNPs should we take into account?

Simulation specific questions arose.

**Problems that were solved this week**

Decided how to create simulation data.

**Week 7
Progress**

Asked the proffesor for the method and figured out the simple method to solve relatedness.

Stared coding.

**Plans**

Code for probability matrix.

**Grade: A**

**Problems that arose this week**

How shoud we make the random simulation data?

**Problems that were solved this week**

Finally decided the method to solve relatedness.

**Week 6
Progress**

Read published papers

**Plans**

Make specific plans for coding and simulations.

**Grade: A**

**Problems that arose this week**

Published papers were harder to read. Although the papers had many ways to apply the method, most of them

were to hard to apply right in the project term.

**Problems that were solved this week**

Probability matrix.

**Week 5
Progress**

Solved basic problems on our slides. Understood the meaning of difference in MAF and relationship.

We understood how MAF could help to find relatedness between two individuals.

**Plans**

Read published papers

(Estimation of Pairwise Relatedness With Molecular Markers, etc)

**Grade: A**

**Problems that arose this week**

Method to solve relatedness with multiple SNPs.

**Problems that were solved this week**

Basic method to relate relatedness with SNP.

**Week 4
Progress**

We decided project topic and made out line plans.

Made the project page in wiki.

**Plans**

Background research.

**Grade: A**

**Problems that arose this week**

N/A.

**Problems that were solved this week**

N/A.