Statistics Genomics Module #3
1. Load the example SNP data with the following code:
Fit a linear model and a logistic regression model to the data for the 3rd SNP. What are the coefficients for the SNP variable? How are they interpreted? (Hint: Don't forget to recode the 0 values to NA for the SNP data)
2.
In the previous question why might the choice of logistic regression be better than the choice of linear regression?
3.
Load the example SNP data with the following code:
Fit a logistic regression model on a recessive (need 2 copies of minor allele to confer risk) and additive scale for the 10th SNP. Make a table of the fitted values versus the case/control status. Does one model fit better than the other?
4.
Load the example SNP data with the following code:
Fit an additive logistic regression model to each SNP. What is the average effect size? What is the max? What is the minimum?
5.
Load the example SNP data with the following code:
Fit an additive logistic regression model to each SNP and square the coefficients. What is the correlation with the results from using and ? Why does this make sense?
6.
Load the Montgomery and Pickrell eSet:
Do the log2(data + 1) transform and fit calculate F-statistics for the difference between studies/populations using genefilter:rowFtests and using genefilter:rowttests. Do you get the same statistic? Do you get the same p-value?
7.
Load the Montgomery and Pickrell eSet:
First test for differences between the studies using the package using the function. Then do the log2(data + 1) transform and do the test for differences between studies using the package and the , and functions. What is the correlation in the statistics between the two analyses? Are there more differences for the large statistics or the small statistics (hint: Make an MA-plot).
8.
Apply the Benjamni-Hochberg correction to the P-values from the two previous analyses. How many results are statistically significant at an FDR of 0.05 in each analysis?
9.
Is the number of significant differences surprising for the analysis comparing studies from Question 8? Why or why not?
10.
Suppose you observed the following P-values from the comparison of differences between studies. Why might you be suspicious of the analysis?
Comments
Post a Comment