Teaching Statistical Methods For The Health Sciences In The Post-Genomic Era

Proceedings of the sixth international conference on teaching statistics, Developing a statistically literate society
Aires, N. & Thelle, D.
Phillips, B.
International Statistical Institute

In many complex diseases researchers have observed that neither genetic factors nor environmental factors alone determine the disease. This observation generates the hypothesis that human disease is caused by both genetic and environmental factors that act together. This leads to the concept multifactorial causes of disease. On the other hand, the recent compilation of the draft human genome sequence opened the possibility to detect candidate genes for complex diseases and even to study these in relation with environmental factors. The gene-environmental interaction may not be easy to analyze due to the complex structure that the involved factors may have. These factors have different nature that should be treated at different stages of the study. Particular attention should be paid to the study size and design. Epidemiological studies with particular interest in identifying candidate genes that contribute to complex diseases as well as detection of intergenic or gene-environment interactions require large sample sizes because many variables are studied simultaneously. The larger patient populations ensure that individual subgroups retain adequate power to detect significant results with narrow confidence intervals. In the paper we focus on the advantages/disadvantages of classic multifactorial statistical methods applied to the health sciences and the genome scan.

The CAUSE Research Group is supported in part by a member initiative grant from the American Statistical Association’s Section on Statistics and Data Science Education