This presentation is a part of a series of lessons on the Analysis of Categorical Data. This lecture covers the following: linear probability model, non-constant variance, logistic model, logit transformation, and probit link.
This presentation is a part of a series of lessons on the Analysis of Categorical Data. This lecture covers the following: linear probability model, non-constant variance, logistic model, logit transformation, and probit link.
This presentation discusses modeling cluster correlation explicitly through random effects, yielding a generalized linear mixed effects models (GLMM). Part II contains many examples of application to different studies.
This presentation discusses modeling cluster correlation explicitly through random effects, yielding a generalized linear mixed effects models (GLMM).
Includes detailed PowerPoints for 20 lectures for topics including generalized linear models, logistic regression, and random effects models.
This course covers methodology, major software tools and applications in data mining. By introducing principal ideas in statistical learning, the course will help students to understand conceptual underpinnings of methods in data mining. It focuses more on usage of existing software packages (mainly in R) than developing the algorithms by the students. The topics include statistical learning; resampling methods; linear regression; variable selection; regression shrinkage; dimension reduction; non-linear methods; logistic regression, discriminant analysis; nearest-neighbors; decision trees; bagging; boosting; support vector machines; principal components analysis; clustering. Perfect for students and teachers wanting to learn/acquire materials for this topic.
This graduate level course offers an introduction into regression analysis. A researcher is often interested in using sample data to investigate relationships, with an ultimate goal of creating a model to predict a future value for some dependent variable. The process of finding this mathematical model that best fits the data involves regression analysis. STAT 501 is an applied linear regression course that emphasizes data analysis and interpretation and is perfect for both students and teachers of statistics courses.
R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.
R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.
This page introduces contigency tables with an example on fruit trees and fire blight. Two calculators are provided that allow users to enter their own contigency table and test for treatment effects. The first calculator performs Fisher's Exact Test on a 2x2 tables. The second performs a chi-square test on up to a 9x9 table.
This page introduces the Kolmogorov-Smirnov test, gives background and procedures for the test, and provides a calculation page which allows users to enter their own data and perform the test.