|
|
|||||||||||||||||
|
Results of USPROC 2007 CompetitionFirst Award (Includes cash prize of $750)Title: Numbers Don't Lie Abstract: This study shows the difference between a Major League Baseball homerun hitter who uses steroids and one who does not. The sample is comprised of players who reached 500 career homeruns without steroids and the top homerun hitters who have been accused of, or proven, to have used steroids. It is not entirely clear who used steroids and when they began, but based on league testing and congressional investigations these are players who can be labeled as steroid users with a large degree of certainty. The players who used steroids were able to prolong the productivity of their career. Based on the data from players that have hit 500 or more career homeruns without the assistance of steroids, it is apparent that most Major League players peak in their homerun production between their 6th and 10th seasons. Players who use steroids have a peak much later in their career around their 11th through 17th seasons. Even though they are able to increase the productivity later in their careers there is no statistical evidence that steroid users are able to sustain this level of productivity over an extended period of time. Players who have not used steroids have a higher single season average for homeruns for their entire career as well as for their five best seasons. Steroids seem to give players an advantage for one season. The top six single season homerun totals all belong to steroid users and the probability of a steroid user breaking the record for most homeruns in one year is much greater than a non-user. The statistical analysis does not provide conclusive evidence for all Major League baseball players, but rather those who have made a career out of hitting homeruns. Download Presentation (PDF) Second Award (Includes cash prize of $500)Title: Bayesian Applications for Obsidian Artifact Dating Abstract: The goal of this research is to improve upon current obsidian hydration dating by implementing a Bayesian approach. The benefit to such an approach is that we are able to combine both empirical data and scientific knowledge of physical diffusion models to arrive at an adequate model for estimation of the obsidian artifacts' age. We hope to build upon current models by taking into consideration many known sources of error that are presently not accounted for. In the end we propose a model that will aid archaeologists in understanding our rich history. Download Presentation (PDF) Third Award (Includes cash prize of $250)Title: Forecasting Hotel Occupancy Rates for JHM Hotels, Inc. Abstract: In the fall of 2006, my team and I worked on an ongoing project with representatives from JHM Hotels, Inc. ("JHM"). JHM is a multi-brand hotel management company founded in 1981. JHM own 29 hotel properties with over 4,000 rooms and 1,000 associates in the Southwest United States. They operate under franchise flags such as Hilton, Marriott, and Holiday Inn. Our project focused on developing forecasting models of daily occupancy rates for hotels owned and managed by JHM. This paper will describe our team's effort to develop master datasets for three particular hotels and explain some of the interesting forecasting models developed using the data. Download Presentation (PDF) Honorable MentionTitle: Dental Health and Socioeconomic Status in Southern India Abstract: A questionnaire regarding dental health and oral examinations were administered to 110 and 213 people, respectively, in rural Chennakuppam, India. Data were collected by surveying every fifth house in the village. We model dental carries in the 110 people given the questionnaire using a zero-inflated Poisson model because a large number of people have no dental caries, but overall the distribution is Poisson. We model missing teeth in the 213 people given an oral examination. Because people can have no more than 32 missing teeth, but no less than zero, there is inflation of observations at zero and 32 in the distribution of missing teeth. We model missing teeth using a random intercepts Tobit model with censored observations at zero and 32. It is a mixed model to account for correlation by caste in the dataset. We found that increasing age and being a member of the forward or scheduled caste are positively associated with dental carries. Sugar consumption, brushing teeth using a toothbrush, and brushing teeth using a stick are negatively associated with dental caries. Age was found to be positively associated with missing teeth, and some correlation was found within castes. Honorable MentionTitle: Who is Baseball's Best Batter? Abstract: A definition of "the best baseball batter" is established as a combination of my opinions and logic and Michael Shell's. This definition is then applied by first using Bill James' created statistic of runs created. An explanation of the choice, construction, and function of the statistic is given. The initial sample space studied was all non-pitching players inducted into the Baseball Hall of Fame, two players currently banned from entering the Hall, and one current player giving a total of 160 players. After evaluating each player's runs created statistic, the sample was narrowed down to ten and then to three after a "runs created above MLB league average" is used. The top three players were further examined on their batting abilities by accounting for the home ballparks they played in over their careers. An in-depth justification for the ballpark factors is explained in detail. After figuring the adjustment factors and applying them to the top three, the best batter was named. Extensions, problems, and reasoning are discussed about the conclusion. Honorable MentionTitle: Exploring College Football Outcomes Abstract: In this project, we explore various methods of modeling the results of the 2006 Division 1-A college football season. A logistic regression modeling wins and losses is presented, followed by 2 linear models which use a team's actual score as the response variable. The weaknesses of the 3 basic models are shown if one hopes to simultaneously be able to rank a set of teams and predict future wins and losses as well as individual team scores. A final model is then proposed, which uses team's scores as the response variable, but rewards teams more for winning and penalizes teams more for losing by establishing a function which allows the winning team to "steal" points from the losing team. This final model is shown to manipulate final scores by approximately 4 points per team per game, though the standard error in predicting individual team's scores increases by less than 0.5 points. Additionally, this new model correctly predicts significantly more games than the initial models. A plausible objection to this new model is that USC, rather than Florida is shown to be the "best" team. Evidence is given to argue for the possibility of USC's dominance, though we readily admit that none of the top teams are significantly better than the others, and that a definitive "champion" cannot be crowned with so little data. Honorable MentionTitle: Forecasting Frozen Concentrated Orange Juice Futures Contract Prices Abstract: Our project aimed to develop a model to forecast monthly frozen concentrated orange juice (FCOJ) futures contract prices as quoted by the New York Board of Trade. FCOJ futures are a financial instrument used by producers, purveyors of fruit juice and speculators to hedge against prices changes by locking in a set price for future delivery. The FCOJ contract is for 15,000 pounds of frozen orange juice concentrate; the contract stipulates requirements for color, flavor, and absence of defects to standardize the commodity for wide spread use. The specific future contract we analyzed, FCOJ-A, mandates that concentrate be rendered from oranges grown in either Florida or Brazil. Historically, nearly 70% of oranges used to fulfill these contracts were grown in close proximity to Orlando, Florida. Our data set consists of monthly price data for FCOJ from January 1990 to November 200612. We attempted to forecast FCOJ futures prices for June through October 2006 to measure the predictive power of our model. The 2007 USPROC Competition CommitteeFelix Famoye (famoy1kf@cmich.edu) The panel of judgesJohn Daniels (john.daniels@cmich.edu) |