Hello SBI listserv participants and SBI blog readers,
Hope you are enjoying your Saturday morning!
First, Thank you for your discussions on/contributions to the listserv - it is great to hear about all the things that statistics teachers are doing in their classes!
Second, we have several new articles on the Simulation-based Inference blog (https://www.causeweb.org/sbi/) that have been recently posted:
1) We have two new posts on "How to use real data" by Kevin Ross and Nathan Tintle.
2) Erin Blankenship, Karen McGaughey, and Kathryn Dobeck have written about their experiences and what they thought was "The hardest thing about getting started with simulation-based curricula."
3) For readers interested in "How to implement simulation-based methods in high school classrooms/AP Statistics classes" - we have articles from Bob Peterson, Catherine Case, and Josh Tabor, all AP Statistics teachers, writing about their experiences.
On behalf of the ISI team, I'd like to thank all our blog contributors for writing these pieces for us.
I hope you enjoy reading these articles, and others posted on the blog, as much as I do!
Have a nice weekend!
- Soma
-----------------------
Soma Roy
Associate Professor
Statistics
California Polytechnic State University
San Luis Obispo CA 93407
Phone no.: (805)-756-5250
"… for whenever you learn something new, the whole world becomes that much richer." - Norton Juster, The Phantom Tollbooth
Thanks everyone for your great questions!
I've attached one that I've been using for a while that gets at a situation
where simulation is OK, but the normal approximation (theory-based)
approach is not. It's a bit too context specific and specific to the
applets our group has developed to necessarily use it 'as is' in your
course, but adapting it a bit would seem to be fairly straightforward. The
"note" was added after the first time I included the question and a number
of students explained the difference due to chance variation in the
simulation distribution.
I like this question because it really gets at whether students can apply
the idea of 'validity conditions' in context and the difference in p-values
is somewhat measurable so it matters
Have a great weekend!
Nathan
On Wed, Mar 25, 2015 at 2:53 PM, Eric Reyes <reyesem(a)rose-hulman.edu> wrote:
> I loved Allan's questions (and in fact even included one on an exam this
> week). Given that this reduced my exam writing time, I thought I would
> pass along a few of my favorites as well. The full questions are attached.
>
> The first pair of questions makes use of a side-by-side boxplot. I like
> the first pair of questions because it tests the idea that the strength of
> evidence is dependent on both the value of the statistic as well as the
> variability in the data. In addition, it relies on their ability to read
> and interpret graphics. The follow-up question then considers whether
> assumptions for a particular analysis are reasonable.
>
> I believe I stole the second pair of questions from Roger Woodard at
> NCSU when I was a graduate student? It has been my go-to for testing the
> difference between the distribution of a random sample and a sampling
> distribution. Presenting the population in terms of a boxplot also
> requires students to interpret graphics by comparing boxplots to
> histograms. While many students get one of the two questions right, we
> often find students wanting to give the same answer for both questions.
>
> Eric
>
> *Eric M. Reyes | **Assistant Professor*
> *Department of Mathematics*
> *ROSE-HULMAN INSTITUTE OF TECHNOLOGY*
>
> 5500 Wabash Ave | Terre Haute, IN 47803-3999
> Phone: 812.877.8287 | Fax: 812.877.8883
> www.rose-hulman.edu
>
> On Sat, Mar 21, 2015 at 2:02 PM, Allan Rossman <arossman(a)calpoly.edu>
> wrote:
>
>> Hello Simulation-Based Inference (SBI) group,
>>
>> You might recall that I wrote to you on Groundhog Day last month, so I
>> thought I would check in again on this, the first full day of spring. Now
>> with the benefit of nearly seven weeks of hindsight, what do you think of
>> Punxsutawney Phil's prediction?
>>
>> My colleagues and I thought this might be a good time of year to write
>> about favorite assessment/exam questions for introductory statistics.
>> We've just finished final exams for the Winter quarter at Cal Poly, and
>> those of you on a semester calendar will need to give final exams in
>> another 4-6 weeks or so. But the best reason for writing now is that this
>> provides me with a good excuse to procrastinate on grading my exams!
>>
>> I am going to identify and comment on my all-time favorite assessment
>> question, but first I'll mention two "honorable mention" questions that I
>> also like. I'll count down my items from #3 to #1, along with some
>> commentary on each. Think of #3 and #2, which are quite short, as opening
>> acts for the main event, which is fairly long. Oh, and in case you don't
>> make it to the end of this message, let me now invite all of you to respond
>> to the SBI list with one of your own favorite assessment items. Here we go
>> ...
>>
>> 3. I ask students: Suppose that 60% of graduate students at Cal Poly have
>> an iPad and that 20% of undergraduate students at Cal Poly have an iPad.
>> Does it necessarily follow that 40% of all students at Cal Poly have an
>> iPad? Explain your answer.
>>
>> I like this question because I think it gets at a basic skill of
>> quantitative literacy. I certainly don't intend that students cite the Law
>> of Total Probability in answering the question, and I would hope that many
>> students could answer this well even before they take my class. But if
>> they leave my class believing that the answer is yes, then I feel that
>> they're leaving without some basic quantitative awareness. Almost all of
>> my students realize that Cal Poly has far more undergraduate students than
>> graduate students, so the overall percentage with an iPad would be much
>> closer to 20% than 60% in my made-up scenario. I don't expect students to
>> say this part about the overall percentage being closer to 20% than to 60%,
>> but I'm glad when they do. I'm pleased that very few of my students fall
>> into the trap of answering yes (but wait for my next question).
>>
>> If you want to modify this question for your local situation, you could
>> replace undergrad/grad student with any binary variable that is likely not
>> to be equally represented in your population, perhaps male/female or
>> math/other major or dog/cat person. (Most well-educated people are cat
>> people, right? Just kidding!)
>>
>> 2. I ask students: Suppose that you take a random sample of 100 houses
>> currently for sale in California. Does the Central Limit Theorem (CLT)
>> suggest that a histogram of the house prices in the sample will display an
>> approximately normal distribution? Explain.
>>
>> I suspect that you know how this one turns out. To my dismay most
>> students answer yes, pointing out for their explanation that the sample
>> size is larger than 30. I even put "house prices" in bold in an attempt to
>> draw their attention to the variable that I'm asking about. Perhaps I
>> should add a note to point out that there's no mention of words such as
>> "mean" or "average" anywhere in the question.
>>
>> So why do I like this question? I think it's more informative of what
>> students have learned (or not) about the CLT than asking them to perform
>> calculations and hoping that they'll remember to use sigma/sqrt(n) rather
>> than just sigma in the denominator of the z-score. If my students leave
>> the course thinking that the CLT says that sample data for all variables
>> display a normal distribution when the sample size is large, then they've
>> fundamentally not understood what the CLT or a sampling distribution is all
>> about. I suspect that many of my students realize, or could reason out for
>> themselves, that the distribution of house prices is typically skewed to
>> the right. But my question deliberately puts the CLT in their minds, which
>> leads their thinking toward a common misconception about that result.
>>
>> 1. My all-time, no-doubt, bar-none favorite exam question is ... (drum
>> roll please) ... the investigative task (question #6) from the 2009 AP
>> Statistics exam. I want to say at the outset that I had nothing to do with
>> writing this question, so I take no credit whatsoever. Before you read my
>> comments about it, please go and read this question (remember to scroll
>> down to #6) at:
>>
>> http://apcentral.collegeboard.com/apc/public/repository/
>> ap09_frq_statistics.pdf
>>
>> Now let me talk through the four parts of this question. I admit that
>> the context here is not very exciting, and I'll also say that part (a) is
>> pretty straight-forward and perhaps even boring. Nevertheless, I still
>> think it's a good question because students do not find it easy to identify
>> the parameter in a given situation. And it's certainly important to
>> understand what the parameter is before trying to make inferences about the
>> value of that parameter. But part (a) does not make this my all-time
>> favorite question.
>>
>> I think part (b) is a very nice question, because it asks students to
>> apply something they know to a situation they've probably never thought
>> of. Students should know a lot about means and medians, but I bet it's
>> never occurred to them to calculate the ratio of the mean to the median.
>> This question tests whether they realize that they can use their knowledge
>> that the mean is typically larger than the median with a right-skewed
>> distribution to conclude that this new ratio statistic will be larger than
>> one with a right-skewed distribution.
>>
>> I grant that parts (a) and (b) do not yet qualify for my short list of
>> all-time favorite questions, but there's more ...
>>
>> Part (c) epitomizes the logic of simulation-based inference. It assesses
>> whether students have a firm enough understanding to be able to apply what
>> they know to a situation they've never encountered: testing whether sample
>> data provide convincing evidence that a population distribution is skewed
>> to the right. Unfortunately, many students focus on the symmetric shape of
>> the distribution of simulated ratio statistics, and others focus on this
>> null distribution being centered around 1. But students who truly
>> understand how simulation-based inference works know to look for where the
>> observed sample value of the ratio statistic falls in the distribution of
>> simulated ratio statistics.
>>
>> If this question stopped there, it might rank as my all-time favorite
>> question, but I'm not sure about that. And you may recall that I used the
>> phrases "no-doubt" and "bar-none" above. So, what gives? Well, part (d)
>> clinches the title like Secretariat racing down the stretch at the Belmont
>> Stakes.
>>
>> This part of the question invites students to realize that they possess
>> the intellectual power to create their own statistic to measure skewness.
>> Half an hour previously, it might not have occurred to students that using
>> a statistic to measure skewness was even in the realm of possibility. But
>> now students are asked to devise their own statistic to do just that.
>> They're given just enough of a hint to give them a foundation on which to
>> build, as they are told to use components of the five-number summary.
>> Students produced many good statistics for this question on the AP exam.
>> You can find some sample student responses and also scoring guidelines at:
>>
>> http://apcentral.collegeboard.com/apc/members/exam/exam_
>> information/8357.html
>>
>> You can also find an investigation of the power of some of skewness
>> statistics in Josh Tabor's JSE article:
>>
>> http://www.amstat.org/publications/jse/v18n2/tabor.pdf
>>
>> Thanks very much for reading all of this! I'm afraid that I must now
>> heed the beck and call of my own students' final exams that need grading.
>> Please do reply to this SBI list with one (or more) of your own favorite
>> assessment/exam questions. Rest assured that there's no need for you to
>> write as lengthy a message as I have, unless you also have a desire to
>> procrastinate from grading of your own.
>>
>> -- Allan
>>
>> --
>> Allan J. Rossman
>> Professor and Chair
>> Statistics Department
>> Cal Poly
>> San Luis Obispo, CA 93407
>> arossman(a)calpoly.edu
>> http://statweb.calpoly.edu/arossman/
>>
>> _______________________________________________
>> SBI mailing list
>> SBI(a)causeweb.org
>> https://www.causeweb.org/mailman/listinfo/sbi
>>
>
>
--
Nathan Tintle, Ph.D.
Associate Professor of Statistics and Dept. Chair
Director for Research and Scholarship
Dordt College
Sioux Center, IA 51250
nathan.tintle(a)dordt.edu
Phone: (712) 722-6264
Office: SB1612
I loved Allan's questions (and in fact even included one on an exam this
week). Given that this reduced my exam writing time, I thought I would
pass along a few of my favorites as well. The full questions are attached.
The first pair of questions makes use of a side-by-side boxplot. I like
the first pair of questions because it tests the idea that the strength of
evidence is dependent on both the value of the statistic as well as the
variability in the data. In addition, it relies on their ability to read
and interpret graphics. The follow-up question then considers whether
assumptions for a particular analysis are reasonable.
I believe I stole the second pair of questions from Roger Woodard at NCSU
when I was a graduate student? It has been my go-to for testing the
difference between the distribution of a random sample and a sampling
distribution. Presenting the population in terms of a boxplot also
requires students to interpret graphics by comparing boxplots to
histograms. While many students get one of the two questions right, we
often find students wanting to give the same answer for both questions.
Eric
*Eric M. Reyes | **Assistant Professor*
*Department of Mathematics*
*ROSE-HULMAN INSTITUTE OF TECHNOLOGY*
5500 Wabash Ave | Terre Haute, IN 47803-3999
Phone: 812.877.8287 | Fax: 812.877.8883
www.rose-hulman.edu
On Sat, Mar 21, 2015 at 2:02 PM, Allan Rossman <arossman(a)calpoly.edu> wrote:
> Hello Simulation-Based Inference (SBI) group,
>
> You might recall that I wrote to you on Groundhog Day last month, so I
> thought I would check in again on this, the first full day of spring. Now
> with the benefit of nearly seven weeks of hindsight, what do you think of
> Punxsutawney Phil's prediction?
>
> My colleagues and I thought this might be a good time of year to write
> about favorite assessment/exam questions for introductory statistics.
> We've just finished final exams for the Winter quarter at Cal Poly, and
> those of you on a semester calendar will need to give final exams in
> another 4-6 weeks or so. But the best reason for writing now is that this
> provides me with a good excuse to procrastinate on grading my exams!
>
> I am going to identify and comment on my all-time favorite assessment
> question, but first I'll mention two "honorable mention" questions that I
> also like. I'll count down my items from #3 to #1, along with some
> commentary on each. Think of #3 and #2, which are quite short, as opening
> acts for the main event, which is fairly long. Oh, and in case you don't
> make it to the end of this message, let me now invite all of you to respond
> to the SBI list with one of your own favorite assessment items. Here we go
> ...
>
> 3. I ask students: Suppose that 60% of graduate students at Cal Poly have
> an iPad and that 20% of undergraduate students at Cal Poly have an iPad.
> Does it necessarily follow that 40% of all students at Cal Poly have an
> iPad? Explain your answer.
>
> I like this question because I think it gets at a basic skill of
> quantitative literacy. I certainly don't intend that students cite the Law
> of Total Probability in answering the question, and I would hope that many
> students could answer this well even before they take my class. But if
> they leave my class believing that the answer is yes, then I feel that
> they're leaving without some basic quantitative awareness. Almost all of
> my students realize that Cal Poly has far more undergraduate students than
> graduate students, so the overall percentage with an iPad would be much
> closer to 20% than 60% in my made-up scenario. I don't expect students to
> say this part about the overall percentage being closer to 20% than to 60%,
> but I'm glad when they do. I'm pleased that very few of my students fall
> into the trap of answering yes (but wait for my next question).
>
> If you want to modify this question for your local situation, you could
> replace undergrad/grad student with any binary variable that is likely not
> to be equally represented in your population, perhaps male/female or
> math/other major or dog/cat person. (Most well-educated people are cat
> people, right? Just kidding!)
>
> 2. I ask students: Suppose that you take a random sample of 100 houses
> currently for sale in California. Does the Central Limit Theorem (CLT)
> suggest that a histogram of the house prices in the sample will display an
> approximately normal distribution? Explain.
>
> I suspect that you know how this one turns out. To my dismay most
> students answer yes, pointing out for their explanation that the sample
> size is larger than 30. I even put "house prices" in bold in an attempt to
> draw their attention to the variable that I'm asking about. Perhaps I
> should add a note to point out that there's no mention of words such as
> "mean" or "average" anywhere in the question.
>
> So why do I like this question? I think it's more informative of what
> students have learned (or not) about the CLT than asking them to perform
> calculations and hoping that they'll remember to use sigma/sqrt(n) rather
> than just sigma in the denominator of the z-score. If my students leave
> the course thinking that the CLT says that sample data for all variables
> display a normal distribution when the sample size is large, then they've
> fundamentally not understood what the CLT or a sampling distribution is all
> about. I suspect that many of my students realize, or could reason out for
> themselves, that the distribution of house prices is typically skewed to
> the right. But my question deliberately puts the CLT in their minds, which
> leads their thinking toward a common misconception about that result.
>
> 1. My all-time, no-doubt, bar-none favorite exam question is ... (drum
> roll please) ... the investigative task (question #6) from the 2009 AP
> Statistics exam. I want to say at the outset that I had nothing to do with
> writing this question, so I take no credit whatsoever. Before you read my
> comments about it, please go and read this question (remember to scroll
> down to #6) at:
>
> http://apcentral.collegeboard.com/apc/public/repository/
> ap09_frq_statistics.pdf
>
> Now let me talk through the four parts of this question. I admit that the
> context here is not very exciting, and I'll also say that part (a) is
> pretty straight-forward and perhaps even boring. Nevertheless, I still
> think it's a good question because students do not find it easy to identify
> the parameter in a given situation. And it's certainly important to
> understand what the parameter is before trying to make inferences about the
> value of that parameter. But part (a) does not make this my all-time
> favorite question.
>
> I think part (b) is a very nice question, because it asks students to
> apply something they know to a situation they've probably never thought
> of. Students should know a lot about means and medians, but I bet it's
> never occurred to them to calculate the ratio of the mean to the median.
> This question tests whether they realize that they can use their knowledge
> that the mean is typically larger than the median with a right-skewed
> distribution to conclude that this new ratio statistic will be larger than
> one with a right-skewed distribution.
>
> I grant that parts (a) and (b) do not yet qualify for my short list of
> all-time favorite questions, but there's more ...
>
> Part (c) epitomizes the logic of simulation-based inference. It assesses
> whether students have a firm enough understanding to be able to apply what
> they know to a situation they've never encountered: testing whether sample
> data provide convincing evidence that a population distribution is skewed
> to the right. Unfortunately, many students focus on the symmetric shape of
> the distribution of simulated ratio statistics, and others focus on this
> null distribution being centered around 1. But students who truly
> understand how simulation-based inference works know to look for where the
> observed sample value of the ratio statistic falls in the distribution of
> simulated ratio statistics.
>
> If this question stopped there, it might rank as my all-time favorite
> question, but I'm not sure about that. And you may recall that I used the
> phrases "no-doubt" and "bar-none" above. So, what gives? Well, part (d)
> clinches the title like Secretariat racing down the stretch at the Belmont
> Stakes.
>
> This part of the question invites students to realize that they possess
> the intellectual power to create their own statistic to measure skewness.
> Half an hour previously, it might not have occurred to students that using
> a statistic to measure skewness was even in the realm of possibility. But
> now students are asked to devise their own statistic to do just that.
> They're given just enough of a hint to give them a foundation on which to
> build, as they are told to use components of the five-number summary.
> Students produced many good statistics for this question on the AP exam.
> You can find some sample student responses and also scoring guidelines at:
>
> http://apcentral.collegeboard.com/apc/members/exam/exam_
> information/8357.html
>
> You can also find an investigation of the power of some of skewness
> statistics in Josh Tabor's JSE article:
>
> http://www.amstat.org/publications/jse/v18n2/tabor.pdf
>
> Thanks very much for reading all of this! I'm afraid that I must now heed
> the beck and call of my own students' final exams that need grading.
> Please do reply to this SBI list with one (or more) of your own favorite
> assessment/exam questions. Rest assured that there's no need for you to
> write as lengthy a message as I have, unless you also have a desire to
> procrastinate from grading of your own.
>
> -- Allan
>
> --
> Allan J. Rossman
> Professor and Chair
> Statistics Department
> Cal Poly
> San Luis Obispo, CA 93407
> arossman(a)calpoly.edu
> http://statweb.calpoly.edu/arossman/
>
> _______________________________________________
> SBI mailing list
> SBI(a)causeweb.org
> https://www.causeweb.org/mailman/listinfo/sbi
>
Hello Simulation-Based Inference (SBI) group,
You might recall that I wrote to you on Groundhog Day last month, so I
thought I would check in again on this, the first full day of spring.
Now with the benefit of nearly seven weeks of hindsight, what do you
think of Punxsutawney Phil's prediction?
My colleagues and I thought this might be a good time of year to write
about favorite assessment/exam questions for introductory statistics.
We've just finished final exams for the Winter quarter at Cal Poly, and
those of you on a semester calendar will need to give final exams in
another 4-6 weeks or so. But the best reason for writing now is that
this provides me with a good excuse to procrastinate on grading my exams!
I am going to identify and comment on my all-time favorite assessment
question, but first I'll mention two "honorable mention" questions that
I also like. I'll count down my items from #3 to #1, along with some
commentary on each. Think of #3 and #2, which are quite short, as
opening acts for the main event, which is fairly long. Oh, and in case
you don't make it to the end of this message, let me now invite all of
you to respond to the SBI list with one of your own favorite assessment
items. Here we go ...
3. I ask students: Suppose that 60% of graduate students at Cal Poly
have an iPad and that 20% of undergraduate students at Cal Poly have an
iPad. Does it necessarily follow that 40% of all students at Cal Poly
have an iPad? Explain your answer.
I like this question because I think it gets at a basic skill of
quantitative literacy. I certainly don't intend that students cite the
Law of Total Probability in answering the question, and I would hope
that many students could answer this well even before they take my
class. But if they leave my class believing that the answer is yes,
then I feel that they're leaving without some basic quantitative
awareness. Almost all of my students realize that Cal Poly has far more
undergraduate students than graduate students, so the overall percentage
with an iPad would be much closer to 20% than 60% in my made-up
scenario. I don't expect students to say this part about the overall
percentage being closer to 20% than to 60%, but I'm glad when they do.
I'm pleased that very few of my students fall into the trap of answering
yes (but wait for my next question).
If you want to modify this question for your local situation, you could
replace undergrad/grad student with any binary variable that is likely
not to be equally represented in your population, perhaps male/female or
math/other major or dog/cat person. (Most well-educated people are cat
people, right? Just kidding!)
2. I ask students: Suppose that you take a random sample of 100 houses
currently for sale in California. Does the Central Limit Theorem (CLT)
suggest that a histogram of the house prices in the sample will display
an approximately normal distribution? Explain.
I suspect that you know how this one turns out. To my dismay most
students answer yes, pointing out for their explanation that the sample
size is larger than 30. I even put "house prices" in bold in an attempt
to draw their attention to the variable that I'm asking about. Perhaps
I should add a note to point out that there's no mention of words such
as "mean" or "average" anywhere in the question.
So why do I like this question? I think it's more informative of what
students have learned (or not) about the CLT than asking them to perform
calculations and hoping that they'll remember to use sigma/sqrt(n)
rather than just sigma in the denominator of the z-score. If my
students leave the course thinking that the CLT says that sample data
for all variables display a normal distribution when the sample size is
large, then they've fundamentally not understood what the CLT or a
sampling distribution is all about. I suspect that many of my students
realize, or could reason out for themselves, that the distribution of
house prices is typically skewed to the right. But my question
deliberately puts the CLT in their minds, which leads their thinking
toward a common misconception about that result.
1. My all-time, no-doubt, bar-none favorite exam question is ... (drum
roll please) ... the investigative task (question #6) from the 2009 AP
Statistics exam. I want to say at the outset that I had nothing to do
with writing this question, so I take no credit whatsoever. Before you
read my comments about it, please go and read this question (remember to
scroll down to #6) at:
http://apcentral.collegeboard.com/apc/public/repository/ap09_frq_statistics…
Now let me talk through the four parts of this question. I admit that
the context here is not very exciting, and I'll also say that part (a)
is pretty straight-forward and perhaps even boring. Nevertheless, I
still think it's a good question because students do not find it easy to
identify the parameter in a given situation. And it's certainly
important to understand what the parameter is before trying to make
inferences about the value of that parameter. But part (a) does not make
this my all-time favorite question.
I think part (b) is a very nice question, because it asks students to
apply something they know to a situation they've probably never thought
of. Students should know a lot about means and medians, but I bet it's
never occurred to them to calculate the ratio of the mean to the
median. This question tests whether they realize that they can use
their knowledge that the mean is typically larger than the median with a
right-skewed distribution to conclude that this new ratio statistic will
be larger than one with a right-skewed distribution.
I grant that parts (a) and (b) do not yet qualify for my short list of
all-time favorite questions, but there's more ...
Part (c) epitomizes the logic of simulation-based inference. It
assesses whether students have a firm enough understanding to be able to
apply what they know to a situation they've never encountered: testing
whether sample data provide convincing evidence that a population
distribution is skewed to the right. Unfortunately, many students focus
on the symmetric shape of the distribution of simulated ratio
statistics, and others focus on this null distribution being centered
around 1. But students who truly understand how simulation-based
inference works know to look for where the observed sample value of the
ratio statistic falls in the distribution of simulated ratio statistics.
If this question stopped there, it might rank as my all-time favorite
question, but I'm not sure about that. And you may recall that I used
the phrases "no-doubt" and "bar-none" above. So, what gives? Well,
part (d) clinches the title like Secretariat racing down the stretch at
the Belmont Stakes.
This part of the question invites students to realize that they possess
the intellectual power to create their own statistic to measure
skewness. Half an hour previously, it might not have occurred to
students that using a statistic to measure skewness was even in the
realm of possibility. But now students are asked to devise their own
statistic to do just that. They're given just enough of a hint to give
them a foundation on which to build, as they are told to use components
of the five-number summary. Students produced many good statistics for
this question on the AP exam. You can find some sample student
responses and also scoring guidelines at:
http://apcentral.collegeboard.com/apc/members/exam/exam_information/8357.ht…
You can also find an investigation of the power of some of skewness
statistics in Josh Tabor's JSE article:
http://www.amstat.org/publications/jse/v18n2/tabor.pdf
Thanks very much for reading all of this! I'm afraid that I must now
heed the beck and call of my own students' final exams that need
grading. Please do reply to this SBI list with one (or more) of your
own favorite assessment/exam questions. Rest assured that there's no
need for you to write as lengthy a message as I have, unless you also
have a desire to procrastinate from grading of your own.
-- Allan
--
Allan J. Rossman
Professor and Chair
Statistics Department
Cal Poly
San Luis Obispo, CA 93407
arossman(a)calpoly.edu
http://statweb.calpoly.edu/arossman/
> 1. My all-time, no-doubt, bar-none favorite exam question is ... (drum roll please) ... the investigative task (question #6) from the 2009 AP Statistics exam. I want to say at the outset that I had nothing to do with writing this question, so I take no credit whatsoever. Before you read my comments about it, please go and read this question (remember to scroll down to #6) at:
>
> http://apcentral.collegeboard.com/apc/public/repository/ap09_frq_statistics… <http://apcentral.collegeboard.com/apc/public/repository/ap09_frq_statistics…>
Oooh, I DO love the question Allan referred to. Me, I’m in semester-land here so have just given a midterm mini-project. It will probably never be anyone’s all-time, bar-none, hands-down, suitably-hyphenated favorite, but it’s interesting, so I thought I’d share.
Our class has spent some time studying inhabitants of “The Island,” an ingenious place invented by Michael Bulmer and his colleagues in Australia. If you have never been there, you should visit! (details below)
In their travels and explorations, some students noticed that, among married couples, it seems that it’s "more likely that either both people smoke or neither smokes." Of course what they MEANT by that is not exactly what that statement says. In class discussion, it came out that they meant something more complex and harder to say, namely, some conditional-probability version of the statement, such as, “if you smoke, you’re more likely to be in a relationship with someone who smokes than you would be if you didn’t smoke.”
All this means that they have some challenges ahead. I told them that part of the point of the assignment was to use some technique for comparing two proportions. This means that they had to think about the situation and decide what two proportions to compare. In some online discussion in advance of the due date, lots of problems emerged, such as comparing the proportion of couples who both smoke to the proportion where neither smokes. Only about 20% of Islanders smoke overall, so showing that this difference is not equal to zero is not very interesting. More importantly, doing so says nothing about any association between choice of partner and smoking status.
But among some students, the task may have helped some light bulbs go on about association, randomization, scrambling attribute values, and all those things we like about SBI. It was good for them to grapple with a problem where they had to think hard about the measure they were looking at in order to decide what exactly it measured; then construct and see its sampling distribution; and connect that to their observed values.
Working on the Island was interesting too. We put together a Google form so that each student in the class went out to a village or two and randomly sampled ten households and record the smoking status — and some other information — about the inhabitants. It was a valuable change for the students NOT to have a big data set all clean and prepped for them, but rather to have to cope with collecting and entering the data themselves. Doing 10 each was not too hard, so thanks, Google, for making it possible for us to leverage the whole class’s efforts to getting us a decent-sized data set.
Cheers to all, happy grading…
Tim
The Island: You could contact Michael Bulmer in Australia (m dot bulmer at uq dot edu dot au) and ask, or you can contact me; I’ve set up what I HOPE would be a “class” for SBI members, but I need to submit your email addresses for you to get an invite link. There is probably a page to request admission, but I don’t know where it is! Email me off-list: eepsmedia at gmail dot com. Michael has written about his invention, for example, in this TISE article:
https://escholarship.org/uc/item/2q0740hv <https://escholarship.org/uc/item/2q0740hv>
Our Data: Or you might just want to look at the results of our data collection, and do some analysis yourself:
https://docs.google.com/spreadsheets/d/10BhVD7CoAMeSG8b-Hss93gAdsyXlhozf_7H… <https://docs.google.com/spreadsheets/d/10BhVD7CoAMeSG8b-Hss93gAdsyXlhozf_7H…>