Jeff said:
My answer is that it's not the only opinion I'm interested in. It's like leaving it up to Florida to elect a president.
And that's what I've been trying to say all along. What you're interested in is not what the poll measures.
This is a Florida poll and you're crying "BS" because it doesn't represent the entire US. The poll isn't wrong, your expectations are.
matt. said:
The best way to look at these things, with the myriad problems with statistical power and selection bias, is to think about which coasters are generally "better" than others. Those top 10 coasters are probably, roughly, pretty darn good. The next 10, still awesome, but not quite so much. And on and on. 10 is arbitrary, make it 12 or 15.2 if you'd like. If you really care about what is #3 this year vs. what is #4 you have completely missed the point.
Exactly! And in that aspect, I'm sure this poll represents a pretty damn accurate look at which roller coasters people tend to enjoy more than others.
I use that wording carefully because I don't believe it does measure which coasters are "better" based on some general opinion, it measures the general trends of which coasters enthusiasts who care enough to rank coasters and vote in this kind of poll tend to enjoy most.
rollergator said:
The short answer is you're not doing a statistical analysis because you didn't take a sample, you have the entire census data at your disposal. You run statistics on samples taken FROM a larger population.
So how large (or small) does the larger population have to be before I can take a sample and run stats and have them be valid? 10? 100? 1000? 10,000? Where is that line? This is the answer I'm trying to coax out of someone.
You take sample statistics because it's assumed we have neither the resources nor the time to measure everybody (or everything) in the population.
If you consider the population all enthusiasts who participated then all of this is moot, you get what you get. However you still run into problems because your entire population is still self-selected, so you just have to be aware of that.
However, if your population is ALL enthusiasts then it depends on how comfortable you are with whatever degree of error.
For example, when I ran the numbers for Voyage vs. T-Express, I found that at a 95% confidence error there was nearly a 50% potential swing in what percentage of enthusiasts preferred Voyage of T-Express. To me, that is frigging huge. You'd have to really bump up the number of people were polled who rode both to get that 50% swing smaller, or, make your confidence interval something like 90% or lower.
Then again, this is just roller coasters. What you find acceptable and not acceptable is completely subjective as long as you understand the sampling method. When predicting the probability of Voyage being preferred over El Toro, I'd probably be ok with a relatively large room for error (but not nearly 50 percentage points at 95% confidence). When predicting the probability that a positive HIV test actually indicates that a patient has HIV, I'd be much more discerning.
I know this is a runabout way of answering Gonch's question. You run a sample and create statistics as soon as you don't have the resources to do otherwise, but you have to be clear about what exactly you mean your "population" to be.
matt. said:
You run a sample and create statistics as soon as you don't have the resources to do otherwise, but you have to be clear about what exactly you mean your "population" to be.
I think this is where I'm hitting the wall with Jeff. We have different ideas of what this poll is meant to represent. Or even better, what it does represent.
Without the quoting and such, your population should be at least 10x the sample size, and the sample size really should be at least 30 before before you can deem it "representative". Of course, you need to know more about the population and its characteristics before you can really tell whether survey respondents ("sample") are indeed representative of the larger population about which you wish to make some sort of claim...
Of course, I did meh in Stats as an undergrad, so even though I just did well in my Graduate class last semester, I am FAR from a statistics expert. On the other other hand, at least my educational experience was very recent... ;)
Ok, so what you're saying Gator further confirms my thought process, I think.
If we take any given rare coaster pair (meaning that a pretty low number of people have ridden both coasters) and sample 1/10 of them - we have pretty good insight into what people who've ridden those two coasters think of them. (assuming the sample size is large enough - and you're putting it at 30)
Correct?
Somewhat more complicated, involving the concept that at a sample of 30 you can be reasonably sure that repeated sampling will result in a normal distribution (that old bell curve). From there, you can be reasonably sure that your sample mean/proportion is relatively in line with the population mean/proportion. Of course, with proportions/percentages, the rule to be followed is that P (your hypothesized proportion) times n >=10 and that (1-P)*n is also >= 10.
Since we're not testing, say, a claim that your car gets 25mpg and deciding if that claim is reasonable...but rather comparing thousands of "coaster X is better than coaster Y" records, the analysis would get rather involved. But yeah, you seem to have the general gist.
Larger samples give better certainty that the sample selected (provided it's <= 10% of the population) will provide statistics (measures of sample characteristics) that are more in line with population characteristics (which are almost never known, but inferred from repeated sampling).
You still have Zoidberg.... You ALL have Zoidberg! (V) (;,,;) (V)
After going through the last two new pages, I've figured out what the main question is:
Does Jeff like avocados? Or more accurately, what percentage of Jeff likes avocados? Of course, we'll have to figure out which other vegetables and/or fruit the various percentages of Jeff have eaten, and then give them a weighted bias accordingly.
This has nothing to do with the more urgent pancakes/bacon frame debate. ;)
My author website: mgrantroberts.com
Look at the detailed spreadsheet and compare the following 3 coasters: Prowler #8, Tremors #16, and Evel Knievel #17.
Head to head, Prowler and Tremors were actually tied among mutual riders 6-6. Prowler "defeated" EK, 35-18, and Tremors "defeated" EK 7-6. The reason why Prowler ends up ranked higher than Tremors, is because Prowler had more people who preferred it to any coaster ranked 9-15, including Phoenix, Thunderhead and Hades. Fewer voters preferred Tremors and EK to Phoenix, Thunderhead and Hades in their individual matchups.
Basically what this poll tells us is that when you take the people who rode Prowler and compare it to any other coaster they rode, they prefer Prowler over 160 other coasters, they like 5 other coasters better, and 3 other coasters were tied. In comparison, Tremors record was 161-13-1, and EK's was 155-15-3. Mitch takes those records and assigns winning percentages to each coaster, and ranks those percentages.
So if we can't compare T-Express because it only had 20 riders, what about comparisons between Prowler and Tremors that had only 12 common riders? Both of those coasters had less than 100 riders in the poll. So do we just assume they must suck because fewer people went to Missouri and Idaho as opposed to Pennsylvania and Ohio? How else could you compare those coasters, and any others that had so few common riders? If you start cherry picking what to include, then you don't have much of a poll at all.
The poll does its best to quantify what is really a very subjective topic, how individual people feel about individual coasters. I think we'd have an unanimous BCS champion before someone could come up with a poll that would unquestionable determine what the best wooden or steel coasters are.
ApolloAndy said:
Haven't read all the comments yet, but 2 points:
1) Absolutely the people who have ridden more coasters should count for more. If a person has ridden 500 coasters in all corners of the world, their opinion on what's good is much more important than someone who has only ridden at Cedar Point.
It depends upon what the poll is for. And since the only people who care about it are enthusiasts I would would tend to agree with you. A few summers ago I went to SFGADV with some of my Massachusetts friends. They were anything but enthusiasts. Most of them had only been to IOA and SFNE in the past. Afterwards 4 of the 5 said that Kingda Ka was their favorite steel coaster after just one front seat ride! So if they were filling out this poll they would be ranking SROS/Bizzaro below Kingda Ka on their ballots. That doesn't represent the enthusiast population very well obviously, but it would help represent your casual amusement park goer.
1.SV 2.El Toro 3.MF 4.I-305 5.Kumba
6.STR@SFNE 7.Voyage 8.X2 9.Storm Chaser 10. Wicked Cyclone
Lord Gonchar said:
And that's what I've been trying to say all along. What you're interested in is not what the poll measures.
What is that even supposed to mean? My point is that the poll doesn't measure anything. And...
This is a Florida poll and you're crying "BS" because it doesn't represent the entire US. The poll isn't wrong, your expectations are.
Not at all. The "B.S." comes from the representation of being "sophisticated." His words, not mine. It's little more than a clever comparison, but it doesn't measure anything of value. For example...
RatherGoodBear said:
So if we can't compare T-Express because it only had 20 riders, what about comparisons between Prowler and Tremors that had only 12 common riders? Both of those coasters had less than 100 riders in the poll. So do we just assume they must suck because fewer people went to Missouri and Idaho as opposed to Pennsylvania and Ohio? How else could you compare those coasters, and any others that had so few common riders? If you start cherry picking what to include, then you don't have much of a poll at all.
That's a good example that illustrates my point about how worthless the measurement is beyond being anything more than something of entertainment value. So for Hawker to put down the AT poll is pretty ridiculous. His is no better except that he's not scamming a magazine full of masturbatory advertisements.
Jeff - Editor - CoasterBuzz.com - My Blog
All seems like perfume on a pig to me. You can try to make the process seem more objective but in the end you are trying to measure something that is totally subjective. Different folks (even different enthusiasts) will have different views on what they like/don't like in a coaster. So what do the results tell you? Just like arguing who is the best team, player, actor, etc. and limiting it to folks who watched any given number of teams play (though not against each other), or players play or actors act, etc. Provides some entertainment and maybe sells some advertising. Nothing else.
I'd argue that is IS a better measurement....given the rather large supposition that completely differet experiences can be rated at all. Is pistachio ice cream really better than a nice thick slice of well-cured bacon? Well, that's pretty subjective, isn't it?
My point though, is that rather than simply ranking a ride better because a larger number of voters considers it "top ten", but that a higher *percentage* of riders considers it better. "Better"...yeah, I'm rolling my eyes too...our tastes vary from person to person. Shoot, even the ride itself varies from seat to seat, day to day, day-to-night, train to train, etc.
But under any circumstances, a voter may not have ridden Prowler or Tremors, but does have T-Express and Boulder Dash, and rates B-Dash higher. Under Mitch's algorithm, that counts as one rider-ride comparison, and the methodology means you maximize the information gleaned by making that very same comparison against every other ride. Instead of simply asking me for my personal top ten (which for me means only from those rides that I've personally experienced), I get to have a say on wehre every ride belongs in the rankings. So my vote for, say, Roar West at near 110 in my rankings of my wooden coasters helps drag the ride down further in Mitch's poll than simply omitting it from my list of top ten rides as I would in virtually every other ranking system I've seen. It also means my ranking of Kentucky Rumbler at around 15th or so makes it more likely to make Mitch's top ten than Roar West...whereas with a "regular" ranking system you get no information on Roar West or on Kentucky Rumbler, and all you could say is "they're not in the top ten". Everyone gets an enhanced "say" on every ride, and those with more rides in their track record hold more sway merely by virtue of being able to make more comparisons.
What makes a ride "good" varies from person to person SO wildly that it makes ranking kind of silly in the first place...whether you have 3 riders or 100, that still holds true. But if you're willing to accept that it's just a ranking system and that we're here discussing it when an infinite number of more important things are going on in the world around us....then yes, I'll argue for Mitch's method over all those polls that claim Hulk or Beast is number one because it's had the most riders (most voters in those polls have little to compare to).
I think there are a couple of things going on that are getting mixed up:
1) Polling methodology (how you ask the people you ask - in this case clearly not representative of the worldwide GP but maybe pretty close to the US enthusiast community. This has it's own problems when it comes to foreign coasters in terms of objective comparison as mentioned above, but presumably reflects what a US enthusiast would experience on a foreign ride.)
2) Sample size (clearly not enough for many of the conclusions drawn, not the least of which are the coasters with #'s)
3) Evaluation of data methodology (to me this is significantly better than any other poll out there. The problem is that because of the huge requirement of data for the evaluation methodology, sample size suffers greatly)
Hobbes: "What's the point of attaching a number to everything you do?"
Calvin: "If your numbers go up, it means you're having more fun."
I wrote:
RatherGoodBear brings up political polls of likely voters; the problem with that comparison is that in a political poll of likely voters, all of the participants in the poll are presented with the same options, and give their opinions on the same options as hundreds of thousands if not millions of likely voters that sample is intended to represent.In conducting a coaster opinion poll, there is an added complication that all of the participants do not have the same experiences...
Jeff replied
Ah, but that's exactly like an electoral poll. Reputable pollsters don't just ask who you voted for, they get a complete picture of who you are in terms of race, income, gender and other well-defined demographics. They then adjust the results based on statistically significant samples to match the aggregate data of the voting public. Based on whatever short comings they have in their data (depending on their methodology) they then indicate what the margin of error for the poll is. They further ask for data that they theorize may affect the reasoning for their vote. For example, if they think a candidate has an edge because of his or her stance on gun control, they may ask if the voter owns guns, has been a victim of crime with a gun, etc. Thus, wildly different experiences are a significant part of the polling process.
True, but I wasn't talking about normalizing the demographics, which is what the political pollsters try to do. In a typical political poll, respondents are asked to pick one from a limited slate. In the coaster poll, respondents are being asked to rank in order of preference all coasters they have ridden from a slate of nearly 200 rides. On average, each respondent has ridden only about 30 rides, and so only ranks 30 of the 200. But every respondent has a different selection of rides to rank. The whole idea of the head to head comparison is to interpolate those missing results. It's an interesting trick--
Let's say I have ridden rides A, B and C. You have ridden rides A, B and D. For simplicity, I have ranked the rides in order, A, B, C. You, meanwhile, rank your rides A, D, B. What that means is that even though you haven't ridden C, and I haven't ridden D, if we assume that there is some consistency to rankings, we can rank the four rides A, D, B, C. Because you liked A better than D, and D better than B; I didn't ride D, but since we agree on the relative ranking of A and B, then logically D would be somewhere between them.
That makes some pretty serious assumptions about consistency of opinions, which means that the formula is trying to hammer out some kind of numerical consensus about what makes a good ride, a consensus that may be shown numerically, but that may not really exist. Which means that trying to use overlapping experiences to generate a comprehensive ranking is kind of like trying to fit parts with a sledge hammer, but it does at least yield a result.
--Dave Althoff, Jr.
/X\ _ *** Respect rides. They do not respect you. ***
/XXX\ /X\ /X\_ _ /X\__ _ _ _____
/XXXXX\ /XXX\ /XXXX\_ /X\ /XXXXX\ /X\ /X\ /XXXXX
_/XXXXXXX\__/XXXXX\/XXXXXXXX\_/XXX\_/XXXXXXX\__/XXX\_/XXX\_/\_/XXXXXX
Jeff said:
That's a good example that illustrates my point about how worthless the measurement is beyond being anything more than something of entertainment value.
I guess I fail to see how a roller coaster poll of any kind could be anything else.
I think what Jeff is saying is that it's not an accurate reflection of the opinions of the community (or it wouldn't be if we all managed to ride all the rides somehow).
In general, though, I've planned trips based on the polls and I've not been disappointed (though how Thunderhead stayed so high so long, I don't know...but I also don't like GCI's or twisters in general).
Hobbes: "What's the point of attaching a number to everything you do?"
Calvin: "If your numbers go up, it means you're having more fun."
I think it was fall '08...I got one front-seat ride on Thunderhead that was beyond insane. I mean, it was on the level of Raven after a downpour. Fast, smooth, monster airtime...
The very next day, I sat in the same seat under similar conditions (temperature, humidity, etc) and the ride stunk.
I learned later I wasn't sitting in the same seat...they had switched trains overnight. I never did get to recreate that one ride...
--Dave Althoff, Jr.
/X\ _ *** Respect rides. They do not respect you. ***
/XXX\ /X\ /X\_ _ /X\__ _ _ _____
/XXXXX\ /XXX\ /XXXX\_ /X\ /XXXXX\ /X\ /X\ /XXXXX
_/XXXXXXX\__/XXXXX\/XXXXXXXX\_/XXX\_/XXXXXXX\__/XXX\_/XXX\_/\_/XXXXXX
You must be logged in to post