Mitchs best woodie poll

a_hoffman50's avatar

I think the difference is in the way you look at the poll. If you look at it as a poll made for the masses, Jeff is right. If you look at it as a poll for coaster enthusiasts by coaster enthusiasts, Gonch is right.

I tend to lean towards the latter scenario because in all honesty, who cares about these polls, 'cept for us rollie coaster enthusiasts?

And even some of us, not so much. ;)


My author website: mgrantroberts.com

Lord Gonchar's avatar

Jeff said:
The population isn't 20 people. It's thousands. You're missing the point... 20 people who went on some trip together to ride some obscure coaster don't make a significant enough sample to reach any conclusion.

If more people don't ride the coaster, the pool of opinions is too shallow to draw any conclusions.

So asking those 20 people what they think is too small of a sample to conclude what those 20 people think?

That's what I'm getting at (and what I believe the poll does) - the samples don't speak for the masses and I don't believe they intend to.

We're not after what the thousands think, we want to know what these 20 think.

That they compare one ride to another doesn't matter, it's still 20 opinions, period.

Absolutely it does, because those 20 samples do become statistically insignifcant in the bigger scheme of the hundreds of samples that he's comparing them to...and you say as much yourself:

Like the cancer analogy, the 20 people cured are too small of a sample to account for a thousand different variables that better describe the population at large.

Compared to the population at large, those results are meaningless. Of and by themselves, they're very relevant if you interested in the only 20 people to have this specific rare type of cancer.

If you're interested in the big picture, then those 20 mean dick. If you're interested in those 20 in particular, then sampling those 20 is as good as it gets. I apply that same logic to the poll.

For example how many people in the world do you think have ridden T Express in South Korea and Troy in the Netherlands? It can't be many at all...and it's certainly only going to be a certain type of enthusiast that travels to ride.

If we want an informed comparison based on experience with both of those rides, we have a very small population to even sample from in the first place. Mitch's sample most likely represents a decent portion of the population we're interested in (those who've ridden both T Express and Troy).

In fact, I'd argue that Mitch's smaller samples for those rarer populations are more valid than his largest ones for much more common populations.

Sticking to the T Express/Troy population, Mitch has 11 samples. How many people in the world have ridden both of those coasters? It can't be many. Dozens? A hundred or two at most if I'm being super-unrealistic-generous?

But what about a much more common population? How about the Voyage/Raven population. Mitch has 248 samples. How many people in the world have ridden Voyage and Raven? Only everybody that vists HW and rides roller coasters. Tens of thousands? Hundreds of thousands?

I guess what I'm saying is that I don't see how you can use sample size alone to determine validity without considering the population size.

Given the sample size and what I estimate the population to be in this case, I'm ok with saying that more people who've ridden both T Express and Troy prefer T Express based on the info.

Where I think it falls apart is saying T Express is the third best coaster in the world based on the opinion of 20 people in a poll of 730 meant to represent thousands or more...in other words once he starts listing and ranking them it falls apart.

---

Is T Express the 3rd best coaster on Earth? Much too little info to even begin to determine that.

Is T Express generally considered a better coaster than Troy by people who've ridden both? Absolutely.

That's what Mitch's poll does...and generally does pretty well much of the time.


rollergator's avatar

"Most" wooden coasters will fall not only because better rides are built after them, but also because they have a tendency to deteriorate over time. Phoenix, just as an example, almost defies both of those rules by virtue of its exceptional design combined with exceptional care.

Oh, and just because....statistical power is increased not only through a larger sample size but also by simply having a larger difference between the true value and the hypothesized value (in this case, the ranking of the "other" coaster). So if coaster A is ranked 1st and B is ranked 35th, that says more, statistically, than A at 1st and B at 3rd. "Statistically significant" depends on the level of confidence chosen for the comparison - you need better evidence to make stronger claims.


You still have Zoidberg.... You ALL have Zoidberg! (V) (;,,;) (V)

Jeff's avatar

Lord Gonchar said:
I guess what I'm saying is that I don't see how you can use sample size alone to determine validity without considering the population size.

Because that's what statistics are. I didn't make the rules. I am considering the population size... there are how many thousands of enthusiasts on the planet? That not enough have been on the ride is a different problem.


Jeff - Editor - CoasterBuzz.com - My Blog

Jeff, by your criteria, none of the extant coaster rankings are statistically significant. Perhaps you are asking too much. Particularly since there are other limitations that ought to be resolved before we should even worry about statistical significance. A limitation for all existing rankings is that they are (mostly) self-selected samples. Other potential criticisms abound.

In the case of Mitch's poll (for the purpose of ranking the "best" coasters), I have at least the following criticisms.
1.) Not all ballots have equal weight. Ballots with a large number of rides have a disproportionate effect on the outcome.
2.) Mitch's poll violates the property of "Independence of Irrelevant Alternatives", otherwise, why does he have to run the algorithm both with and without favorite steel (wood). It ought not matter.
3.) Mitch's "total rider" threshold for including a coaster in the overall ranking is too low (in this case, I agree with Jeff, but I do not clamor for statistical significance).
4.) Mitch's threshold for "valid" pairwise comparisons (3 mutual riders) is too low (this exacerbates criticism 1).

Last edited by Paul Miner,
Jeff's avatar

As I said over and over, the concept of statistical significance is not "my" criteria.

I've been toying around with the idea of a poll, going after a different way of measuring opinion (i.e., one so simple a caveman could do it), that would take into consideration both popularity and experience, through some kind of magical formula. I for one don't have an issue with popularity. I mean, even in the case of Cedar Point, I'd guess that the bulk of even casual enthusiasts have been on both Magnum and Mean Streak, making them both popular, but for different reasons.

My problem of course is that I'm not going to have a very large sample either. I mean, assuming I could get every single club member to participate, I'd still only have a few hundred people rating rides.


Jeff - Editor - CoasterBuzz.com - My Blog

Lord Gonchar's avatar

Still seems to me that everyone has a problem with the way Mitch interprets and presents the data - the ranking list. The data behind it (the mutual rider spreadsheet) is full of interesting and valid comparisons of head-to-head coaster pairs...even at low sample numbers.

Jeff said:
My problem of course is that I'm not going to have a very large sample either. I mean, assuming I could get every single club member to participate, I'd still only have a few hundred people rating rides.

I still think you're trying to do something that can't realistically be done and probably wouldn't be too far off from what your little subset would come up with anyway - gauge the opinion of everyone who rides coasters.

What you would end up with though is a very statistically valid picture of what CoasterBuzz Club members thought of the roller coasters of the world.

Why does it have to be anything more than that? Why does it have to represent some greater (arguably unachievable) segment. Why can't it just be a look into what the people who use this website think? Why does it have to try to represent what everybody thinks?

That's what I'm not getting.

Why can't Mitch's poll be little comparisons of what people who've ridden any two given rides thought of those two rides? Why does it have to represent thousands and thousands of roller coaster enthusiasts?

As I said over and over, the concept of statistical significance is not "my" criteria.

No, but what you're trying to quantify under the terms of 'statistical significance' is.

If you gather data for 'A' you can't call it invalid because it doesn't measure 'B' - sure it's not significant for 'B', it wasn't supposed to be.

I don't think being statistically significant is asking too much either, but I think that taking data for a specific purpose and then complaining that it doesn't meet the criteria for being statistically significant for a much broader purpose is asking too much. You're expecting the collected data to do something it isn't intended to do.

If only a couple of dozen people have ever ridden any particular pair of coasters and the poll surveys most of those people, you can't call the data insignificant because it doesn't consider thousands of other people who've never ridden those two coasters, but have ridden plenty of others (or other specific pairs, as the case may be) because they don't matter as far as this particular pair is concerned.

I get that 20 out of thousands is insignificant, but that's not what's being done here in most cases...until we get to trying to create some arbitrary ranking list using all of those little individual pairs as one larger data set, at least.


Jeff's avatar

No, it's not my criteria. Am I not explaining it right? Statistical significance, selection bias, sample size, etc., are all important when measuring anything, especially something as arbitrary and subjective as opinions. These aren't fields of study I invented. I used to be married to a scientist, you know, and anything you publish is subject to scrutiny as to how it stands up to accepted data gathering processes and basic statistics. While a novel approach, this survey does not stand up to those standards.

Lord Gonchar said:
If only a couple of dozen people have ever ridden any particular pair of coasters and the poll surveys most of those people, you can't call the data insignificant because it doesn't consider thousands of other people who've never ridden those two coasters, but have ridden plenty of others (or other specific pairs, as the case may be) because they don't matter as far as this particular pair is concerned.

Well sure you can call it insignificant. That's how insignificant is defined.

If you can't take a hundred random enthusiasts, put them all on the same coaster and get their feedback, then you can't convince me that any of those three rides in Asia is "better" than any other ride in the survey. I forget who it was last year (probably Brian Noble, since he's the only "real" academic here), but someone mentioned that the sheer excitement of going to China or whatever to ride a coaster with similarly minded people is selection bias in its worst form.

My continuing feeling is that replacing popularity bias with selection bias (colored with a tiny sample size) makes this particular survey that much less representative of anything in the real world. The poll suggests that it uses more sophisticated methods, but they're only more complex, not sophisticated. Anyone who studies and analyzes polling data would likely throw this out completely.

If you want to call this an entertaining annual distraction, I'm OK with that. But this just ain't sophisticated.


Jeff - Editor - CoasterBuzz.com - My Blog

Lord Gonchar's avatar

Jeff said:
Am I not explaining it right?

I'm starting to think I'm not. :)

What if my goal were to find out what people named Jeff Putz who also run roller coaster websites thought of avacados?

After I asked you, would my results be statistically insignificant because my sample size was only one (too small) or would they be entirely accurate because my sample rate was 100% (the entire population of people named Jeff Putz that also run roller coaster websites)?


Jeff's avatar

Your comparison isn't valid. The purpose of a poll is to measure an average or gain some consensus. I can't think of many things less important than what I alone think.


Jeff - Editor - CoasterBuzz.com - My Blog

I really don't see anything wrong with the methodology used in Mitch's poll. I prefer the head to head comparison between any pair of coasters to any kind of popularity of individual coasters. Someone mentioned that those who rode more coasters skewed the results. How? I'd say the opposite. Why should someone who's never been more than 5 miles away from Lake Erie decide the best coaster?

Every year, we're subjected to endless polls for every election telling us which candidate is ahead. Just about every one of them is based on a sampling of 500 or so people contacted by telephone. That's 200 less than participated in this poll. Yet the networks, news agencies and pollsters have no problem telling us this is an accurate reflection of how the electorate feels. Even though that electorate may contains tens of thousands or tens of millions of voters.

Jeff, so what if this is an entertaining distraction? Do we really need an ultra-sophisticated poll to determine some indisputable ranking?

Jeff's avatar

I wasn't arguing against that, I was saying that's what it is. I'm debating the claims of the poll itself.

If it makes you feel better, I think the Amusement Today poll is also terrible, and a scam to attract advertising on top of that.


Jeff - Editor - CoasterBuzz.com - My Blog

Lord Gonchar's avatar

Jeff said:
Your comparison isn't valid. The purpose of a poll is to measure an average or gain some consensus.I can't think of many things less important than what I alone think.

I can. Ranking roller coasters. ;)

But, seriously, I was oversimplifying on purpose.

I still feel like I mustn't be wording what I'm trying to say correctly.

Why is it not vaild to measure an average or gain a consensus on a small group if that group's opinion is the only opinion you're interested in?

(stop reading here if you have an answer and don't want to suffer through another lame example attempt)

---

You're essentially trying to say that if I'm interested in what my son's 2nd grade class thinks of Spongebob that asking them and tallying the results is not valid because there's only 22 of them and that's not representative of the entire school...or the entire school district...or all the schools in SW Ohio.

But the point is I don't care about the entire school or the entire district or all the schools in my corner of the state. I just care about what my son's class thinks. That's all I intend to measure and that's all I'm interested in. Which makes my polling of his class entirely valid for my purposes - finding out Room 25's general opinion of Spongebob. How is that not valid?

Last edited by Lord Gonchar,

I think the biggest issue is that when you examine the results of any kind of survey, it is important to understand *what is being measured*.

RatherGoodBear brings up political polls of likely voters; the problem with that comparison is that in a political poll of likely voters, all of the participants in the poll are presented with the same options, and give their opinions on the same options as hundreds of thousands if not millions of likely voters that sample is intended to represent.

In conducting a coaster opinion poll, there is an added complication that all of the participants do not have the same experiences. Every participant has a different set of coasters ridden, and a simple popularity poll, just for reasons of distribution, will offer a bias which has nothing to do with the quality of the ride. I mean, if you could get every American who has ever ridden a roller coaster to vote for the best coaster, I'd be shocked if Space Mountain wasn't the runaway favorite, just because of the sheer number of people who have ridden the darned thing.

That's the bias that Mitch tries to address. Consider the Top 10 Wood Coasters in the United States as presented in both Mitch's Poll and in the Golden Ticket poll--


GOLDEN TICKET AWARDS INTERNET COASTER POLL
#01: The Voyage The Voyage
#02: Boulder Dash El Toro
#03: El Toro Boulder Dash
#04: Phoenix Ravine Flyer II
#05: Thunderhead Prowler
#06: Ravine Flyer II Phoenix
#07: The Beast Thunderhead
#08: Prowler Hades
#09: Hades Tremors
#10: Shivering Timbers Evel Knievel

#16: Shivering Timbers
#36: The Beast

If you break it down that way, the sample size bias in the Internet poll is reduced a bit, and you can see that within the top 10, there really isn't a whole lot of variation. We see that riders who rode both El Toro and Boulder Dash apparently prefer El Toro. Ravine Flyer and Prowler are preferred to the Phoenix, and Tremors and Evel Knievel both scored better than either Shivering Timbers or The Beast.

The key here is that the Golden Ticket poll attempts to identify the most popular wood coaster. The Internet coaster poll attempts to objectively rank coasters based on the combined opinions of all the poll participants, interpolating exact head-to-head comparisons between rides. Nowhere is that outcome more obvious than in the ranking for The Beast. The Beast is probably the second most famous wood roller coaster in the world, let alone in the US, ridden by millions, and well regarded in memory by people who haven't been on it in years. But in the smaller population of people who have ridden The Beast *and* any or all of the top 35 coasters in the Internet poll, most of those riders actually prefer the other 35 coasters to The Beast.

That is what is being measured in the Internet poll. It introduces a different kind of bias into the results, but the advantage is that you will probably find that the results of the Internet poll tend to more accurately reflect the opinion of any randomly selected participant in that poll, even though they don't reflect the outcome of a straight popularity contest. And that is what the poll is trying to accomplish. It is attempting to arrange the rides into something that represents a consensus order of preference, rather than an ordering by popularity. The key here, then, is that the absolute ranking of any ride in the Internet poll is of far less importance than the relative ranking of any two rides in the Internet poll. Because the poll is all about relative ranking, and the absolute position is derived from the relative ranking. In a straight popularity contest, the results of the poll directly determine the absolute ranking.

I hope that makes some kind of sense. I'm supposed to be sleeping now. :)

It is worth noting, though, that the same small sample bias that makes good, obscure coasters rise to the top of the Internet coaster poll is also what causes the CDC and CPSC to estimate that garden hoses injure more people annually than amusement rides, so perhaps it is reasonable to assess the same kind of skepticism to the poll numbers...

--Dave Althoff, Jr.


    /X\        _      *** Respect rides. They do not respect you. ***
/XXX\ /X\ /X\_ _ /X\__ _ _ _____
/XXXXX\ /XXX\ /XXXX\_ /X\ /XXXXX\ /X\ /X\ /XXXXX
_/XXXXXXX\__/XXXXX\/XXXXXXXX\_/XXX\_/XXXXXXX\__/XXX\_/XXX\_/\_/XXXXXX

Jeff's avatar

Lord Gonchar said:
Why is it not vaild to measure an average or gain a consensus on a small group if that group's opinion is the only opinion you're interested in?

That's the right question, and I did not read on. :) My answer is that it's not the only opinion I'm interested in. It's like leaving it up to Florida to elect a president.

RideMan said:
RatherGoodBear brings up political polls of likely voters; the problem with that comparison is that in a political poll of likely voters, all of the participants in the poll are presented with the same options, and give their opinions on the same options as hundreds of thousands if not millions of likely voters that sample is intended to represent.

In conducting a coaster opinion poll, there is an added complication that all of the participants do not have the same experiences...

Ah, but that's exactly like an electoral poll. Reputable pollsters don't just ask who you voted for, they get a complete picture of who you are in terms of race, income, gender and other well-defined demographics. They then adjust the results based on statistically significant samples to match the aggregate data of the voting public. Based on whatever short comings they have in their data (depending on their methodology) they then indicate what the margin of error for the poll is. They further ask for data that they theorize may affect the reasoning for their vote. For example, if they think a candidate has an edge because of his or her stance on gun control, they may ask if the voter owns guns, has been a victim of crime with a gun, etc. Thus, wildly different experiences are a significant part of the polling process.

This coaster poll has none of that context, but adjusts, arbitrarily I feel, based on experience, weighting the opinions of those who have been on more rides. I say that's arbitrary because who's to say my opinion about Millennium Force is any less valid just because I've been on fewer coasters and more people have been on it? Again, it replaces one bias with another.

The truth is, I'd probably blow this poll off the same way I do the AT poll, but because it claims to have "sophisticated methods," I call B.S.

Last edited by Jeff,

Jeff - Editor - CoasterBuzz.com - My Blog

matt.'s avatar

Jeff said:
(probably Brian Noble, since he's the only "real" academic here)

I'm getting closer by the day. :) Unfortunately I think Brian leans heavily toward the quantitative side of being a brainiac, and I decidedly do not, so I believe he would still remain the expert on data collection and survey research.


Basically, Jeff is right on all points here. The size of the population really is irrelevant for most of the arguments being made here*, and across the board the comparisons made in Mitch's poll are too small to be taken, you know, scientifically.


That being said, and after skimming the last couple of pages, I think some people just don't get the math but what really matters is we're all on the same page with how to interpret the results - with a huge, huge grain of salt. Jeff is saying "These samples are too small to take the results seriously" but in the same breath agreeing that it's still the best poll we have. I don't really see anyone disputing that, and that's really at the core of it. The best way to look at these things, with the myriad problems with statistical power and selection bias, is to think about which coasters are generally "better" than others. Those top 10 coasters are probably, roughly, pretty darn good. The next 10, still awesome, but not quite so much. And on and on. 10 is arbitrary, make it 12 or 15.2 if you'd like. If you really care about what is #3 this year vs. what is #4 you have completely missed the point.


What's really missing from this poll (and maybe someone who is more of a stats head could work on this) are confidence intervals. The results are stated as a list of coasters from highest to lowest, but you could easily state things like


"It is X% confident that this percentage of enthusiasts prefer El Toro over Mean Streak"

Or something like that. That would easily satisfy what Jeff is saying because then you'd know, based on the sample size, exactly how likely one coaster really is preferred over another across the population. For El Toro vs. Mean Streak that would be quite high, but I'm guessing for the top 5 coasters in either poll the interval would be quite quite low.

*For more go take a Stats 1 class at your local community college. Your life will be better for it, seriously.

Last edited by matt.,
matt.'s avatar

Ok, if anyone is more of a stats expert here, correct me if I do any of this wrong.

p (the point estimate) = X/n where n = the sample and X = the condition we're looking for. In my first case preference for Voyage over El Toro. According to the spreadsheet

X/n = 77/144 = .535

Z in this case we'll call 1.96 for a confidence level of 95%

(.535(.465))/144 = a decimal you take the square root of, which is .04156... multiplied by 1.96 = .08147

So our range is .453 to .616.

Our sample statistic was 53.5% of voters preferred Voyage over El Toro. We are 95% confident that the value for the ENTIRE POPULATION is between 45.3% and 61.6%.

I don't know about you, but for me that is quite a range. At the low end El Toro beats Voyage quite handily, and that's only at 95% confidence. Also notice the population size doesn't come into play at all.

Moral - take it with a grain of salt. All of it.

Now let's do Voyage vs. T-Express.

X/n = 10/15 = .6667

(.3333)(.6667) / 15 = .0148, square root of that times 1.96 = .2386

Excuse me if I've done any of this wrong but we are 95% confident that between 42.8% and 90.5% of coaster enthusiasts prefer Voyage to T-Express.

That's like nearly a 50 point spread. And again, the size of the population has no bearing.

ApolloAndy's avatar

Haven't read all the comments yet, but 2 points:
1) Absolutely the people who have ridden more coasters should count for more. If a person has ridden 500 coasters in all corners of the world, their opinion on what's good is much more important than someone who has only ridden at Cedar Point.

2) The real issue with the sample size, even if it is 15 out of 20 mutual riders, is that there are too many independent and unrelated variables at that level to draw a conclusive result. Specifically, you don't know if 8 of those 15 happen to have been on the ride on the same day or hate Korea or have eaten KimChee the night before or whatever. There just aren't enough people to eliminate variables NOT related to coaster riding to draw a conclusion about coaster riding.

Edit: Not to mention the fact that you do know that they all traveled and all paid quite a bit of money to get there. That may balance out in the T-express vs. Troy comparison, but it definitely will not balance out in the ExpGeForce v. S:RoS.

Last edited by ApolloAndy,

Hobbes: "What's the point of attaching a number to everything you do?"
Calvin: "If your numbers go up, it means you're having more fun."

rollergator's avatar

Lord Gonchar said:

What if my goal were to find out what people named Jeff Putz who also run roller coaster websites thought of avacados?

After I asked you, would my results be statistically insignificant because my sample size was only one (too small) or would they be entirely accurate because my sample rate was 100% (the entire population of people named Jeff Putz that also run roller coaster websites)?

The short answer is you're not doing a statistical analysis because you didn't take a sample, you have the entire census data at your disposal. You run statistics on samples taken FROM a larger population.

matt. said:

Ok, if anyone is more of a stats expert here, correct me if I do any of this wrong.

(snip) We are 95% confident that the value for the ENTIRE POPULATION is between 45.3% and 61.6%.

Your stats look square, but my understanding is that the interpretation of a confidence interval is supposed to read more like this: "In repeated sampling using the methodology selected, 95% of samples will result in a sample mean between 45.3% and 61.6%." The methodology we used will result in a sample mean OUTSIDE that range 5% of the time. Took me a while to really grasp the CI concept...but I think I have it now.


You still have Zoidberg.... You ALL have Zoidberg! (V) (;,,;) (V)

You must be logged in to post

POP Forums - ©2024, POP World Media, LLC
Loading...