Log in
Register
Search
Search titles only
By:
Search titles only
By:
Forums
New posts
Forum list
Search forums
Leaderboards
Games
Our Blog
Blogs
New entries
New comments
Blog list
Search blogs
Credits
Transactions
Shop
Blessings: ✟0.00
Tickets
Open new ticket
Watched
Donate
Log in
Register
Search
Search titles only
By:
Search titles only
By:
More options
Toggle width
Share this page
Share this page
Share
Reddit
Pinterest
Tumblr
WhatsApp
Email
Share
Link
Menu
Install the app
Install
Forums
Discussion and Debate
Discussion and Debate
Physical & Life Sciences
Is global warming just another End-of-the-World delusion?
JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an
alternative browser
.
Reply to thread
Message
<blockquote data-quote="Lucy Stulz" data-source="post: 63316850" data-attributes="member: 328376"><p>-sigh-</p><p></p><p>Let's do some real world work here:</p><p></p><p>In <a href="http://cran.us.r-project.org/" target="_blank">R</a> I have generated two POPULATIONS, A and B. (You are familiar with R no doubt, since you are a computer scientist and have training in statistics).</p><p></p><p>I used the following commands to generate the two populations:</p><p></p><p><span style="font-family: 'Fixedsys'">A<-rnorm(1000,2,4)</span> which gives a normal distribution with a mean of 2 and stdev =4</p><p></p><p><span style="font-family: 'Fixedsys'">B<-rnorm(1000,7,4)</span></p><p></p><p>Then I ran R to generate a few samples we'll call "a1","b1", "a2", "b2", "a3","b3", etc.</p><p></p><p>Each will be 10 items from the POPULATION.</p><p></p><p>I will use the following command:</p><p></p><p><span style="font-family: 'Fixedsys'">a<-sample(A,10,replace=FALSE, prob=NULL)</span></p><p></p><p>Now, you can run this sort of test yourself and you'll get different values but remember: the TRUE MEAN of A is "2" and the TRUE MEAN of B is 7</p><p></p><p>On repeated SAMPLING I get the following means:</p><p></p><p>a: 1.6, 0.9, 3.2, 3.5, 3.3 </p><p>b: 5.3, 8.2, 5.4, 8.6, 7.7</p><p></p><p>And for each of these sample pairs I ran t-tests which give the following p-values:</p><p>0.12, 0.001, 0.324, 0.016, 0.03</p><p></p><p>MEANING in 3 out of 5 runs I was able to see a significant difference between the two SAMPLES.</p><p></p><p>In reality the t.test(A,B) command returns a p-value of 2.2e-16 (!!!) That's pretty significant.</p><p></p><p>It is a large data set so it can easily show the differences.</p><p></p><p>The samples are, NECESSARILY off from the TRUE POPULATION.</p><p></p><p>Was my sampling "bad"? Well it was "random" sampling, so it's about as good as I could get.</p><p></p><p>And in the end, that is what Anderegg et al were doing. SAMPLING A DATABASE. And they clearly outlined the criterion which is the BEST that can be done with a given database. That's it! </p><p></p><p>Your critiques of the "problems" with the "hit count" for any given researcher have two problems:</p><p></p><p>1. Will that difference in "hit count" result in a REAL difference in result?</p><p></p><p>2. Given the nature of the data base this is properly called NOISE (error), and as such, since it is not UNDER THE CONTROL OF ANDEREGG ET AL is not an induced bias.</p><p></p><p>ONLY unlike MY example the DIFFERENCES ARE SO LARGE in the samples that the p-values from the Mann-Whitney U test make it really hard to see how those numbers could be shifted significantly. (Not to say they can't!)</p></blockquote><p></p>
[QUOTE="Lucy Stulz, post: 63316850, member: 328376"] -sigh- Let's do some real world work here: In [URL="http://cran.us.r-project.org/"]R[/URL] I have generated two POPULATIONS, A and B. (You are familiar with R no doubt, since you are a computer scientist and have training in statistics). I used the following commands to generate the two populations: [FONT="Fixedsys"]A<-rnorm(1000,2,4)[/FONT] which gives a normal distribution with a mean of 2 and stdev =4 [FONT="Fixedsys"]B<-rnorm(1000,7,4)[/FONT] Then I ran R to generate a few samples we'll call "a1","b1", "a2", "b2", "a3","b3", etc. Each will be 10 items from the POPULATION. I will use the following command: [FONT="Fixedsys"]a<-sample(A,10,replace=FALSE, prob=NULL)[/FONT] Now, you can run this sort of test yourself and you'll get different values but remember: the TRUE MEAN of A is "2" and the TRUE MEAN of B is 7 On repeated SAMPLING I get the following means: a: 1.6, 0.9, 3.2, 3.5, 3.3 b: 5.3, 8.2, 5.4, 8.6, 7.7 And for each of these sample pairs I ran t-tests which give the following p-values: 0.12, 0.001, 0.324, 0.016, 0.03 MEANING in 3 out of 5 runs I was able to see a significant difference between the two SAMPLES. In reality the t.test(A,B) command returns a p-value of 2.2e-16 (!!!) That's pretty significant. It is a large data set so it can easily show the differences. The samples are, NECESSARILY off from the TRUE POPULATION. Was my sampling "bad"? Well it was "random" sampling, so it's about as good as I could get. And in the end, that is what Anderegg et al were doing. SAMPLING A DATABASE. And they clearly outlined the criterion which is the BEST that can be done with a given database. That's it! Your critiques of the "problems" with the "hit count" for any given researcher have two problems: 1. Will that difference in "hit count" result in a REAL difference in result? 2. Given the nature of the data base this is properly called NOISE (error), and as such, since it is not UNDER THE CONTROL OF ANDEREGG ET AL is not an induced bias. ONLY unlike MY example the DIFFERENCES ARE SO LARGE in the samples that the p-values from the Mann-Whitney U test make it really hard to see how those numbers could be shifted significantly. (Not to say they can't!) [/QUOTE]
Insert quotes…
Verification
Post reply
Forums
Discussion and Debate
Discussion and Debate
Physical & Life Sciences
Is global warming just another End-of-the-World delusion?
Top
Bottom