Global warming--the Data, and serious debate

Chalnoth · Nov 6, 2008

thaumaturgy said:
There's no "offset" in the data here. But there's still a massive peak at zero.

You keep saying this, but it's wrong. The value at zero means nothing at all. The zero frequency mode is the average value of the graph. It's not the value at zero that is any interest here, but instead the behavior near zero. You really should remove the zero mode from all such plots.

grmorton · Nov 6, 2008

Tomk80 said:
Depends on what the original purpose of the thermometers was, does it not? "As a scientist" I often have to deal with data that is far from perfect, for example because it is collected by volunteers or because the data was never originally intended for the thing you want to use it for. This doesn't mean this data is useless, but it does mean you need to take into account it isn't perfect. From what I can gather, the data of the National Weather Services wasn't originally intended to specifically deal with long-term climate trends, nor do climate scientists have influence on how the data is gathered. If they don't own the stations, they can at best ask for suggestions to be taken into account. From what I can gather, the network is largely run by volunteers and the personnel involved is too low-staffed to check up on all sites. Next to that, a site may have been good to start with but deteriorated by local factors. Perhaps the thermometer was not placed next to the air vent but rather vice versa, for example?

The US Climate Reference Network has been specifically designed to measure long-term climate trends, but this network has only be operational since 2001, I don't know whether these have been used for comparison purposes already.

This does not make the data à priori useless, it just means that you need to take into account the error they can give. For example by giving less weight to data or by correcting the data of the worse stations with the help of the better stations before aggregating the data. As far as I understand, the latter is being done with the national weather stations. If you can estimate the error and direction of a measurement, you can use the measurement if you take this into account. So rather than rejecting the stations out of hand, you need to study the whole path from observation, via correction to eventual conclusions. Whatever has been discussed so far, the method of correction for differences between stations has been virtually ignored so far. Which to me, "as an epidmiologist" is "disgusting".

"As a scientist", I'm wondering what kind of scientist you are. Not meant denigrating, but I've often noticed differences between different "kinds" of scientists, often stemming from a lack of insight in the different problems different fields have. Watch a discussion between an epidemiologist and a toxicologist for hilarious effect. The difference between an epidemiologist who often has to use an estimate of exposure and effect over large groups to make statements for whole populations to a toxicologist who knows exactly how much of a certain substance and which effect he measured on his very low number of (human or animal) guinea pigs is extremely interesting.

Much of what you say is true. The data is collected by volunteers but that doesn't mean that the data is therefore useful for determining the climate of the past. It may be precisely because it was collected by volunteers that it is not useful.

grmorton · Nov 6, 2008

Split Rock said:
What is wrong with this data? The linear regression shows a 3 degree increase in mean annual temperature over the time period measured. The cyclic nature of the data is apparent, but no one is arguing with that. We are talking about a Global Warming trend, not a warming for one station, or one state, or even one country. I applaud your efforts at routing out bad data, Glenn, but perhaps you are not seeing the forest because you are looking too hard at the individual trees.

The problem with the data is that in 1989 the annual average temperature was 48.09 deg F. It had been hovering around that temperature for about 40 years. Then suddenly in 1990, the annual average temperature took a step change of 10 degrees. the 1990 annual average temperature was now 58.04 deg F. The three degree regression is due totally to this 10 deg F step function in temperature.

Now, I don't know where you live, but wouldn't you find it odd and huge news if the temperature in your city suddenly averaged 10 degrees hotter? Here in Houston, I would mean 110 degree days in the summer. Such temperatures don't come here because the ocean is near.

Here is the original analysis. I took all the good stations. I derived an average for the first 11 years for which ALL stations had temperature readings. That is 1912-1922. I then subtracted that average from each respective station. I did this so that one could have an anomaly trend for each station. One can't compare the temperatures directly because Berekley is on the coast and California has a bit of elevation so, absolute temperature differences are problematical. But by making each station have its own baseline, I could see how each station has trended over the past 100 years. That is the first chart. What we see is 20 degrees of temperature anomaly spread between all stations. If we are supposed to see an average upward trend in this data, I can't figure out how.

Theoretically the 11 years 1912-1922 should be colder than the present years due to CO2 heating. That heating should have come about gradually over the past 100 years. But what we see is something quite different, as you said (using a term I first heard when living in Scotland) it is a dog's breakfast--for those who don't know the meaning of this term, when a dog vomits, he eats it up. That is a dog's breakfast.

The second chart is the same thing only with 1977-1987 as baseline average. I did this one because someone might complain that the old thermometers are no good. But it seems that the new thermometers are no better. At max we still have a 20 deg F temperature spread among the anomalies. This means that the noise is 18 times the signal--1.1 deg Fof warming. For those who have never done signal to noise analysis, one can hardly see the signal if the noise is 2-3x the signal. At least that is the way it is in seismic time series analysis.

Finally I want to post the Electra temperature trend. It too has some real stupidities, yet it is supposed to be a class two, meaning a well-sited station. For 80 years the temperature varied between 57 and 62 deg F. But in 1985 the annual average temperature suddenly jumped by 16 degrees. It is this jump which forces the regression upward. The interesting thing to me is that no one could seriously beleive that over night every day in 1985 was 16 deg F hotter than every day in 1984. That is what it must be on average. That is a ridiculous claim, but that is what the weather data claims happened. What a laughable situation.

Does anyone want to defend that 16 deg jump in temperature? Does anyone want to actually defend the 10 deg jump in Susanville California in 1990? Thaumaturgy, you seem to like defending things that are problematical. Want to do this one? Or are you willing to acknowledge that this data is crap?

grmorton · Nov 6, 2008

thaumaturgy said:
I need to know:

Glenn, presented with the above data set, you can clearly see a linear trend increasing that is coupled with a cyclical trend.

I need to know how would you prove or disprove the obvious linear trend in the data that is unrelated to the cyclical data?

Go ahead, show me the "phase" diagram, just anything, I need to know.

Because I'm seeing a LOT of time series analyses and they run them and factor in or factor out the linear trends.

If you can show me you would prove or disprove a linear trend in data that has a cyclicity we can then revisit this issue in more detail as to whether the earlier "Global" temp data even has a linear trend in it.

Time Series analyses are extremely important to this conversation. Since Time Series are run for data all the time all across the globe, please tell me how you would describe the above data sets.

I'd be very interested to learn how you verify or falsify a linear trend in data that has a cyclic component in it. (Because such things do exist.)

Maybe that will help me prove or disprove my earlier contention on the Global Temperature data we were discussing many pages ago (that started all this).

(Please illustrate with a data set that shows both forms with and without trend and how you differentiate the data in a repeatable and robust manner).

No, Thaumaturgy, I am becoming embarrassed by continuing to have to correct your amaturish mistakes. Let's review, and this will be my last reply to you on the Fourier issue. It is clear that you have a belief a priori to the data, so that you twist everything to fit your belief system. THERE MUST be a secular trend in the satellite data so you twist things to make it so. This FFT discussion started way back in post 7

Post 7

grmorton said:
The satellite data from Huntsville measures the temperature of the lower troposphere. As you can see the chart goes up and down but over 30 years, the tropospheric temperature is just about what it was 30 years ago, only a tiny tiny bit of net warming

In post 13 you binned the data into yearly bins and then said that the temperature rise seen in the satellite data was statistically significant.
In post 24 I said

grmorton said:
You didn't seem to notice that the ending temperature, in 2007 was about the same as the starting temperature in 1979. That is my point. I have no doubt that the temperature goes up and down, but a trend? not necessarily because today's low is not significantly higher than that of 30 years ago.The fact is that we have more CO2 in the atmosphere today than 30 years ago and we don't have a higher tropospheric temperature.

my bolding

In post 40 I uploaded the satellite measurement of the Lower Troposphere temperature which showed cyclicity.

grmorton said:
I will again upload the satellite data. From 1979 to 1997, there is no rise at all. Indeed, the zero line almost perfectly bifurcates the cyclical data. Then there is a bump with the very active solar cycles of the early 90s and early 2000s, then the temperature goes back to about where it was in 1979. I stand by this. This is NOT a linear phenomenon. The ups and down are NOT randomly distributed but cyclically distributed.

Please look at the data, not at your bias.

I would note that the bump I speak of is a box function kind of shape. That means it will have a high low frequency component.
You then came back with the Kappa function, run on yearly summed data and concluded that there is a secular change.

You said two things of interest in Post 72

Thaumaturgy said:
I will make the huge caveat that I have never done a "time series" analysis in the usual stats program I utilize.

That doesn’t make you wrong necessarily but it does mean that you shouldn’t be standing on your hind legs over the issue.

Thaumaturgy said:
Now, again, I am wholly new to the Fisher's kappa function but here's what JMP says about this function

{GRM--after quoting the handbook on the function Thaumaturgy then said}

This indicates to me that at 95% confidence (in fact at 98% confidence) I am reasonable in rejecting the hypothesis that this is more likely a "periodic" function within this time domain

In post 72 you said I had merely looked at the data going up and down and concluded it was a cyclical and that most likely that was a yearly variation.

In post 90 you rejected cyclicity in the satellite temperature data.

thaumaturgy said:
I strenuously disagree. My linear regression has a p-value showing significance. YOU are under a burden to prove that the cyclicity is a better model.

And in that same post you wrote

Thaumaturgy said:
Ask your stats friend to explain the Fishers' Kappa function. I would love to learn more about it. But from what I can tell, Fisher's Kappa indicates no such "cyclicity" among the noise to a 99% level of assurity.

My bolding. But of course, as we shall see, you admitted this was wrong.
In post 97, I had to point out to you that you screwed it up by using yearly averages in your Kappa calculation and in your FFT. Until then, you didn’t know of this problem.

grmorton said:
First off, the satellite temperature data you have is a yearly time series, not a monthly. I used monthly. Thus, you miss out on the monthly periodicity. You still get a periodicity of 4 years rather than 64 months but I think that is because you are not using the monthly data but are clearly using annual data. That reduces the fidelity of the signal and impoverishes the frequency content. Beyond that, since I didn't do your analysis, I can't explain it.

In post 105, I again had to point out that your kappa function was based upon your the yearly data which was a huge error on your part.

Finally in post 129, after standing on your kappa function, accusing me of not discussion statistics, and not proving that the satellite data was cyclical astoundingly you write:

Post 129

Thaumaturgy said:
I think I was mistaken about the Kappa function. It does show a statistical significance for cyclcity when it is low on the p-value.
No problem. We see from the graph that, as Glenn has pointed out, there is, indeed, cyclicity. AND it has a multi-year period. The residuals bear this out.

HOWEVER, from what I can tell the large peak at or near "zero" on the FREQUENCY graph, as well as the raw data graph itself, show a secular trend.

The way JMP models time-series is to assume the larger secular trend is actually just an extremely long-wavelength cyclicity I believe. Hence the "periodogram" in the lower left with an extremely long period peak of importance.

So we are back at square one. Indeed there is a cyclicity that is on a longer time scale than merely a seasonal as would be expected.

Bolding mine.

You were so [bless and do not curse][bless and do not curse][bless and do not curse][bless and do not curse] sure I was wrong and then you had to admit that you didn’t know what you were talking about. It is clear from this that you are quite willing to take hard stands on things you are inexpert at.

Then you changed the issue to try to say that the zero frequency was a secular trend. This is to try to maintain your belief that there is a strong secular trend in the satellite tropospheric temperature data in spite of it only going up at 1/3 the rate of your global temperature trend.

What rubbish that is. And you are still saying it. Even the low frequency component doesn’t ensure a secular trend, as I demonstrated with the box function, yet you continue to claim that Fourier analysis with high amp low frequnencies requires a secular trend. It doesn’t. You clearly don’t know what the heck you are talking about and you have had to back track numerous times, including once when you mis-read annual maximum temperature for being an annual temperature

You acknowledged that you hadn’t done time series analysis, but yet you act as if you know what you are doing. It is all bluster with you isn’t it?

You have made so many mistakes and errors, that I am beginning to get embarrassed to keep pounding you on them. It took a lot of pounding to get you to see that air conditioners would bias a temperature record. You stood on the p-test that there was no cyclicity in the satellite data and then had to back track. You stood on the Kappa function and then had to back track, admitting that you were wrong. Yous aid that the California station surveys were biased to big cities, but I pointed out that the stations included a large percentage of towns with populations beneath 2000. You didn't bother checking up on that before you made the assertion. Your credibility is about zero with me, Thaumaturgy. And tonight I see that Chalnoth is still telling you to get rid of the zero in the fft. He is correct; you are again wrong.

Thaumaturgy said:
There's no "offset" in the data here. But there's still a massive peak at zero.

Chalnoth is quite correct. You are utterly in error on this point. No doubt it won’t stop you from continuing to assert nonsense.

Your claim was, that the satellite data didn’t show cyclicity. I said it did.

You now agree with me as documented in post 129

Post 129

Thaumaturgy said:
I think I was mistaken about the Kappa function. It does show a statistical significance for cyclcity when it is low on the p-value.
No problem. We see from the graph that, as Glenn has pointed out, there is, indeed, cyclicity. AND it has a multi-year period. The residuals bear this out.

My bolding,

You then claimed that the low frequency component in the satellite data means there is a secular trend. You are in the process of changing by now claiming that it can mean a secular trend. It can, but your logic is highly flawed. While secular trends will have some low frequency, you can’t do what you did and look at an FFT and conclude that there is a secular trend. It is a one way gate. Most secular trends require low frequency (unless it is very steep) but not all low frequencies indicate a secular trend. Below is a power spectra from seismic data. It has no secular trend, yet it has a huge amp at 1 hz. Thus, you can't look at the satellite data and claim that the FFT proves there is a secular trend any more than you can say one exists in a box function or on seismic data. Flawed logic leads to flawed conclusions.

Since our original disagreement which got us into the FFT was about the cyclicity of the satellite data and you now agree with me, I see no reason to spend any further time educating the unwilling on this issue. Believe what you want about the secular trend. What you currently believe is wrong, but that is so tangential to the discussion of the validity of the data for determining global warming, I won’t discuss it further with you. It is not worth my time. Say what you will about this but you may have the last word on FFT (this time It is over really since you clearly agree that the satellite data is periodic, which is contra your original claim)

When you have said that I should deal in things I am familiar with, I am doing that—by your own admission in post 72 you were new to time series analysis and the kappa function and you have screwed both up. And you even had to withdraw your statement that there was no cyclicity based on the p-test (see above). And you say I don’t know what I am talking about or that I don't understand statistics. LOL.

You may have the last word on this topic. I am tired of beating up on you.

Tomk80 · Nov 7, 2008

grmorton said:
Much of what you say is true. The data is collected by volunteers but that doesn't mean that the data is therefore useful for determining the climate of the past. It may be precisely because it was collected by volunteers that it is not useful.

No, it doesn't mean it is useful. It doesn't mean it useless either, however. And I haven't seen you actually engage in that discussion yet. You reject the methods of making the data useful out of hand, without actually discussing it.

thaumaturgy · Nov 7, 2008

Chalnoth said:
You keep saying this, but it's wrong. The value at zero means nothing at all.

Chalnoth, please re-read the various postings I've made from both the SAS Institute and a PhD statistician.

I will defer to experts on statistical time series analyses.

(YOu will note the same "peak at or near zero" shows up whether I have an offset or not in the data, as I showed in numerous previous postings).

An extremely (as in extreeeemely long wavelength cycle will result in an extremeeeeemely low frequency peak.

If you deny that, then I am unsure what kind of math you are doing.

Again, please refer to the numerous citations available on statistical time series analyses.

The zero frequency mode is the average value of the graph. It's not the value at zero that is any interest here, but instead the behavior near zero. You really should remove the zero mode from all such plots.

NOTE:

from the SAS Institute:

The data are displayed as a thick black line in the top left plot. The periodogram of the data is shown as dots in the top right
panel. Note the exceptionally high periodogram values at low frequencies. This comes from the trend in the data. Because
periodogram analysis explains everything in terms of waves, an upward trend shows up as a very long (low frequency) wave.(SOURCE)

(emphasis added)

From a PhD Statistician who is also a 6 Sigma Master Blackbelt and Statistics instructor in response to my question:

Thaumaturgy said:
In the earlier e-mail you stated that you parsed out the "linear" trend as a zero peak in the periodogram of the RESIDUALS of the linear fit to the data.

I'm a bit confused. To help me understand this a bit more I ginned up a fake data set that has one cyclical component (a sin wave) with and without an overlayed linear trend (Y trend) which was generated by taking the sin function and then adding on an (X-mean(X)) factor to give it a nice linear trend.

I ran a time series on both and saw that nice big spike at zero for the time series data on Y-Trend.

Am I correct in the statement:

Linear trends in time-series data are often represented by a peak in the frequency periodogram at zero

Or am I missing something altogether here?

(Also, the residuals of the linear fit of the Y-Trend data shown here plot with the same sine wave frequency as the original data set, which is what I'd expect).

To which he responded thusly:

PhD Statistician said:

Yes, that is correct.

Click to expand...

If you have a problem with these points I highly recommend you work on re-writing the mass of information around statistical time series analysis.

Indeed, as I have shown repeatedly when you construct a data set with ONLY one cyclical component and a demonstrable linear trend and plot them with and without the demonstrable linear trend term the peak in the periodogram at zero disappears.

Honestly, I hope you and Glenn are not claiming that a "secular trend" could be represented as an extreeeemely long period wavelength cycle which, in the shortened time scale of the graph would show up as a secular trend.

Further, I'm no expert on FT but I'm not a complete idiot:

LONG WAVELENGTH FUNCTIONS HAVE LOW FREQUENCIES.

That's really all the periodogram is saying. It can't tell the difference between a secular trend and an extremely long period wavelength function. The peak at or near zero is merely a very low frequency wave which could be a secular trend.

grmorton · Nov 7, 2008

Tomk80 said:
No, it doesn't mean it is useful. It doesn't mean it useless either, however. And I haven't seen you actually engage in that discussion yet. You reject the methods of making the data useful out of hand, without actually discussing it.

I thought I had explained why it is useless. search on the term signal to noise. search on the term standard deviation in my posts.

Lets start with signal to noise. The first picture is a contour map of a pyramid with a hole in the center. It is viewed from 50 deg above the horizon. YOu can see the pyramid sticking up. The sunlight is coming from the upper right. In this picture there is no noise so the signal to noise ratio is infinite. Signal/0 = infinity.

Now, lets add some noise. The second picture shows the pyramid with a 9 to 1 signal to noise ratio. This is calculated based upon the average energy of the signal and noise separately. With a S/N of 9 one can easily see the pyramid.

Now the next picture has a S/N of 2. One can barely make out the pyramid.

I won't show it but even at a S/N of 1, you can barely make out the pyramid. But, when noise is twice the signal, find the pyramid. That is the last picture.

How does this relate to global warming? Well the signal we are trying to get out of the data is 1.1 deg F. Yet we regularely see spikes of 10 deg F or more. Closely spaced cities vary in annual temperatures by up to 6 deg over 20 miles in areas (like Halletsville and Flatonia Texas), which should not differ that much in temperature. Anyone who has been to that part of Texas knows it all looks the same.

Another part of the noise is the 16 degree temperature steps in places like Electra, CA or the 12 deg temperture step at Susanville, or the 10 degree jump at Wenatchee washington. There are lots of these noise steps as well. And they must be considered to be noise. If a temeperature step of 10 deg can occur in the record, the queston must be asked, would we recognize a step of half a degree? Maybe, maybe not.

Now, I also earlier discussed the standard deviation of temperature records. I pointed out that it is ridiculous to use a ruler to measure your table and come up with the data that says the table is 5 feet long plus or minus 10 feet! Such a measurement is useless.

Now to standard deviation. I took the area of eastern Colorado and studied the temperture records of 5 towns mentioned in an article that Thaumaturgy pointed me to. This area is about 70 x 40 miles wide. Thus the climate should be somewhat similar. It is is far eastern Colorado so there aren't local mountains to get in the way.

I did a de-spike, meaning I got rid of 5 deg sudden changes in temperture. Then I did an adiabatic correction on the data, which datumized the temperatures to a constant elevation. Then I de-biased the temperatures by making all their averages equal. Then I scanned each year for the maximum and minimum annual temperature. The average Max-Min temperature was 2 deg F. for all the years. It was not the case that one town was always hotter and one town always cooler. The rankings changed with each year. That is noise. It seems to me that when the average variance of temperature across a small area is 2 deg F, then to claim that you can measure a 1.1 deg F change is doubtful.

Then I took a standard deviation of the data and it was 1.5 deg F. Now, clearly stacking lots of stations would help, IF those stations actually are measuring what we think they are. Does it make more sense to say that your table is 5 feet long pluss or minus 7 feet or that the temperature has risen 1.1 deg pluss or minus 1.5? It makes no sense to say either. Did you see the Chinese data I posted? I have lots of nonsense like that. Towns just a few miles away, one freezes all year and one is balmy. I can show that to you with Walden NY vs West Point NY. In 1965 Walden's annual average temperature was 38 deg F. 16 miles or so away, at West Point was a balmy 51 deg F annual average temperature. Don't think of these as daily temperatures which can swing wildly. Yearlly averages shouldn't swing like this.

So, yes, I have discussed the unusuability of the data. you must have missed it.

Off to the ranch

thaumaturgy · Nov 7, 2008

grmorton said:
No, Thaumaturgy, I am becoming embarrassed by continuing to have to correct your amaturish mistakes. Let's review, and this will be my last reply to you on the Fourier issue. It is clear that you have a belief a priori to the data, so that you twist everything to fit your belief system. THERE MUST be a secular trend in the satellite data so you twist things to make it so. This FFT discussion started way back in post

OK, just so we know you are unable to address the professional time series experts on this topic. That way I know where you stand.

Mr. Glenn Morton takes exception to the SAS Institute and a PhD statistician and he can't explain why they are wrong and he is right.

Got it!

In post 13 you binned the data into yearly bins and then said that the temperature rise seen in the satellite data was statistically significant.
In post 24 I said

And indeed, when you "unbin" the data the same linear trend shows up with a p-value of 0.0001.

Now, look Glenn, you are clearly an amateur in statistics and I hate to have to keep correcting your freshman mistakes, but a p-value like that on a fit means there's a 99.99% chance that it is non-zero (ie a trend).

I even went and did the time series analysis on the data (even after I confessed to over-filtering it, I went back, removed that filter and re-ran it). Found a 45 and 90 month period cycles and a significant linear trend.

In fact, I went and showed that data to my PhD 6-Sigma Master Blackbelt friend who said this:

Statistician said:
I think there is a reason to believe in a linear trend--it's called global warming. The opposite view is that the trend is just part of a longer cycle. The periodogram cannot distinguish between these two hypotheses. It can only distinguish cycles that repeat within the time span of data series.

Whatever the big picture really is, there is no doubt there is some kind of trend within the time span of your data.

(emphasis added).

Now you are stuck with your own confirmation bias which you are maintaining by selectively filtering out the statistics.

I can't help you there. Only you can learn statistics for yourself.

In post 90 you rejected cyclicity in the satellite temperature data.

No, what I said was there is a measurable, statistically significant linear trend, and that it was up to you to prove the data was dominated by a cyclic function. That was before I educated myself on time-series analysis.

I have since readily granted there is cyclical components in the data, but you have yet to acquiesce to the solid statistical reasoning and the WORD OF STATISTICANS WHO ARE PROFESSIONALS that there is also a "secular" trend that is evident in the data.

I was incorrect in eliminating the cyclicity initially, but the cyclicity changes nothing in the original claim that there is a secular (or at least a very, very long-period trend).

Finally in post 129, after standing on your kappa function, accusing me of not discussion statistics,

You will note that at all points in the kappa discussion I allowed that I might be in error.

You see, for me honesty in the discussion is important.

That is why I repeatedly asked you to verify or deny it.

YOU WILL NOTE THAT THE PERSON WHO CORRECT ME WAS......ME.

(It always is. I'm the one who finds my errors and exposes them 99% of the time. That is called "honesty").

Then you changed the issue to try to say that the zero frequency was a secular trend. This is to try to maintain your belief that there is a strong secular trend in the satellite tropospheric temperature data in spite of it only going up at 1/3 the rate of your global temperature trend.

What rubbish that is. And you are still saying it.

The SAS Institute is saying it.

A PhD Statistician who is a 6-Sigma Master Blackbelt is saying it.

Even the low frequency component doesn’t ensure a secular trend

Are you...gasp...BACKPEDDLING???? You told me it couldn't be a secular trend! HA! Seems you were caught in a classic logic error!

Good on ya! Glad you could "confess it".

, as I demonstrated with the box function, yet you continue to claim that Fourier analysis with high amp low frequnencies requires a secular trend.

The key is the low frequency.

It doesn’t. You clearly don’t know what the heck you are talking about

ROFLMAO! Please, then, MR. Morton, TAKE IT UP WITH THE SAS INSTITUTE AND PhD STATISTICIANS!!!!!

You are too funny. You aren't even arguing with ME anymore...YOU ARE NOW ARGUING WITH YOUR OWN MATH!

and you have had to back track numerous times, including once when you mis-read annual maximum temperature for being an annual temperature

Sorry if "honesty" offends you.

You acknowledged that you hadn’t done time series analysis, but yet you act as if you know what you are doing. It is all bluster with you isn’t it?

I have now.

You have made so many mistakes and errors, that I am beginning to get embarrassed to keep pounding you on them

THEN ADDRESS THE POINTS BY THE SAS INSTITUTE AND A PhD STATISTICIAN.

Pound on them for a while.

A few questions for you:

1. Do you or do you not think that low frequency wavelengths will show up as a low frequency spike on a periodogram

2. Do you or do you not think an extreeeeemely low frequency cyclic trend in the data can show up as a steady increase when sampled over a shorter time frame (ie much, much, much shorter than its wavelength)

3. Do you or do you not believe that such a trend, if sampled at a small enough fraction of its wavelength could appear to be a linear trend.

I suggest you answer these questions.

Then we'll get back to your "understanding" of the difference between confidence intervals and standard deviations.

grmorton · Nov 7, 2008

Before I go I thought I would post the temperature difference between 2 chinese stations, 132 minus 38. These are 75 miles apart in a relatively flat area of China and they should have similar climates. But they differ by 12 deg C in 2004-2005. That is about 20 deg F! And notice that it changed for two years and then the temperature difference reversed by 16 C by 2007. Clearly this didn't happen. I lived in China in those years about 300 miles north of there. That simply DIDN'T HAPPEN. So, that means that the magnitude of the error in the temperature record here in China is as much as 16 deg C and we think we can measure .065 deg C warming. What an exercise in utter self-delusion.

thaumaturgy · Nov 7, 2008

Chalnoth · Nov 7, 2008

thaumaturgy said:
Chalnoth, please re-read the various postings I've made from both the SAS Institute and a PhD statistician.

I will defer to experts on statistical time series analyses.

(YOu will note the same "peak at or near zero" shows up whether I have an offset or not in the data, as I showed in numerous previous postings).

The problem isn't near zero. You keep saying at zero. Yes, a trend results in strong low-frequency components. But the zero mode itself has nothing to do with the trend at all: the zero mode is just the average value of the data taken, since the zero-frequency mode is just a constant.

Also, it's worth noting that the amplitude of the frequency spectrum alone doesn't say whether or not there is a trend. The phase is also important, so it's best to look at the data in the time domain to see whether or not there is a trend.

thaumaturgy · Nov 7, 2008

Glenn is under no obligation to reply to this, but I'd like to hear Chalnoth's response:

I would dearly love some of our local experts to explain the following:

I ginned up some data on Excel to prove a point to myself. Then I ran a time-series analysis on JMP. If you can't explain this I will respect, but I am going the extra mile to check my own thinking as well as yours.

DATASET:
A = arbitrary amplitude (100)
f = arbitrary frequency (0.85)
X = 1 to 256

Y = A*SIN(X*f) {NO TREND CYCLIC DATA SET)
Here's the data and the spectral density plot:

Y= A*SIN(X*f)+(X+mean(X)-mean(A*sin(X*f)+(X+mean(X)) {To ensure the y-mean = 0 --no offset)

Y with a Box Function (generated by taking X-values 65-192 and adding in an arbitrary offset = 200) Then I took this data and removed any offset such that the y-value mean = 0

What this says:

A single pulse offset in the data will result in a spike on the spectral analysis at or near zero.
A monotonic secular trend in the data will also result in a spike on the spectral analysis at or near zero

If you have any comments on this, you now have the formulae I used and the graphics. The y-values (except for the Y=A*sin(X*f)) have a mean value of ZERO.

This means further:

You (Glenn and Chalnoth) are correct
I am correct

What have I done wrong in my reasoning here? If you like I can send you a tab delimited copy of the data so you can run your various analyses on it to prove that I am somehow in error.

You have not convinced me. Nor have you proven that SAS and a PhD statistician are somehow equally in error.

NOTE: I am NOT saying you are incorrect in attributing a spike at or near zero to some arbitrary data offset, however, I AM saying that a monotonic secular trend will equally show up as a spike at or near zero in the spectral analysis of the data.

If you wish to declare me wrong, prove it to me. Use this data set. Show me the error. Show me the offset. Show me why the spike at or near zero DISAPPEARS when the secular trend is removed.

That is all you have to do.

Chalnoth · Nov 7, 2008

My beef was mainly just the particulars of the language you used. If you look closely at your zero-mean trend in frequency space, you will see that the zero mode has an amplitude of zero.

The main point, however, is that the amplitude of the frequency spectrum doesn't encapsulate all of the data, so while a trend will always look like what you're seeing there, just seeing the strong spike in the spectrum near zero does not necessarily indicate an overall trend. There is no question that there is a trend, but that comes primarily from looking at the time series data. A really easy way to see this is to filter the data by removing all of the high-frequency information.

To do this you could:
1. Take the data with the trend (be it the fake data, or the real temperature data).
2. Perform an FFT.
3. Multiply the Fourier transformed values by the function e^-(f/F)^2, where F is the cutoff frequency (You could, of course, just set all values above the cutoff to zero, but the Gaussian filter tends to produce cleaner results). You should set the cutoff to between where the near-zero spike falls off and the high-frequency information starts to come in.
4. Once the filtering has been done, transform the data back to the time domain to see what it looks like.

What you should see after doing this is a very smooth increasing trend. All that you've done is get rid of the short-term variation, to highlight what the data is doing on long time scales. There may be some ringing near the beginning/end of the plot that appears because it's not a periodic stream. It's safe to ignore that.

thaumaturgy · Nov 7, 2008

Chalnoth said:
so while a trend will always look like what you're seeing there, just seeing the strong spike in the spectrum near zero does not necessarily indicate an overall trend.

This is the "all dogs are animals, not all animals are dogs" part of the debate:

A spike at or near zero can indicate a secular trend (not a just according to me, but according to SAS and a PhD statistician). Indeed I have proven this in my data set by having and removing a secular trend in the data and showing the fourier transform spectral analyses of those.

One has the spike, one doesn't. The only thing that changes is the inclusion of a monotonic secular trend increasing.

There is no question that there is a trend, but that comes primarily from looking at the time series data.

Agreed, but when looking at exceptionally noisy data in order to parse out the possibility of a secular trend in the data we have to rely on looking at the spectral analysis of the time-series data.

SAS and statisticians tell me, explicitly, that such a trend can show up as a spike at or near zero.

This is not under debate. Unless one is to debate with statisticians. That is not my beef.

A really easy way to see this is to filter the data by removing all of the high-frequency information.

That can be quite easily done without even running a filter on it. I can tell you the equation for that line is exactly:

Y=(X-MEAN(X))-MEAN{X-MEAN(X)}

That is a perfect line with mean Y value = 0. The spectral analysis of that plot comes out with only the spike at or near zero (as one would obviously expect).

What I am arguing here (and Glenn seems to be missing) is that a secular trend can and does show up as a spike at or near zero. This is proven not only by the data which I have presented but also by the comments by both SAS and a PhD statistician (6 Sigma Master Blackbelt).

I honestly don't see what the debate here is all about. Glenn seems to be of the impression, initially, that the spike at or near zero cannot be attributable to a secular trend. It surely can.

NOW, to be fair to Glenn it could also be some gigantic long period wave, but neither of us can say one way or the other. We are all three of us technically correct.

There is a visible spike at or near zero and it can mean a secular trend. Not that it necessarily does so..

As you state, the analysis of the time-domain data shows that. A brute-force approach of a simple linear least-squares regression shows that the data has a statistically significant non-zero trend at 99.99% confidence.

Chalnoth · Nov 7, 2008

thaumaturgy said:
This is the "all dogs are animals, not all animals are dogs" part of the debate:

A spike at or near zero can indicate a secular trend (not a just according to me, but according to SAS and a PhD statistician). Indeed I have proven this in my data set by having and removing a secular trend in the data and showing the fourier transform spectral analyses of those.

One has the spike, one doesn't. The only thing that changes is the inclusion of a monotonic secular trend increasing.

Well, no. You can take as an example a data series that is the sum of two linear trends, one increasing and one decreasing with equal slope. Overall there would be no trend, but the power spectrum would look nearly identical to the monotonically increasing trend.

Basically what the spike near zero frequency means is that the data is drifting on long time scales. This drift won't necessarily be all in the same direction, however.

thaumaturgy said:
Agreed, but when looking at exceptionally noisy data in order to parse out the possibility of a secular trend in the data we have to rely on looking at the spectral analysis of the time-series data.

This is where the filtering that I mentioned comes in. By filtering out the high-frequency noisy variation, the overall trend becomes more visible. It's essentially a somewhat more sophisticated way of binning the data. So, say, if you cut out the frequency scales with periods shorter than 3 years, the result is very similar to, at every point in the graph, taking the 3-year band surrounding that point, averaging it, and plotting the result.

thaumaturgy said:
SAS and statisticians tell me, explicitly, that such a trend can show up as a spike at or near zero.

Well, more specifically, a trend will show up as a spike near zero (but not at zero: that's the average). The question is whether or not the spike indicates a trend or just long-wavelength variation.

thaumaturgy said:
That can be quite easily done without even running a filter on it. I can tell you the equation for that line is exactly:

Y=(X-MEAN(X))-MEAN{X-MEAN(X)}

That is a perfect line with mean Y value = 0. The spectral analysis of that plot comes out with only the spike at or near zero (as one would obviously expect).

Well, not quite. There will be some ringing near the ends of the graph (a linear trend actually has significant components out all the way to the Nyquist frequency). But you're right, the result would be simple. It would be more interesting to perform on the real temperature data.

thaumaturgy said:
As you state, the analysis of the time-domain data shows that. A brute-force approach of a simple linear least-squares regression shows that the data has a statistically significant non-zero trend at 99.99% confidence.

Right, and what I'm saying is that if you're trying to demonstrate a secular trend, this is the sort of thing you do. The spectral analysis is interesting for picking out periodic variations, but not the trend. The trend shows up, but that feature that shows up can indicate things other than a trend as well.

thaumaturgy · Nov 7, 2008

Back to the Data

Using the data Glenn supplied earlier, here's a summation of the data so far:

ORIGINAL DATA:
Taking the original data I developed a "timestamp" that is of the following form:

Year+(month#/12)/10
This generated a "Time stamp" in which Jan 1979 is represented as 1979.08333, Feb 1979 is 1979.1667, June 1979 is 1979.5, etc. So as to evenly space the monthly data out for treatment by the statistics program.

The first thing a statistician will do is check to see if there's an overall linear trend:

There is. It is not a good fit (adjusted R[sup]2[/sup] around 27%), but the trend overall with the data is statistically significantly non-zero. (The little red area is the 95% Confidence of the fit)

No way to argue with that. Unless one wishes to deconstruct all known statistics from the last 2 centuries.

BUT, there is clearly some other things going on here. As Glenn noted there is some cyclic appearance to the data.

In order to parse that out it becomes necessary to treat the data using time series analysis.

There is clearly at least one or two larger cycles to the data.

What happens if we take the orignal GLOBAL data and SUBTRACT an amount out that is equal to the "linear fit" from the earlier portion of the study?

This is what the data looks like:

NOW it looks a LOT like the ORIGINAL data (within rounding errors and approximations) except it no longer has a linear term.

It is much easier to see the longer period stuff going on closer in to zero as we have effectively eliminated the secular trend.

If you were to run the statistics of a linear least-squares regression on this data you will find it has a p-value of 0.9981 indicating that it is effectively a "flat line" (ie no trend).

ALL that was done is that I took the ORIGINAL GLOBAL DATA and subtracted the following factor:

(0.0128133*GLOBAL-25.45458)

which was calculated from the Linear Least Squares fit from the earlier data set.

Hence eliminating the linear trend as well as I can.

The point of this exercise is to point out that there is obviously some additional trend in the data (in this case the linear trend) which can be filtered out but result in a spectral analysis that effectively still shows the shorter wavelength cycles.

Conclusion:

While it is impossible to say if the "secular trend" modeled as a linear trend is or isn't some hugely long period cycle, it is possible to say that within the window of this data, there is, along with shorter period cyclicities an overall increase in the tempearture.

QED

shernren · Nov 7, 2008

Niggling question. What're the error bars on those little dots?

gheorgie · Nov 7, 2008

It is extremely difficult to find an intelligent discussion of the facts concerning global warming. Usually the 'discussions' consist of one side or the other throwing postures and spin, with a final altar call of "therefore thou shalt believe this."

I'm definitely in the undecided camp in this particular election. For instance, I found the Al Gore show (along with many other supposed 'proofs' for GW) to be long on scare and short on facts. Which is actually disturbing, considering he and others are asking (nay, demanding) us to change our lives to conform to their fears. Such a potentially important topic needs more open discussion of the observable facts and less hype.

Likewise I find the "cons" to be typically dismissive and parochial. For instance, I've grown wary of oil industry wanks running another game on us. I'm not referring specifically to Mr. Morton, but I'm sure that he's seen it.

With that in mind, I'd like to applaud both Morton and Thaumaturgy for about the best discussion I've ever seen on the facts involved. (or at least some of them) I have a faint idea of what this match has cost both of them in terms of time and especially, emotional energy. I know how long it took me just to wade thru the arcana of these 20 pages. But I'd like to think that it's for my benefit, as just another American looking to make informed decisions.

So by all means guys, do keep sparring. Try to limit the ad hominem, and keep making the main thing, the main thing: IS GLOBAL WARMING ACTUALLY HAPPENING? And it's corrolary: IS IT ANTHROPOGENIC? Inquiring people do want to know why you believe as you do. A lot might depend on it.

Chalnoth · Nov 7, 2008

I recommend checking out this website for up-to-date information on the science of global warming:
http://www.realclimate.org/index.php/archives/2007/05/start-here/

Besides just listing out the information, they also list a series of websites that go through the various arguments against anthropogenic global warming. I particularly liked this one:
http://www.skepticalscience.com/argument.php

thaumaturgy · Nov 8, 2008

shernren said:
Niggling question. What're the error bars on those little dots?

The data set provided has no error associated with each data point, presumably because it is simply a single measurement (month and year). But I cannot say for sure.

Certainly each individual data point has the potential for error and it would be good to analyze that.

Hence the need for gauge studies on the various temperature measurement systems.

That's why it isn't a bad thing for groups like surfacestations.org and others to focus on making sure to find bad gauges, but, again, the key is that the science of global warming is not underlain solely by a handful of surface temperature sites.

Glenn is focusing on anecdotal data around a group of particularly bad sites, which tells little about the overall validity of the data which is supported from a number of independent sources.

This is why the statistics is important as is the overall appreciation of the data from the various sources.

Global warming--the Data, and serious debate

Senior Contributor

Senior Member

Senior Member

Senior Member

Titleless

Well-Known Member

Senior Member

Well-Known Member

Senior Member

Well-Known Member

Senior Contributor

Well-Known Member

Senior Contributor

Well-Known Member

Senior Contributor

Well-Known Member

you are not reading this.

Newbie

Senior Contributor

Well-Known Member

Similar threads

Privacy & Transparency

Privacy & Transparency