Is Jerry Taylor Doing What I Think He’s Doing?
In researching another point, I came across an interesting argument from a 2015 blog post by Niskanen Center’s Jerry Taylor. (It summarized his contribution in a debate on carbon taxes.) It seems that JT is making a pretty bad inference from a chart, but since I’m not predisposed to agree with him, I’m seeking feedback from you folks.
Here’s Taylor:
We also hear quite a bit from the Right about how the computer models have wildly over-predicted warming and thus should not be informing our policy going forward. Again, courtesy of Berkeley Earth, let’s see how the computer models used in the fourth IPCC report (released in 2007) perform when run against Berkeley Earth’s historical temperature record.
The multi-colored lines represent runs from the climate models featured in the fourth IPCC report. The heavy black line represents the Berkeley Earth land temperature record. The heavy red line represents the average of the various model runs. It would appear that the climate models used by the IPCC are now pretty good at replicating temperatures and are not, on balance, running hot.
So my question: If the models were published in 2007, I’m assuming that means they were calibrated up to 2007 (or very recent) observations, right? If so, then the goodness of fit before 2007 isn’t really relevant. What matters is how the models performed out of sample, i.e. from 2007 forward.
And as Taylor’s own chart shows, the models predicted much more warming after 2007, than actually occurred.
So doesn’t this chart prove the exact opposite of JT’s point?
Yes.
Yes. This is fundamental. The curve is fitted to given data, so its alignment with that data is not evidence you have the ‘right’ curve. That must be tested, on new data. (There ARE techniques for using the old data for both finding and testing the curve, and this can help one cull the set of curves you find fit the old data, as a guard against overfitting, but this is never a substitute for testing predictions. )
An extreme example. For any finite set of data I can find a polynomial whose zeroes are that data. And I can do that for polynomials in x and y whose zeroes are the x, y pairs of the climate measurements. So I can construct a perfect retrodicter, using nice continuous, differentiable and integratable polynomials. Any bets my perfectly retrofitted function has any predictive value at all?
It would appear that you are correct in pointing out that he is using old irrelevant data to advance his narrative seeing as the data he ought to be using won’t accomplish that end.
Taylor and the rest of the folks at Niskanen are just a bunch of hacks. They couldn’t get their left-wing agenda through at Cato (which is saying something), so Taylor started his own hack platform masquerading as a policy think tank. It’s really pathetic that they sully Niskanen’s fantastic reputation by naming their hack center after him.
I believe that the operative words are “wildly over-predicted”. Are the climate models “wildly over-predicting” temperature? It seems like the model is off by 0.1 or 10% is that a lot? It doesn’t seem to be that much more than in some other years e.g. 1920s. In other words: the model doesn’t over predict out of sample more than it under predicted in sample.
Let’s say that this is the right model, then is this discrepancy so large that we’d want to reject this model? Or alternatively, does this model perform worse than a simple model such as a moving average?
Of course, because of the discrepancy between the models and surface average temperature, the difference between modeled and observed ocean-only readings is large, and post -1998 only matches the models during the recent El Nino (too bad this reply section will not accept an illustration).
There’s the further problem that Taylor tries to sweep away: The satellite/radiosonde comparisons with the IPCC model average show a huge error in the vertical in the tropics. Given that the vertical stratification is what determines tropical precipitation, that means the modeled rainfall is systematically wrong. Given that the presence of surface water dramatically alters the partitioning of incoming radiation (less sensible heating of wet surfaces), that means the daily thermal regime is also mis-specified, which will further screw up the rainfall etc…
At any rate, as shown by Hourdin et al. in the latest Bulletin of The American Meteorological Society, the models are tuned to match the 20th century surface history often with physically unrealistic adjustments, possibly a cause of the huge vertical error.
Massive misdirection citing LAND temperatures, there, followed by smoothing the absolute hell out of the data.
Good catch on this, Doc. I think you’re right that this is Taylor essentially saying,
“The models are bad at predicting the future, are they? Well, look how great they are at predicting the past!” [Shows graph of, in most cases, the models trending alongside a dataset they were literally and specifically programmed and calibrated to replicate closely.] “Victory!”
The picture looks odd in that most of the spaghetti’s seem to stop at about 2002. However, the fact is that the current surface data is sitting pretty close to the projections. Given 2016 was another warmest year the measured temperature continues its upward trajectory from the graph illustrated from 2015. This is true if you use global not just land data. see here
http://www.realclimate.org/index.php/climate-model-projections-compared-to-observations/
Satellite temperatures are still lower than surface. However, the projections are all for surface temperatures. The temperatures do not only match during the recent El Nino, as you can clearly see, but continue to mach past that event. Conditions have generally been towards La Nina recently.
It seems a bit pointless to spend too much tme arguing over a graph from 2015 when we have more recent data available.
Harold, I’m not being sarcastic when I say, I find it fascinating that we can all look at the same graph and reach opposite conclusions.
Bob, let me be clearer. It is possible that Taylor is presenting the data in such a way as to put what he sees as a good gloss on it. I don’t know exactly. My point is that it doesn’t matter too much now whether he was or was not because the data we have today shows a good match to the projections. So I did not so much reach an opposite conclusion about that particular graph but focused on the central message and said that the story he is telling seeems to be accurate, whether or not that graph is a good illustration.
I don’t know enough about that graph to make detailed comment. I did comment on the apparent truncation of lots of the squiggly lines. There sees to be about 20 up to 2002 then lots of them simply stop and only about 5 continue. What is going on? How were he lines calculated, and why do so many simply stop? I don’t know.
I could possibly do quite a bit of digging to find out more about the origin of the graph, whether land gives a different picture than surface etc. However, that seems to be unnecessary since we have available more recent data presented more clearly and clearly attributed to 2004 and 2011. So as to whether Taylor is pulling a fast one – I say possibly, but unproven. On the more important (to me) point of whether the measured temperature is in line with the models, then we can conclude that they are, whatever Taylor was or was not doing in 2015.
Further clarification, when I said “However, the fact is that the current surface data is sitting pretty close to the projections. ” I was not refering to the graph illustrated.
It’s pretty hard to visualize ordinary least squares regression lines so charts of levels rather than trends is a pretty good way to obfuscate the issue, which the environmental political advocacy group which funds “realclimate” is pretty good at doing. In particular using charts of levels lets the charter take advantage of models’ unrealistically larger year to year variability to make the model range look wider than it really is.
Beware chartsmanship trickery.
Andrew, I have no idea what your coment means. Charts of levels rather than trends? What else is a chart but a depiction of levels of something? Sounds like obfuscation to me. Use this as a good check for confirmation bias. Just look at the numbers. What is the middle of the projection range for now, what is the current anomaly. They are very similar.
If you look at this data and try to avoid the conclusion that the projections are currently accuratey you have a big problem.
There may be objections to concluding that the projections have any skill. Maybe it is pure chance that they happen to coincide at the moment. However to use deceptive charts as a way of avoiding the conclusion if very poor reasoning. You have to actually come up with an argument and not simply claim that the presenters of the data are biased.
These people are professional climate scientists. Bob has quite rightly pulled me up in the past when I have suggested that he as a professional economist has made an elementary mistake. So show some respect for other professionals when they comment about their field.
If Gavin Schmidt wants to weigh in in the tax interaction effect, I would have my suspicions as to his expertise, but we sould respect their expertise in their own subjects. That does not mean we must agree but we should not simply dismiss everything they say as biased either.
“I have no idea what your coment means.”
You’re not off to a great start.
“Charts of levels rather than trends? What else is a chart but a depiction of levels of something? Sounds like obfuscation to me.”
A chart of levels would be, you know, a chart of levels, ie you just plot up the anomalies and compare. A chart of trends would either involve, actually plotting up the trend lines, or, more comprehensively, doing histogram of the slopes sampled from models and comparing the real world slope to that.
“Just look at the numbers. What is the middle of the projection range for now, what is the current anomaly. They are very similar. ”
That one data point happens by chance to land near the middle of the model distribution of anomalies does not mean that the trend is near the middle of the model distribution of trends.
People who don’t understand the concept of weather noise aren’t going to get why that matters so of course you don’t.
“If you look at this data and try to avoid the conclusion that the projections are currently accuratey you have a big problem.”
If I have a better understanding of what the chart makers are trying to hide than you do, *I* am the problem, got it.
“Maybe it is pure chance that they happen to coincide at the moment.”
Um yeah, it is, because that’s how weather noise works.
“However to use deceptive charts as a way of avoiding the conclusion if very poor reasoning.”
I agree, which is why it’s odd that you’re doing exactly that.
“You have to actually come up with an argument and not simply claim that the presenters of the data are biased.”
The argument is that actually doing a regression line is examining more information and is more robust than being impressed by one year of weather noise.
“These people are professional climate scientists.”
They are also left wing political activists with a history of distorting their own research. You have no personal experience with this, as far as I know.
“If Gavin Schmidt wants to weigh in in the tax interaction effect, I would have my suspicions as to his expertise, but we sould respect their expertise in their own subjects.”
I’ve forgotten more climate research than you have ever known in your life Harold, and I don’t have to respect anything Schmidt says when I know enough to know he’s lying.
How do you know the year to year variability is unrealistically large? It’s only about half a degree on a global basis, and if your HVAC this winter was half a degree warmer or cooler than it was last winter you wouldn’t even notice.
With a normal mercury and glass thermometer you need to work pretty hard to even get a measurement accurate to half a degree… and these are coming from all over the world, then averaged together, give or take a bunch of adjustments, weighting factors, and other jiggery pokery. The surface thermometers don’t even have good global coverage, and they are not evenly spaced. Most of the stations have moved at some time or another, plus the measurement technology changed.
I think you would be lucky to have an answer within the realm of three or four degrees accuracy there. Mind you at least presuming they do roughly the same process every year it might not vary too much, but a few big storms one year, a polar vortex another year … takes a lot to average all that out. I’m suspicious to see such little variation in the measured value, to be honest.
Tel, If you are suspicious you should go to the sources to see how it is done. You should not make assumptions that because you can’t do it using the thermometer in your garden there is no scientific basis for the figures.
As an example, if you had a measure accurate to the nearest foot and you measure a highly selected group of 10 people with heights of 5′, 5.1′, 5.2’…5.9′, then take the average you are accurate to 0.5″.
There are reasons why they use anomalies also. You are then looking at the precision rather than the accuracy. It does not matter if the thermometer is not accurate as long as it accurately reflects changes.
You are entitled to abandon science and go with your own intuition, but you should not expect the rest of us to follow.
Using anomalies rather than absolute temperatures helps you get changes. You know what helps you get changes even better? Trend lines. Gosh I wonder why RC didn’t plot those on the chart.
Trend lines have their place but are not a panacea. What trend line do you think they should fit? Linear? Exponential? Over the whole range? Maybe plot a linear trend for the last 3 years? Wow – the data shows a spectacular trend much higher than the models.
The do indeed publish rates of change, histograms and trend lines, but not in those particuar graphs. It would not be appropriate to do so.
The data points are plotted so you can compare them to the calculated model output. You can quickly see how well the data matches the model output. These charts do that effectively but of course they are not a full statistical analysis. Yes, you could do histograms and they do histograms, but they are not as accesable to most people.
You refer to fitting a regression line as more robust than one year of data, but we are not talking about one year of data, are we? There is not just one point. Very nearly all the data points are within the envelope. It was at the low end for a bit, now it is above the mean and has been close to the mean for a few years. You suggest that all those recent points are there by chance, but all the low points were representative? Bad argument.
If you have forgotten so much climate science you should go back and remind yourself of it. Your defence is that you know everyone else is lying, which is a bad argument also.
AGW “skeptics” have been publishing this sort of graph for years and apparently were very satisfied with them when the data was below the models. Now the data shows what they don’t like and suddenly these are the wrong type of graphs to show. Well, good try but it will not work.
Holy crap Harold this comment is beneath even you.
“rend lines have their place but are not a panacea. What trend line do you think they should fit? Linear? Exponential? Over the whole range? Maybe plot a linear trend for the last 3 years? Wow – the data shows a spectacular trend much higher than the models.”
This is all misdirection. Obviously you do a linear trend over the longest period possible over which a straight line can be expected to approximate the expected qualitative path of temperatures. This is how you avoid being misleading by a single year of weather noise.
“The do indeed publish rates of change, histograms and trend lines, but not in those particuar graphs. It would not be appropriate to do so. ”
Oh, it wouldn’t be appropriate you say! Wouldn’t want anyone getting the wrong idea.
“The data points are plotted so you can compare them to the calculated model output. You can quickly see how well the data matches the model output. These charts do that effectively but of course they are not a full statistical analysis. Yes, you could do histograms and they do histograms, but they are not as accesable to most people.”
You’re more concerned about getting a message out to people efficiently than conveying as much information as accurately as possible, because what you actually care about is that it isn’t safe to show a chart to “most people” from which they might draw a conclusion you don’t approve of.
“You refer to fitting a regression line as more robust than one year of data, but we are not talking about one year of data, are we? There is not just one point. Very nearly all the data points are within the envelope. It was at the low end for a bit, now it is above the mean and has been close to the mean for a few years. You suggest that all those recent points are there by chance, but all the low points were representative? Bad argument.”
Actually I don’t argue that. I argue that you need to do a trend line because individual data points high or low are not representative.
If the highest points are just near or slightly above the mean, and the lower or middling points are below the mean, the trend is probably, you know, under the mean. This is how weather noise works.
“If you have forgotten so much climate science you should go back and remind yourself of it. Your defence is that you know everyone else is lying, which is a bad argument also.”
I know that people with a history of lying are lying, and also I know what the idiom “I’ve forgotten more than you’ll ever know about [x]” means.
“AGW “skeptics” have been publishing this sort of graph for years and apparently were very satisfied with them when the data was below the models. Now the data shows what they don’t like and suddenly these are the wrong type of graphs to show. Well, good try but it will not work.”
Actually it is irrelevant what “skeptics” have “been publishing for years” but my preferred kind of chart has been being done by “skeptics” like Michaels since at least 2013.
https://arxiv.org/abs/1309.5164
Andrew, the models are clearly not a straight line. They are curves that have been generated by a model. Why would you then stick a straght line through that data?
I cannot see the charts you prefer. And I have repeatedly said that I am not against the sort of analysis you like, just that you should not include it in every chart because it results in distraction. I have seen a lot more lying using trend lines than by missing them out. Showing the data without additional analysis is not lying, and only a liar would suggest it to be. The lying comes in when the additional analysis misleading.
I will come back to you when I have had a chance to read the paper.
OK, read it. There are no charts of temperature in the paper. The only charts are probability of trends of particualr length. If this is your prefered chart it is totally useless for doing anything other than analysing the probabilities of trend lengths. You will really have to provide a much better example if you want to communicate better.
The model weather noise of of greater magnitude than the actual year to year variability of the temperature data, that’s how I know it is unrealistically large. The rest of your ccomment is confusing unrelated issues.
You’re kidding right? There’s a sharp spike upward from the 1960s onward. The year over year variability is huge.
Toby-I’m not kidding and you don’t understand what I’m saying.
How about you compute it and we see who is right?
I don’t see the point seeing as I am talking about year to year variability and you are talking about the 56 year trend.
what happens to var(y-o-y) with a change in trend?
The mean shifts, and the variance does not.
So the variance of the y-o-y is the same from start to 1960 as it is from the start to 2007?
How weird. I mean you could interpret his statement to mean there’s a good fit to data but he doesn’t seem to appreciate that that’s not a tall order for anyone competent.
What’s strange is that at the end where we might be getting into some projections (?) some of the models just stop.Are we looking at any projections here or is it all just calibration?
Good point. I hadn’t noticed that most of the models stopped before the predictions. Hard to see that on my phone.
You’re absolutely correct. Adding any old term to a model (e.g. a regression model) will improve the goodness of fit on the historical data. In this case adding something like the Dow Jones industrial average along with the median house price will improve the fit. It’s completely meaningless.
I rummaged around and found this reference:
http://www.ipcc.ch/pdf/assessment-report/ar4/wg1/ar4-wg1-chapter8.pdf
Which is the chapter on modeling from the fourth IPCC report (presumably the relevant reference, although Jerry Taylor is a little vague on the details) it was indeed published in 2007 so we are in the right ballpark. Then scroll to page 600 (which is p12 inside the PDF, since numbering starts at the beginning of the whole report, but the link is merely chapter 8). You will see “FAQ 8.1, Figure 1.” and there are model runs with a thick red line representing the average of many model runs (similar basic layout to Taylor’s graph).
Interestingly, in the IPCC report, all the model runs stop right at 2007 same as when the report was published so this is 100% past data, no prediction at all. Also, the red line is quite a different shape to the red line in the graph we see linked from Jerry Taylor. For starters the IPCC has included volcanic effects (dust clouds, etc) and also most of the lines from Jerry Taylor finish well BEFORE 2007 but a few lines go a bit after, so they are not the same modeling runs as the IPCC graph shows. Taylor’s red line finishes up at an Anomoly approx 1.10 while in the IPCC report it ends up approx 0.75 and from an eyeball comparison doesn’t even look like the zero is in the same place.
So in a nutshell, I have no idea what Taylor is doing but unlikely to be what he thinks he is doing.
And hey, it’s entirely possible that somewhere in the same IPCC report there’s a different graph with completely different modeling runs giving different answers… and maybe there’s even a good reason for why nothing matches up. You know what? I’m not going to search for it too hard.
People seem to be thinking about this wrong. Daniel suggests it is not a tall order to construct a climate model that fits past data. Steve says that adding in any old term will make them fit. These are wrong. These models are not curve fitting, but based on physical laws – built from the bottom up, so to speak.
Say I had a model of the economy. It is a very simple one and only outputs unemployment. My model says that for every extra 10% on the minimum wage unemployment falls by 10% and that is the only factor. I test this model against some data from the past 10 years. I use data on real unemployment and real minimum wage to set my start point, then run the model to produce my output. Let us say that over this period minimum wage has steadily gone up while unemployment has also steadily gone up.
My chart has two lines – the model output sloping down and the real unemployment data sloping up. The measured result is the opposite of my model output. Oh dear, I have a bad model. This model will never fit past data.
However, if I have a different model and the output from this model is close to the actual unemployment we start to think it might be a good model. This new model contains many more inputs than just minimum wage.
We might test this new model by projecting future unemployment. We don’t know what the future minimum wage rate will be, nor the value of the other inputs, so we cannot make firm predictions. We can make projections based on likely values of these inputs. So we assume, for example different minimum wage rates and run our model. This produces a variety of lines going into the future dependent on what the input values actually turn out to be. In climate these are often called scenarios. It turns out that the models are actually good at matching the past and predicting the future.
The models are definitely NOT just curve fitting. They are derived from physical laws. A bad model will never fit past data. Tel observes that the IPCC chart shows volcanic eruptions. These are a good test of the model. Can it cope with a sudden shock to the system? However we would not test a model by noting that it failed to predict an eruption.
The IPCC chart says anomalies are relative to 1901 to 1950 mean. Taylor’s chart does not say. If using an anomaly you should always state what the reference is to allow comparisons. Without knowing this we cannot say anything about the absolute magnitude of the anomaly. Hansen 1984 uses anomaly wrt 1950-1981 average. CMIP3 and CMIP5 use 1980-1999. That is why on the Hansen plot shows GISTEMP anomaly of about 1C and CMIP plots show an anomaly of about 0.7C for the same temperature series.
The fitting consists of the of the choice of input values, which, believe it or not Harold, are not actually known over the 20th century, because it’s very hard to figure out how effective all that old air pollution was at cooling the planet.
Yes, like in my example the actual minimum wage was inputted. Even so a bad model will not fit the data. It is indeed very hard to figure out how effective air pollution was at cooling the planet, which was pretty much my point.
Actually Harold, a bad model can fit the data better. Its called “overfitting”. Its a common problem in model development.
It’s also fishy that most models truncate in 2000 when they stop in 2007 in the IPCC report.
Ken P, yes, you are right, bad models can fit the data. Models could be made to fit past data almost exactly, by curve fitting. However, these will rapidly deviate from future data, unless the curve fitting has stumbled upon some actual physical process.
For example, curve fitting to a genuine cyclic event such as the sun rising could be a pretty good predictor of future sun rises.
This is unlikely in very complicated and chaotic systems like the climate, would you agree?
I also commented on the truncated models. It made me wonder about the origin of that particular graph, but rather than dig into that as a detail I looked up what the recent data said.
And they are indeed deviatkng frkm current data. That is Bob’s point. Curve fittting does not require addition of extraneous variables. It can be as simple as changing the coefficient on existing physical variables.
Actually no, if you have 20+ models that differ a great deal from one another some will be better than others, and the least good ones will be bad. But they all have been fit to the data.
You don’t seem to understand that “how effective air pollution was at cooling the planet,” is, in your example, equivalent to not knowing what the past minimum wage was.
I see what you are getting at, but I don’t think the equivalency is accurate. Knowing how effective air pollution was at cooling the planet is one of the main outputs from the model, whereas the minimum wage is only an input. It may be that the analogies breaks down at that level of scrutiny. Analogies, like models, are not perfect.
This is a good explanation and a good reason why the model should be trusted.
People seem to be thinking about this wrong. Daniel suggests it is not a tall order to construct a climate model that fits past data. Steve says that adding in any old term will make them fit. These are wrong. These models are not curve fitting, but based on physical laws – built from the bottom up, so to speak. Absolutely NOT true. Michael Mann’s hockey stick was built off of Tree Ring Proxies, what physical law is “Tree Ring Proxies”? All these models are built off “historical” data. There is no way that those curves can be generated off of physical laws alone. There are simply too many unknowns. (Volcanic eruptions, killing of buffalo, deforestation, meteor strikes, etc.) All models need “initial conditions” which are of course by definition “Historical Data”.
Betty, these charts are not based on Michael Mann’s hockey stick. That was a temperature reconstruction – totally different thing altogether.