Finding unprecedented high resolution values in a low resolution dataset

In previous post, I discussed a graph that suggested that the CO₂ and CH₄ levels in the atmosphere are unprecedented in the last 800,000 years and proposed that it is misleading to compare high resolution data with low resolution data. After I published that post, I wondered whether I could illustrate this with an example. It should be possible if I had some detailed dataset. Then I could make a detailed graph, see how that looks like, then sample this dataset in the same way as a proxy dataset and again make a graph. Comparing both graphs should make clear what the effect is.

Initially, I considered using white noise, but after a while I thought that it would be better if I use some real world, high resolution data with some trend in it. I remembered that the Dutch meteorological services published their measurements for different stations like for example the daily temperature data from De Bilt. This is detailed information and I know how it looks like, so that would be a good thing when I start comparing.

The data is provided as 1/10^th of a degree Celsius (for example, 150 is the representation for 15 °C). In what follows, I will keep it in that format.

I then needed the resolution of a proxy. At first, I wanted to sample in a regular pattern, but when looking at some proxy datasets, I noticed that this was not the case in those datasets. They mostly have a 100+ years interval, but not in a regular pattern. I then settled with the dome C proxy dataset. This dataset is for CO₂, but this is not important because I only want to use the sample rate of this dataset in order to see what it would do with the know example dataset.

The dome C proxy dataset is 12,609 years long (from 9,067 to 21,676 years) and has a sampling rate of about 175 years. To make it myself a bit easier, I took the last 12,609 data points of my daily temperature dataset (from July 25, 1983 until the end of 2017). This is the result:

The datapoints are in the range of -132 and +271 tenths of a degree. There is some trend in the data. At the beginning it seems a bit cooler, then temperatures go up around datapoint 2000, then back up around 5000 and again between 10,000 and 11,00. Generally, temperatures going up slightly.

I then took the intervals from the example proxy dataset and sampled the temperature values accordingly. This is the result:

Look at the data range. It is now in the range of -10 and +239 tenths of a degree. It is getting more narrow. Which is quite logical since it would be quite coincident that all extremes would be sampled.

Okay, but this is not exactly what happens with proxy data. The proxy samples are not from a specific year, but from a range of years. If I average the values in those ranges, then I get this:

Again, look what happens with the range of the data. Now we are left with a range of +12 and +170 tenths of a degree (compared to -132 and +271 tenths of a degree in the original data). Which is also logical since by averaging the values in those ranges, the graph is basically smoothed. This also removes the subtle trend from the original data. The first lower trend around datapoint 2,000 is now gone, the second one is still somewhat visible. The third is not visible, but there is another dip before 10,000 that wasn’t in the original data. It was created by the short sample rate at that point, therefor averaging two low values and resulting in a low value for that range.

What will happen when I delete the 2017 data (3 data points) and replace it with the actual value of one of those datapoints. I take the value of 23.5 °C to show the issue with the low resolution data) and get this:

Now the big question: does this somehow mean that a daily temperature of 23.5 °C is unprecedented in our 32 years dataset?

Not necessarily. It might be unprecedented in the smoothed, low resolution data (there was no range in the daily temperature dataset with an average of at least 23.5 °C), but it is not unprecedented in the underlying higher resolution data. There are 102 measurements of at least 23.5 °C in the daily temperature record since 1983. When we look at the original data, then there are 1,873 measurements that exceed the highest value of the average data (more than 170) and 2,554 measurement that are outside the range of the averaged data (less than 12 or more than 170).

Why couldn’t we see all this in previous graph? Well, these were hidden in the lower resolution data. Their absence was an artifact of the low resolution sampling of the dataset.

Okay, this example doesn’t say anything about whether the current CO₂ and CH₄ levels are unprecedented in the last 800,000 years. It is just a practical example to show that it is not possible to prove that a certain value from a high resolution dataset is unprecedented using a low resolution dataset. Just as in the proxy datasets that have a very low resolution of on average 170 years and never had a period with an average of 386 ppm CO₂ or 1790 ppb CH₄. Even if a 60 years period would exist, it would not show up in this low resolution dataset, so it is misleading to suggest that our current (high resolution) values are unprecedented on basis of such low resolution data. There is no way to tell how many similar warmings in the past may have been hidden in the lower resolution proxies.

via Trust, yet verify

https://ift.tt/2qx2xWv

April 14, 2018 at 04:51PM