Dave Burton
Have you noticed that, in most fields, technology tends to advance in fits and starts? For long periods, technology just creeps along, then there’s a sudden surge of progress. Well it’s happening again… I think.
For the last sixty years, computer scientists have been trying, with little success, to make programming computers more productive. The creation of high level languages in the 1950s & early 1960s yielded big productivity gains, but then progress slowed to a crawl. Despite astonishing increases in computing power (5-6 orders of magnitude increases in speed, RAM memory, disk storage, and display resolution, in my own home computers, over five decades!), and despite improved tools, and a series of fads in programming methodology, not much really changed, and software developers have remained about as (un)productive as we were in 1975, when Fred Brooks wrote The Mythical Man-Month.
Newer programming languages, like Python, JavaScript and Rust, helped a little bit, but programmers in those languages are only slightly more productive than programmers were in Pascal and C a half-century ago. StackOverflow on the Internet is more convenient than a shelf full of paper books, but that mostly just trims the learning curve a bit. Newer tools, like the VSCode editor, are very nice (certainly nicer than punched cards!), but they don’t make computer programmers wildly more productive.
But now that has apparently finally changed. The long anticipated sudden surge has finally arrived. Cutting edge programmers are now using “AI Assistants” like Microsoft/Github CoPilot to help them write code, and they are reporting dramatic improvements in productivity, and perhaps even improvements in code quality.
It does not shock me that tools created by computer programmers are helpful to computer programmers. But, surprisingly, this technology is also very, very good at parsing, responding to, and writing “natural language” (e.g., English).
Here’s an eye-opening example. I asked ChatGPT 3.5 this question, which I’m confident the vast majority of professional “climate scientists” would not answer correctly:
“With reference to the agronomy literature, quantify how elevated CO2 helps plants use water more efficiently, and increases crops’ drought resilience, by reducing stomatal conductance and water loss through transpiration.”
ChatGPT’s answer was remarkably good:
“In agronomy, the effects of elevated CO2 on plant water use efficiency and drought resilience are extensively studied. One of the key mechanisms through which elevated CO2 levels improve water use efficiency is by reducing stomatal conductance and, consequently, water loss through transpiration.
“Stomata are small pores on the surface of plant leaves that regulate gas exchange, including the uptake of CO2 for photosynthesis and the release of water vapor through transpiration. When CO2 levels are elevated, plants can maintain the same or higher rate of photosynthesis while reducing stomatal conductance. This reduction in stomatal conductance leads to a decrease in water loss through transpiration without significantly affecting CO2 uptake, resulting in improved water use efficiency.”
Wow! I was very impressed. That answer is exactly correct, and the prose is perfect. I’ve been explaining that to people for years, and I think ChatGPT’s explanation is better than mine.
But all is not rainbows and roses, because that doesn’t always happen. I then asked ChatGPT 3.5 what I thought was a much simpler question. It requires a trivial bit of physical and mathematical reasoning, but I think even most climate scientists could probably answer it correctly:
“How much is 5.7 PgC in ppmv of CO2?”
I expected that AI tools would have no problem answering that simple question. But they all failed. In fact, they gave me ten different wrong answers.
Conventionally, “1 ppmv CO2” is shorthand for 1 µmol/mol of dry atmosphere (“dry molar fraction”). 1 ppmv CO2 = 7.8024 Gt CO2 = 2.12940 PgC, so the correct answer to the question I asked is 5.7 / 2.1294 = 2.67681 ppmv. (Water vapor adds about 0.25% to the mass of the atmosphere, which is 0.41% by molar fraction, so the answer is very slightly larger if water vapor is not omitted.)
ChatGPT 3.5 completely fell on its face. It said:
“5.7 PgC is approximately equivalent to 61.9 ppmv of CO2 in the atmosphere.”
ChatGPT 4.0 isn’t free on the OpenAI website, but Popai has an option to use ChatGPT 4.0 as its engine, so I tried that. It gave a different wrong answer:
“5.7 PgC is approximately equivalent to 4.07 ppmv of CO2 in the atmosphere.”
I asked Mistral AI, and it also gave a wrong answer:
“5.7 PgC is approximately equivalent to 569 ppmv of CO2 when evenly distributed throughout the entire atmosphere.”
I asked Microsoft Copilot, and it also gave a wrong answer:
“5.7 PgC is approximately equivalent to 12.198 ppmv of CO₂.”
CoPilot is also integrated into the Microsoft Edge Browser. Oddly, that version gave a different wrong answer:
“Unfortunately, without knowing the specific atmospheric temperature and other factors, we cannot provide an exact ppmv value.”
I asked Llama 2, and it also gave a wrong answer:
“5.7 PgC is equivalent to 1220 ppmv of CO2.”
I asked Claude, and it also gave a wrong answer:
“5.7 PgC (petagrams of carbon) is approximately equivalent to 9.86 ppmv (parts per million by volume) of CO2 in the atmosphere.”
I asked Scite, and it also gave a wrong answer:
“5.7 PgC is equivalent to approximately 25.2 ppmv of CO2 in the atmosphere”
I asked Aria (in the Opera browser), and it also gave a wrong answer:
“5.7 PgC * 2.12 ppmv/PgC = 12.084 ppmv”
I also asked Google Gemini, and it gave the most hilariously wrong answer of all:
“5.7 PgC is equivalent to 5700000000000000000.00 ppmv of CO2.”
Two fervent advocates of the position that AI is transformative technology are Matt Welsh and Matthew Berman (both on YouTube, of course). Here’s a contrary view (profanity warning). IMO, the jury is still out, but…
👉 My first conclusion is that the AI tools are like the kids who pass their science classes by “memorizing the formulas.” They cannot “do science.”
👉 My second conclusion is that they can nevertheless be quite useful. Just always remember President Reagan’s advice for dealing with the Soviets: “Trust, but verify.”
👉 My third conclusion is that when using these tools it helps to lead them along by the nose. Be very, very specific with your questions. For instance, if you’d rather that they rely on the (robust) agronomy literature, rather than the (hopelessly politicized) climate science literature, then say so. Here’s a video with some suggestions about how to make best use of ChatGPT.
Dave Burton
via Watts Up With That?
March 14, 2024 at 04:06AM
