I believe this to be an important and interesting topic, but this paper is so sloppily done it could have been written by sociologists.
The paper starts by asking the question: ``Can we compare the energy efficiency of software languages?'' Of course anyone who understands the difference between a language and an implementation will quickly tell you that no, you cannot, yet the authors claim that they did so. The actual language implementations are listed on the companion website, where you can learn that Python means CPython 3.5.2, Lisp means SBCL 1.3.3.debian, and for OCaml they are using the optimizing (native) compiler (version 4.05.0). They were running all this on an Intel i5-4660, and measured the energy consumption using some Intel tool. Yet they claim that their results are ``externally valid'' and general, despite not measuring alternative implementations or running on other architectures, or even just two different microarchitectures.
For the measurements, they are using a subset of the Computer Language Benchmark Game. These programs are very computation intense, and are characterized by a single, long burst of near uniform performance. The paper does not concern itself with questions whether these are representative of usual load, or in what cases it might be representative, or why it might not make sense to measure something else.
Having measured the energy consumption and execution times, they ask whether there is linear proportionality between the two. Unsurprisingly they find that while the fastest program consumes the least energy, there are also cases where a faster program consumes more energy than a slower one. They do not investigate what could have caused this. Furthermore, they do not concern themselves with trying to explain what "energy efficiency" actually means, and seem to rely on using total energy consumption instead.
Next, they want to investigate whether memory usage correlates with energy consumption. Sadly they are only measuring peak memory usage (using the time
utility!), which turns out to be completely useless.
Finally, they take their results and try to recommend languages based on it, concluding that if you want to save both time and energy, you should write everything in C, but if you want to also save memory, you can try Pascal too.
In general this seems to be a filler paper. They took existing tools and programs, started the benchmarks, leaned back, drank some coffee, presented the results, all without even once questions if any of it made sense.
By far the most interesting result is the fact(?) that the change from SBCL 1.3.3 to 1.4.3 made Lisp somewhat slower everywhere, except for `fasta', which is up to four times as efficient in all three measures with the new compiler.
https://sites.google.com/view/energy-efficiency-languages/updated-functional-results-2020
https://sites.google.com/view/energy-efficiency-languages/results