Benchmarks like this are quite pointless, because they don't measure anything close to what software looks like in the real world. Take a look at the github measurements:
C = 0.77
Go = 2.07
Node = 0.79
Bun = 0.83
Deno = 1.13
PyPy = 1.61
Java = 0.64
Node and Bun, two runtimes for an interpreted language are nearly as fast as C and Java a VM based language is faster than C? Well what the hell is going on here?
Simply put, it's JIT compilation, in interpreted languages, when the interpreter notices that the same operations are done over and over again, instead of interpreting them, it will compile them into very optimized code.
This benchmark only executes the exact same 3 lines of code in the exact same context, which is the optimal scenario fro JIT.
But any real codebase consists of more than 3 lines of code called in different contexts with different constraints.
This benchmark is just plain usless. There are no good benchmarks, because no benchmark can capture the nuonces of different programming languages, different paradigms and idioms. But this is by far the worst benchmark I have seen so far. It's literally just 3 lines of code that are benchmarked