Ludwig

On reading about optimizing compilers

I was asked (on x.com) to dump some of my reading from the last two days on optimizing compilers -- here they are. For context, I never actually wrote a compiler backend (I'm an LLVM baby), so while I'm specifically interested in optimizing compilers because of their allure. I have no hands-on tangible experiences with writing one.

What Every C Programmer Should Know About Undefined Behavior

Re-read the full wonderful series from Chris Lattner, to refresh my mind on UB. Give this one a read if you haven't!

Future Directions for Optimizing Compilers

This served as a nice (if indirect) modern primer on optimizing compilers. Their main case studies were: Declarative Peephole Optimizations for LLVM, Superoptimizing LLVM

The references led me to:

It also made me realize I needed a refresher (or to actually learn) what SSA actually was. If you're interested in optimizing compilers, these are some of the nicest, shorter resources I found on SSA:

Learning to Make Compiler Optimizations More Effective

Very interesting read on a RNN-based model and pipeline for loop optimizations. It got me thinking about where/how do these ML-based techniques fit in existing or new compiler pipelines -- held off on researching the current SOTA for this.

Some interesting quotes:

Given the source code of a loop, LoopLearner suggests a semantically invariant transformation that will likely allow the compiler to produce more efficient code.

Following its recommendations prior to compilation results in an average speedup of 1.14x. Almost three quarters (73%) of the suggested transformations yield positive speedups. Trying the top-3 recommendations and choosing the best one raises the average speedup even to 1.29x.

(to be used) as a pre-processor run before or as part of the compiler

What every compiler writer should know about programmers or “Optimization” based on undefined behaviour hurts performance

Tbh, the entire paper is a banger with lots of interesting takes (2.1, 3 especially go hard). It's also a really funny read. Below are some quotes I thought were interesting.

This one stood out and captures C very well to me

Some of the facets of the spirit of C can be summarized in phrases like:
– Trust the programmer.
– Don’t prevent the programmer from doing what needs to be done. – Keep the language small and simple.
– Provide only one way to do an operation.
– Make it fast, even if it is not guaranteed to be portable.
The last proverb needs a little explanation. The potential for efficient code generation is one of the most important strengths of C. To help ensure that no code explosion occurs for what appears to be a very simple operation, many operations are defined to be how the target machine’s hardware does it23 rather than by a general abstract rule.

Bold take on UB-based heuristics and its consequences

“Optimizations” based on assuming that undefined behavior does not happen buy little performance even for SPECint benchmarks (1.7% speedup with Clang3.1, 1.1% with GCC-4.7), and incur large costs: Most production programs perform undefined behaviors and removing them all requires a lot of effort, and may cause worse code to be produced for the program. Moreover, a number of security checks have been “optimized” away, leaving the affected programs vulnerable!

Screenshot 2024-07-14 at 11

"[...] programs that work on a version of your compiler are conforming C programs and they are not buggy just because they perform undefined behavior"

Here's one more (but really, the paper is full of these) I posted on x.com here.

Reading this I also realized I needeed a refresh on GCC optimization flags and went through a few rabbit holes using GCC's 3.11 Options That Control Optimization

Stashed for later

Stashed for later can mean a lot of things. If it is directly relevant to my goals, that usually means tomorrow/once I'm rested. If it isn't, it could be anytime from "next time I'm reading on that domain" to "in 9 months when I'm attempting to write an optimizing compiler"