Profile-Guided Optimisation#

Profile-Guided Optimisation (PGO) is a two-phase process: first build an instrumented binary, run it with a representative workload to collect profile data, then rebuild with the profile data to produce an optimised binary. LIBRA automates the compiler flags for both phases.

For the variable reference, see LIBRA_PGO in Variable reference.

1. Add PGO presets#

{
  "configurePresets": [
    {
      "name": "pgo-gen",
      "inherits": "base",
      "cacheVariables": {
        "CMAKE_BUILD_TYPE": "Release",
        "LIBRA_PGO": "GEN"
      }
    },
    {
      "name": "pgo-use",
      "inherits": "base",
      "cacheVariables": {
        "CMAKE_BUILD_TYPE": "Release",
        "LIBRA_PGO": "USE"
      }
    }
  ],
  "buildPresets": [
    { "name": "pgo-gen", "configurePreset": "pgo-gen" },
    { "name": "pgo-use", "configurePreset": "pgo-use" }
  ]
}

2. GEN phase — build and profile#

Build the instrumented binary and run it with a representative workload. The workload should cover the hot paths you want the compiler to optimise — typically your benchmarks or a realistic subset of your test suite.

clibra build --preset pgo-gen
./build/pgo-gen/my_application --representative-workload
cmake --preset pgo-gen
cmake --build --preset pgo-gen
./build/pgo-gen/my_application --representative-workload

3. Merge profile data (Clang only)#

GCC writes .gcda files directly in a form the USE phase can read. Clang writes .profraw files that must be merged first:

# Clang / Intel LLVM only
llvm-profdata merge \
  -output=build/pgo-gen/default.profdata \
  build/pgo-gen/default*.profraw

If you have multiple .profraw files from different runs, merge them all to produce a single .profdata:

llvm-profdata merge \
  -output=build/pgo-gen/merged.profdata \
  build/pgo-gen/*.profraw

4. USE phase — build the optimised binary#

clibra build --preset pgo-use
cmake --preset pgo-use
cmake --build --preset pgo-use

The compiler reads the profile data from the GEN build directory automatically (LIBRA passes the correct -fprofile-use= path). The resulting binary in build/pgo-use/ is tuned to the workload you ran in the GEN phase.

Note

For Clang, LIBRA passes -fprofile-use=build/pgo-gen/default.profdata by default. If you merged to a different path, set LIBRA_PGO_PROFILE_PATH in your pgo-use preset’s cacheVariables.

5. Verify the improvement#

Compare the instrumented and optimised binaries with your benchmark:

# Instrumented binary (GEN phase — slower due to instrumentation)
time ./build/pgo-gen/my_application --benchmark

# Optimised binary (USE phase)
time ./build/pgo-use/my_application --benchmark

Typical improvements are 5–20% for CPU-bound workloads. Memory-bound workloads see smaller gains.

Common issues#

“Profile data not found” during USE phase

The compiler looks for profile data relative to the build directory. Make sure the GEN binary was run from or wrote data to the expected location. For Clang, verify the .profdata file exists at build/pgo-gen/default.profdata.

“Profile data out of date” warnings

The source changed between the GEN and USE builds. The compiler falls back to non-PGO optimisation for affected functions. Re-run the GEN phase with the current source before rebuilding with USE.

Low workload coverage

If the workload only exercises 20% of the code, PGO only helps that 20%. Profile data from test suites tends to cover more code paths than a single benchmark run — consider running the full test suite as the GEN workload if individual benchmarks are insufficient.