Clang optimization levels
@Antoine's answer (and the other question linked) accurately describe the LLVM optimizations that are enabled, but there are a few other Clang-specific options (i.e., those that affect lowering to the AST) that affected by the -O[0|1|2|3|fast]
flags.
You can take a look at these with:
echo 'int;' | clang -xc -O0 - -o /dev/null -\#\#\#
echo 'int;' | clang -xc -O1 - -o /dev/null -\#\#\#
echo 'int;' | clang -xc -O2 - -o /dev/null -\#\#\#
echo 'int;' | clang -xc -O3 - -o /dev/null -\#\#\#
echo 'int;' | clang -xc -Ofast - -o /dev/null -\#\#\#
For example, -O0
enables -mrelax-all
, -O1
enables -vectorize-loops
and -vectorize-slp
, and -Ofast
enables -menable-no-infs
, -menable-no-nans
, -menable-unsafe-fp-math
, -ffp-contract=fast
and -ffast-math
.
@Techogrebo:
Yes, no don't necessarily need the other LLVM tools. Try:
echo 'int;' | clang -xc - -o /dev/null -mllvm -print-all-options
Also, there are a lot more detailed options you can examine/modify with Clang alone... you just need to know how to get to them!
Try a few of:
clang -help
clang -cc1 -help
clang -cc1 -mllvm -help
clang -cc1 -mllvm -help-list-hidden
clang -cc1as -help
Starting with clang / LLVM 13.0.0, the legacy pass manager has been deprecated and the new pass manager is used by default.
This means that the previous solution for printing the optimization passes used for the different optimization levels in opt
will only work if the legacy pass manager is explicitly enabled with -enable-new-pm=0
. So as long as the legacy pass manager is around (expected until LLVM 14), one can use the following command
llvm-as < /dev/null | opt -O3 -disable-output -debug-pass=Arguments -enable-new-pm=0
Alternatively, the execution order of the optimization passes with the new pass manager can be extracted with --debug-pass-manager
(instead of -debug-pass=Arguments
).
Unfortunately the output is very verbose and some processing needs to be done to reconstruct the behavior manually with -passes=
.
If only transformation passes are of interest, one can use the option -debug-pass-manager=quiet
to skip information about analyses.
There is a user guide on how to use the new pass manager with opt
on the LLVM Website.
I found this related question.
To sum it up, to find out about compiler optimization passes:
llvm-as < /dev/null | opt -O3 -disable-output -debug-pass=Arguments
As pointed out in Geoff Nixon's answer (+1), clang
additionally runs some higher level optimizations, which we can retrieve with:
echo 'int;' | clang -xc -O3 - -o /dev/null -\#\#\#
Documentation of individual passes is available here.
You can compare the effect of changing high-level flags such as -O
like this:
diff -wy --suppress-common-lines \
<(echo 'int;' | clang -xc - -o /dev/null -\#\#\# 2>&1 | tr " " "\n" | grep -v /tmp) \
<(echo 'int;' | clang -xc -O0 - -o /dev/null -\#\#\# 2>&1 | tr " " "\n" | grep -v /tmp)
# will tell you that -O0 is indeed the default.
With version 6.0 the passes are as follow:
baseline (
-O0
):opt
sets: -tti -verify -ee-instrument -targetlibinfo -assumption-cache-tracker -profile-summary-info -forceattrs -basiccg -always-inline -barrierclang
adds : -mdisable-fp-elim -mrelax-all
-O1
is based on-O0
opt
adds: -targetlibinfo -tti -tbaa -scoped-noalias -assumption-cache-tracker -profile-summary-info -forceattrs -inferattrs -ipsccp -called-value-propagation -globalopt -domtree -mem2reg -deadargelim -basicaa -aa -loops -lazy-branch-prob -lazy-block-freq -opt-remark-emitter -instcombine -simplifycfg -basiccg -globals-aa -prune-eh -always-inline -functionattrs -sroa -memoryssa -early-cse-memssa -speculative-execution -lazy-value-info -jump-threading -correlated-propagation -libcalls-shrinkwrap -branch-prob -block-freq -pgo-memop-opt -tailcallelim -reassociate -loop-simplify -lcssa-verification -lcssa -scalar-evolution -loop-rotate -licm -loop-unswitch -indvars -loop-idiom -loop-deletion -loop-unroll -memdep -memcpyopt -sccp -demanded-bits -bdce -dse -postdomtree -adce -barrier -rpo-functionattrs -globaldce -float2int -loop-accesses -loop-distribute -loop-vectorize -loop-load-elim -alignment-from-assumptions -strip-dead-prototypes -loop-sink -instsimplify -div-rem-pairs -verify -ee-instrument -early-cse -lower-expectclang
adds : -momit-leaf-frame-pointerclang
drops : -mdisable-fp-elim -mrelax-all
-O2
is based on-O1
opt
adds: -inline -mldst-motion -gvn -elim-avail-extern -slp-vectorizer -constmergeopt
drops: -always-inlineclang
adds: -vectorize-loops -vectorize-slp
-O3
is based on-O2
opt
adds: -callsite-splitting -argpromotion
-Ofast
is based on-O3
, valid inclang
but not inopt
clang
adds: -fno-signed-zeros -freciprocal-math -ffp-contract=fast -menable-unsafe-fp-math -menable-no-nans -menable-no-infs -mreassociate -fno-trapping-math -ffast-math -ffinite-math-only
-Os
is similar to-O2
opt
drops: -libcalls-shrinkwrap and -pgo-memopt-opt
-Oz
is based on-Os
opt
drops: -slp-vectorizer
With version 3.8 the passes are as follow:
baseline (
-O0
):opt
sets : -targetlibinfo -tti -verifyclang
adds : -mdisable-fp-elim -mrelax-all
-O1
is based on-O0
opt
adds: -globalopt -demanded-bits -branch-prob -inferattrs -ipsccp -dse -loop-simplify -scoped-noalias -barrier -adce -deadargelim -memdep -licm -globals-aa -rpo-functionattrs -basiccg -loop-idiom -forceattrs -mem2reg -simplifycfg -early-cse -instcombine -sccp -loop-unswitch -loop-vectorize -tailcallelim -functionattrs -loop-accesses -memcpyopt -loop-deletion -reassociate -strip-dead-prototypes -loops -basicaa -correlated-propagation -lcssa -domtree -always-inline -aa -block-freq -float2int -lower-expect -sroa -loop-unroll -alignment-from-assumptions -lazy-value-info -prune-eh -jump-threading -loop-rotate -indvars -bdce -scalar-evolution -tbaa -assumption-cache-trackerclang
adds : -momit-leaf-frame-pointerclang
drops : -mdisable-fp-elim -mrelax-all
-O2
is based on-O1
opt
adds: -elim-avail-extern -mldst-motion -slp-vectorizer -gvn -inline -globaldce -constmergeopt
drops: -always-inlineclang
adds: -vectorize-loops -vectorize-slp
-O3
is based on-O2
opt
adds: -argpromotion
-Ofast
is based on-O3
, valid inclang
but not inopt
clang
adds: -fno-signed-zeros -freciprocal-math -ffp-contract=fast -menable-unsafe-fp-math -menable-no-nans -menable-no-infs
-Os
is the same as-O2
-Oz
is based on-Os
opt
drops: -slp-vectorizerclang
drops: -vectorize-loops
----------
With version 3.7 the passes are as follow (parsed output of the command above):
default (-O0): -targetlibinfo -verify -tti
-O1 is based on -O0
- adds: -sccp -loop-simplify -float2int -lazy-value-info -correlated-propagation -bdce -lcssa -deadargelim -loop-unroll -loop-vectorize -barrier -memcpyopt -loop-accesses -assumption-cache-tracker -reassociate -loop-deletion -branch-prob -jump-threading -domtree -dse -loop-rotate -ipsccp -instcombine -scoped-noalias -licm -prune-eh -loop-unswitch -alignment-from-assumptions -early-cse -inline-cost -simplifycfg -strip-dead-prototypes -tbaa -sroa -no-aa -adce -functionattrs -lower-expect -basiccg -loops -loop-idiom -tailcallelim -basicaa -indvars -globalopt -block-freq -scalar-evolution -memdep -always-inline
-O2 is based on -01
- adds: -elim-avail-extern -globaldce -inline -constmerge -mldst-motion -gvn -slp-vectorizer
- removes: -always-inline
-O3 is based on -O2
- adds: -argpromotion -verif
-Os is identical to -O2
-Oz is based on -Os
- removes: -slp-vectorizer
----------
For version 3.6 the passes are as documented in GYUNGMIN KIM's post.
----------
With version 3.5 the passes are as follow (parsed output of the command above):
default (-O0): -targetlibinfo -verify -verify-di
-O1 is based on -O0
- adds: -correlated-propagation -basiccg -simplifycfg -no-aa -jump-threading -sroa -loop-unswitch -ipsccp -instcombine -memdep -memcpyopt -barrier -block-freq -loop-simplify -loop-vectorize -inline-cost -branch-prob -early-cse -lazy-value-info -loop-rotate -strip-dead-prototypes -loop-deletion -tbaa -prune-eh -indvars -loop-unroll -reassociate -loops -sccp -always-inline -basicaa -dse -globalopt -tailcallelim -functionattrs -deadargelim -notti -scalar-evolution -lower-expect -licm -loop-idiom -adce -domtree -lcssa
-O2 is based on -01
- adds: -gvn -constmerge -globaldce -slp-vectorizer -mldst-motion -inline
- removes: -always-inline
-O3 is based on -O2
- adds: -argpromotion
-Os is identical to -O2
-Oz is based on -Os
- removes: -slp-vectorizer
----------
With version 3.4 the passes are as follow (parsed output of the command above):
-O0: -targetlibinfo -preverify -domtree -verify
-O1 is based on -O0
- adds: -adce -always-inline -basicaa -basiccg -correlated-propagation -deadargelim -dse -early-cse -functionattrs -globalopt -indvars -inline-cost -instcombine -ipsccp -jump-threading -lazy-value-info -lcssa -licm -loop-deletion -loop-idiom -loop-rotate -loop-simplify -loop-unroll -loop-unswitch -loops -lower-expect -memcpyopt -memdep -no-aa -notti -prune-eh -reassociate -scalar-evolution -sccp -simplifycfg -sroa -strip-dead-prototypes -tailcallelim -tbaa
-O2 is based on -01
- adds: -barrier -constmerge -domtree -globaldce -gvn -inline -loop-vectorize -preverify -slp-vectorizer -targetlibinfo -verify
- removes: -always-inline
-O3 is based on -O2
- adds: -argpromotion
-Os is identical to -O2
-Oz is based on -O2
- removes: -barrier -loop-vectorize -slp-vectorizer
----------
With version 3.2 the passes are as follow (parsed output of the command above):
-O0: -targetlibinfo -preverify -domtree -verify
-O1 is based on -O0
- adds: -sroa -early-cse -lower-expect -no-aa -tbaa -basicaa -globalopt -ipsccp -deadargelim -instcombine -simplifycfg -basiccg -prune-eh -always-inline -functionattrs -simplify-libcalls -lazy-value-info -jump-threading -correlated-propagation -tailcallelim -reassociate -loops -loop-simplify -lcssa -loop-rotate -licm -loop-unswitch -scalar-evolution -indvars -loop-idiom -loop-deletion -loop-unroll -memdep -memcpyopt -sccp -dse -adce -strip-dead-prototypes
-O2 is based on -01
- adds: -inline -globaldce -constmerge
- removes: -always-inline
-O3 is based on -O2
- adds: -argpromotion
-Os is identical to -O2
-Oz is identical to -Os
-------------
Edit [march 2014] removed duplicates from lists.
Edit [april 2014] added documentation link + options for 3.4
Edit [september 2014] added options for 3.5
Edit [december 2015] added options for 3.7 and mention existing answer for 3.6
Edit [may 2016] added options for 3.8, for both opt and clang and mention existing answer for clang (versus opt)
Edit [nov 2018] add options for 6.0