Corrections to the under the hood yada yada.
6.4 KiB
A Concurrent WLLVM in Go
TL; DR: A drop-in replacement for wllvm, that builds the bitcode in parallel, and is faster. A comparison between the two tools can be gleaned from building the Linux kernel.
Quick Start Comparison Table
wllvm command/env variable | gllvm command/env variable |
---|---|
wllvm | gclang |
wllvm++ | gclang++ |
extract-bc | get-bc |
wllvm-sanity-checker | gsanity-check |
LLVM_COMPILER_PATH | LLVM_COMPILER_PATH |
LLVM_CC_NAME ... | LLVM_CC_NAME ... |
WLLVM_CONFIGURE_ONLY | WLLVM_CONFIGURE_ONLY |
WLLVM_OUTPUT_LEVEL | WLLVM_OUTPUT_LEVEL |
WLLVM_OUTPUT_FILE | WLLVM_OUTPUT_FILE |
LLVM_COMPILER | not supported (clang only) |
LLVM_GCC_PREFIX | not supported (clang only) |
LLVM_DRAGONEGG_PLUGIN | not supported (clang only) |
This project, gllvm
, provides tools for building whole-program (or
whole-library) LLVM bitcode files from an unmodified C or C++
source package. It currently runs on *nix
platforms such as Linux,
FreeBSD, and Mac OS X. It is a Go port of wllvm.
gllvm
provides compiler wrappers that work in two
phases. The wrappers first invoke the compiler as normal. Then, for
each object file, they call a bitcode compiler to produce LLVM
bitcode. The wrappers then store the location of the generated bitcode
file in a dedicated section of the object file. When object files are
linked together, the contents of the dedicated sections are
concatenated (so we don't lose the locations of any of the constituent
bitcode files). After the build completes, one can use a gllvm
utility to read the contents of the dedicated section and link all of
the bitcode into a single whole-program bitcode file. This utility
works for both executable and native libraries.
For more details see wllvm.
Prerequisites
To install gllvm
you need the go language tool.
To use gllvm
you need clang/clang++ and the llvm tools llvm-link and llvm-ar.
gllvm
is agnostic to the actual llvm version. gllvm
also relies on standard build
tools such as objcopy
and ld
.
Installation
To install, simply do
go get github.com/SRI-CSL/gllvm/cmd/...
This should install four binaries: gclang
, gclang++
, get-bc
, and gsanity-check
in the $GOPATH/bin
directory.
Usage
gclang
and
gclang++
are the wrappers used to compile C and C++. get-bc
is used for
extracting the bitcode from a build product (either an object file, executable, library
or archive). gsanity-check
can be used for detecting configuration errors.
Here is a simple example. Assuming that clang is in your PATH
, you can build
bitcode for pkg-config
as follows:
tar xf pkg-config-0.26.tar.gz
cd pkg-config-0.26
CC=gclang ./configure
make
This should produce the executable pkg-config
. To extract the bitcode:
get-bc pkg-config
which will produce the bitcode module pkg-config.bc
.
If clang and the llvm tools are not in your PATH
, you will need to set some
environment variables.
-
LLVM_COMPILER_PATH
can be set to the absolute path of the directory that contains the compiler and the other LLVM tools to be used. -
LLVM_CC_NAME
can be set if your clang compiler is not calledclang
but something likeclang-3.7
. SimilarlyLLVM_CXX_NAME
can be used to describe what the C++ compiler is called. We also pay attention to the environment variablesLLVM_LINK_NAME
andLLVM_AR_NAME
in an analogous way.
Another useful environment variable is WLLVM_CONFIGURE_ONLY
. Its use is explained in the
README of wllvm.
gllvm
does not support the dragonegg plugin. All other features of wllvm
, such as logging, and the bitcode store,
are supported in exactly the same fashion.
Under the hoods
Both wllvm
and gllvm
toolsets do much the same thing, but the way they do it is
slightly different. The gllvm
toolset's code base is written in
golang
, and is largely derived from the wllvm
's python codebase.
Both generate object files and bitcode files using the compiler. wllvm
can use gcc
and dragonegg
, gllvm
can only use clang
. The gllvm
toolset
does these two tasks in parallel, while wllvm
does them sequentially.
This together with the slowness of python's fork exec
-ing, and it's
interpreted nature accounts for the large efficiency gap between the
tool toolsets.
Both inject the path of the bitcode version of the .o
file into a
dedicated segment of the .o
file itself. This segment is the same across
toolsets, so extracting the bitcode can be done by the appropriate
tool in either toolset. On *nix
both toolsets use objcopy
to add the
segment, while on OS X they use ld
.
When the object files are linked into the resulting library or
executable, the bitcode path segments are appended, so the resulting
binary contains the paths of all the bitcode files that constitute the
binary. To extract the sections the gllvm
toolset uses the golang
packages "debug/elf"
and "debug/macho"
, while the wllvm
toolset uses
objdump
on *nix, and otool
on OS X.
Both tools then use llvm-link
or llvm-ar
to combine the bitcode files
into the desired form.
License
gllvm
is released under a BSD license. See the file LICENSE
for details.
This material is based upon work supported by the National Science Foundation under Grant ACI-1440800. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.