2017-06-28 15:56:33 +02:00
|
|
|
# Go Whole Program LLVM
|
|
|
|
|
2017-06-29 21:25:20 +02:00
|
|
|
[![Build Status](https://travis-ci.org/SRI-CSL/gllvm.svg?branch=master)](https://travis-ci.org/SRI-CSL/gllvm)
|
2017-06-29 21:54:07 +02:00
|
|
|
[![Go Report Card](https://goreportcard.com/badge/github.com/SRI-CSL/gllvm)](https://goreportcard.com/report/github.com/SRI-CSL/gllvm)
|
2017-06-29 21:25:20 +02:00
|
|
|
|
2017-06-28 15:56:33 +02:00
|
|
|
|
|
|
|
## Overview
|
2017-06-27 18:27:21 +02:00
|
|
|
|
|
|
|
|
2017-06-27 19:19:45 +02:00
|
|
|
This project, gllvm, provides tools for building whole-program (or
|
2017-06-27 18:27:21 +02:00
|
|
|
whole-library) LLVM bitcode files from an unmodified C or C++
|
|
|
|
source package. It currently runs on `*nix` platforms such as Linux,
|
2017-06-27 19:19:45 +02:00
|
|
|
FreeBSD, and Mac OS X. It is a Go port of the [wllvm](https://github.com/SRI-CSL/whole-program-llvm).
|
2017-06-27 18:27:21 +02:00
|
|
|
|
2017-06-27 19:19:45 +02:00
|
|
|
gllvm provides compiler wrappers that work in two
|
2017-06-27 18:27:21 +02:00
|
|
|
steps. The wrappers first invoke the compiler as normal. Then, for
|
|
|
|
each object file, they call a bitcode compiler to produce LLVM
|
|
|
|
bitcode. The wrappers also store the location of the generated bitcode
|
|
|
|
file in a dedicated section of the object file. When object files are
|
|
|
|
linked together, the contents of the dedicated sections are
|
|
|
|
concatenated (so we don't lose the locations of any of the constituent
|
2017-06-27 19:19:45 +02:00
|
|
|
bitcode files). After the build completes, one can use a gllvm
|
2017-06-27 18:27:21 +02:00
|
|
|
utility to read the contents of the dedicated section and link all of
|
|
|
|
the bitcode into a single whole-program bitcode file. This utility
|
|
|
|
works for both executable and native libraries.
|
|
|
|
|
|
|
|
This two-phase build process is necessary to be a drop-in replacement
|
|
|
|
for gcc or g++ in any build system. Using the LTO framework in gcc
|
|
|
|
and the gold linker plugin works in many cases, but fails in the
|
2017-06-27 19:19:45 +02:00
|
|
|
presence of static libraries in builds. gllvm's approach has the
|
2017-06-27 18:27:21 +02:00
|
|
|
distinct advantage of generating working binaries, in case some part
|
|
|
|
of a build process requires that.
|
|
|
|
|
2017-06-27 19:19:45 +02:00
|
|
|
gllvm currently works with clang.
|
2017-06-27 18:27:21 +02:00
|
|
|
|
2017-06-28 15:56:33 +02:00
|
|
|
## Installation
|
|
|
|
|
2017-06-27 18:27:21 +02:00
|
|
|
|
2017-06-28 15:56:33 +02:00
|
|
|
#### Requirements
|
2017-06-27 18:27:21 +02:00
|
|
|
|
2017-07-05 17:19:23 +02:00
|
|
|
FIXME: mention the platform depends: objdump and otool as well
|
|
|
|
|
2017-06-28 18:49:53 +02:00
|
|
|
You need the Go compiler to compile gllvm, and both the clang/clang++
|
|
|
|
executables and the llvm tools -- llvm-link, llvm-ar -- to use gllvm. Follow
|
|
|
|
the instructions here to get started: https://golang.org/doc/install.
|
2017-06-27 18:27:21 +02:00
|
|
|
|
2017-07-05 17:40:38 +02:00
|
|
|
FIXME: GOROOT needs to be tossed. This can be explained in a simpler fashion.
|
|
|
|
|
2017-06-27 18:27:21 +02:00
|
|
|
As for now, let us name `$GOROOT` your root Go path that you can obtain by
|
|
|
|
typing `go env GOPATH` in a terminal session -- it is usually `$HOME/go`
|
|
|
|
by default. It is worth noticing that a standard Go installation will install
|
|
|
|
the binaries generated for the project under `$GOROOT/bin`. Make sure that you
|
|
|
|
added the `$GOROOT/bin` directory to your `$PATH` variable.
|
|
|
|
|
2017-06-28 15:56:33 +02:00
|
|
|
#### Build
|
2017-06-27 18:27:21 +02:00
|
|
|
|
2017-07-05 17:40:38 +02:00
|
|
|
FIXME: this needs to be rewritten to use spells like this:
|
2017-07-05 17:19:23 +02:00
|
|
|
```
|
|
|
|
go get github.com/SRI-CSL/gllvm/cmd/gclang
|
|
|
|
go get github.com/SRI-CSL/gllvm/cmd/gclang++
|
|
|
|
go get github.com/SRI-CSL/gllvm/cmd/get-bc
|
|
|
|
```
|
|
|
|
|
2017-06-27 18:27:21 +02:00
|
|
|
First, you must checkout the project under the directory `$GOROOT/src`:
|
|
|
|
```
|
|
|
|
cd $GOROOT/src
|
2017-06-27 19:19:45 +02:00
|
|
|
git clone https://github.com/SRI-CSL/gllvm
|
2017-06-27 18:27:21 +02:00
|
|
|
```
|
|
|
|
|
2017-06-27 19:19:45 +02:00
|
|
|
To build and install gllvm on your system, type:
|
2017-06-27 18:27:21 +02:00
|
|
|
```
|
|
|
|
make install
|
|
|
|
```
|
|
|
|
|
2017-06-28 15:56:33 +02:00
|
|
|
## Usage
|
2017-06-27 18:27:21 +02:00
|
|
|
|
2017-06-27 19:19:45 +02:00
|
|
|
gllvm includes three symlinks to the program's binary: `gclang` and
|
2017-06-27 19:09:39 +02:00
|
|
|
`gclang++`to compile C and C++, and an auxiliary tool `get-bc` for
|
2017-06-27 18:27:21 +02:00
|
|
|
extracting the bitcode from a build product (object file, executable, library
|
|
|
|
or archive).
|
|
|
|
|
|
|
|
Some useful environment variables are listed here:
|
|
|
|
|
2017-06-27 19:16:33 +02:00
|
|
|
* `GLLVM_CC_NAME` can be set if your clang compiler is not called `clang` but
|
|
|
|
something like `clang-3.7`. Similarly `GLLVM_CXX_NAME` can be used to
|
2017-06-27 18:27:21 +02:00
|
|
|
describe what the C++ compiler is called. We also pay attention to the
|
2017-06-27 19:16:33 +02:00
|
|
|
environment variables `GLLVM_LINK_NAME` and `GLLVM_AR_NAME` in an
|
2017-06-27 18:27:21 +02:00
|
|
|
analagous way, since they too get adorned with suffixes in various Linux
|
|
|
|
distributions.
|
|
|
|
|
2017-06-27 19:16:33 +02:00
|
|
|
* `GLLVM_TOOLS_PATH` can be set to the absolute path to the folder that
|
2017-06-27 18:27:21 +02:00
|
|
|
contains the compiler and other LLVM tools such as `llvm-link` to be used.
|
|
|
|
This prevents searching for the compiler in your PATH environment variable.
|
|
|
|
This can be useful if you have different versions of clang on your system
|
|
|
|
and you want to easily switch compilers without tinkering with your PATH
|
|
|
|
variable.
|
2017-06-27 19:16:33 +02:00
|
|
|
Example `GLLVM_TOOLS_PATH=/home/user/llvm_and_clang/Debug+Asserts/bin`.
|
2017-06-27 18:27:21 +02:00
|
|
|
|
2017-06-27 19:16:33 +02:00
|
|
|
* `GLLVM_CONFIGURE_ONLY` can be set to anything. If it is set, `gclang`
|
2017-06-27 19:09:39 +02:00
|
|
|
and `gclang++` behave like a normal C or C++ compiler. They do not
|
2017-06-27 19:16:33 +02:00
|
|
|
produce bitcode. Setting `GLLVM_CONFIGURE_ONLY` may prevent configuration
|
2017-06-27 18:27:21 +02:00
|
|
|
errors caused by the unexpected production of hidden bitcode files. It is
|
|
|
|
sometimes required when configuring a build.
|
|
|
|
|
2017-06-28 15:56:33 +02:00
|
|
|
## Examples
|
2017-06-27 18:27:21 +02:00
|
|
|
|
2017-06-28 15:56:33 +02:00
|
|
|
### Building a bitcode module with clang
|
2017-06-27 18:27:21 +02:00
|
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
tar xf pkg-config-0.26.tar.gz
|
|
|
|
cd pkg-config-0.26
|
2017-06-27 19:09:39 +02:00
|
|
|
CC=gclang ./configure
|
2017-06-27 18:27:21 +02:00
|
|
|
make
|
|
|
|
```
|
|
|
|
|
|
|
|
This should produce the executable `pkg-config`. To extract the bitcode:
|
|
|
|
```
|
2017-06-27 19:09:39 +02:00
|
|
|
get-bc pkg-config
|
2017-06-27 18:27:21 +02:00
|
|
|
```
|
|
|
|
|
|
|
|
which will produce the bitcode module `pkg-config.bc`.
|
|
|
|
|
|
|
|
|
2017-06-28 15:56:33 +02:00
|
|
|
### Building bitcode archive
|
2017-06-27 18:27:21 +02:00
|
|
|
|
|
|
|
```
|
|
|
|
tar -xvf bullet-2.81-rev2613.tgz
|
|
|
|
mkdir bullet-bin
|
|
|
|
cd bullet-bin
|
2017-06-27 19:09:39 +02:00
|
|
|
CC=gclang CXX=gclang++ cmake ../bullet-2.81-rev2613/
|
2017-06-27 18:27:21 +02:00
|
|
|
make
|
|
|
|
|
|
|
|
# Produces src/LinearMath/libLinearMath.bca
|
2017-06-27 19:09:39 +02:00
|
|
|
get-bc src/LinearMath/libLinearMath.a
|
2017-06-27 18:27:21 +02:00
|
|
|
```
|
|
|
|
|
|
|
|
Note that by default extracting bitcode from an archive produces an archive of
|
|
|
|
bitcode. You can also extract the bitcode directly into a module:
|
|
|
|
```
|
2017-06-27 19:09:39 +02:00
|
|
|
get-bc -b src/LinearMath/libLinearMath.a
|
2017-06-27 18:27:21 +02:00
|
|
|
```
|
|
|
|
produces `src/LinearMath/libLinearMath.a.bc`.
|
|
|
|
|
|
|
|
|
2017-06-28 15:56:33 +02:00
|
|
|
### Configuring without building bitcode
|
2017-06-27 18:27:21 +02:00
|
|
|
|
|
|
|
Sometimes it is necessary to disable the production of bitcode. Typically this
|
|
|
|
is during configuration, where the production of unexpected files can confuse
|
2017-06-27 19:16:33 +02:00
|
|
|
the configure script. For this we have a flag `GLLVM_CONFIGURE_ONLY` which
|
2017-06-27 18:27:21 +02:00
|
|
|
can be used as follows:
|
|
|
|
```
|
2017-06-27 19:16:33 +02:00
|
|
|
GLLVM_CONFIGURE_ONLY=1 CC=gclang ./configure
|
2017-06-27 19:09:39 +02:00
|
|
|
CC=gclang make
|
2017-06-27 18:27:21 +02:00
|
|
|
```
|
|
|
|
|
|
|
|
|
2017-06-28 15:56:33 +02:00
|
|
|
### Building a bitcode archive then extracting the bitcode
|
2017-06-27 18:27:21 +02:00
|
|
|
|
|
|
|
```
|
|
|
|
tar xvfz jansson-2.7.tar.gz
|
|
|
|
cd jansson-2.7
|
2017-06-27 19:09:39 +02:00
|
|
|
CC=gclang ./configure
|
2017-06-27 18:27:21 +02:00
|
|
|
make
|
|
|
|
mkdir bitcode
|
|
|
|
cp src/.libs/libjansson.a bitcode
|
|
|
|
cd bitcode
|
2017-06-27 19:09:39 +02:00
|
|
|
get-bc libjansson.a
|
2017-06-27 18:27:21 +02:00
|
|
|
llvm-ar x libjansson.bca
|
|
|
|
ls -la
|
|
|
|
```
|
2017-06-28 15:56:33 +02:00
|
|
|
|
|
|
|
## Miscellaneous Features
|
|
|
|
|
|
|
|
### Preserving bitcode files in a store
|
|
|
|
|
|
|
|
Sometimes it can be useful to preserve the bitcode files produced in a
|
|
|
|
build, either to prevent deletion or to retrieve them later. If the
|
|
|
|
environment variable `GLLVM_BC_STORE` is set to the absolute path of
|
|
|
|
an existing directory, then gllvm will copy the produced bitcode files
|
|
|
|
into that directory. The name of a copied bitcode file is the hash of the path
|
|
|
|
to the original bitcode file. For convenience, when using both the manifest
|
|
|
|
feature of `get-bc` and the store, the manifest will contain both the
|
|
|
|
original path, and the store path.
|
|
|
|
|
|
|
|
|
|
|
|
### Debugging
|
|
|
|
|
|
|
|
|
|
|
|
The GLLVM tools can show various levels of output to aid with debugging.
|
2017-06-28 18:49:53 +02:00
|
|
|
To show this output set the `GLLVM_OUTPUT_LEVEL` environment
|
2017-06-28 15:56:33 +02:00
|
|
|
variable to one of the following levels:
|
|
|
|
|
|
|
|
* `ERROR`
|
|
|
|
* `WARNING`
|
|
|
|
* `INFO`
|
|
|
|
* `DEBUG`
|
|
|
|
|
|
|
|
For example:
|
|
|
|
```
|
|
|
|
export GLLVM_OUTPUT_LEVEL=DEBUG
|
|
|
|
```
|
|
|
|
Output will be directed to the standard error stream, unless you specify the
|
2017-06-28 18:49:53 +02:00
|
|
|
path of a logfile via the `GLLVM_OUTPUT_FILE` environment variable.
|
2017-06-28 15:56:33 +02:00
|
|
|
|
|
|
|
For example:
|
|
|
|
```
|
|
|
|
export GLLVM_OUTPUT_FILE=/tmp/gllvm.log
|
|
|
|
```
|