gllvm/README.md

154 lines
6.4 KiB
Markdown
Raw Normal View History

2017-07-11 20:30:40 +02:00
# A Concurrent WLLVM in Go
2017-06-28 15:56:33 +02:00
2019-01-05 17:14:38 +01:00
[![License](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)
[![Build Status](https://travis-ci.org/SRI-CSL/gllvm.svg?branch=master)](https://travis-ci.org/SRI-CSL/gllvm)
[![Go Report Card](https://goreportcard.com/badge/github.com/SRI-CSL/gllvm)](https://goreportcard.com/report/github.com/SRI-CSL/gllvm)
2017-07-15 02:10:09 +02:00
**TL; DR:** A drop-in replacement for [wllvm](https://github.com/SRI-CSL/whole-program-llvm), that builds the
2018-05-02 15:27:08 +02:00
bitcode in parallel, and is faster. A comparison between the two tools can be gleaned from building the [Linux kernel.](https://github.com/SRI-CSL/gllvm/tree/master/examples/linux-kernel)
2017-07-11 20:30:40 +02:00
2018-02-09 20:58:45 +01:00
## Quick Start Comparison Table
2017-06-28 15:56:33 +02:00
2018-02-09 20:58:45 +01:00
| wllvm command/env variable | gllvm command/env variable |
|-----------------------------|-----------------------------|
| wllvm | gclang |
| wllvm++ | gclang++ |
| extract-bc | get-bc |
| wllvm-sanity-checker | gsanity-check |
| LLVM_COMPILER_PATH | LLVM_COMPILER_PATH |
| LLVM_CC_NAME ... | LLVM_CC_NAME ... |
| WLLVM_CONFIGURE_ONLY | WLLVM_CONFIGURE_ONLY |
| WLLVM_OUTPUT_LEVEL | WLLVM_OUTPUT_LEVEL |
| WLLVM_OUTPUT_FILE | WLLVM_OUTPUT_FILE |
2018-02-09 22:34:08 +01:00
| LLVM_COMPILER | *not supported* (clang only)|
| LLVM_GCC_PREFIX | *not supported* (clang only)|
| LLVM_DRAGONEGG_PLUGIN | *not supported* (clang only)|
2017-06-27 18:27:21 +02:00
2017-07-15 02:10:09 +02:00
This project, `gllvm`, provides tools for building whole-program (or
2017-06-27 18:27:21 +02:00
whole-library) LLVM bitcode files from an unmodified C or C++
source package. It currently runs on `*nix` platforms such as Linux,
2017-07-15 02:10:09 +02:00
FreeBSD, and Mac OS X. It is a Go port of [wllvm](https://github.com/SRI-CSL/whole-program-llvm).
2017-06-27 18:27:21 +02:00
2017-07-15 02:10:09 +02:00
`gllvm` provides compiler wrappers that work in two
phases. The wrappers first invoke the compiler as normal. Then, for
2017-06-27 18:27:21 +02:00
each object file, they call a bitcode compiler to produce LLVM
2017-07-15 02:10:09 +02:00
bitcode. The wrappers then store the location of the generated bitcode
2017-06-27 18:27:21 +02:00
file in a dedicated section of the object file. When object files are
linked together, the contents of the dedicated sections are
concatenated (so we don't lose the locations of any of the constituent
2017-07-15 02:10:09 +02:00
bitcode files). After the build completes, one can use a `gllvm`
2017-06-27 18:27:21 +02:00
utility to read the contents of the dedicated section and link all of
the bitcode into a single whole-program bitcode file. This utility
works for both executable and native libraries.
2017-07-15 02:10:09 +02:00
For more details see [wllvm](https://github.com/SRI-CSL/whole-program-llvm).
2017-06-27 18:27:21 +02:00
2017-07-15 02:11:02 +02:00
## Prerequisites
2017-06-27 18:27:21 +02:00
2017-07-15 02:10:09 +02:00
To install `gllvm` you need the go language [tool](https://golang.org/doc/install).
2017-06-27 18:27:21 +02:00
2017-07-15 02:10:09 +02:00
To use `gllvm` you need clang/clang++ and the llvm tools llvm-link and llvm-ar.
`gllvm` is agnostic to the actual llvm version. `gllvm` also relies on standard build
tools such as `objcopy` and `ld`.
2017-07-05 17:40:38 +02:00
2017-06-27 18:27:21 +02:00
2017-07-15 02:10:09 +02:00
## Installation
2017-06-27 18:27:21 +02:00
2017-07-15 02:10:09 +02:00
To install, simply do
2017-07-05 17:19:23 +02:00
```
2017-07-13 00:49:58 +02:00
go get github.com/SRI-CSL/gllvm/cmd/...
```
2017-07-15 02:10:09 +02:00
This should install four binaries: `gclang`, `gclang++`, `get-bc`, and `gsanity-check`
in the `$GOPATH/bin` directory.
2017-06-27 18:27:21 +02:00
2017-06-28 15:56:33 +02:00
## Usage
2017-06-27 18:27:21 +02:00
2017-07-15 02:10:09 +02:00
`gclang` and
`gclang++` are the wrappers used to compile C and C++. `get-bc` is used for
extracting the bitcode from a build product (either an object file, executable, library
or archive). `gsanity-check` can be used for detecting configuration errors.
2017-06-27 18:27:21 +02:00
2017-07-15 02:10:09 +02:00
Here is a simple example. Assuming that clang is in your `PATH`, you can build
bitcode for `pkg-config` as follows:
2017-06-27 18:27:21 +02:00
```
tar xf pkg-config-0.26.tar.gz
cd pkg-config-0.26
2017-06-27 19:09:39 +02:00
CC=gclang ./configure
2017-06-27 18:27:21 +02:00
make
```
This should produce the executable `pkg-config`. To extract the bitcode:
```
2017-06-27 19:09:39 +02:00
get-bc pkg-config
2017-06-27 18:27:21 +02:00
```
which will produce the bitcode module `pkg-config.bc`.
If clang and the llvm tools are not in your `PATH`, you will need to set some
2017-07-15 02:10:09 +02:00
environment variables.
2017-06-27 18:27:21 +02:00
2017-07-15 02:14:49 +02:00
* `LLVM_COMPILER_PATH` can be set to the absolute path of the directory that
contains the compiler and the other LLVM tools to be used.
2017-07-12 23:34:09 +02:00
2017-07-15 02:10:09 +02:00
* `LLVM_CC_NAME` can be set if your clang compiler is not called `clang` but
something like `clang-3.7`. Similarly `LLVM_CXX_NAME` can be used to
describe what the C++ compiler is called. We also pay attention to the
2017-07-15 02:14:49 +02:00
environment variables `LLVM_LINK_NAME` and `LLVM_AR_NAME` in an
2017-07-15 02:16:19 +02:00
analogous way.
2017-07-12 23:34:09 +02:00
Another useful environment variable is `WLLVM_CONFIGURE_ONLY`. Its use is explained in the
README of [wllvm](https://github.com/SRI-CSL/whole-program-llvm).
2017-07-12 23:34:09 +02:00
2017-07-15 02:10:09 +02:00
`gllvm` does not support the dragonegg plugin. All other features of `wllvm`, such as logging, and the bitcode store,
are supported in exactly the same fashion.
2017-07-12 23:34:09 +02:00
## Under the hoods
2019-06-10 23:02:36 +02:00
Both `wllvm` and `gllvm` toolsets do much the same thing, but the way
they do it is slightly different. The `gllvm` toolset's code base is
written in `golang`, and is largely derived from the `wllvm`'s python
codebase.
Both generate object files and bitcode files using the
compiler. `wllvm` can use `gcc` and `dragonegg`, `gllvm` can only use
`clang`. The `gllvm` toolset does these two tasks in parallel, while
`wllvm` does them sequentially. This together with the slowness of
python's `fork exec`-ing, and it's interpreted nature accounts for the
large efficiency gap between the two toolsets.
Both inject the path of the bitcode version of the `.o` file into a
2019-06-10 23:02:36 +02:00
dedicated segment of the `.o` file itself. This segment is the same
across toolsets, so extracting the bitcode can be done by the
appropriate tool in either toolset. On `*nix` both toolsets use
`objcopy` to add the segment, while on OS X they use `ld`.
When the object files are linked into the resulting library or
executable, the bitcode path segments are appended, so the resulting
binary contains the paths of all the bitcode files that constitute the
binary. To extract the sections the `gllvm` toolset uses the golang
2019-06-10 23:02:36 +02:00
packages `"debug/elf"` and `"debug/macho"`, while the `wllvm` toolset
uses `objdump` on `*nix`, and `otool` on OS X.
2019-06-10 23:02:36 +02:00
Both tools then use `llvm-link` or `llvm-ar` to combine the bitcode
files into the desired form.
2017-07-15 02:11:02 +02:00
## License
2017-07-12 23:34:09 +02:00
2017-08-26 01:37:06 +02:00
`gllvm` is released under a BSD license. See the file `LICENSE` for [details.](LICENSE)
2017-08-26 05:35:18 +02:00
---
2019-06-10 23:02:36 +02:00
This material is based upon work supported by the National Science
Foundation under Grant
[ACI-1440800](http://www.nsf.gov/awardsearch/showAward?AWD_ID=1440800). Any
opinions, findings, and conclusions or recommendations expressed in
this material are those of the author(s) and do not necessarily
reflect the views of the National Science Foundation.