gllvm/README.md
Ian A Mason 40683f0ab7
Update README.md
`tha` should be `the`
2019-10-23 09:41:13 -07:00

6.7 KiB

A Concurrent WLLVM in Go

License Build Status Go Report Card

TL; DR: A drop-in replacement for wllvm, that builds the bitcode in parallel, and is faster. A comparison between the two tools can be gleaned from building the Linux kernel.

Quick Start Comparison Table

wllvm command/env variable gllvm command/env variable
wllvm gclang
wllvm++ gclang++
extract-bc get-bc
wllvm-sanity-checker gsanity-check
LLVM_COMPILER_PATH LLVM_COMPILER_PATH
LLVM_CC_NAME ... LLVM_CC_NAME ...
WLLVM_CONFIGURE_ONLY WLLVM_CONFIGURE_ONLY
WLLVM_OUTPUT_LEVEL WLLVM_OUTPUT_LEVEL
WLLVM_OUTPUT_FILE WLLVM_OUTPUT_FILE
LLVM_COMPILER not supported (clang only)
LLVM_GCC_PREFIX not supported (clang only)
LLVM_DRAGONEGG_PLUGIN not supported (clang only)

This project, gllvm, provides tools for building whole-program (or whole-library) LLVM bitcode files from an unmodified C or C++ source package. It currently runs on *nix platforms such as Linux, FreeBSD, and Mac OS X. It is a Go port of wllvm.

gllvm provides compiler wrappers that work in two phases. The wrappers first invoke the compiler as normal. Then, for each object file, they call a bitcode compiler to produce LLVM bitcode. The wrappers then store the location of the generated bitcode file in a dedicated section of the object file. When object files are linked together, the contents of the dedicated sections are concatenated (so we don't lose the locations of any of the constituent bitcode files). After the build completes, one can use a gllvm utility to read the contents of the dedicated section and link all of the bitcode into a single whole-program bitcode file. This utility works for both executable and native libraries.

For more details see wllvm.

Prerequisites

To install gllvm you need the go language tool.

To use gllvm you need clang/clang++ and the llvm tools llvm-link and llvm-ar. gllvm is agnostic to the actual llvm version. gllvm also relies on standard build tools such as objcopy and ld.

Installation

To install, simply do

go get github.com/SRI-CSL/gllvm/cmd/...

This should install four binaries: gclang, gclang++, get-bc, and gsanity-check in the $GOPATH/bin directory.

Usage

gclang and gclang++ are the wrappers used to compile C and C++. get-bc is used for extracting the bitcode from a build product (either an object file, executable, library or archive). gsanity-check can be used for detecting configuration errors.

Here is a simple example. Assuming that clang is in your PATH, you can build bitcode for pkg-config as follows:

tar xf pkg-config-0.26.tar.gz
cd pkg-config-0.26
CC=gclang ./configure
make

This should produce the executable pkg-config. To extract the bitcode:

get-bc pkg-config

which will produce the bitcode module pkg-config.bc.

If clang and the llvm tools are not in your PATH, you will need to set some environment variables.

  • LLVM_COMPILER_PATH can be set to the absolute path of the directory that contains the compiler and the other LLVM tools to be used.

  • LLVM_CC_NAME can be set if your clang compiler is not called clang but something like clang-3.7. Similarly LLVM_CXX_NAME can be used to describe what the C++ compiler is called. We also pay attention to the environment variables LLVM_LINK_NAME and LLVM_AR_NAME in an analogous way.

Another useful environment variable is WLLVM_CONFIGURE_ONLY. Its use is explained in the README of wllvm.

gllvm does not support the dragonegg plugin. All other features of wllvm, such as logging, and the bitcode store, are supported in exactly the same fashion as documented here.

Under the hoods

Both wllvm and gllvm toolsets do much the same thing, but the way they do it is slightly different. The gllvm toolset's code base is written in golang, and is largely derived from the wllvm's python codebase.

Both generate object files and bitcode files using the compiler. wllvm can use gcc and dragonegg, gllvm can only use clang. The gllvm toolset does these two tasks in parallel, while wllvm does them sequentially. This together with the slowness of python's fork exec-ing, and it's interpreted nature accounts for the large efficiency gap between the two toolsets.

Both inject the path of the bitcode version of the .o file into a dedicated segment of the .o file itself. This segment is the same across toolsets, so extracting the bitcode can be done by the appropriate tool in either toolset. On *nix both toolsets use objcopy to add the segment, while on OS X they use ld.

When the object files are linked into the resulting library or executable, the bitcode path segments are appended, so the resulting binary contains the paths of all the bitcode files that constitute the binary. To extract the sections the gllvm toolset uses the golang packages "debug/elf" and "debug/macho", while the wllvm toolset uses objdump on *nix, and otool on OS X.

Both tools then use llvm-link or llvm-ar to combine the bitcode files into the desired form.

Customization under the hood.

You can specify the exact version of objcopy and ld that gllvm uses to manipulate the artifacts by setting the GLLVM_OBJCOPY and GLLVM_LD environment variables.

License

gllvm is released under a BSD license. See the file LICENSE for details.


This material is based upon work supported by the National Science Foundation under Grant ACI-1440800. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.