New Kaleidoscope chapter: Creating object files

This new chapter describes compiling LLVM IR to object files.

The new chaper is chapter 8, so later chapters have been renumbered.
Since this brings us to 10 chapters total, I've also needed to rename
the other chapters to use two digit numbering.

Differential Revision: http://reviews.llvm.org/D18070

llvm-svn: 274441
This commit is contained in:
Wilfred Hughes 2016-07-02 17:01:59 +00:00
parent f2db01c626
commit 945f43e94b
16 changed files with 1803 additions and 358 deletions

View File

@ -42,45 +42,48 @@ in the various pieces. The structure of the tutorial is:
to implement everything in C++ instead of using lexer and parser
generators. LLVM obviously works just fine with such tools, feel free
to use one if you prefer.
- `Chapter #2 <LangImpl2.html>`_: Implementing a Parser and AST -
- `Chapter #2 <LangImpl02.html>`_: Implementing a Parser and AST -
With the lexer in place, we can talk about parsing techniques and
basic AST construction. This tutorial describes recursive descent
parsing and operator precedence parsing. Nothing in Chapters 1 or 2
is LLVM-specific, the code doesn't even link in LLVM at this point.
:)
- `Chapter #3 <LangImpl3.html>`_: Code generation to LLVM IR - With
- `Chapter #3 <LangImpl03.html>`_: Code generation to LLVM IR - With
the AST ready, we can show off how easy generation of LLVM IR really
is.
- `Chapter #4 <LangImpl4.html>`_: Adding JIT and Optimizer Support
- `Chapter #4 <LangImpl04.html>`_: Adding JIT and Optimizer Support
- Because a lot of people are interested in using LLVM as a JIT,
we'll dive right into it and show you the 3 lines it takes to add JIT
support. LLVM is also useful in many other ways, but this is one
simple and "sexy" way to show off its power. :)
- `Chapter #5 <LangImpl5.html>`_: Extending the Language: Control
- `Chapter #5 <LangImpl05.html>`_: Extending the Language: Control
Flow - With the language up and running, we show how to extend it
with control flow operations (if/then/else and a 'for' loop). This
gives us a chance to talk about simple SSA construction and control
flow.
- `Chapter #6 <LangImpl6.html>`_: Extending the Language:
- `Chapter #6 <LangImpl06.html>`_: Extending the Language:
User-defined Operators - This is a silly but fun chapter that talks
about extending the language to let the user program define their own
arbitrary unary and binary operators (with assignable precedence!).
This lets us build a significant piece of the "language" as library
routines.
- `Chapter #7 <LangImpl7.html>`_: Extending the Language: Mutable
- `Chapter #7 <LangImpl07.html>`_: Extending the Language: Mutable
Variables - This chapter talks about adding user-defined local
variables along with an assignment operator. The interesting part
about this is how easy and trivial it is to construct SSA form in
LLVM: no, LLVM does *not* require your front-end to construct SSA
form!
- `Chapter #8 <LangImpl8.html>`_: Extending the Language: Debug
- `Chapter #8 <LangImpl08.html>`_: Compiling to Object Files - This
chapter explains how to take LLVM IR and compile it down to object
files.
- `Chapter #9 <LangImpl09.html>`_: Extending the Language: Debug
Information - Having built a decent little programming language with
control flow, functions and mutable variables, we consider what it
takes to add debug information to standalone executables. This debug
information will allow you to set breakpoints in Kaleidoscope
functions, print out argument variables, and call functions - all
from within the debugger!
- `Chapter #9 <LangImpl9.html>`_: Conclusion and other useful LLVM
- `Chapter #10 <LangImpl10.html>`_: Conclusion and other useful LLVM
tidbits - This chapter wraps up the series by talking about
potential ways to extend the language, but also includes a bunch of
pointers to info about "special topics" like adding garbage
@ -146,7 +149,7 @@ useful for mutually recursive functions). For example:
A more interesting example is included in Chapter 6 where we write a
little Kaleidoscope application that `displays a Mandelbrot
Set <LangImpl6.html#kicking-the-tires>`_ at various levels of magnification.
Set <LangImpl06.html#kicking-the-tires>`_ at various levels of magnification.
Lets dive into the implementation of this language!
@ -280,11 +283,11 @@ file. These are handled with this code:
}
With this, we have the complete lexer for the basic Kaleidoscope
language (the `full code listing <LangImpl2.html#full-code-listing>`_ for the Lexer
is available in the `next chapter <LangImpl2.html>`_ of the tutorial).
language (the `full code listing <LangImpl02.html#full-code-listing>`_ for the Lexer
is available in the `next chapter <LangImpl02.html>`_ of the tutorial).
Next we'll `build a simple parser that uses this to build an Abstract
Syntax Tree <LangImpl2.html>`_. When we have that, we'll include a
Syntax Tree <LangImpl02.html>`_. When we have that, we'll include a
driver so that you can use the lexer and parser together.
`Next: Implementing a Parser and AST <LangImpl2.html>`_
`Next: Implementing a Parser and AST <LangImpl02.html>`_

View File

@ -731,5 +731,5 @@ Here is the code:
.. literalinclude:: ../../examples/Kaleidoscope/Chapter2/toy.cpp
:language: c++
`Next: Implementing Code Generation to LLVM IR <LangImpl3.html>`_
`Next: Implementing Code Generation to LLVM IR <LangImpl03.html>`_

View File

@ -563,5 +563,5 @@ Here is the code:
.. literalinclude:: ../../examples/Kaleidoscope/Chapter3/toy.cpp
:language: c++
`Next: Adding JIT and Optimizer Support <LangImpl4.html>`_
`Next: Adding JIT and Optimizer Support <LangImpl04.html>`_

View File

@ -606,5 +606,5 @@ Here is the code:
.. literalinclude:: ../../examples/Kaleidoscope/Chapter4/toy.cpp
:language: c++
`Next: Extending the language: control flow <LangImpl5.html>`_
`Next: Extending the language: control flow <LangImpl05.html>`_

View File

Before

Width:  |  Height:  |  Size: 38 KiB

After

Width:  |  Height:  |  Size: 38 KiB

View File

@ -217,7 +217,7 @@ IR into "t.ll" and run "``llvm-as < t.ll | opt -analyze -view-cfg``", `a
window will pop up <../ProgrammersManual.html#viewing-graphs-while-debugging-code>`_ and you'll
see this graph:
.. figure:: LangImpl5-cfg.png
.. figure:: LangImpl05-cfg.png
:align: center
:alt: Example CFG
@ -786,5 +786,5 @@ Here is the code:
.. literalinclude:: ../../examples/Kaleidoscope/Chapter5/toy.cpp
:language: c++
`Next: Extending the language: user-defined operators <LangImpl6.html>`_
`Next: Extending the language: user-defined operators <LangImpl06.html>`_

View File

@ -764,5 +764,5 @@ Here is the code:
:language: c++
`Next: Extending the language: mutable variables / SSA
construction <LangImpl7.html>`_
construction <LangImpl07.html>`_

View File

@ -877,5 +877,5 @@ Here is the code:
.. literalinclude:: ../../examples/Kaleidoscope/Chapter7/toy.cpp
:language: c++
`Next: Adding Debug Information <LangImpl8.html>`_
`Next: Compiling to Object Code <LangImpl08.html>`_

View File

@ -0,0 +1,218 @@
========================================
Kaleidoscope: Compiling to Object Code
========================================
.. contents::
:local:
Chapter 8 Introduction
======================
Welcome to Chapter 8 of the "`Implementing a language with LLVM
<index.html>`_" tutorial. This chapter describes how to compile our
language down to object files.
Choosing a target
=================
LLVM has native support for cross-compilation. You can compile to the
architecture of your current machine, or just as easily compile for
other architectures. In this tutorial, we'll target the current
machine.
To specify the architecture that you want to target, we use a string
called a "target triple". This takes the form
``<arch><sub>-<vendor>-<sys>-<abi>`` (see the `cross compilation docs
<http://clang.llvm.org/docs/CrossCompilation.html#target-triple>`_).
As an example, we can see what clang thinks is our current target
triple:
::
$ clang --version | grep Target
Target: x86_64-unknown-linux-gnu
Running this command may show something different on your machine as
you might be using a different architecture or operating system to me.
Fortunately, we don't need to hard-code a target triple to target the
current machine. LLVM provides ``sys::getDefaultTargetTriple``, which
returns the target triple of the current machine.
.. code-block:: c++
auto TargetTriple = sys::getDefaultTargetTriple();
LLVM doesn't require us to to link in all the target
functionality. For example, if we're just using the JIT, we don't need
the assembly printers. Similarly, if we're only targetting certain
architectures, we can only link in the functionality for those
architectures.
For this example, we'll initialize all the targets for emitting object
code.
.. code-block:: c++
InitializeAllTargetInfos();
InitializeAllTargets();
InitializeAllTargetMCs();
InitializeAllAsmParsers();
InitializeAllAsmPrinters();
We can now use our target triple to get a ``Target``:
.. code-block:: c++
std::string Error;
auto Target = TargetRegistry::lookupTarget(TargetTriple, Error);
// Print an error and exit if we couldn't find the requested target.
// This generally occurs if we've forgotten to initialise the
// TargetRegistry or we have a bogus target triple.
if (!Target) {
errs() << Error;
return 1;
}
Target Machine
==============
We will also need a ``TargetMachine``. This class provides a complete
machine description of the machine we're targetting. If we want to
target a specific feature (such as SSE) or a specific CPU (such as
Intel's Sandylake), we do so now.
To see which features and CPUs that LLVM knows about, we can use
``llc``. For example, let's look at x86:
::
$ llvm-as < /dev/null | llc -march=x86 -mattr=help
Available CPUs for this target:
amdfam10 - Select the amdfam10 processor.
athlon - Select the athlon processor.
athlon-4 - Select the athlon-4 processor.
...
Available features for this target:
16bit-mode - 16-bit mode (i8086).
32bit-mode - 32-bit mode (80386).
3dnow - Enable 3DNow! instructions.
3dnowa - Enable 3DNow! Athlon instructions.
...
For our example, we'll use the generic CPU without any additional
features, options or relocation model.
.. code-block:: c++
auto CPU = "generic";
auto Features = "";
TargetOptions opt;
auto RM = Optional<Reloc::Model>();
auto TargetMachine = Target->createTargetMachine(TargetTriple, CPU, Features, opt, RM);
Configuring the Module
======================
We're now ready to configure our module, to specify the target and
data layout. This isn't strictly necessary, but the `frontend
performance guide <../Frontend/PerformanceTips.html>`_ recommends
this. Optimizations benefit from knowing about the target and data
layout.
.. code-block:: c++
TheModule->setDataLayout(TargetMachine->createDataLayout());
TheModule->setTargetTriple(TargetTriple);
Emit Object Code
================
We're ready to emit object code! Let's define where we want to write
our file to:
.. code-block:: c++
auto Filename = "output.o";
std::error_code EC;
raw_fd_ostream dest(Filename, EC, sys::fs::F_None);
if (EC) {
errs() << "Could not open file: " << EC.message();
return 1;
}
Finally, we define a pass that emits object code, then we run that
pass:
.. code-block:: c++
legacy::PassManager pass;
auto FileType = TargetMachine::CGFT_ObjectFile;
if (TargetMachine->addPassesToEmitFile(pass, dest, FileType)) {
errs() << "TargetMachine can't emit a file of this type";
return 1;
}
pass.run(*TheModule);
dest.flush();
Putting It All Together
=======================
Does it work? Let's give it a try. We need to compile our code, but
note that the arguments to ``llvm-config`` are different to the previous chapters.
::
$ clang++ -g -O3 toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs all` -o toy
Let's run it, and define a simple ``average`` function. Press Ctrl-D
when you're done.
::
$ ./toy
ready> def average(x y) (x + y) * 0.5;
^D
Wrote output.o
We have an object file! To test it, let's write a simple program and
link it with our output. Here's the source code:
.. code-block:: c++
#include <iostream>
extern "C" {
double average(double, double);
}
int main() {
std::cout << "average of 3.0 and 4.0: " << average(3.0, 4.0) << std::endl;
}
We link our program to output.o and check the result is what we
expected:
::
$ clang++ main.cpp output.o -o main
$ ./main
average of 3.0 and 4.0: 3.5
Full Code Listing
=================
.. literalinclude:: ../../examples/Kaleidoscope/Chapter8/toy.cpp
:language: c++
`Next: Adding Debug Information <LangImpl09.html>`_

View File

@ -5,11 +5,11 @@ Kaleidoscope: Adding Debug Information
.. contents::
:local:
Chapter 8 Introduction
Chapter 9 Introduction
======================
Welcome to Chapter 8 of the "`Implementing a language with
LLVM <index.html>`_" tutorial. In chapters 1 through 7, we've built a
Welcome to Chapter 9 of the "`Implementing a language with
LLVM <index.html>`_" tutorial. In chapters 1 through 8, we've built a
decent little programming language with functions and variables.
What happens if something goes wrong though, how do you debug your
program?
@ -149,7 +149,7 @@ command line:
.. code-block:: bash
Kaleidoscope-Ch8 < fib.ks | & clang -x ir -
Kaleidoscope-Ch9 < fib.ks | & clang -x ir -
which gives an a.out/a.exe in the current working directory.
@ -455,8 +455,8 @@ debug information. To build this example, use:
Here is the code:
.. literalinclude:: ../../examples/Kaleidoscope/Chapter8/toy.cpp
.. literalinclude:: ../../examples/Kaleidoscope/Chapter9/toy.cpp
:language: c++
`Next: Conclusion and other useful LLVM tidbits <LangImpl9.html>`_
`Next: Conclusion and other useful LLVM tidbits <LangImpl10.html>`_

View File

@ -178,7 +178,7 @@ IR into "t.ll" and run "``llvm-as < t.ll | opt -analyze -view-cfg``", `a
window will pop up <../ProgrammersManual.html#viewing-graphs-while-debugging-code>`_ and you'll
see this graph:
.. figure:: LangImpl5-cfg.png
.. figure:: LangImpl05-cfg.png
:align: center
:alt: Example CFG

View File

@ -1,9 +1,5 @@
set(LLVM_LINK_COMPONENTS
Core
ExecutionEngine
Object
Support
native
all
)
add_kaleidoscope_chapter(Kaleidoscope-Ch8

View File

@ -1,28 +1,20 @@
#include "llvm/ADT/APFloat.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/Triple.h"
#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/Constants.h"
#include "llvm/IR/DebugInfoMetadata.h"
#include "llvm/IR/DebugLoc.h"
#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/DIBuilder.h"
#include "llvm/IR/Function.h"
#include "llvm/IR/Instructions.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/Analysis/Passes.h"
#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/LegacyPassManager.h"
#include "llvm/IR/Metadata.h"
#include "llvm/IR/Module.h"
#include "llvm/IR/Type.h"
#include "llvm/IR/Verifier.h"
#include "llvm/Support/Host.h"
#include "llvm/Support/raw_ostream.h"
#include "llvm/Support/FileSystem.h"
#include "llvm/Support/TargetRegistry.h"
#include "llvm/Support/TargetSelect.h"
#include "llvm/Target/TargetMachine.h"
#include "../include/KaleidoscopeJIT.h"
#include <cassert>
#include "llvm/Target/TargetOptions.h"
#include "llvm/Transforms/Scalar.h"
#include <cctype>
#include <cstdio>
#include <cstdlib>
@ -33,7 +25,7 @@
#include <vector>
using namespace llvm;
using namespace llvm::orc;
using namespace llvm::sys;
//===----------------------------------------------------------------------===//
// Lexer
@ -67,71 +59,6 @@ enum Token {
tok_var = -13
};
std::string getTokName(int Tok) {
switch (Tok) {
case tok_eof:
return "eof";
case tok_def:
return "def";
case tok_extern:
return "extern";
case tok_identifier:
return "identifier";
case tok_number:
return "number";
case tok_if:
return "if";
case tok_then:
return "then";
case tok_else:
return "else";
case tok_for:
return "for";
case tok_in:
return "in";
case tok_binary:
return "binary";
case tok_unary:
return "unary";
case tok_var:
return "var";
}
return std::string(1, (char)Tok);
}
namespace {
class ExprAST;
} // end anonymous namespace
static LLVMContext TheContext;
static IRBuilder<> Builder(TheContext);
struct DebugInfo {
DICompileUnit *TheCU;
DIType *DblTy;
std::vector<DIScope *> LexicalBlocks;
void emitLocation(ExprAST *AST);
DIType *getDoubleTy();
} KSDbgInfo;
struct SourceLocation {
int Line;
int Col;
};
static SourceLocation CurLoc;
static SourceLocation LexLoc = {1, 0};
static int advance() {
int LastChar = getchar();
if (LastChar == '\n' || LastChar == '\r') {
LexLoc.Line++;
LexLoc.Col = 0;
} else
LexLoc.Col++;
return LastChar;
}
static std::string IdentifierStr; // Filled in if tok_identifier
static double NumVal; // Filled in if tok_number
@ -141,13 +68,11 @@ static int gettok() {
// Skip any whitespace.
while (isspace(LastChar))
LastChar = advance();
CurLoc = LexLoc;
LastChar = getchar();
if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
IdentifierStr = LastChar;
while (isalnum((LastChar = advance())))
while (isalnum((LastChar = getchar())))
IdentifierStr += LastChar;
if (IdentifierStr == "def")
@ -177,7 +102,7 @@ static int gettok() {
std::string NumStr;
do {
NumStr += LastChar;
LastChar = advance();
LastChar = getchar();
} while (isdigit(LastChar) || LastChar == '.');
NumVal = strtod(NumStr.c_str(), nullptr);
@ -187,7 +112,7 @@ static int gettok() {
if (LastChar == '#') {
// Comment until end of line.
do
LastChar = advance();
LastChar = getchar();
while (LastChar != EOF && LastChar != '\n' && LastChar != '\r');
if (LastChar != EOF)
@ -200,7 +125,7 @@ static int gettok() {
// Otherwise, just return the character as its ascii value.
int ThisChar = LastChar;
LastChar = advance();
LastChar = getchar();
return ThisChar;
}
@ -208,25 +133,11 @@ static int gettok() {
// Abstract Syntax Tree (aka Parse Tree)
//===----------------------------------------------------------------------===//
namespace {
raw_ostream &indent(raw_ostream &O, int size) {
return O << std::string(size, ' ');
}
/// ExprAST - Base class for all expression nodes.
class ExprAST {
SourceLocation Loc;
public:
ExprAST(SourceLocation Loc = CurLoc) : Loc(Loc) {}
virtual ~ExprAST() {}
virtual Value *codegen() = 0;
int getLine() const { return Loc.Line; }
int getCol() const { return Loc.Col; }
virtual raw_ostream &dump(raw_ostream &out, int ind) {
return out << ':' << getLine() << ':' << getCol() << '\n';
}
};
/// NumberExprAST - Expression class for numeric literals like "1.0".
@ -236,10 +147,6 @@ class NumberExprAST : public ExprAST {
public:
NumberExprAST(double Val) : Val(Val) {}
Value *codegen() override;
raw_ostream &dump(raw_ostream &out, int ind) override {
return ExprAST::dump(out << Val, ind);
}
};
/// VariableExprAST - Expression class for referencing a variable, like "a".
@ -247,14 +154,9 @@ class VariableExprAST : public ExprAST {
std::string Name;
public:
VariableExprAST(SourceLocation Loc, const std::string &Name)
: ExprAST(Loc), Name(Name) {}
VariableExprAST(const std::string &Name) : Name(Name) {}
const std::string &getName() const { return Name; }
Value *codegen() override;
raw_ostream &dump(raw_ostream &out, int ind) override {
return ExprAST::dump(out << Name, ind);
}
};
/// UnaryExprAST - Expression class for a unary operator.
@ -266,12 +168,6 @@ public:
UnaryExprAST(char Opcode, std::unique_ptr<ExprAST> Operand)
: Opcode(Opcode), Operand(std::move(Operand)) {}
Value *codegen() override;
raw_ostream &dump(raw_ostream &out, int ind) override {
ExprAST::dump(out << "unary" << Opcode, ind);
Operand->dump(out, ind + 1);
return out;
}
};
/// BinaryExprAST - Expression class for a binary operator.
@ -280,17 +176,10 @@ class BinaryExprAST : public ExprAST {
std::unique_ptr<ExprAST> LHS, RHS;
public:
BinaryExprAST(SourceLocation Loc, char Op, std::unique_ptr<ExprAST> LHS,
BinaryExprAST(char Op, std::unique_ptr<ExprAST> LHS,
std::unique_ptr<ExprAST> RHS)
: ExprAST(Loc), Op(Op), LHS(std::move(LHS)), RHS(std::move(RHS)) {}
: Op(Op), LHS(std::move(LHS)), RHS(std::move(RHS)) {}
Value *codegen() override;
raw_ostream &dump(raw_ostream &out, int ind) override {
ExprAST::dump(out << "binary" << Op, ind);
LHS->dump(indent(out, ind) << "LHS:", ind + 1);
RHS->dump(indent(out, ind) << "RHS:", ind + 1);
return out;
}
};
/// CallExprAST - Expression class for function calls.
@ -299,17 +188,10 @@ class CallExprAST : public ExprAST {
std::vector<std::unique_ptr<ExprAST>> Args;
public:
CallExprAST(SourceLocation Loc, const std::string &Callee,
CallExprAST(const std::string &Callee,
std::vector<std::unique_ptr<ExprAST>> Args)
: ExprAST(Loc), Callee(Callee), Args(std::move(Args)) {}
: Callee(Callee), Args(std::move(Args)) {}
Value *codegen() override;
raw_ostream &dump(raw_ostream &out, int ind) override {
ExprAST::dump(out << "call " << Callee, ind);
for (const auto &Arg : Args)
Arg->dump(indent(out, ind + 1), ind + 1);
return out;
}
};
/// IfExprAST - Expression class for if/then/else.
@ -317,19 +199,10 @@ class IfExprAST : public ExprAST {
std::unique_ptr<ExprAST> Cond, Then, Else;
public:
IfExprAST(SourceLocation Loc, std::unique_ptr<ExprAST> Cond,
std::unique_ptr<ExprAST> Then, std::unique_ptr<ExprAST> Else)
: ExprAST(Loc), Cond(std::move(Cond)), Then(std::move(Then)),
Else(std::move(Else)) {}
IfExprAST(std::unique_ptr<ExprAST> Cond, std::unique_ptr<ExprAST> Then,
std::unique_ptr<ExprAST> Else)
: Cond(std::move(Cond)), Then(std::move(Then)), Else(std::move(Else)) {}
Value *codegen() override;
raw_ostream &dump(raw_ostream &out, int ind) override {
ExprAST::dump(out << "if", ind);
Cond->dump(indent(out, ind) << "Cond:", ind + 1);
Then->dump(indent(out, ind) << "Then:", ind + 1);
Else->dump(indent(out, ind) << "Else:", ind + 1);
return out;
}
};
/// ForExprAST - Expression class for for/in.
@ -344,15 +217,6 @@ public:
: VarName(VarName), Start(std::move(Start)), End(std::move(End)),
Step(std::move(Step)), Body(std::move(Body)) {}
Value *codegen() override;
raw_ostream &dump(raw_ostream &out, int ind) override {
ExprAST::dump(out << "for", ind);
Start->dump(indent(out, ind) << "Cond:", ind + 1);
End->dump(indent(out, ind) << "End:", ind + 1);
Step->dump(indent(out, ind) << "Step:", ind + 1);
Body->dump(indent(out, ind) << "Body:", ind + 1);
return out;
}
};
/// VarExprAST - Expression class for var/in
@ -366,14 +230,6 @@ public:
std::unique_ptr<ExprAST> Body)
: VarNames(std::move(VarNames)), Body(std::move(Body)) {}
Value *codegen() override;
raw_ostream &dump(raw_ostream &out, int ind) override {
ExprAST::dump(out << "var", ind);
for (const auto &NamedVar : VarNames)
NamedVar.second->dump(indent(out, ind) << NamedVar.first << ':', ind + 1);
Body->dump(indent(out, ind) << "Body:", ind + 1);
return out;
}
};
/// PrototypeAST - This class represents the "prototype" for a function,
@ -384,14 +240,12 @@ class PrototypeAST {
std::vector<std::string> Args;
bool IsOperator;
unsigned Precedence; // Precedence if a binary op.
int Line;
public:
PrototypeAST(SourceLocation Loc, const std::string &Name,
std::vector<std::string> Args, bool IsOperator = false,
unsigned Prec = 0)
PrototypeAST(const std::string &Name, std::vector<std::string> Args,
bool IsOperator = false, unsigned Prec = 0)
: Name(Name), Args(std::move(Args)), IsOperator(IsOperator),
Precedence(Prec), Line(Loc.Line) {}
Precedence(Prec) {}
Function *codegen();
const std::string &getName() const { return Name; }
@ -404,7 +258,6 @@ public:
}
unsigned getBinaryPrecedence() const { return Precedence; }
int getLine() const { return Line; }
};
/// FunctionAST - This class represents a function definition itself.
@ -417,13 +270,6 @@ public:
std::unique_ptr<ExprAST> Body)
: Proto(std::move(Proto)), Body(std::move(Body)) {}
Function *codegen();
raw_ostream &dump(raw_ostream &out, int ind) {
indent(out, ind) << "FunctionAST\n";
++ind;
indent(out, ind) << "Body:";
return Body ? Body->dump(out, ind) : out << "null\n";
}
};
} // end anonymous namespace
@ -492,12 +338,10 @@ static std::unique_ptr<ExprAST> ParseParenExpr() {
static std::unique_ptr<ExprAST> ParseIdentifierExpr() {
std::string IdName = IdentifierStr;
SourceLocation LitLoc = CurLoc;
getNextToken(); // eat identifier.
if (CurTok != '(') // Simple variable ref.
return llvm::make_unique<VariableExprAST>(LitLoc, IdName);
return llvm::make_unique<VariableExprAST>(IdName);
// Call.
getNextToken(); // eat (
@ -521,13 +365,11 @@ static std::unique_ptr<ExprAST> ParseIdentifierExpr() {
// Eat the ')'.
getNextToken();
return llvm::make_unique<CallExprAST>(LitLoc, IdName, std::move(Args));
return llvm::make_unique<CallExprAST>(IdName, std::move(Args));
}
/// ifexpr ::= 'if' expression 'then' expression 'else' expression
static std::unique_ptr<ExprAST> ParseIfExpr() {
SourceLocation IfLoc = CurLoc;
getNextToken(); // eat the if.
// condition.
@ -552,7 +394,7 @@ static std::unique_ptr<ExprAST> ParseIfExpr() {
if (!Else)
return nullptr;
return llvm::make_unique<IfExprAST>(IfLoc, std::move(Cond), std::move(Then),
return llvm::make_unique<IfExprAST>(std::move(Cond), std::move(Then),
std::move(Else));
}
@ -707,7 +549,6 @@ static std::unique_ptr<ExprAST> ParseBinOpRHS(int ExprPrec,
// Okay, we know this is a binop.
int BinOp = CurTok;
SourceLocation BinLoc = CurLoc;
getNextToken(); // eat binop
// Parse the unary expression after the binary operator.
@ -725,8 +566,8 @@ static std::unique_ptr<ExprAST> ParseBinOpRHS(int ExprPrec,
}
// Merge LHS/RHS.
LHS = llvm::make_unique<BinaryExprAST>(BinLoc, BinOp, std::move(LHS),
std::move(RHS));
LHS =
llvm::make_unique<BinaryExprAST>(BinOp, std::move(LHS), std::move(RHS));
}
}
@ -748,8 +589,6 @@ static std::unique_ptr<ExprAST> ParseExpression() {
static std::unique_ptr<PrototypeAST> ParsePrototype() {
std::string FnName;
SourceLocation FnLoc = CurLoc;
unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
unsigned BinaryPrecedence = 30;
@ -805,7 +644,7 @@ static std::unique_ptr<PrototypeAST> ParsePrototype() {
if (Kind && ArgNames.size() != Kind)
return LogErrorP("Invalid number of operands for operator");
return llvm::make_unique<PrototypeAST>(FnLoc, FnName, ArgNames, Kind != 0,
return llvm::make_unique<PrototypeAST>(FnName, ArgNames, Kind != 0,
BinaryPrecedence);
}
@ -823,10 +662,9 @@ static std::unique_ptr<FunctionAST> ParseDefinition() {
/// toplevelexpr ::= expression
static std::unique_ptr<FunctionAST> ParseTopLevelExpr() {
SourceLocation FnLoc = CurLoc;
if (auto E = ParseExpression()) {
// Make an anonymous proto.
auto Proto = llvm::make_unique<PrototypeAST>(FnLoc, "__anon_expr",
auto Proto = llvm::make_unique<PrototypeAST>("__anon_expr",
std::vector<std::string>());
return llvm::make_unique<FunctionAST>(std::move(Proto), std::move(E));
}
@ -839,52 +677,14 @@ static std::unique_ptr<PrototypeAST> ParseExtern() {
return ParsePrototype();
}
//===----------------------------------------------------------------------===//
// Debug Info Support
//===----------------------------------------------------------------------===//
static std::unique_ptr<DIBuilder> DBuilder;
DIType *DebugInfo::getDoubleTy() {
if (DblTy)
return DblTy;
DblTy = DBuilder->createBasicType("double", 64, 64, dwarf::DW_ATE_float);
return DblTy;
}
void DebugInfo::emitLocation(ExprAST *AST) {
if (!AST)
return Builder.SetCurrentDebugLocation(DebugLoc());
DIScope *Scope;
if (LexicalBlocks.empty())
Scope = TheCU;
else
Scope = LexicalBlocks.back();
Builder.SetCurrentDebugLocation(
DebugLoc::get(AST->getLine(), AST->getCol(), Scope));
}
static DISubroutineType *CreateFunctionType(unsigned NumArgs, DIFile *Unit) {
SmallVector<Metadata *, 8> EltTys;
DIType *DblTy = KSDbgInfo.getDoubleTy();
// Add the result type.
EltTys.push_back(DblTy);
for (unsigned i = 0, e = NumArgs; i != e; ++i)
EltTys.push_back(DblTy);
return DBuilder->createSubroutineType(DBuilder->getOrCreateTypeArray(EltTys));
}
//===----------------------------------------------------------------------===//
// Code Generation
//===----------------------------------------------------------------------===//
static LLVMContext TheContext;
static IRBuilder<> Builder(TheContext);
static std::unique_ptr<Module> TheModule;
static std::map<std::string, AllocaInst *> NamedValues;
static std::unique_ptr<KaleidoscopeJIT> TheJIT;
static std::map<std::string, std::unique_ptr<PrototypeAST>> FunctionProtos;
Value *LogErrorV(const char *Str) {
@ -917,7 +717,6 @@ static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
}
Value *NumberExprAST::codegen() {
KSDbgInfo.emitLocation(this);
return ConstantFP::get(TheContext, APFloat(Val));
}
@ -927,7 +726,6 @@ Value *VariableExprAST::codegen() {
if (!V)
return LogErrorV("Unknown variable name");
KSDbgInfo.emitLocation(this);
// Load the value.
return Builder.CreateLoad(V, Name.c_str());
}
@ -941,13 +739,10 @@ Value *UnaryExprAST::codegen() {
if (!F)
return LogErrorV("Unknown unary operator");
KSDbgInfo.emitLocation(this);
return Builder.CreateCall(F, OperandV, "unop");
}
Value *BinaryExprAST::codegen() {
KSDbgInfo.emitLocation(this);
// Special case '=' because we don't want to emit the LHS as an expression.
if (Op == '=') {
// Assignment requires the LHS to be an identifier.
@ -1001,8 +796,6 @@ Value *BinaryExprAST::codegen() {
}
Value *CallExprAST::codegen() {
KSDbgInfo.emitLocation(this);
// Look up the name in the global module table.
Function *CalleeF = getFunction(Callee);
if (!CalleeF)
@ -1023,8 +816,6 @@ Value *CallExprAST::codegen() {
}
Value *IfExprAST::codegen() {
KSDbgInfo.emitLocation(this);
Value *CondV = Cond->codegen();
if (!CondV)
return nullptr;
@ -1101,8 +892,6 @@ Value *ForExprAST::codegen() {
// Create an alloca for the variable in the entry block.
AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
KSDbgInfo.emitLocation(this);
// Emit the start code first, without 'variable' in scope.
Value *StartVal = Start->codegen();
if (!StartVal)
@ -1213,8 +1002,6 @@ Value *VarExprAST::codegen() {
NamedValues[VarName] = Alloca;
}
KSDbgInfo.emitLocation(this);
// Codegen the body, now that all vars are in scope.
Value *BodyVal = Body->codegen();
if (!BodyVal)
@ -1262,43 +1049,12 @@ Function *FunctionAST::codegen() {
BasicBlock *BB = BasicBlock::Create(TheContext, "entry", TheFunction);
Builder.SetInsertPoint(BB);
// Create a subprogram DIE for this function.
DIFile *Unit = DBuilder->createFile(KSDbgInfo.TheCU->getFilename(),
KSDbgInfo.TheCU->getDirectory());
DIScope *FContext = Unit;
unsigned LineNo = P.getLine();
unsigned ScopeLine = LineNo;
DISubprogram *SP = DBuilder->createFunction(
FContext, P.getName(), StringRef(), Unit, LineNo,
CreateFunctionType(TheFunction->arg_size(), Unit),
false /* internal linkage */, true /* definition */, ScopeLine,
DINode::FlagPrototyped, false);
TheFunction->setSubprogram(SP);
// Push the current scope.
KSDbgInfo.LexicalBlocks.push_back(SP);
// Unset the location for the prologue emission (leading instructions with no
// location in a function are considered part of the prologue and the debugger
// will run past them when breaking on a function)
KSDbgInfo.emitLocation(nullptr);
// Record the function arguments in the NamedValues map.
NamedValues.clear();
unsigned ArgIdx = 0;
for (auto &Arg : TheFunction->args()) {
// Create an alloca for this variable.
AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, Arg.getName());
// Create a debug descriptor for the variable.
DILocalVariable *D = DBuilder->createParameterVariable(
SP, Arg.getName(), ++ArgIdx, Unit, LineNo, KSDbgInfo.getDoubleTy(),
true);
DBuilder->insertDeclare(Alloca, D, DBuilder->createExpression(),
DebugLoc::get(LineNo, 0, SP),
Builder.GetInsertBlock());
// Store the initial value into the alloca.
Builder.CreateStore(&Arg, Alloca);
@ -1306,15 +1062,10 @@ Function *FunctionAST::codegen() {
NamedValues[Arg.getName()] = Alloca;
}
KSDbgInfo.emitLocation(Body.get());
if (Value *RetVal = Body->codegen()) {
// Finish off the function.
Builder.CreateRet(RetVal);
// Pop off the lexical block for the function.
KSDbgInfo.LexicalBlocks.pop_back();
// Validate the generated code, checking for consistency.
verifyFunction(*TheFunction);
@ -1326,11 +1077,6 @@ Function *FunctionAST::codegen() {
if (P.isBinaryOp())
BinopPrecedence.erase(Proto->getOperatorName());
// Pop off the lexical block for the function since we added it
// unconditionally.
KSDbgInfo.LexicalBlocks.pop_back();
return nullptr;
}
@ -1338,16 +1084,17 @@ Function *FunctionAST::codegen() {
// Top-Level parsing and JIT Driver
//===----------------------------------------------------------------------===//
static void InitializeModule() {
static void InitializeModuleAndPassManager() {
// Open a new module.
TheModule = llvm::make_unique<Module>("my cool jit", TheContext);
TheModule->setDataLayout(TheJIT->getTargetMachine().createDataLayout());
}
static void HandleDefinition() {
if (auto FnAST = ParseDefinition()) {
if (!FnAST->codegen())
fprintf(stderr, "Error reading function definition:");
if (auto *FnIR = FnAST->codegen()) {
fprintf(stderr, "Read function definition:");
FnIR->dump();
}
} else {
// Skip token for error recovery.
getNextToken();
@ -1356,10 +1103,11 @@ static void HandleDefinition() {
static void HandleExtern() {
if (auto ProtoAST = ParseExtern()) {
if (!ProtoAST->codegen())
fprintf(stderr, "Error reading extern");
else
if (auto *FnIR = ProtoAST->codegen()) {
fprintf(stderr, "Read extern: ");
FnIR->dump();
FunctionProtos[ProtoAST->getName()] = std::move(ProtoAST);
}
} else {
// Skip token for error recovery.
getNextToken();
@ -1369,9 +1117,7 @@ static void HandleExtern() {
static void HandleTopLevelExpression() {
// Evaluate a top-level expression into an anonymous function.
if (auto FnAST = ParseTopLevelExpr()) {
if (!FnAST->codegen()) {
fprintf(stderr, "Error generating code for top level expr");
}
FnAST->codegen();
} else {
// Skip token for error recovery.
getNextToken();
@ -1421,50 +1167,74 @@ extern "C" double printd(double X) {
//===----------------------------------------------------------------------===//
int main() {
InitializeNativeTarget();
InitializeNativeTargetAsmPrinter();
InitializeNativeTargetAsmParser();
// Install standard binary operators.
// 1 is lowest precedence.
BinopPrecedence['='] = 2;
BinopPrecedence['<'] = 10;
BinopPrecedence['+'] = 20;
BinopPrecedence['-'] = 20;
BinopPrecedence['*'] = 40; // highest.
// Prime the first token.
fprintf(stderr, "ready> ");
getNextToken();
TheJIT = llvm::make_unique<KaleidoscopeJIT>();
InitializeModule();
// Add the current debug info version into the module.
TheModule->addModuleFlag(Module::Warning, "Debug Info Version",
DEBUG_METADATA_VERSION);
// Darwin only supports dwarf2.
if (Triple(sys::getProcessTriple()).isOSDarwin())
TheModule->addModuleFlag(llvm::Module::Warning, "Dwarf Version", 2);
// Construct the DIBuilder, we do this here because we need the module.
DBuilder = llvm::make_unique<DIBuilder>(*TheModule);
// Create the compile unit for the module.
// Currently down as "fib.ks" as a filename since we're redirecting stdin
// but we'd like actual source locations.
KSDbgInfo.TheCU = DBuilder->createCompileUnit(
dwarf::DW_LANG_C, "fib.ks", ".", "Kaleidoscope Compiler", false, "", 0);
InitializeModuleAndPassManager();
// Run the main "interpreter loop" now.
MainLoop();
// Finalize the debug info.
DBuilder->finalize();
// Initialize the target registry etc.
InitializeAllTargetInfos();
InitializeAllTargets();
InitializeAllTargetMCs();
InitializeAllAsmParsers();
InitializeAllAsmPrinters();
// Print out all of the generated code.
TheModule->dump();
auto TargetTriple = sys::getDefaultTargetTriple();
TheModule->setTargetTriple(TargetTriple);
std::string Error;
auto Target = TargetRegistry::lookupTarget(TargetTriple, Error);
// Print an error and exit if we couldn't find the requested target.
// This generally occurs if we've forgotten to initialise the
// TargetRegistry or we have a bogus target triple.
if (!Target) {
errs() << Error;
return 1;
}
auto CPU = "generic";
auto Features = "";
TargetOptions opt;
auto RM = Optional<Reloc::Model>();
auto TheTargetMachine =
Target->createTargetMachine(TargetTriple, CPU, Features, opt, RM);
TheModule->setDataLayout(TheTargetMachine->createDataLayout());
auto Filename = "output.o";
std::error_code EC;
raw_fd_ostream dest(Filename, EC, sys::fs::F_None);
if (EC) {
errs() << "Could not open file: " << EC.message();
return 1;
}
legacy::PassManager pass;
auto FileType = TargetMachine::CGFT_ObjectFile;
if (TheTargetMachine->addPassesToEmitFile(pass, dest, FileType)) {
errs() << "TheTargetMachine can't emit a file of this type";
return 1;
}
pass.run(*TheModule);
dest.flush();
outs() << "Wrote " << Filename << "\n";
return 0;
}

View File

@ -0,0 +1,13 @@
set(LLVM_LINK_COMPONENTS
Core
ExecutionEngine
Object
Support
native
)
add_kaleidoscope_chapter(Kaleidoscope-Ch9
toy.cpp
)
export_executable_symbols(Kaleidoscope-Ch9)

File diff suppressed because it is too large Load Diff