2011-05-31 18:01:00 +02:00
|
|
|
PHP Parser
|
|
|
|
==========
|
|
|
|
|
|
|
|
This is a PHP parser written in PHP. It's purpose is to simplify static code analysis and
|
|
|
|
manipulation.
|
|
|
|
|
2011-06-05 18:40:04 +02:00
|
|
|
***Note: This project is highly experimental. It may not always function correctly.***
|
2011-05-31 18:01:00 +02:00
|
|
|
|
|
|
|
Components
|
|
|
|
==========
|
|
|
|
|
|
|
|
This package currently bundles several components:
|
|
|
|
|
2011-05-31 19:35:47 +02:00
|
|
|
* The `Parser` itself
|
|
|
|
* A `NodeDumper` to dump the nodes to a human readable string representation
|
|
|
|
* A `PrettyPrinter` to translate the node tree back to PHP
|
2011-05-31 18:01:00 +02:00
|
|
|
|
2011-06-05 18:47:52 +02:00
|
|
|
Autoloader
|
|
|
|
----------
|
|
|
|
|
|
|
|
In order to automatically include required files `PHPParser_Autoloader` can be used:
|
|
|
|
|
|
|
|
require_once 'path/to/phpparser/lib/PHPParser/Autoloader.php';
|
|
|
|
PHPParser_Autoloader::register();
|
|
|
|
|
2011-05-31 18:01:00 +02:00
|
|
|
Parser and ParserDebug
|
|
|
|
----------------------
|
|
|
|
|
2011-06-05 18:40:04 +02:00
|
|
|
Parsing is performed using `PHPParser_Parser->parse()`. This method accepts a `PHPParser_Lexer`
|
|
|
|
as the only parameter and returns an array of statement nodes. If an error occurs it throws a
|
2011-06-05 18:52:41 +02:00
|
|
|
PHPParser_Error.
|
2011-05-31 18:01:00 +02:00
|
|
|
|
|
|
|
$code = '<?php // some code';
|
|
|
|
|
2011-06-03 17:44:23 +02:00
|
|
|
try {
|
2011-06-05 18:40:04 +02:00
|
|
|
$parser = new PHPParser_Parser;
|
|
|
|
$stmts = $parser->parse(new PHPParser_Lexer($code));
|
2011-06-05 18:52:41 +02:00
|
|
|
} catch (PHPParser_Error $e) {
|
2011-06-03 17:44:23 +02:00
|
|
|
echo 'Parse Error: ', $e->getMessage();
|
|
|
|
}
|
2011-05-31 18:01:00 +02:00
|
|
|
|
2011-06-05 18:40:04 +02:00
|
|
|
The `PHPParser_ParserDebug` class also parses a PHP code, but outputs a debug trace while doing so.
|
2011-05-31 18:01:00 +02:00
|
|
|
|
|
|
|
Node Tree
|
|
|
|
---------
|
|
|
|
|
2011-06-05 18:40:04 +02:00
|
|
|
The output of the parser is an array of statement nodes. All nodes are instances of
|
|
|
|
`PHPParser_NodeAbstract`. Furthermore nodes are divided into three categories:
|
2011-05-31 18:01:00 +02:00
|
|
|
|
2011-06-05 18:40:04 +02:00
|
|
|
* `PHPParser_Node_Stmt`: A statement
|
|
|
|
* `PHPParser_Node_Expr`: An expression
|
|
|
|
* `PHPParser_Node_Scalar`: A scalar (which is a string, a number, aso.)
|
|
|
|
`PHPParser_Node_Scalar` inherits from `PHPParser_Node_Expr`.
|
2011-05-31 18:01:00 +02:00
|
|
|
|
2011-06-05 18:40:04 +02:00
|
|
|
Each node may have subnodes. For example `PHPParser_Node_Expr_Plus` has two subnodes, namely `left`
|
|
|
|
and `right`, which represend the left hand side and right hand side expressions of the plus operation.
|
2011-05-31 18:01:00 +02:00
|
|
|
Subnodes are accessed as normal properties:
|
|
|
|
|
|
|
|
$node->left
|
|
|
|
|
|
|
|
The subnodes which a certain node can have are documented as `@property` doccomments in the
|
|
|
|
respective files.
|
|
|
|
|
2011-07-14 13:21:41 +02:00
|
|
|
Additionally all nodes have two methods, `getLine()` and `getDocComment()`.
|
|
|
|
`getLine()` returns the line a node started in.
|
|
|
|
`getDocComment()` returns the doccomment before the node or `null` if there was none.
|
|
|
|
|
2011-05-31 18:01:00 +02:00
|
|
|
NodeDumper
|
|
|
|
----------
|
|
|
|
|
2011-06-05 18:40:04 +02:00
|
|
|
Nodes can be dumped into a string representation using the `PHPParser_NodeDumper->dump()` method:
|
2011-05-31 18:01:00 +02:00
|
|
|
|
|
|
|
$code = <<<'CODE'
|
2011-05-31 19:35:47 +02:00
|
|
|
<?php
|
|
|
|
function printLine($msg) {
|
|
|
|
echo $msg, "\n";
|
|
|
|
}
|
2011-05-31 18:01:00 +02:00
|
|
|
|
2011-05-31 19:35:47 +02:00
|
|
|
printLine('Hallo World!!!');
|
|
|
|
CODE;
|
2011-05-31 18:01:00 +02:00
|
|
|
|
2011-06-03 17:44:23 +02:00
|
|
|
try {
|
2011-06-05 18:40:04 +02:00
|
|
|
$parser = new PHPParser_Parser;
|
|
|
|
$stmts = $parser->parse(new PHPParser_Lexer($code));
|
2011-05-31 18:01:00 +02:00
|
|
|
|
2011-06-05 18:40:04 +02:00
|
|
|
$nodeDumper = new PHPParser_NodeDumper;
|
2011-05-31 18:01:00 +02:00
|
|
|
echo '<pre>' . htmlspecialchars($nodeDumper->dump($stmts)) . '</pre>';
|
2011-06-05 18:52:41 +02:00
|
|
|
} catch (PHPParser_Error $e) {
|
2011-06-03 17:44:23 +02:00
|
|
|
echo 'Parse Error: ', $e->getMessage();
|
2011-05-31 18:01:00 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
This script will have an output similar to the following:
|
|
|
|
|
|
|
|
array(
|
|
|
|
0: Stmt_Func(
|
|
|
|
byRef: false
|
|
|
|
name: printLine
|
|
|
|
params: array(
|
|
|
|
0: Stmt_FuncParam(
|
|
|
|
type: null
|
|
|
|
name: msg
|
|
|
|
byRef: false
|
|
|
|
default: null
|
|
|
|
)
|
|
|
|
)
|
|
|
|
stmts: array(
|
|
|
|
0: Stmt_Echo(
|
|
|
|
exprs: array(
|
|
|
|
0: Variable(
|
|
|
|
name: msg
|
|
|
|
)
|
|
|
|
1: Scalar_String(
|
|
|
|
value:
|
|
|
|
|
|
|
|
isBinary: false
|
|
|
|
type: 1
|
|
|
|
)
|
|
|
|
)
|
|
|
|
)
|
|
|
|
)
|
|
|
|
)
|
|
|
|
1: Expr_FuncCall(
|
|
|
|
func: Name(
|
|
|
|
parts: array(
|
|
|
|
0: printLine
|
|
|
|
)
|
|
|
|
)
|
|
|
|
args: array(
|
|
|
|
0: Expr_FuncCallArg(
|
|
|
|
value: Scalar_String(
|
|
|
|
value: Hallo World!!!
|
|
|
|
isBinary: false
|
|
|
|
type: 0
|
|
|
|
)
|
|
|
|
byRef: false
|
|
|
|
)
|
|
|
|
)
|
|
|
|
)
|
|
|
|
)
|
|
|
|
|
|
|
|
PrettyPrinter
|
|
|
|
-------------
|
|
|
|
|
|
|
|
The pretty printer compiles nodes back to PHP code. "Pretty printing" here is just the formal
|
|
|
|
name of the process and does not mean that the output is in any way pretty.
|
|
|
|
|
2011-06-05 18:40:04 +02:00
|
|
|
$prettyPrinter = new PHPParser_PrettyPrinter_Zend;
|
2011-06-02 22:52:24 +02:00
|
|
|
echo '<pre>' . htmlspecialchars($prettyPrinter->prettyPrint($stmts)) . '</pre>';
|
2011-05-31 18:01:00 +02:00
|
|
|
|
|
|
|
For the code mentioned in the above section this should create the output:
|
|
|
|
|
|
|
|
function printLine($msg)
|
|
|
|
{
|
|
|
|
echo $msg, "\n";
|
|
|
|
}
|
2011-06-05 18:40:04 +02:00
|
|
|
printLine('Hallo World!!!');
|