Small docs touchups and typo fixes

2024-11-30 04:19:30 +01:00 · 2014-09-12 00:20:22 +02:00 · 2014-09-12 00:20:22 +02:00 · e65fd664d1
commit e65fd664d1
parent 7a3789f1a9
5 changed files with 84 additions and 73 deletions
--- a/doc/0_Introduction.markdown
+++ b/doc/0_Introduction.markdown
@ -1,16 +1,16 @@
 Introduction
 ============

-This project is a PHP 5.5 (and older) parser **written in PHP itself**.
+This project is a PHP 5.2 to PHP 5.6 parser **written in PHP itself**.

 What is this for?
 -----------------

-A parser is useful for [static analysis][0] and manipulation of code and basically any other
+A parser is useful for [static analysis][0], manipulation of code and basically any other
 application dealing with code programmatically. A parser constructs an [Abstract Syntax Tree][1]
 (AST) of the code and thus allows dealing with it in an abstract and robust way.

-There are other ways of dealing with source code. One that PHP supports natively is using the
+There are other ways of processing source code. One that PHP supports natively is using the
 token stream generated by [`token_get_all`][2]. The token stream is much more low level than
 the AST and thus has different applications: It allows to also analyze the exact formatting of
 a file. On the other hand the token stream is much harder to deal with for more complex analysis.
@ -26,13 +26,13 @@ programmatic PHP code analysis are incidentally PHP developers, not C developers
 What can it parse?
 ------------------

-The parser uses a PHP 5.5 compliant grammar, which is backwards compatible with at least PHP 5.4, PHP 5.3
-and PHP 5.2 (and maybe older).
+The parser uses a PHP 5.6 compliant grammar, which is backwards compatible with all PHP version from PHP 5.2
+upwards (and maybe older).

 As the parser is based on the tokens returned by `token_get_all` (which is only able to lex the PHP
-version it runs on), additionally a wrapper for emulating new tokens from 5.3, 5.4 and 5.5 is provided. This
-allows to parse PHP 5.5 source code running on PHP 5.2, for example. This emulation is very hacky and not
-yet perfect, but it should work well on any sane code.
+version it runs on), additionally a wrapper for emulating new tokens from 5.3, 5.4, 5.5 and 5.6 is provided.
+his allows to parse PHP 5.6 source code running on PHP 5.3, for example. This emulation is very hacky and not
+perfect, but it should work well on any sane code.

 What output does it produce?
 ----------------------------
@ -56,7 +56,7 @@ array(
 )
 ```

-This matches the semantics the program had: An echo statement, which takes two strings as expressions,
+This matches the structure of the code: An echo statement, which takes two strings as expressions,
 with the values `Hi` and `World!`.

 You can also see that the AST does not contain any whitespace information (but most comments are saved).
--- a/doc/1_Installation.markdown
+++ b/doc/1_Installation.markdown
@ -3,11 +3,6 @@ Installation

 There are multiple ways to include the PHP parser into your project:

-Installing from the Zip- or Tarball
-----------------------------------
-
-Download the latest version from [the download page][2], unpack it and move the files somewhere into your project.
-
 Installing via Composer
 -----------------------

@ -34,6 +29,10 @@ Run the following command to install the parser into the `vendor/PHP-Parser` fol

    git submodule add git://github.com/nikic/PHP-Parser.git vendor/PHP-Parser

+Installing from the Zip- or Tarball
+-----------------------------------
+
+Download the latest version from [the download page][2], unpack it and move the files somewhere into your project.


 [1]: http://getcomposer.org/composer.phar
--- a/doc/2_Usage_of_basic_components.markdown
+++ b/doc/2_Usage_of_basic_components.markdown
@ -26,31 +26,38 @@ This ensures that there will be no errors when traversing highly nested node tre
 Parsing
 -------

-In order to parse some source code you first have to create a `PhpParser\Parser` object (which
-needs to be passed a `PhpParser\Lexer` instance) and then pass the code (including `<?php` opening
-tags) to the `parse` method. If a syntax error is encountered `PhpParser\Error` is thrown, so this
-exception should be `catch`ed.
+In order to parse some source code you first have to create a `PhpParser\Parser` object, which
+needs to be passed a `PhpParser\Lexer` instance:
+
+```php
+<?php
+
+$parser = new PhpParser\Parser(new PhpParser\Lexer);
+// or
+$parser = new PhpParser\Parser(new PhpParser\Lexer\Emulative);
+```
+
+Use of the emulative lexer is required if you want to parse PHP code from newer versions than the one
+you're running on. For example it will allow you to parse PHP 5.6 code while running on PHP 5.3.
+
+Subsequently you can pass PHP code (including the opening `<?php` tag) to the `parse` method in order to
+create a syntax tree. If a syntax error is encountered, an `PhpParser\Error` exception will be thrown:

 ```php
 <?php
 $code = '<?php // some code';

-$parser = new PhpParser\Parser(new PhpParser\Lexer);
+$parser = new PhpParser\Parser(new PhpParser\Lexer\Emulative);

 try {
    $stmts = $parser->parse($code);
+    // $stmts is an array of statement nodes
 } catch (PhpParser\Error $e) {
    echo 'Parse Error: ', $e->getMessage();
 }
 ```

-The `parse` method will return an array of statement nodes (`$stmts`).
-
-### Emulative lexer
-
-Instead of `PhpParser\Lexer` one can also use `PhpParser\Lexer\Emulative`. This class will emulate tokens
-of newer PHP versions and as such allow parsing PHP 5.5 on PHP 5.2, for example. So if you want to parse
-PHP code of newer versions than the one you are running, you should use the emulative lexer.
+A parser instance can be reused to parse multiple files.

 Node tree
 ---------
@ -104,7 +111,7 @@ with a PHP keyword.

 Every node has a (possibly zero) number of subnodes. You can access subnodes by writing
 `$node->subNodeName`. The `Stmt\Echo_` node has only one subnode `exprs`. So in order to access it
-in the above example you would write `$stmts[0]->exprs`. If you wanted to access name of the function
+in the above example you would write `$stmts[0]->exprs`. If you wanted to access the name of the function
 call, you would write `$stmts[0]->exprs[1]->name`.

 All nodes also define a `getType()` method that returns the node type. The type is the class name
@ -143,10 +150,10 @@ try {
          ->exprs     // sub expressions
          [0]         // the first of them (the string node)
          ->value     // it's value, i.e. 'Hi '
-          = 'Hallo '; // change to 'Hallo '
+          = 'Hello '; // change to 'Hello '

    // pretty print
-    $code = '<?php ' . $prettyPrinter->prettyPrint($stmts);
+    $code = $prettyPrinter->prettyPrint($stmts);

    echo $code;
 } catch (PhpParser\Error $e) {
@ -156,7 +163,7 @@ try {

 The above code will output:

-    <?php echo 'Hallo ', hi\getTarget();
+    <?php echo 'Hello ', hi\getTarget();

 As you can see the source code was first parsed using `PhpParser\Parser->parse()`, then changed and then
 again converted to code using `PhpParser\PrettyPrinter\Standard->prettyPrint()`.
@ -164,8 +171,8 @@ again converted to code using `PhpParser\PrettyPrinter\Standard->prettyPrint()`.
 The `prettyPrint()` method pretty prints a statements array. It is also possible to pretty print only a
 single expression using `prettyPrintExpr()`.

-The `prettyPrintFile()` method can be used to print an entire file. This will include the opening `<?php` tag and handle
-inline HTML as the first/last sentence more gracefully.
+The `prettyPrintFile()` method can be used to print an entire file. This will include the opening `<?php` tag
+and handle inline HTML as the first/last statement more gracefully.

 Node traversation
 -----------------
@ -180,9 +187,8 @@ structure of a program using this `PhpParser\NodeTraverser` looks like this:

 ```php
 <?php
-$code = "<?php // some code";

-$parser        = new PhpParser\Parser(new PhpParser\Lexer);
+$parser        = new PhpParser\Parser(new PhpParser\Lexer\Emulative);
 $traverser     = new PhpParser\NodeTraverser;
 $prettyPrinter = new PhpParser\PrettyPrinter\Standard;

@ -190,6 +196,8 @@ $prettyPrinter = new PhpParser\PrettyPrinter\Standard;
 $traverser->addVisitor(new MyNodeVisitor);

 try {
+    $code = file_get_contents($fileName);
+
    // parse
    $stmts = $parser->parse($code);

@ -197,7 +205,7 @@ try {
    $stmts = $traverser->traverse($stmts);

    // pretty print
-    $code = '<?php ' . $prettyPrinter->prettyPrint($stmts);
+    $code = $prettyPrinter->prettyPrintFile($stmts);

    echo $code;
 } catch (PhpParser\Error $e) {
@ -205,14 +213,16 @@ try {
 }
 ```

-A same node visitor for this code might look like this:
+The corresponding node visitor might look like this:

 ```php
 <?php
+use PhpParser\Node;
+
 class MyNodeVisitor extends PhpParser\NodeVisitorAbstract
 {
-    public function leaveNode(PhpParser\Node $node) {
-        if ($node instanceof PhpParser\Node\Scalar\String) {
+    public function leaveNode(Node $node) {
+        if ($node instanceof Node\Scalar\String) {
            $node->value = 'foo';
        }
    }
@ -221,7 +231,7 @@ class MyNodeVisitor extends PhpParser\NodeVisitorAbstract

 The above node visitor would change all string literals in the program to `'foo'`.

-All visitors must implement the `PhpParser\NodeVisitor` interface, which defined the following four
+All visitors must implement the `PhpParser\NodeVisitor` interface, which defines the following four
 methods:

    public function beforeTraverse(array $nodes);
@ -240,11 +250,12 @@ The `enterNode` and `leaveNode` methods are called on every node, the former whe
 i.e. before its subnodes are traversed, the latter when it is left.

 All four methods can either return the changed node or not return at all (i.e. `null`) in which
-case the current node is not changed. The `leaveNode` method can furthermore return two special
-values: If `false` is returned the current node will be removed from the parent array. If an `array`
-is returned the current node will be merged into the parent array at the offset of the current node.
-I.e. if in `array(A, B, C)` the node `B` should be replaced with `array(X, Y, Z)` the result will be
-`array(A, X, Y, Z, C)`.
+case the current node is not changed. The `leaveNode` method can additionally return two special
+values:
+
+If `false` is returned the current node will be removed from the parent array. If an array is returned
+it will be merged into the parent array at the offset of the current node. I.e. if in `array(A, B, C)`
+the node `B` should be replaced with `array(X, Y, Z)` the result will be `array(A, X, Y, Z, C)`.

 Instead of manually implementing the `NodeVisitor` interface you can also extend the `NodeVisitorAbstract`
 class, which will define empty default implementations for all the above methods.
@ -283,10 +294,9 @@ We start off with the following base code:

 ```php
 <?php
-const IN_DIR  = '/some/path';
-const OUT_DIR = '/some/other/path';
+$inDir  = '/some/path';
+$outDir = '/some/other/path';

-// use the emulative lexer here, as we are running PHP 5.2 but want to parse PHP 5.3
 $parser        = new PhpParser\Parser(new PhpParser\Lexer\Emulative);
 $traverser     = new PhpParser\NodeTraverser;
 $prettyPrinter = new PhpParser\PrettyPrinter\Standard;
@ -295,7 +305,7 @@ $traverser->addVisitor(new PhpParser\NodeVisitor\NameResolver); // we will need
 $traverser->addVisitor(new NodeVisitor\NamespaceConverter);     // our own node visitor

 // iterate over all .php files in the directory
-$files = new RecursiveIteratorIterator(new RecursiveDirectoryIterator(IN_DIR));
+$files = new RecursiveIteratorIterator(new RecursiveDirectoryIterator($inDir));
 $files = new RegexIterator($files, '/\.php$/');

 foreach ($files as $file) {
@ -310,11 +320,11 @@ foreach ($files as $file) {
        $stmts = $traverser->traverse($stmts);

        // pretty print
-        $code = '<?php ' . $prettyPrinter->prettyPrint($stmts);
+        $code = $prettyPrinter->prettyPrintFile($stmts);

        // write the converted file to the target directory
        file_put_contents(
-            substr_replace($file->getPathname(), OUT_DIR, 0, strlen(IN_DIR)),
+            substr_replace($file->getPathname(), $outDir, 0, strlen($inDir)),
            $code
        );
    } catch (PhpParser\Error $e) {
@ -323,7 +333,7 @@ foreach ($files as $file) {
 }
 ```

-Now lets start with the main code, the `NodeVisitor_NamespaceConverter`. One thing it needs to do
+Now lets start with the main code, the `NodeVisitor\NamespaceConverter`. One thing it needs to do
 is convert `A\\B` style names to `A_B` style ones.

 ```php
@ -340,14 +350,14 @@ class NodeVisitor_NamespaceConverter extends PhpParser\NodeVisitorAbstract
 ```

 The above code profits from the fact that the `NameResolver` already resolved all names as far as
-possible, so we don't need to do that. All the need to create a string with the name parts separated
+possible, so we don't need to do that. We only need to create a string with the name parts separated
 by underscores instead of backslashes. This is what `$node->toString('_')` does. (If you want to
 create a name with backslashes either write `$node->toString()` or `(string) $node`.) Then we create
 a new name from the string and return it. Returning a new node replaces the old node.

 Another thing we need to do is change the class/function/const declarations. Currently they contain
-only the shortname (i.e. the last part of the name), but they need to contain the complete class
-name:
+only the shortname (i.e. the last part of the name), but they need to contain the complete name inclduing
+the namespace prefix:

 ```php
 <?php
--- a/doc/3_Other_node_tree_representations.markdown
+++ b/doc/3_Other_node_tree_representations.markdown
@ -1,7 +1,7 @@
 Other node tree representations
 ===============================

-It is possible to convert the AST in several textual representations, which serve different uses.
+It is possible to convert the AST into several textual representations, which serve different uses.

 Simple serialization
 --------------------
@ -13,18 +13,19 @@ but PHP, but it is compact and generates fast. The main application thus is in c
 Human readable dumping
 ----------------------

-Furthermore it is possible to dump nodes into a human readable form using the `dump` method of
+Furthermore it is possible to dump nodes into a human readable format using the `dump` method of
 `PhpParser\NodeDumper`. This can be used for debugging.

 ```php
 <?php
 $code = <<<'CODE'
 <?php
+
 function printLine($msg) {
    echo $msg, "\n";
 }

-    printLine('Hallo World!!!');
+printLine('Hello World!!!');
 CODE;

 $parser = new PhpParser\Parser(new PhpParser\Lexer);
@ -33,13 +34,13 @@ $nodeDumper = new PhpParser\NodeDumper;
 try {
    $stmts = $parser->parse($code);

-    echo '<pre>' . htmlspecialchars($nodeDumper->dump($stmts)) . '</pre>';
+    echo $nodeDumper->dump($stmts), "\n";
 } catch (PhpParser\Error $e) {
    echo 'Parse Error: ', $e->getMessage();
 }
 ```

-The above output will have an output looking roughly like this:
+The above script will have an output looking roughly like this:

 ```
 array(
@ -77,7 +78,7 @@ array(
        args: array(
            0: Arg(
                value: Scalar_String(
-                    value: Hallo World!!!
+                    value: Hello World!!!
                )
                byRef: false
            )
@ -97,11 +98,12 @@ interfacing with other languages and applications or for doing transformation us
 <?php
 $code = <<<'CODE'
 <?php
+
 function printLine($msg) {
    echo $msg, "\n";
 }

-    printLine('Hallo World!!!');
+printLine('Hello World!!!');
 CODE;

 $parser = new PhpParser\Parser(new PhpParser\Lexer);
@ -110,7 +112,7 @@ $serializer = new PhpParser\Serializer\XML;
 try {
    $stmts = $parser->parse($code);

-    echo '<pre>' . htmlspecialchars($serializer->serialize($stmts)) . '</pre>';
+    echo $serializer->serialize($stmts);
 } catch (PhpParser\Error $e) {
    echo 'Parse Error: ', $e->getMessage();
 }
@ -185,7 +187,7 @@ Produces:
      <subNode:value>
       <node:Scalar_String line="6">
        <subNode:value>
-         <scalar:string>Hallo World!!!</scalar:string>
+         <scalar:string>Hello World!!!</scalar:string>
        </subNode:value>
       </node:Scalar_String>
      </subNode:value>
--- a/doc/component/Lexer.markdown
+++ b/doc/component/Lexer.markdown
@ -42,7 +42,7 @@ getNextToken
 ------------

 `getNextToken` returns the ID of the next token and sets some additional information in the three variables which it
-accepts by-ref. If no more tokens are available it has to return `0`, which is the ID of the `EOF` token.
+accepts by-ref. If no more tokens are available it must return `0`, which is the ID of the `EOF` token.

 The first by-ref variable `$value` should contain the textual content of the token. It is what will be available as `$1`
 etc in the parser.