Update docs

2024-11-26 20:04:48 +01:00 · 2016-10-29 13:37:47 +02:00 · 2016-10-29 13:37:47 +02:00 · 71438559ae
commit 71438559ae
parent c0f0edf044
6 changed files with 163 additions and 28 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -27,6 +27,8 @@ This release primarily improves our support for error recovery.
 * Due to the error handling changes, the `Parser` interface and `Lexer` API have changed.
 * The emulative lexer now directly postprocesses tokens, instead of using `~__EMU__~` sequences.
  This changes the protected API of the lexer.
+* The `Name::slice()` method now returns `null` for empty slices, previously `new Name([])` was
+  used. `Name::concat()` now also supports concatenation with `null`.

 ### Removed

--- a/README.md
+++ b/README.md
@ -6,7 +6,9 @@ PHP Parser
 This is a PHP 5.2 to PHP 7.1 parser written in PHP. Its purpose is to simplify static code analysis and
 manipulation.

-[**Documentation for version 2.x**][doc_master] (stable; for running on PHP >= 5.4; for parsing PHP 5.2 to PHP 7.0).
+[Documentation for version 3.x][doc_master] (beta; for running on PHP >= 5.5; for parsing PHP 5.2 to PHP 7.1).
+
+[**Documentation for version 2.x**][doc_2_x] (stable; for running on PHP >= 5.4; for parsing PHP 5.2 to PHP 7.0).

 [Documentation for version 1.x][doc_1_x] (unsupported; for running on PHP >= 5.3; for parsing PHP 5.2 to PHP 5.6).

@ -89,7 +91,7 @@ Documentation

 Component documentation:

- 1. [Error](doc/component/Error.markdown)
+ 1. [Error handling](doc/component/Error_handling.markdown)
 2. [Lexer](doc/component/Lexer.markdown)

 [doc_1_x]: https://github.com/nikic/PHP-Parser/tree/1.x/doc
--- a/UPGRADE-3.0.md
+++ b/UPGRADE-3.0.md
@ -152,3 +152,5 @@ The following methods, arguments or options have been removed:
 * The constants on `NameTraverserInterface` have been moved into the `NameTraverser` class.
 * The emulative lexer now directly postprocesses tokens, instead of using `~__EMU__~` sequences.
   This changes the protected API of the emulative lexer.
+ * The `Name::slice()` method now returns `null` for empty slices, previously `new Name([])` was
+   used. `Name::concat()` now also supports concatenation with `null`.
--- a/doc/3_Other_node_tree_representations.markdown
+++ b/doc/3_Other_node_tree_representations.markdown
@ -8,7 +8,7 @@ Simple serialization

 It is possible to serialize the node tree using `serialize()` and also unserialize it using
 `unserialize()`. The output is not human readable and not easily processable from anything
-but PHP, but it is compact and generates fast. The main application thus is in caching.
+but PHP, but it is compact and generates quickly. The main application thus is in caching.

 Human readable dumping
 ----------------------
@ -86,6 +86,134 @@ array(
 )
 ```

+JSON encoding
+-------------
+
+Nodes (and comments) implement the `JsonSerializable` interface. As such, it is possible to JSON
+encode the AST directly using `json_encode()`:
+
+```php
+$code = <<<'CODE'
+<?php
+
+function printLine($msg) {
+    echo $msg, "\n";
+}
+
+printLine('Hello World!!!');
+CODE;
+
+$parser = (new PhpParser\ParserFactory)->create(PhpParser\ParserFactory::PREFER_PHP7);
+$nodeDumper = new PhpParser\NodeDumper;
+
+try {
+    $stmts = $parser->parse($code);
+
+    echo json_encode($stmts, JSON_PRETTY_PRINT), "\n";
+} catch (PhpParser\Error $e) {
+    echo 'Parse Error: ', $e->getMessage();
+}
+```
+
+This will result in the following output (which includes attributes):
+
+```json
+[
+    {
+        "nodeType": "Stmt_Function",
+        "byRef": false,
+        "name": "printLine",
+        "params": [
+            {
+                "nodeType": "Param",
+                "type": null,
+                "byRef": false,
+                "variadic": false,
+                "name": "msg",
+                "default": null,
+                "attributes": {
+                    "startLine": 3,
+                    "endLine": 3
+                }
+            }
+        ],
+        "returnType": null,
+        "stmts": [
+            {
+                "nodeType": "Stmt_Echo",
+                "exprs": [
+                    {
+                        "nodeType": "Expr_Variable",
+                        "name": "msg",
+                        "attributes": {
+                            "startLine": 4,
+                            "endLine": 4
+                        }
+                    },
+                    {
+                        "nodeType": "Scalar_String",
+                        "value": "\n",
+                        "attributes": {
+                            "startLine": 4,
+                            "endLine": 4,
+                            "kind": 2
+                        }
+                    }
+                ],
+                "attributes": {
+                    "startLine": 4,
+                    "endLine": 4
+                }
+            }
+        ],
+        "attributes": {
+            "startLine": 3,
+            "endLine": 5
+        }
+    },
+    {
+        "nodeType": "Expr_FuncCall",
+        "name": {
+            "nodeType": "Name",
+            "parts": [
+                "printLine"
+            ],
+            "attributes": {
+                "startLine": 7,
+                "endLine": 7
+            }
+        },
+        "args": [
+            {
+                "nodeType": "Arg",
+                "value": {
+                    "nodeType": "Scalar_String",
+                    "value": "Hello World!!!",
+                    "attributes": {
+                        "startLine": 7,
+                        "endLine": 7,
+                        "kind": 1
+                    }
+                },
+                "byRef": false,
+                "unpack": false,
+                "attributes": {
+                    "startLine": 7,
+                    "endLine": 7
+                }
+            }
+        ],
+        "attributes": {
+            "startLine": 7,
+            "endLine": 7
+        }
+    }
+]
+```
+
+There is currently no mechanism to convert JSON back into a node tree. Furthermore, not all ASTs
+can be JSON encoded. In particular, JSON only supports UTF-8 strings.
+
 Serialization to XML
 --------------------

--- a/doc/component/Error_handling.markdown
+++ b/doc/component/Error_handling.markdown
@ -35,6 +35,8 @@ the source code of the parsed file. An example for printing an error:
 if ($e->hasColumnInfo()) {
    echo $e->getRawMessage() . ' from ' . $e->getStartLine() . ':' . $e->getStartColumn($code)
        . ' to ' . $e->getEndLine() . ':' . $e->getEndColumn($code);
+    // or:
+    echo $e->getMessageWithColumnInfo();
 } else {
    echo $e->getMessage();
 }
@ -46,27 +48,23 @@ file.
 Error recovery
 --------------

-> **EXPERIMENTAL**
+The error behavior of the parser (and other components) is controlled by an `ErrorHandler`. Whenever an error is
+encountered, `ErrorHandler::handleError()` is invoked. The default error handling strategy is `ErrorHandler\Throwing`,
+which will immediately throw when an error is encountered.

-By default the parser will throw an exception upon encountering the first error during parsing. An alternative mode is
-also supported, in which the parser will remember the error, but try to continue parsing the rest of the source code.
-
-To enable this mode the `throwOnError` parser option needs to be disabled. Any errors that occurred during parsing can
-then be retrieved using `$parser->getErrors()`. The `$parser->parse()` method will either return a partial syntax tree
-or `null` if recovery fails.
-
-A usage example:
+To instead collect all encountered errors into an array, while trying to continue parsing the rest of the source code,
+an instance of `ErrorHandler\Collecting` can be passed to the `Parser::parse()` method. A usage example:

 ```php
-$parser = (new PhpParser\ParserFactory)->create(PhpParser\ParserFactory::PREFER_PHP7, null, array(
-    'throwOnError' => false,
-));
+$parser = (new PhpParser\ParserFactory)->create(PhpParser\ParserFactory::ONLY_PHP7);
+$errorHandler = new PhpParser\ErrorHandler\Collecting;

-$stmts = $parser->parse($code);
-$errors = $parser->getErrors();
+$stmts = $parser->parse($code, $errorHandler);

-foreach ($errors as $error) {
-    // $error is an ordinary PhpParser\Error
+if ($errorHandler->hasErrors()) {
+    foreach ($errorHandler->getErrors() as $error) {
+        // $error is an ordinary PhpParser\Error
+    }
 }

 if (null !== $stmts) {
@ -74,4 +72,4 @@ if (null !== $stmts) {
 }
 ```

-The error recovery implementation is experimental -- it currently won't be able to recover from many types of errors.
+The `NameResolver` visitor also accepts an `ErrorHandler` as a constructor argument.
--- a/doc/component/Lexer.markdown
+++ b/doc/component/Lexer.markdown
@ -95,13 +95,14 @@ Lexer extension

 A lexer has to define the following public interface:

-    void startLexing(string $code);
+    void startLexing(string $code, ErrorHandler $errorHandler = null);
    array getTokens();
    string handleHaltCompiler();
    int getNextToken(string &$value = null, array &$startAttributes = null, array &$endAttributes = null);

 The `startLexing()` method is invoked with the source code that is to be lexed (including the opening tag) whenever the
-`parse()` method of the parser is called. It can be used to reset state or preprocess the source code or tokens.
+`parse()` method of the parser is called. It can be used to reset state or preprocess the source code or tokens. The
+passes `ErrorHandler` should be used to report lexing errors.

 The `getTokens()` method returns the current token array, in the usual `token_get_all()` format. This method is not
 used by the parser (which uses `getNextToken()`), but is useful in combination with the token position attributes.
@ -122,9 +123,10 @@ node and the `$endAttributes` from the last token that is part of the node.
 E.g. if the tokens `T_FUNCTION T_STRING ... '{' ... '}'` constitute a node, then the `$startAttributes` from the
 `T_FUNCTION` token will be taken and the `$endAttributes` from the `'}'` token.

-An application of custom attributes is storing the original formatting of literals: The parser does not retain
-information about the formatting of integers (like decimal vs. hexadecimal) or strings (like used quote type or used
-escape sequences). This can be remedied by storing the original value in an attribute:
+An application of custom attributes is storing the exact original formatting of literals: While the parser does retain
+some information about the formatting of integers (like decimal vs. hexadecimal) or strings (like used quote type), it
+does not preserve the exact original formatting (e.g. leading zeros for integers or escape sequences in strings). This
+can be remedied by storing the original value in an attribute:

 ```php
 use PhpParser\Lexer;
@ -135,9 +137,10 @@ class KeepOriginalValueLexer extends Lexer // or Lexer\Emulative
    public function getNextToken(&$value = null, &$startAttributes = null, &$endAttributes = null) {
        $tokenId = parent::getNextToken($value, $startAttributes, $endAttributes);

-        if ($tokenId == Tokens::T_CONSTANT_ENCAPSED_STRING // non-interpolated string
-            || $tokenId == Tokens::T_LNUMBER               // integer
-            || $tokenId == Tokens::T_DNUMBER               // floating point number
+        if ($tokenId == Tokens::T_CONSTANT_ENCAPSED_STRING   // non-interpolated string
+            || $tokenId == Tokens::T_ENCAPSED_AND_WHITESPACE // interpolated string
+            || $tokenId == Tokens::T_LNUMBER                 // integer
+            || $tokenId == Tokens::T_DNUMBER                 // floating point number
        ) {
            // could also use $startAttributes, doesn't really matter here
            $endAttributes['originalValue'] = $value;