The parameter case is a bit weird, because the subnode is called
"name" here, rather than "var". Nothing we can do about that in
this version though.
The two parser options might be merged. I've kept it separate,
because I think this variable representation should become the
default (or even only representation) in the future, while I'm
less sure about the Identifier thing.
In this mode non-namespaced names that are currently represented
using strings will be represented using Identifier nodes instead.
Identifier nodes have a string $name subnode and coerce to string.
This allows preserving attributes and in particular location
information on identifiers.
Nearly all special errors are now handled gracefully, i.e. the
parser will be able to continue after encountering them. In some
cases the associated error range has been improved using the new
end attribute stack.
To achieve this the error handling code has been moved out of the
node constructors and into special methods in the parser.
Lexer::startLexing() no longer throws, instead errors can be fetched
using Lexer::getErrors().
Lexer errors now also contain full line and position information.
It's likely that an error after -> will trigger another one due to
missing semicolon without shifting a single token. We prevent an
immediate failure in this case by manually setting errorState to 2,
which will suppress the duplicate error message, but allow error
recovery.
Expr\List will now contain ArrayItems instead of plain variables.
I'm reusing ArrayItem, because code handling list() must also handle
arrays, and this allows both to go through the same code path.
This also renames Expr\List->vars to ->items.
TODO: Should Expr\List be dropped in favor of Expr\Array with an
extra flag?
Scalar\String_ and Scalar\Encapsed now have an additional "kind"
attribute, which may be one of:
* String_::KIND_SINGLE_QUOTED
* String_::KIND_DOUBLE_QUOTED
* String_::KIND_NOWDOC
* String_::KIND_HEREDOC
Additionally, if the string kind is one of the latter two, an
attribute "docLabel" is provided, which contains the doc string
label (STR in <<<STR) that was originally used.
The pretty printer will try to take the original kind of the string,
as well as the used doc string label into account.
To distinguish array() and [] syntax. The pretty printer respects
this attribute. The shortArraySyntax pretty printer option acts as
a default in case the attribute is not specified.
Kind specifies whether the number was formatted as decimal, octal,
binary or hex. The pretty printer reproduces the number kind (but
not necessarily the exact formatting).
A Nop statement will be inserted into statement lists if there are
any trailing comments in the list (which would otherwise not be
associated with any node).
The pretty printer output currently still contains a superfluous
newline.
Adding this as an option to avoid breaking people's tests.
Some of the test results show pretty clearly that we are incorrectly
assigning the same comment multiple times for nested nodes (mentioned
in #36).
We can't strip the <?php at the end of a __halt_compiler() segment
in file mode.
Fixed by being a bit more explicit in prettyPrintFile() about what
we want to do...
Magic constant names have been added after the PHP 7 release.
We do not support and likely will not support __halt_compiler here
due to lexer limitations.
This should be enough for all cases, because: A double has 53 bits
of mantissa (including the implicit 1 bit), which is 53*ln(2)/ln(10)
= 15.95 decimal digits. However the leading decimal digit may encode
less than the usual 3.32 bits, which will push this over the edge to
requiring 17 decimal digits.
Adding only a single recovery rule for now.
The API is now:
* throwOnError parser option must be disabled.
* List of Errors is available through $parser->getErrors(). This
method is available either way.
* If no recovery is possible $parser->parse() will return null.
(Obviously only if throwOnError is disabled).
This adds an additional "returnType" subnode to Stmt\Function_,
Stmt\ClassMethod and Expr\Closure, as well as the corresponding
support in the name resolver and pretty printer.
When parsing on PHP 7 we will no longer be able to deal with
code that contains invalid octal literals. Currently we'll fatal,
after engine exceptions land we'll throw an exception instead.
Previously the pretty printer added unnecessary and odd-looking parentheses
when several operators with the same precedence were chained:
'a' . 'b' . 'c' . 'd' . 'e'
// was printed as
'a' . ('b' . ('c' . ('d' . 'e')))
Another issue reported as part of #39 was that assignments inside closures
were wrapped in parentheses:
function() {
$a = $b;
}
// was printed as
function() {
($a = $b);
}
This was caused by the automatic precedence handling, which just regarded
the closure as an ordinal nested expression.
With the new system the $predenceMap of PrettyPrinterAbstract contains both
precedence and associativity and there is a new method pPrec() which prints
a node taking precedence and associativity into account.
For simpler usage there are additional function pInfixOp(), pPrefixOp() and
pPostfixOp().
Prints not going through pPrec() do not have any precedence handling (fixing
the closure issue).
* nested list()s will now create nested List nodes (instead of just
nested arrays)
* yield $k => $v was parsed with key and value swapped. This is now fixed
* the pretty printer now works with the newly added language constructs
Example: foreach ($coords as list($x, $y)) { ... }
This change slightly breaks backwards compatability, as it changes the
node structure for the previously existing `list(...) = $foo` assignments.
Those no longer have a dedicated `AssignList` node; instead they are
parsed as a normal `Assign` node with a `List` as `var`. Similarly the
use in `foreach` will generate a `List` for `valueVar`.
Travis 5.2 seems to have changed the float output precision, so a test was
failing. Now the numbers in the expected output are also provided by PHP,
so they should be the same.
The new dereferencing syntaxes (new Foo)->bar and (new Foo)['bar'] were
causing a shift/reduce conflict with the '(' expr ')' rule. When
(new Foo) was encountered (without dereference operators following) the
parser thus threw a parse error.
The fix simply adds a special '(' new_expr ')' rule to expr. This does not
remove the shift/reduce conflict itself, but makes it irrelevant.
This fixes issue #20.
* codeGeneration:
Add docs for templates
Add a filesystem template loader.
Add simple templating support.
Add usage example for builders to docs
Add function builder
Add ability to specify arrays as default values
Add property builder
Add parameter builder
Add method builder
Add class builder
The parser didn't account for the additional newline after the content of doc strings, which is left there by the tokenizer for some reason. Additoinally esacape sequences were parsed in nowdoc strings.
Additionally this contains some minor changes to the grammar: Some _list nonterminals were refactored to have the possible single elements in a reparate rule and only assemble those single elements. (This reduces duplication and gives better assignment of line number context.)