Move doc string parsing logic from rebuildParsers.php and
String_::parseDocString() into ParserAbstract. This stuff is
going to get complicated now.
For now only implement the validation of the indentation on the
end label.
The parser will now always generate Identifier nodes (for
non-namespaced identifiers). This obsoletes the useIdentifierNodes
parser option.
Node constructors still accepts strings and will implicitly create
an Identifier wrapper. Identifier implement __toString(), so that
outside of strict-mode many things continue to work without changes.
Use class_name production and emit the same error as for
"extends self" and "extends parent". It's weird that "extends
static" gives a different result than those two.
The dispatch using $this->{'reduceRule' . $rule}() is very expensive
because it involves
* One allocation when converting $rule to a string
* Another allocation when concatenating the two strings
* Lowercasing during the method call
* Various uncached method checks, e.g. visibility.
Using an array of closures for semantic action dispatch is
significantly more efficient.
We're accessing $this->stackPos a lot, and property accesses are
much more expensive than local variable acesses. Its cheaper to
use a local var and pass it as an argument to semantic actions.
Instead assign attributes on Nop nodes and in the pretty printer
specially handle end<start offsets. It's a somewhat weird case,
but not wrong per se given the meaning the offsets have.
For representing Identifiers that have an implicit leading $.
With this done, maybe go one step further?
* Rename VarLikeIdentifier -> VarIdentifier / VarName
* Use VarIdentifier / VarName also as an inner node in Variable.
Not sure if this adds any real value.
The parameter case is a bit weird, because the subnode is called
"name" here, rather than "var". Nothing we can do about that in
this version though.
The two parser options might be merged. I've kept it separate,
because I think this variable representation should become the
default (or even only representation) in the future, while I'm
less sure about the Identifier thing.
In this mode non-namespaced names that are currently represented
using strings will be represented using Identifier nodes instead.
Identifier nodes have a string $name subnode and coerce to string.
This allows preserving attributes and in particular location
information on identifiers.
Nearly all special errors are now handled gracefully, i.e. the
parser will be able to continue after encountering them. In some
cases the associated error range has been improved using the new
end attribute stack.
To achieve this the error handling code has been moved out of the
node constructors and into special methods in the parser.
It's likely that an error after -> will trigger another one due to
missing semicolon without shifting a single token. We prevent an
immediate failure in this case by manually setting errorState to 2,
which will suppress the duplicate error message, but allow error
recovery.
Expr\List will now contain ArrayItems instead of plain variables.
I'm reusing ArrayItem, because code handling list() must also handle
arrays, and this allows both to go through the same code path.
This also renames Expr\List->vars to ->items.
TODO: Should Expr\List be dropped in favor of Expr\Array with an
extra flag?
Scalar\String_ and Scalar\Encapsed now have an additional "kind"
attribute, which may be one of:
* String_::KIND_SINGLE_QUOTED
* String_::KIND_DOUBLE_QUOTED
* String_::KIND_NOWDOC
* String_::KIND_HEREDOC
Additionally, if the string kind is one of the latter two, an
attribute "docLabel" is provided, which contains the doc string
label (STR in <<<STR) that was originally used.
The pretty printer will try to take the original kind of the string,
as well as the used doc string label into account.
To distinguish array() and [] syntax. The pretty printer respects
this attribute. The shortArraySyntax pretty printer option acts as
a default in case the attribute is not specified.
A Nop statement will be inserted into statement lists if there are
any trailing comments in the list (which would otherwise not be
associated with any node).
The pretty printer output currently still contains a superfluous
newline.
Magic constant names have been added after the PHP 7 release.
We do not support and likely will not support __halt_compiler here
due to lexer limitations.
As these are shared between Php5 and Php7 parsers they should be
in some common place, otherwise we'd have to always reference either
one or the other.