Expr\List will now contain ArrayItems instead of plain variables.
I'm reusing ArrayItem, because code handling list() must also handle
arrays, and this allows both to go through the same code path.
This also renames Expr\List->vars to ->items.
TODO: Should Expr\List be dropped in favor of Expr\Array with an
extra flag?
Scalar\String_ and Scalar\Encapsed now have an additional "kind"
attribute, which may be one of:
* String_::KIND_SINGLE_QUOTED
* String_::KIND_DOUBLE_QUOTED
* String_::KIND_NOWDOC
* String_::KIND_HEREDOC
Additionally, if the string kind is one of the latter two, an
attribute "docLabel" is provided, which contains the doc string
label (STR in <<<STR) that was originally used.
The pretty printer will try to take the original kind of the string,
as well as the used doc string label into account.
To distinguish array() and [] syntax. The pretty printer respects
this attribute. The shortArraySyntax pretty printer option acts as
a default in case the attribute is not specified.
Kind specifies whether the number was formatted as decimal, octal,
binary or hex. The pretty printer reproduces the number kind (but
not necessarily the exact formatting).
A Nop statement will be inserted into statement lists if there are
any trailing comments in the list (which would otherwise not be
associated with any node).
The pretty printer output currently still contains a superfluous
newline.
Adding this as an option to avoid breaking people's tests.
Some of the test results show pretty clearly that we are incorrectly
assigning the same comment multiple times for nested nodes (mentioned
in #36).
We can't strip the <?php at the end of a __halt_compiler() segment
in file mode.
Fixed by being a bit more explicit in prettyPrintFile() about what
we want to do...
Magic constant names have been added after the PHP 7 release.
We do not support and likely will not support __halt_compiler here
due to lexer limitations.
As these are shared between Php5 and Php7 parsers they should be
in some common place, otherwise we'd have to always reference either
one or the other.
This should be enough for all cases, because: A double has 53 bits
of mantissa (including the implicit 1 bit), which is 53*ln(2)/ln(10)
= 15.95 decimal digits. However the leading decimal digit may encode
less than the usual 3.32 bits, which will push this over the edge to
requiring 17 decimal digits.
Adding only a single recovery rule for now.
The API is now:
* throwOnError parser option must be disabled.
* List of Errors is available through $parser->getErrors(). This
method is available either way.
* If no recovery is possible $parser->parse() will return null.
(Obviously only if throwOnError is disabled).
* Don't assign to attribute stack on reduce - why was that there
in the first place?
* Assign attributes to the position in the stack where the first
token of the production is, instead of one position earlier.
* Add a comment to clarify why we also assign attributes on read,
instead of just on shift.
Minor performance improvement for parsing, also allows to access
attributes with higher granulity in the parser, though this is not
currently done.
* #n can now be used to access the stack position of a token. $n
is the same as $this->semStack[#n]. (Post-translate $n will
actually be the stack position.)
* $attributeStack is now $this->startAttributeStack and
$endAttributes is now $this->endAttributes.
* Attributes for a node are now computed inside the individual
reduction methods, instead of being passed as a parameter.
Accessible through the attributes() macro.
This fixes the case where the old name is used before the new one
is ever used, e.g. when manually constructing nodes, as opposed to
parsing them.
The previous approach would try to register the alias from OLD to
NEW. This would trigger autoloading on NEW and afterwards it would
register the alias from OLD to NEW. Afterwards the alias registration
which originally triggered the autoload would run, thus redeclaring
the class.
TL;DR aliases suck, closes#192.
Were this library to be fully annotated with scalar types and
return types where possible and were strict types to be enabled
for all files, the test suite would now pass.
Running a .phar or regular PHP executable that requires and includes its own
version of php-parser will cause a "cannot redeclare class" error if said
executable also includes the autoloader of the current working directory.
This adds an additional "returnType" subnode to Stmt\Function_,
Stmt\ClassMethod and Expr\Closure, as well as the corresponding
support in the name resolver and pretty printer.
Instead of storing subnodes in a subNodes dictionary, they are
now stored as simple properties. This requires declarating the
properties, assigning them in the constructor, overriding
the getSubNodeNames() method and passing NULL to the first argument
of the NodeAbstract constructor.
[Deprecated: It's still possible to use the old mode of operation
for custom nodes by passing an array of subnodes to the constructor.]
The only behavior difference this should cause is that getSubNodeNames()
will always return the original subnode names and skip any additional
properties that were dynamically added. E.g. this means that the
"namespacedName" node added by the NameResolver visitor is not treated
as a subnode, but as a dynamic property instead.
This change improves performance and memory usage.
While array (with name components) could technically be allowed (as
they are supported by the Name node itself), more likely than not
an array would due to incorrect usage of the API (e.g. array instead
of variadics).
Also change endFilePos semantics to refer to the last character that
is *included* in the token, rather than one past the last character.
This ensures that all end* attributes have the same semantics.
The lexer can now optionally add startFilePos and endFilePos
attributes, which are offsets in to the lexed code string.
The end offset currently points one past the last character of
the token - this is pending further discussion.
The attributes are not added by default and have to be enabled
using the new 'usedAttributes' lexer option:
$lexer = new Lexer([
'usedAttributes' => [
'comments', 'startLine', 'endLine',
'startFilePos', 'endFilePos'
]
]);
And improve the code a tad bit in general.
I left YY2TBLSTATES and YYNLSTATES around, because I don't fully
understand their role in the action double indexing.
The uniqid function is *very* slow on unix systems. The code has no
particular unique-ness requirements, so the much faster mt_rand()
function is used instead.
Closes PR #65.
The end attributes previously were always assigned from the last read token,
which does not necessarily correspond to the last token in the reduced rule.
In particular this occurs if the parser read a new token and based on that
lookahead decided to reduce a rule. The behavior was only correct if the
newly read token was first shifted and then the rule was reduced.
This is fixed by buffering the endAttributes of the new token in a temporary
variable and only assigning them once the token is shifted.
Previously the pretty printer added unnecessary and odd-looking parentheses
when several operators with the same precedence were chained:
'a' . 'b' . 'c' . 'd' . 'e'
// was printed as
'a' . ('b' . ('c' . ('d' . 'e')))
Another issue reported as part of #39 was that assignments inside closures
were wrapped in parentheses:
function() {
$a = $b;
}
// was printed as
function() {
($a = $b);
}
This was caused by the automatic precedence handling, which just regarded
the closure as an ordinal nested expression.
With the new system the $predenceMap of PrettyPrinterAbstract contains both
precedence and associativity and there is a new method pPrec() which prints
a node taking precedence and associativity into account.
For simpler usage there are additional function pInfixOp(), pPrefixOp() and
pPostfixOp().
Prints not going through pPrec() do not have any precedence handling (fixing
the closure issue).
Directly creating the node isn't necessary anymore, the token only needs
to be parsed. This makes it consistent with the other scalar parsing
methods and removes the need to pass $arguments around.
* nested list()s will now create nested List nodes (instead of just
nested arrays)
* yield $k => $v was parsed with key and value swapped. This is now fixed
* the pretty printer now works with the newly added language constructs
Example: foreach ($coords as list($x, $y)) { ... }
This change slightly breaks backwards compatability, as it changes the
node structure for the previously existing `list(...) = $foo` assignments.
Those no longer have a dedicated `AssignList` node; instead they are
parsed as a normal `Assign` node with a `List` as `var`. Similarly the
use in `foreach` will generate a `List` for `valueVar`.
The new dereferencing syntaxes (new Foo)->bar and (new Foo)['bar'] were
causing a shift/reduce conflict with the '(' expr ')' rule. When
(new Foo) was encountered (without dereference operators following) the
parser thus threw a parse error.
The fix simply adds a special '(' new_expr ')' rule to expr. This does not
remove the shift/reduce conflict itself, but makes it irrelevant.
This fixes issue #20.
getDocComment() now returns the last comment (given that it is a doc
comment). setDocComment() no longer exists, as it doesn't make sense
with the comment objects anymore. getAttribute() now returns by reference,
so it also works in reference contexts.
Now two arrays are fetched from the lexer: $startAttributes and
$endAttributes. When constructing the attributes for a node, the
$startAttributes from the first token of the node and the $endAttributes
of the last token of the node are merged.
Now the end line is saved in the endLine attribute.
The yacc parser skeleton with all those odd $yy short names is quite
non-obvious. This commits starts to refactor it a bit, to use more
obvious names and logic.
Now the lexer is injected only once when creating the parser. Instead of
$parser = new PHPParser_Parser;
$parser->parse(new PHPParser_Lexer($code));
$parser->parse(new PHPParser_Lexer($code2));
you write:
$parser = new PHPParser_Parser(new PHPParser_Lexer);
$parser->parse($code);
$parser->parse($code2);
lcfirst() isn't defined on PHP 5.2, so I added a fallback function, which
is defined in the bootstrap.php. Not sure whether that's the right place
to put it.
* codeGeneration:
Add docs for templates
Add a filesystem template loader.
Add simple templating support.
Add usage example for builders to docs
Add function builder
Add ability to specify arrays as default values
Add property builder
Add parameter builder
Add method builder
Add class builder
The subNodes array was not initialized, so for empty nodes it would just
be null. Due to the addition of attributes for nodes those have to be
initialized too.
The template loaders loads templates from a base directory (and can
optionally use a suffix). For example
$templateLoader = new PHPParser_TemplateLoader(
$parser, './templates', '.php'
);
// loads ./templates/TestTemplate.php
$templateLoader->load('TestTemplate');
Again the implementation is not optimal. The loader probably shouldn't
intantiate the Template itself, but instead should accept a
TemplateFactory. This seemed like overkill to me, so I left it out.
Templates use __name__ placeholders. A variant of the placeholder with a
capitalized first latter can be accessed using __Name__ (this is useful
for camel case identifiers, e.g. get__Name__).
Currently the implemention is not particularly clean, because the Template
instantiates a Lexer itself. Fixing this requires a major refactoring of
the lexer/parser interface.
If a NodeVisitor returns an array of nodes to merge these will no longer be traversed by all other visitors. That "feature" turned out to be a real pain in the ass on some occasions ;)
The parser didn't account for the additional newline after the content of doc strings, which is left there by the tokenizer for some reason. Additoinally esacape sequences were parsed in nowdoc strings.
Additionally this contains some minor changes to the grammar: Some _list nonterminals were refactored to have the possible single elements in a reparate rule and only assemble those single elements. (This reduces duplication and gives better assignment of line number context.)
a) ->traverseNode() now operates on a clone of the node, otherwise the original node will be modified too
b) before nodes were passed to the following visitor unchanged, even though they were already changed in the tree
(new A)->b(), (new A)->b, (new A)[0]. The feature is not implemented fully compliant (implemented as a `variable`, not `expr_without_variable`: Awaiting input on that on internals@.
Instead manually implement IteratorAggregate and define the required magic methods. The reasoning behind this is:
a) Extending ArrayObject is always risky, because a lot of magic which is known to be buggy is involved
b) This allows to lateron change the implementation for the nodes altogether, for example it could be changed to using real public fields instead of a $subNodes array.
This time properly. Only remaining problem is that floats like 1e1000 are printed as INF. This may or may not be acceptable. The value will be the same, but the tests will signal a diff failure.
The NameResolver visitor tries to resolve all names to fully qualified names. It will resolve all non-dynamic names, apart from unqualified function and constant names. The latter can not be resolved properly without running the code.