php-parser/lib/PhpParser/Node/Scalar/String_.php

<?php declare(strict_types=1);

namespace PhpParser\Node\Scalar;

use PhpParser\Error;
use PhpParser\Node\Scalar;

class String_ extends Scalar
{
    /* For use in "kind" attribute */
    const KIND_SINGLE_QUOTED = 1;
    const KIND_DOUBLE_QUOTED = 2;
    const KIND_HEREDOC = 3;
    const KIND_NOWDOC = 4;

    /** @var string String value */
    public $value;

    protected static $replacements = [
        '\\' => '\\',
        '$'  =>  '$',
        'n'  => "\n",
        'r'  => "\r",
        't'  => "\t",
        'f'  => "\f",
        'v'  => "\v",
        'e'  => "\x1B",
    ];

    /**
     * Constructs a string scalar node.
     *
     * @param string $value      Value of the string
     * @param array  $attributes Additional attributes
     */
    public function __construct(string $value, array $attributes = []) {
        $this->attributes = $attributes;
        $this->value = $value;
    }

    public function getSubNodeNames() : array {
        return ['value'];
    }

    /**
     * @internal
     *
     * Parses a string token.
     *
     * @param string $str String token content
     * @param bool $parseUnicodeEscape Whether to parse PHP 7 \u escapes
     *
     * @return string The parsed string
     */
    public static function parse(string $str, bool $parseUnicodeEscape = true) : string {
        $bLength = 0;
        if ('b' === $str[0] || 'B' === $str[0]) {
            $bLength = 1;
        }

        if ('\'' === $str[$bLength]) {
            return str_replace(
                ['\\\\', '\\\''],
                ['\\', '\''],
                substr($str, $bLength + 1, -1)
            );
        } else {
            return self::parseEscapeSequences(
                substr($str, $bLength + 1, -1), '"', $parseUnicodeEscape
            );
        }
    }

    /**
     * @internal
     *
     * Parses escape sequences in strings (all string types apart from single quoted).
     *
     * @param string      $str   String without quotes
     * @param null|string $quote Quote type
     * @param bool $parseUnicodeEscape Whether to parse PHP 7 \u escapes
     *
     * @return string String with escape sequences parsed
     */
    public static function parseEscapeSequences(string $str, $quote, bool $parseUnicodeEscape = true) : string {
        if (null !== $quote) {
            $str = str_replace('\\' . $quote, $quote, $str);
        }

        $extra = '';
        if ($parseUnicodeEscape) {
            $extra = '|u\{([0-9a-fA-F]+)\}';
        }

        return preg_replace_callback(
            '~\\\\([\\\\$nrtfve]|[xX][0-9a-fA-F]{1,2}|[0-7]{1,3}' . $extra . ')~',
            function($matches) {
                $str = $matches[1];

                if (isset(self::$replacements[$str])) {
                    return self::$replacements[$str];
                } elseif ('x' === $str[0] || 'X' === $str[0]) {
                    return chr(hexdec($str));
                } elseif ('u' === $str[0]) {
                    return self::codePointToUtf8(hexdec($matches[2]));
                } else {
                    return chr(octdec($str));
                }
            },
            $str
        );
    }

    /**
     * Converts a Unicode code point to its UTF-8 encoded representation.
     *
     * @param int $num Code point
     *
     * @return string UTF-8 representation of code point
     */
    private static function codePointToUtf8(int $num) : string {
        if ($num <= 0x7F) {
            return chr($num);
        }
        if ($num <= 0x7FF) {
            return chr(($num>>6) + 0xC0) . chr(($num&0x3F) + 0x80);
        }
        if ($num <= 0xFFFF) {
            return chr(($num>>12) + 0xE0) . chr((($num>>6)&0x3F) + 0x80) . chr(($num&0x3F) + 0x80);
        }
        if ($num <= 0x1FFFFF) {
            return chr(($num>>18) + 0xF0) . chr((($num>>12)&0x3F) + 0x80)
                 . chr((($num>>6)&0x3F) + 0x80) . chr(($num&0x3F) + 0x80);
        }
        throw new Error('Invalid UTF-8 codepoint escape sequence: Codepoint too large');
    }
    
    public function getType() : string {
        return 'Scalar_String';
    }
}
Add strict_types to lib code 2017-08-18 22:57:27 +02:00			`<?php declare(strict_types=1);`
a) changes node structure (Stmt_, Expr_, ...) b) fixes parsing of x::$y[z] Sorry for that one large commit. Won't happen again. 2011-05-27 18:20:44 +02:00
Port library to use namespaces, with BC for old names 2014-02-06 14:44:16 +01:00			`namespace PhpParser\Node\Scalar;`

Add support for unicode escape sequences Only parsed if the PHP 7 parser is used. 2015-06-13 20:51:02 +02:00			`use PhpParser\Error;`
Port library to use namespaces, with BC for old names 2014-02-06 14:44:16 +01:00			`use PhpParser\Node\Scalar;`

Rename nodes for compat with PHP 7 The old names will still be available on PHP 5.x. 2015-03-20 21:47:20 +01:00			`class String_ extends Scalar`
a) changes node structure (Stmt_, Expr_, ...) b) fixes parsing of x::$y[z] Sorry for that one large commit. Won't happen again. 2011-05-27 18:20:44 +02:00			`{`
Add string kinds and doc string labels Scalar\String_ and Scalar\Encapsed now have an additional "kind" attribute, which may be one of: * String_::KIND_SINGLE_QUOTED * String_::KIND_DOUBLE_QUOTED * String_::KIND_NOWDOC * String_::KIND_HEREDOC Additionally, if the string kind is one of the latter two, an attribute "docLabel" is provided, which contains the doc string label (STR in <<<STR) that was originally used. The pretty printer will try to take the original kind of the string, as well as the used doc string label into account. 2016-04-02 15:22:24 +02:00			`/* For use in "kind" attribute */`
			`const KIND_SINGLE_QUOTED = 1;`
			`const KIND_DOUBLE_QUOTED = 2;`
			`const KIND_HEREDOC = 3;`
			`const KIND_NOWDOC = 4;`

Use real properties for storing subnodes Instead of storing subnodes in a subNodes dictionary, they are now stored as simple properties. This requires declarating the properties, assigning them in the constructor, overriding the getSubNodeNames() method and passing NULL to the first argument of the NodeAbstract constructor. [Deprecated: It's still possible to use the old mode of operation for custom nodes by passing an array of subnodes to the constructor.] The only behavior difference this should cause is that getSubNodeNames() will always return the original subnode names and skip any additional properties that were dynamically added. E.g. this means that the "namespacedName" node added by the NameResolver visitor is not treated as a subnode, but as a dynamic property instead. This change improves performance and memory usage. 2015-02-28 18:44:28 +01:00			`/** @var string String value */`
			`public $value;`

[cs] use PHP 5.4 short array, since PHP 7.0 is min version 2017-08-13 14:06:08 +02:00			`protected static $replacements = [`
More test coverage and doc string parsing fixes The parser didn't account for the additional newline after the content of doc strings, which is left there by the tokenizer for some reason. Additoinally esacape sequences were parsed in nowdoc strings. Additionally this contains some minor changes to the grammar: Some _list nonterminals were refactored to have the possible single elements in a reparate rule and only assemble those single elements. (This reduces duplication and gives better assignment of line number context.) 2011-12-04 16:52:43 +01:00			`'\\' => '\\',`
			`'$' => '$',`
			`'n' => "\n",`
			`'r' => "\r",`
			`'t' => "\t",`
			`'f' => "\f",`
			`'v' => "\v",`
[5.4] Add new \e escape sequence (0x1B/27) 2011-12-04 17:35:30 +01:00			`'e' => "\x1B",`
[cs] use PHP 5.4 short array, since PHP 7.0 is min version 2017-08-13 14:06:08 +02:00			`];`
More test coverage and doc string parsing fixes The parser didn't account for the additional newline after the content of doc strings, which is left there by the tokenizer for some reason. Additoinally esacape sequences were parsed in nowdoc strings. Additionally this contains some minor changes to the grammar: Some _list nonterminals were refactored to have the possible single elements in a reparate rule and only assemble those single elements. (This reduces duplication and gives better assignment of line number context.) 2011-12-04 16:52:43 +01:00
Give all Scalar nodes and the special nodes Name and Variable specialized constructors for easier use 2011-08-09 14:55:45 +02:00			`/**`
			`* Constructs a string scalar node.`
			`*`
Store line and doc comment as attributes 2012-04-29 23:32:09 +02:00			`* @param string $value Value of the string`
			`* @param array $attributes Additional attributes`
Give all Scalar nodes and the special nodes Name and Variable specialized constructors for easier use 2011-08-09 14:55:45 +02:00			`*/`
[cs] use PHP 5.4 short array, since PHP 7.0 is min version 2017-08-13 14:06:08 +02:00			`public function __construct(string $value, array $attributes = []) {`
Avoid parent constructor call during node construction Instead explicitly assign the attributes. This is a minor performance improvement. 2019-05-12 14:55:21 +02:00			`$this->attributes = $attributes;`
Use real properties for storing subnodes Instead of storing subnodes in a subNodes dictionary, they are now stored as simple properties. This requires declarating the properties, assigning them in the constructor, overriding the getSubNodeNames() method and passing NULL to the first argument of the NodeAbstract constructor. [Deprecated: It's still possible to use the old mode of operation for custom nodes by passing an array of subnodes to the constructor.] The only behavior difference this should cause is that getSubNodeNames() will always return the original subnode names and skip any additional properties that were dynamically added. E.g. this means that the "namespacedName" node added by the NameResolver visitor is not treated as a subnode, but as a dynamic property instead. This change improves performance and memory usage. 2015-02-28 18:44:28 +01:00			`$this->value = $value;`
			`}`

Generate PHP 7 type annotations 2017-04-28 21:40:59 +02:00			`public function getSubNodeNames() : array {`
[cs] use PHP 5.4 short array, since PHP 7.0 is min version 2017-08-13 14:06:08 +02:00			`return ['value'];`
Give all Scalar nodes and the special nodes Name and Variable specialized constructors for easier use 2011-08-09 14:55:45 +02:00			`}`

Parse strings more correctly, keep information on whether it was a single or double quoted string 2011-05-28 00:21:12 +02:00			`/**`
Annotate some APIs as @internal 2014-09-30 20:23:25 +02:00			`* @internal`
			`*`
Scalar_String::create() -> Scalar_String::parse() Directly creating the node isn't necessary anymore, the token only needs to be parsed. This makes it consistent with the other scalar parsing methods and removes the need to pass $arguments around. 2012-10-19 15:17:08 +02:00			`* Parses a string token.`
Parse strings more correctly, keep information on whether it was a single or double quoted string 2011-05-28 00:21:12 +02:00			`*`
Scalar_String::create() -> Scalar_String::parse() Directly creating the node isn't necessary anymore, the token only needs to be parsed. This makes it consistent with the other scalar parsing methods and removes the need to pass $arguments around. 2012-10-19 15:17:08 +02:00			`* @param string $str String token content`
Add support for unicode escape sequences Only parsed if the PHP 7 parser is used. 2015-06-13 20:51:02 +02:00			`* @param bool $parseUnicodeEscape Whether to parse PHP 7 \u escapes`
fix doccomment 2011-06-01 22:37:10 +02:00			`*`
Scalar_String::create() -> Scalar_String::parse() Directly creating the node isn't necessary anymore, the token only needs to be parsed. This makes it consistent with the other scalar parsing methods and removes the need to pass $arguments around. 2012-10-19 15:17:08 +02:00			`* @return string The parsed string`
Parse strings more correctly, keep information on whether it was a single or double quoted string 2011-05-28 00:21:12 +02:00			`*/`
Generate PHP 7 type annotations 2017-04-28 21:40:59 +02:00			`public static function parse(string $str, bool $parseUnicodeEscape = true) : string {`
Don't save whether a string is binary anymore. The binary flag isn't going to be used in the next couple of years, so it doesn't make sense to unnecessarily complicate things. 2011-08-09 14:19:44 +02:00			`$bLength = 0;`
Handle uppercase B"" prefix 2016-04-02 14:15:49 +02:00			`if ('b' === $str[0] \|\| 'B' === $str[0]) {`
Don't save whether a string is binary anymore. The binary flag isn't going to be used in the next couple of years, so it doesn't make sense to unnecessarily complicate things. 2011-08-09 14:19:44 +02:00			`$bLength = 1;`
Parse strings more correctly, keep information on whether it was a single or double quoted string 2011-05-28 00:21:12 +02:00			`}`

Properly parse escape sequences: * Add support for oct and hex escape sequences * Take used quote type into account when parsing encapsed strings 2011-08-20 10:40:27 +02:00			`if ('\'' === $str[$bLength]) {`
Scalar_String::create() -> Scalar_String::parse() Directly creating the node isn't necessary anymore, the token only needs to be parsed. This makes it consistent with the other scalar parsing methods and removes the need to pass $arguments around. 2012-10-19 15:17:08 +02:00			`return str_replace(`
[cs] use PHP 5.4 short array, since PHP 7.0 is min version 2017-08-13 14:06:08 +02:00			`['\\\\', '\\\''],`
[CS] Trim whitespaces inside arrays 2018-01-10 17:18:49 +01:00			`['\\', '\''],`
Properly parse escape sequences: * Add support for oct and hex escape sequences * Take used quote type into account when parsing encapsed strings 2011-08-20 10:40:27 +02:00			`substr($str, $bLength + 1, -1)`
Parse strings more correctly, keep information on whether it was a single or double quoted string 2011-05-28 00:21:12 +02:00			`);`
			`} else {`
Add support for unicode escape sequences Only parsed if the PHP 7 parser is used. 2015-06-13 20:51:02 +02:00			`return self::parseEscapeSequences(`
			`substr($str, $bLength + 1, -1), '"', $parseUnicodeEscape`
			`);`
Parse strings more correctly, keep information on whether it was a single or double quoted string 2011-05-28 00:21:12 +02:00			`}`
Parse escape sequences in encapsed strings too 2011-05-29 19:38:04 +02:00			`}`

			`/**`
Annotate some APIs as @internal 2014-09-30 20:23:25 +02:00			`* @internal`
			`*`
Properly parse escape sequences: * Add support for oct and hex escape sequences * Take used quote type into account when parsing encapsed strings 2011-08-20 10:40:27 +02:00			`* Parses escape sequences in strings (all string types apart from single quoted).`
Parse escape sequences in encapsed strings too 2011-05-29 19:38:04 +02:00			`*`
Properly parse escape sequences: * Add support for oct and hex escape sequences * Take used quote type into account when parsing encapsed strings 2011-08-20 10:40:27 +02:00			`* @param string $str String without quotes`
			`* @param null\|string $quote Quote type`
Add support for unicode escape sequences Only parsed if the PHP 7 parser is used. 2015-06-13 20:51:02 +02:00			`* @param bool $parseUnicodeEscape Whether to parse PHP 7 \u escapes`
fix doccomment 2011-06-01 22:37:10 +02:00			`*`
Parse escape sequences in encapsed strings too 2011-05-29 19:38:04 +02:00			`* @return string String with escape sequences parsed`
			`*/`
Generate PHP 7 type annotations 2017-04-28 21:40:59 +02:00			`public static function parseEscapeSequences(string $str, $quote, bool $parseUnicodeEscape = true) : string {`
Properly parse escape sequences: * Add support for oct and hex escape sequences * Take used quote type into account when parsing encapsed strings 2011-08-20 10:40:27 +02:00			`if (null !== $quote) {`
			`$str = str_replace('\\' . $quote, $quote, $str);`
			`}`
Parse escape sequences in encapsed strings too 2011-05-29 19:38:04 +02:00
Add support for unicode escape sequences Only parsed if the PHP 7 parser is used. 2015-06-13 20:51:02 +02:00			`$extra = '';`
			`if ($parseUnicodeEscape) {`
			`$extra = '\|u\{([0-9a-fA-F]+)\}';`
			`}`

Properly parse escape sequences: * Add support for oct and hex escape sequences * Take used quote type into account when parsing encapsed strings 2011-08-20 10:40:27 +02:00			`return preg_replace_callback(`
Add support for unicode escape sequences Only parsed if the PHP 7 parser is used. 2015-06-13 20:51:02 +02:00			`'~\\\\([\\\\$nrtfve]\|[xX][0-9a-fA-F]{1,2}\|[0-7]{1,3}' . $extra . ')~',`
Anonymize some callbacks 2015-05-02 22:35:15 +02:00			`function($matches) {`
			`$str = $matches[1];`

			`if (isset(self::$replacements[$str])) {`
			`return self::$replacements[$str];`
			`} elseif ('x' === $str[0] \|\| 'X' === $str[0]) {`
			`return chr(hexdec($str));`
Add support for unicode escape sequences Only parsed if the PHP 7 parser is used. 2015-06-13 20:51:02 +02:00			`} elseif ('u' === $str[0]) {`
			`return self::codePointToUtf8(hexdec($matches[2]));`
Anonymize some callbacks 2015-05-02 22:35:15 +02:00			`} else {`
			`return chr(octdec($str));`
			`}`
			`},`
Properly parse escape sequences: * Add support for oct and hex escape sequences * Take used quote type into account when parsing encapsed strings 2011-08-20 10:40:27 +02:00			`$str`
Parse escape sequences in encapsed strings too 2011-05-29 19:38:04 +02:00			`);`
Parse strings more correctly, keep information on whether it was a single or double quoted string 2011-05-28 00:21:12 +02:00			`}`
Properly parse escape sequences: * Add support for oct and hex escape sequences * Take used quote type into account when parsing encapsed strings 2011-08-20 10:40:27 +02:00
Add non-void return types 2017-01-24 08:38:55 +01:00			`/**`
Update doc comments after previous comment Make some of the type annotations more accurate, and complete the generated doc-comments to be complete (with description and parameter annotations.) 2017-01-26 00:16:54 +01:00			`* Converts a Unicode code point to its UTF-8 encoded representation.`
			`*`
			`* @param int $num Code point`
			`*`
			`* @return string UTF-8 representation of code point`
Add non-void return types 2017-01-24 08:38:55 +01:00			`*/`
Generate PHP 7 type annotations 2017-04-28 21:40:59 +02:00			`private static function codePointToUtf8(int $num) : string {`
Add support for unicode escape sequences Only parsed if the PHP 7 parser is used. 2015-06-13 20:51:02 +02:00			`if ($num <= 0x7F) {`
			`return chr($num);`
			`}`
			`if ($num <= 0x7FF) {`
			`return chr(($num>>6) + 0xC0) . chr(($num&0x3F) + 0x80);`
			`}`
			`if ($num <= 0xFFFF) {`
			`return chr(($num>>12) + 0xE0) . chr((($num>>6)&0x3F) + 0x80) . chr(($num&0x3F) + 0x80);`
			`}`
			`if ($num <= 0x1FFFFF) {`
			`return chr(($num>>18) + 0xF0) . chr((($num>>12)&0x3F) + 0x80)`
			`. chr((($num>>6)&0x3F) + 0x80) . chr(($num&0x3F) + 0x80);`
			`}`
			`throw new Error('Invalid UTF-8 codepoint escape sequence: Codepoint too large');`
			`}`
Add explicit getType() methods Rather than automatically deriving getType() from the class name. 2017-11-12 21:25:57 +01:00
Add public visibility to getType method (#463) 2018-01-10 18:57:48 +01:00			`public function getType() : string {`
Add explicit getType() methods Rather than automatically deriving getType() from the class name. 2017-11-12 21:25:57 +01:00			`return 'Scalar_String';`
			`}`
Use real properties for storing subnodes Instead of storing subnodes in a subNodes dictionary, they are now stored as simple properties. This requires declarating the properties, assigning them in the constructor, overriding the getSubNodeNames() method and passing NULL to the first argument of the NodeAbstract constructor. [Deprecated: It's still possible to use the old mode of operation for custom nodes by passing an array of subnodes to the constructor.] The only behavior difference this should cause is that getSubNodeNames() will always return the original subnode names and skip any additional properties that were dynamically added. E.g. this means that the "namespacedName" node added by the NameResolver visitor is not treated as a subnode, but as a dynamic property instead. This change improves performance and memory usage. 2015-02-28 18:44:28 +01:00			`}`