ext-php-rs/.gitignore
ju1ius d52a878e7b
feat: allows ZendStr to contain null bytes (#202)
Closes https://github.com/davidcole1340/ext-php-rs/issues/200

## Rationale
In PHP zend_strings are binary strings with no encoding information. They can contain any byte at any position.
The current implementation use `CString` to transfer zend_strings between Rust and PHP, which prevents zend_strings containing null-bytes to roundtrip through the ffi layer. Moreover, `ZendStr::new()` accepts only a `&str`, which is incorrect since a zend_string is not required to be valid UTF8.

When reading the PHP source code, it is apparent that  most functions marked with `ZEND_API` that accept a `const *char` are convenience wrappers that convert the `const *char` to a zend_string and delegate to another function. For example [zend_throw_exception()](eb83e0206c/Zend/zend_exceptions.c (L823)) takes a `const *char message`, and just converts it to a zend_string before delegating to [zend_throw_exception_zstr()](eb83e0206c/Zend/zend_exceptions.c (L795)).

I kept this PR focused around `ZendStr` and it's usages in the library, but it should be seen as the first step of a more global effort to remove usages of `CString` everywhere possible.

Also, I didn't change the return type of the string related methods of `Zval` (e.g. I could have made `Zval::set_string()` 
 accept an `impl AsRef<[u8]>` instead of `&str` and return `()` instead of `Result<()>`). If I get feedback that it should be done in this PR, I'll do it.

## Summary of the changes:
### ZendStr
* [BC break]: `ZendStr::new()` and `ZendStr::new_interned()` now accept an `impl AsRef<[u8]>` instead of just `&str`, and are therefore infaillible (outside of the cases where we panic, e.g. when allocation fails). This is a BC break, but it's impact shouldn't be huge (users will most likely just have to remove a bunch of `?` or add a few `Ok()`).
* [BC break]: Conversely, `ZendStr::as_c_str()` now returns a `Result<&CStr>` since it can fail on strings containing null bytes.
* [BC break]: `ZensStr::as_str()` now returns a `Result<&str>` instead of an `Option<&str>` since we have to return an error in case of invalid UTF8.
* adds method `ZendStr::as_bytes()` to return the underlying byte slice.
* adds convenience methods `ZendStr::as_ptr()` and `ZendStr::as_mut_ptr()` to return raw pointers to the zend_string.

 ### ZendStr conversion traits
* adds `impl AsRef<[u8]> for ZendStr`
* [BC break]: replaces `impl TryFrom<String> for ZBox<ZendStr>` by `impl From<String> for ZBox<ZendStr>`.
* [BC break]: replaces `impl TryFrom<&str> for ZBox<ZendStr>` by `impl From<&str> for ZBox<ZendStr>`.
* [BC break]: replaces `impl From<&ZendStr> for &CStr` by `impl TryFrom<&ZendStr> for &CStr`.

### Error
* adds new enum member `Error::InvalidUtf8` used when converting a `ZendStr` to `String` or `&str`
2022-12-09 10:54:17 +01:00

6 lines
49 B
Plaintext

/target
Cargo.lock
/.vscode
/.idea
/tmp
expand.rs