1
0
mirror of https://github.com/danog/MadelineProto.git synced 2024-11-30 09:38:57 +01:00
MadelineProto/CHANGELOG.md
2019-06-06 00:33:11 +02:00

27 KiB
Raw Blame History

Changelog

4.0.0 Full async

Fully asynchronous MadelineProto!

MadelineProto now features async, for incredible speed improvements, and parallel processing. Powered by amphp, MadelineProto wraps the AMPHP APIs to provide a simpler generator-based async API.

What exactly is async, you may ask, and how is it better than threading or multiprocessing?
Async is a relatively new programming pattern that allows you to easily write non-blocking code as if you were using standard blocking functions, all without the need for complex message exchange systems and synchronization handling for threaded programs, that only add overhead and complexity to your programs, making everything slower and error-prone.

More simply put: with MadelineProto 4.0, each update is handled in parallel using a separate thread, and everything is done in parallel (even on restricted webhosts, perfect for creating file downloader bots!).

To enable async, you have to do two simple things:

  1. Load the latest version of MadelineProto
  2. yield your method calls.

That's it!
No need to set up thread pools (that don't even work in PHP), use synchronization primitives, and so on... Just yield $MadelineProto->messages->sendMessage instead of $MadelineProto->messages->sendMessage.


And now, on to the API changes:

  • First of all, we've got several bucketloads of telegram API changes, that can be viewed in the first posts.
  • Very important, I wrote documentation on what to do if your account gets banned. * Dropped support for PHP 5 and PHP 7.0: these versions of PHP have officially reached their EOL, so MadelineProto will not support them anymore.
    You should use MadelineProto with PHP 7.3 (or PHP 7.2; PHP 7.1 is supported but not recommended).
  • Dropped support for get_updates: it won't work properly on async, and I really recommend you stop using it
  • You can now use the @support username in sendMessage and other methods to send messages to the support user!
  • Now MadelineProto will automatically try to get the access hash of users not present in the internal peer database (this should reduce errors)!
  • If any file cannot be downloaded to due issues with the tg media server that is hosting it, it will be automatically sent to the @support user (settings).
  • Documented the MyTelegramOrgWrapper API, that can be used to login programmaticaly to the [my.telegram.org](https://my.telegram.org management page).
  • Added an update_2fa method to update the login password
  • Added a get_full_dialogs method to get a full list of all chats youre member of, including dialog info (such as the pinned/last message ID, unread count, tag count, notification settings and message drafts).
  • Added support for automatic file uploads by name in secret chats (as with normal chats); you can also now send secret chat messages using the sendMessage method as if it were a normal chat
  • Added a resetUpdateState method to reset the update state and fetch ALL updates from the beginning
  • Improved chat message splitting algorithm (if the message you're trying to send is too long): performance improvements, and it will now notify you via the logs if there are too many entities in the logs, or if the entities are too long.
  • Improved the get_self method.
  • Added a __magic_sleep substitute for __sleep in the CombinedEventHandler
  • Removed all dependencies to curl: now all HTTP requests are made asynchronously using a custom version of artax (more on that later).
  • Updated php-libtgvoip and introduced a common API for changing phone call settings
  • Improved the madeline.php loader
  • Removed the old serialization APIs: now serialization is done automatically by MadelineProto!
  • Increased the default flood wait limit to 10 minutes, since with async waiting for the flood wait isn't blocking anymore
  • When running from web, MadelineProto will also automatically enable logging of PHP errors (not MadelineProto logs) to MadelineProto.log, located in the same directory as the script that loaded MadelineProto.

Naturally, async is not the only feature present in MadelineProto 4.0: to implement async in MadelineProto, I rewrote the entire codebase two times over the course of the last six months, as shown in the diff:

~$ git diff --stat old master
 187 files changed, 28692 insertions(+), 12288 deletions(-)
  • I completely refactored the connection stack: I threw out of the window my custom OOP wrapper for vanilla PHP sockets and created a brand new OOP connection stack.
    The new connection stack uses a brand new incremental buffered async socket wrapper Stream API, that greatly simplifies work with layered protocols like MTProto.

Each MTProto packet is passed through a Stream layer, each one taking care of one part of the MTProto envelope, finally sending off everything to telegram using a simple AMPHP socket.

The simplified async buffered implementation of AMPHP sockets I created allowed me to easily add support for ALL MTProto protocols, including a few I haven't implemented before like the padded protocol.

Obfuscation is now handled via a separate setting, and can be enabled for all transports, not just tcp_abridged (it can be used to prevent ISP blocks in combination with the padded protocol, for example).

I also added support for different MTProto transports, including the brand new websocket transport and secure websocket transport (HTTPS), until now only implemented in tdlib!
If you want all communcations with telegram to be triple-encrypted using HTTPS+MTProto+obfuscation, you can enable the wss transport (the third layer of encryption w/ obfuscation will be enabled automatically).
Note: the websocket HTTPS MTProto transport is different from the HTTPS MTProto protocol (both are supported by MadelineProto).
The websocket HTTPS MTProto transport is more reliable and faster than the HTTPS MTProto protocol, since MadelineProto does not have to handle HTTP long polls.

(the websocket transport may or may not also be used in future to implement MadelineProto in the browser with webassembly ;)

The new proxy API is also based on the new stream API, and now supports MTProxies!
You can still use the socks5 and HTTP transports if you want.
Writing your custom proxies now is now a LOT easier, thanks to the neat structure of the stream APIs and the abundant PHPDOC comments.


Streams and proxies aren't the only things that have been completely rewritten in this release:
I have written a unique socket message handler API based on MadelineProto's loop API, and guarantees maximum reliability while communicating with telegram.
By the way, the new async background loop API can also be used by you! It is perfect for repeating tasks in a cron-like manner, running multiple resumable event loops and much more! More on that later.

The new message handler loops run in the background, and guarantee that every single method call you make gets delivered to telegram.
Two signal loops running in two separate green threads take care of writing and reading data from the socket asynchronously.
A third signal loop uses state request messages to make sure that the method calls were received by telegram, and queries replies/resend method calls if something's wrong.
A fourth signal loop takes care of HTTP long polling.

This guarantees maximum stability even if telegram's having server issues.
The write loop also greatly reduces overhead, increasing performances by automatically wrapping in containers multiple method calls: this is especially useful when making multiple method calls simultaneously with async (more on that later).

The update state is now stored using a custom UpdatesState API, that will simplify backup to a DBMS backend later on :).


Possibly the most exciting thing to work on in this version of MadelineProto was the new update management system: I whipped it up in merely two days a few weeks ago, and it has absolutely improved the overall reliability of MadelineProto.
Huge thanks to Aliaksei Levin, the developer of tdlib, for explaining to me how exactly does the MTProto update API work: he saved me a lot of time, and was really nice <3<3<3.

While thinking of an easy way I could implement the new update system, I had an inspiration:

I created a unique, reliable, fast and extremely simple update handling system based on MadelineProto's loop API, not present in any other MTProto client, not even tdlib.

Instead of messing around with various synchronization locks, checks and cluttered update receivers, I implemented three simple feed loops.
Updates are fed to the first update loop, where some simple duplicate/hole checks are done. Updates are then fed to a second update loop, where secondary duplicate/hole checks are done.
A third loop type takes care of eventual holes and periodically fetches updates for all supergroups, channels.

All of this is implemented without any kind of additional synchronization or locking due to the nature of the MadelineProto loop APIs, with 1% of the complexity of the official clients (which means less bugs and more pony time for me).


Even if most of MadelineProto's logic is now concentrated in the new loop and stream APIs, some parts like the TL parser are still there, albeit with many changes.
For example: now the TL parser is fully asynchronous (that may sound a bit weird to some of you, but for dynamic TL parsers, this greatly increases performances and allows parallelism).

The TL deserializer now uses yet another well-structured API called the TLCallback API to automatically populate internal databases directly on deserialization: again, this paradigm greatly reduces complexity, increases performance and is absolutely unique to MadelineProto; no other MTProto client has it, not even official clients.

One of the completely new modules that I created for MadelineProto async is the file reference database: a very complex module, required to handle files in the newer versions of the Telegram API.
It makes use of the TLCallback API to map files to their origins, to be able to refetch them at any given time when the file reference expires.

Another new module I've implemented is the PasswordCalculator, that is used to calculate the new password hashes for the 2FA login, really cool.

I've also rewritten the APIFactory, the abstraction layer that stands between you and MadelineProto when you do $MadelineProto->method(): it is now fully async, and MUCH faster thanks to a new cached method mapping system.
The same cached method mapping system is also used for the event handler, which means that now the event handler is the fastest update management method.


And now, let's elaborate on async:
With MadelineProto 4.0, each update is handled in parallel using a separate thread, and everything is done in parallel (even on restricted webhosts, perfect for creating file downloader bots!).

When I say thread, I actually mean green thread (wikipedia), often called strand.
Strands are behave exactly like normal threads, except that they're implemented in user-space, and they're much faster, more reliable, and do not suffer from synchronization issues present with normal threads.

Each update you receive using the event handler or the callback handler is managed in parallel in separate green threads: the only thing you have to do to enable async with green threads is add a yield before calling MadelineProto methods.

Full async documentation with examples.

If your code still relies on the old synchronous behaviour, don't worry, there is backward compatibility.
However, old synchronous behaviour is deprecated, and I highly recommend you switch to async, due to the huge performance and parallelism benefits.


To implement async, I wrote loads of new async APIs in MadelineProto, as you may have seen above.
I used the awesome amphp async framework as base, on which to build the new MadelineProto APIs.
I heavily modified amphp coroutines and wrapped all of the AMPHP event loop methods to add native support for yielding generators.

I have also wrapped multiple amphp async libraries for ease of use and compatiblity with MadelineProto settings:

  • MadelineProto artax HTTP client:
    I wrapped the amphp artax HTTP library for greater security, and to add support for socks5 and HTTP proxies: the proxy settings are automatically extracted from MadelineProto settings.
    Soon, MadelineProto's artax will support DNS over HTTPS by default.

I also provided a wrapper version of file_get_contents:

$file = yield $MadelineProto->file_get_contents('https://url');

You can use this library to asynchronously download files from the web.

  • ALL internal MadelineProto methods were converted to async automatically using an automatic script: with some changes and conversion to use php-parser, it can be used to automatically asyncify any script (will implement in a future version of MadelineProto).
  • I created a custom async API to asynchronously construct objects:
    This allows you to create multiple instances of MadelineProto asynchronously, because each instantiation will be done asynchronously.
$com = new \danog\MadelineProto\CombinedAPI('combined_session.madeline', ['bot.madeline' => $settings, 'user.madeline' => $settings, 'user2.madeline' => $settings]);
$com->async(true);
$com->loop(function () use ($com) {
    $res = [];
    foreach (['bot.madeline', 'user.madeline', 'user2.madeline'] as $session) {
        $res []= $com->instances[$session]->start();
    }
    yield $com->all($res);
    yield $com->setEventHandler('\EventHandler');
}
$com->loop();

Internally, the combined event handler does new \danog\MadelineProto\API, but it isn't blocking:
this means that later, when I combine all the async start()s into one array and yield it using the async combinator function, initialization of sessions is done in parallel, and not one after the other.

  • I wrapped amphp's helper functions, and created some more:

The all function you saw above is one of the many combinator functions that can be used to execute multiple async operations simultaneously and wait for the result of all of them.
Each method has different error handling techniques, see the amphp docs.
Note that if you just take the result of these methods without yielding it, you can use it as a normal promise/generator.

Note: this is not the recommended method to make multiple method calls on the same instance of MadelineProto; use this only for non-API methods like start(); for API methods, use multiple async.

$promise1 = $MadelineProto->messages->sendMessage(...);
$promise2 = $MadelineProto->messages->sendMessage(...);
// $promise3 = ...;

// Equivalent to Amp\Promise\all(), but works with generators, too
$results = yield $MadelineProto->all([$promise1, $promise2, $generator3]);

// Equivalent to Amp\Promise\first(), but works with generators, too
$results = yield $MadelineProto->first([$promise1, $promise2, $generator3]);

// Equivalent to Amp\Promise\any(), but works with generators, too
$results = yield $MadelineProto->any([$promise1, $promise2, $generator3]);

// Equivalent to Amp\Promise\some(), but works with generators, too
$results = yield $MadelineProto->some([$promise1, $promise2, $generator3]);
  • Handling timeouts

These methods can be used to wait for a certain amount of time for a result, and then throw an Amp\TimeoutException or simply continue execution if no result was obtained.

// Waits for the result for 2 seconds and then throws an \Amp\TimeoutException
$result = yield $MadelineProto->timeout($promise, 2)

// Waits for the result for 2 seconds, returns the result or null (which is the result of sleep())
$result = yield $MadelineProto->first([$promise, $MadelineProto->sleep(2)]);
  • Async forking (does async green thread forking)

Useful if you need to start a process in the background and you want throwed exceptions to surface up.
These exceptions will exit the event loop, turning off the script unless you wrap $MadelineProto->loop() with a try-catch.
Use it when you do not need the result of a method (see ignored async), but you want eventual errors to crash the script.
Otherwise, just use the method without yield.

// Exceptions will surface out of the event loop()
$MadelineProto->callFork($MadelineProto->messages->sendMessage([...]));
// Exceptions will be ignored
$MadelineProto->messages->sendMessage([...]);

// Like the first one, but the call will be deferred to the next event loop tick
$MadelineProto->callForkDefer($MadelineProto->messages->sendMessage([...]));

Ignoring exceptions is usually not good practice, so it's best to wrap the method you're calling in a closure with a try-catch with some error handling code inside of it, calling it right after that and passing it to callFork:

$MadelineProto->callFork((function () use ($MadelineProto) {
    try {
        $MadelineProto->messages->sendMessage([...])
    } catch (\Exception $e) {
        // Handle by logging and stuff
    }
})());
  • I also created some wrapper functions to work asynchronously with console/browser output

Async sleep:

yield $MadelineProto->sleep(3);

Async readline:

$res = yield $MadelineProto->readLine('Optional prompt');
  • Logging in MadelineProto is now completely asynchronous and easier:
$MadelineProto->logger("Message");

No need to yield here, because the logging must be done in background.

  • Simultaneous method calls
yield $MadelineProto->messages->sendMessage([
    'multiple' => true,
    ['peer' => '@danogentili', 'message' => 'hi'],
    ['peer' => '@apony', 'message' => 'hi']
]);

This is the preferred way of combining multiple method calls: this way, the MadelineProto async WriteLoop will combine all method calls in one container, making everything WAY faster.
The result of this will be an array of results, whose type is determined by the original return type of the method (see API docs).

The order of method calls can be guaranteed (server-side, not by MadelineProto) by using call queues.

  • Exceptions:

NOTE: Due to the async nature of MadelineProto 4.0, sometimes the exception that is thrown and logged may not be the actual exception that caused the crash of the script.
To let me properly debug the issue, when reporting issues you also have to provide full logs.

MadelineProto provides some very useful async loop APIs, for executing operations periodically or on demand.

They are the perfect solution for implementing async cron loops, signal threads and much more!
I'll just link you all to the documentation: it has full examples for each and every async API (you can also check out the code, it's full of PHPDOC comments).


Writing MadelineProto async, I really enjoyed working with the AMPHP framework: it is very fast, has multiple packages to work asynchronously with HTTP requests, websockets, databases (MySQL, redis, postgres, DNS, sockets and much more)!

I chose AMPHP instead of the more famous ReactPHP due to its speed, rich set of libraries and extreme ease of use.

Working with its devs is also nice; I already contributed to the amphp's libraries with some improvements and bugfixes (I will soon also implement a DNS over HTTPS client for AMPHP, to implement in MadelineProto's artax), and I invite you to do the same!

Even if you can't contribute to AMPHP, you can still use it: as I mentioned above, there are MANY libraries to work asynchronously with files, databases, DNS, HTTP; there's even an async windows registry client, used by the DNS client to fetch default DNS servers on windows!

When you use MadelineProto async, you have to also use an amphp async database client, artax instead of curl and guzzle, and so on: otherwise, the speed of MadelineProto async may be reduced by blocking behaviour of your code.


In case you missed it, quick reminder that MadelineProto now supports MTProxy!


MadelineProto started as a hobby project, back when I knew nothing about cryptography, telegram's APIs, async or programming standards:

before, MadelineProto was created piece by piece as a single monolothic class composed of multiple traits; now, MadelineProto is composed of multiple modular APIs that are well-structured, heavily commented, wrappable, extendable in any possible and immaginable way.

I am really happy with how MadelineProto async turned out.
I had an absolute blast working on this update, and implementing async really opened a whole sea of possible innovations and features I can implement in MadelineProto now:

  • Async file upload by url (1.5gb)
  • Get direct download url of any file (1.5gb)
  • TON (this is actually going to be a lot of fun)
  • group calls (the php-libtgvoip APIs are actually ready, I just need to wrap them in php-libtgvoip)
  • video calls (~)
  • native calls: With MadelineProto async, I can finally properly implement native async phone calls in PHP:
    this will allow handling phone calls on webhosts! I already have some code I created a year ago for this backed up in a private gitlab repo :)
  • async iterators: I've been thinking of using AMPHP's async iterator API (after some modding obviously) to create async iterators for easily iterating over the messages of a group, and for doing other operations that would normally require using offsets:
foreach ($MadelineProto->getMessages('@group') as $message) {

}

This shouldn't be too hard to implement, and with a proper (maybe separate OOP) API, it's going to be fun to make and use.

  • snake_case => CamelCase conversion for all API methods:
    Previously, MadelineProto's custom API methods (get_info, download_to_dir) used snake_case, which contrasted with the Telegram API methods (sendMessage), and is against PHP coding standards.
    Soon, I plan to update MadelineProto's docs to only use CamelCase for method names.

The old method name will still be available after that; right now, you can already use both naming conventions for all MadelineProto methods:

$MadelineProto->get_pwr_chat('user'); // OK!
$MadelineProto->getPwrChat('user'); // OK!

However, I recommend you now use the CamelCase version of methods.

  • ArrayAccess on promises (to be able to do yield $method()['result'] instead of (yield $method)['result'])
  • An openChat method, inspired by tdlib, to enable fetching updates from groups you aren't a member of
  • Add support for Telegram passport in 2FA and write some wrapper APIs
  • Write some simplified APIs for takeout (can be implemented using async iterators)
  • #MadelineProtoForNode async and lua async (the second can already be done now, the first is also pretty easy now that async is here :)))))
  • DNS over HTTPS everywhere
  • Parallelize some methods like the download method, or getPwrChat (upload is already fully parallelized)
  • Get sponsor of MTProxies
  • Optional max_id and min_id params in methods
  • #phase1 🇮🇷🇷🇺

To use MadelineProto 4.0 w/ async, you have to load the latest version of MadelineProto from the master branch by loading it through composer (dev-master) or with madeline.php:

<?php

if (!file_exists('madeline.php')) {
    copy('https://phar.madelineproto.xyz/madeline.php', 'madeline.php');
}
define('MADELINE_BRANCH', '');
include 'madeline.php';

In a few weeks I will set MadelineProto 4.0 as default with madeline.php: in the meantime, I do not provide support for the old version.


  • Write some docs for the useful get mime funcs
  • Figure out web file proxying (might be interesting)
  • Re-enable gzip in the write loop
  • no defer logs
  • startedLoop docs

tell about restart tell about madeline.php loading in the same dire

remind about using the define

Handle auth_restart splitting