Python ожидание выполнения функции

Waiting in asyncio

One of the main appeals of using Python’s asyncio is being able to fire off many coroutines and run them concurrently. How many ways do you know for waiting for their results?

There’s quite a bit of them! However the different ways have different properties and all of them deserve their place. However I regularly have to look them up to find the right one 1 .

Before we start, a few definitions that I will use throughout this post:

  • coroutine: A running asynchronous function. So if you define a function as async def f(): . and call it as f() , you get back a coroutine in the sense that the term is used throughout this post.
  • awaitable: anything that works with await : coroutines, asyncio.Future s, asyncio.Task s, objects that have a __await__ method.
  • I will be using two async functions f and g for my examples. It’s not important what they do, only that they are defined as async def f(): . and async def g(): . and that they terminate eventually.

await

The simplest case is to await your coroutines:

  1. The coroutines do not run concurrently. g only starts executing after f has finished.
  2. You can’t cancel them once you started awaiting.

A naïve approach to the first problem might be something like this:

But the execution of g / coro_g doesn’t start before it is awaited, making it identical to the first example. For both problems you need to wrap your coroutines in tasks.

Tasks

asyncio.Task s wrap your coroutines and get independently scheduled for execution by the event loop whenever you yield control to it 2 . You can create them using asyncio.create_task() :

 Your tasks now run concurrently and if you decide that you don’t want to wait for task_f or task_g to finish, you can cancel them using task_f.cancel() or task_g.cancel() respectively. Please note that you must create both tasks before you await the first one – otherwise you gain nothing. However, the awaits are only needed to collect the results and to clean up resources ( asyncio will complain if you don’t consume all your results and exceptions).

But waiting for each of them like this is not very practical. In real-life code you often enough don’t even know how many awaitables you will need to wrangle. What we need is to gather the results of multiple awaitables.

asyncio.gather()

asyncio.gather() takes 1 or more awaitables as *args , wraps them in tasks if necessary, and waits for all of them to finish. Then it returns the results of all awaitables in the same order as you passed in the awaitables:

If f() or g() raise an exception, gather() will raise it immediately, but the other tasks are not affected. However if gather() itself is canceled, all of the awaitables that it’s gathering – and that have not completed yet – are also canceled.

You can also pass return_exceptions=True and then exceptions are returned like normal results and you have to check yourself whether or not they were successful (e.g. using isinstance(result, BaseException) .

Summary

  • Takes many awaitables as *args .
  • Wraps each awaitable in a task if necessary.
  • Returns the list of results in the same order.
    • Allows errors to be returned as results (by passing return_exceptions=True ).
    • Otherwise if one of the awaitables raises an exception, gather() propagates it immediately to the caller. But the remaining tasks keep running.

    Now we can wait for many awaitables at once! However well-behaved distributed systems need timeouts. Since gather() hasn’t an option for that, we need the next helper.

    asyncio.wait_for()

    asyncio.wait_for() takes two arguments: one awaitable and a timeout in seconds. If the awaitable is a coroutine, it will automatically be wrapped by a task. So the following construct is quite common:

    If the timeout expires, the inner task gets cancelled. Which for gather() means that all tasks that it is gathering are canceled too: in this case f() and g() .

    Please note that just replacing create_task() by wait_for() and calling it a day does not work. create_task() is a regular function that returns a task; wait_for() is an async function that returns a coroutine. That means it does not start executing until you await it:

    If you now think that there would be no need for wait_for() if gather() had a timeout option, we’re thinking the same thing.

    Summary

    • Takes one awaitable.
    • Wraps the awaitable in a task if necessary.
    • Takes a timeout that cancels the task if it expires.
    • Unlike create_task() , is a coroutine itself that doesn’t execute until awaited.

    Interlude: async-timeout

    A more elegant approach to timeouts is the async-timeout package on PyPI. It gives you an asynchronous context manager that allows you to apply a total timeout even if you need to execute the coroutines sequentially:

    As of Python 3.11, the standard library also has both asyncio.timeout() and asyncio.timeout_at() .

    Sometimes, you don’t want to wait until all awaitables are done. Maybe you want to process them as they finish and report some kind of progress to the user.

    asyncio.as_completed()

    asyncio.as_completed() takes an iterable 3 of awaitables and returns an iterator that yields asyncio.Future s in the order the awaitables are done:

    There’s no way to find out which awaitable you’re awaiting though 4 .

    Summary

    • Takes many awaitables in an iterable.
    • Yields Future s that you have to await as soon as something is done.
    • Does not guarantee to return the original awaitables that you passed in.
    • Does wrap the awaitables in tasks (it actually calls asyncio.ensure_future() on them).
    • Takes an optional timeout.

    Finally, you may want more control over waiting and that takes us to the final waiting primitive.

    asyncio.wait()

    asyncio.wait() is the most unwieldy of the APIs but also the most powerful one. It reminds a little of the venerable select() system call.

    Like as_completed() , it takes awaitables in an iterable. It will return two sets: the awaitables that are done and those that are still pending. It’s up to you to await them 5 and to determine which result belongs to what:

    This code would not work if you passed in a coroutine and wait() wrapped it in a task, because the returned awaitable would be different from the one that you passed in and the identity check would always fail 6 . Currently, wait() will do it anyway, but it will warn you about it because it’s probably a bug.

    How can an awaitable be still pending when wait() returns? There are two possibilities:

    1. You can pass a timeout after which wait() will return. Unlike with gather() , nothing is done to the awaitables when that timeout expires. The function just returns and sorts the tasks into the done and pending buckets.
    2. You can tell wait() to not wait until all awaitables are done using the return_when argument. By default it’s set to asyncio.ALL_COMPLETED which does exactly what it sounds like. But you can also set it to asyncio.FIRST_EXCEPTION that also waits for all awaitables to finish, unless one of them raises an exception – then it will make it return immediately. Finally, asyncio.FIRST_COMPLETED returns the moment any of the awaitables finishes.

    All of this together is a bit complicated but allows you to build powerful dispatcher functions. Often using a while loop until all awaitables are done.

    Summary

Оцените статью