Python concurrent futures future result

Python futures (concurrency)

Pretty simple, huh? any iterates and evaluates each call(user_email) yield from the generator until one of them returns True . This is known as early return — you basically save up some time and resources by not performing unnecessary calls. Some folks gave nice feedback mentioning that I should make it concurrent, i.e., I could make all the calls concurrently and early return as soon as any call returned True. «That’s a good idea», I thought. I’m glad there are smarter people than myself out there. If that’s not clear why I would want to do it: suppose has_facebook_account takes too long to run (as usually happens with any I/O and network operations due to high latency) and has_github_account is pretty fast (maybe it’s cached, for example). I would always need to wait for has_facebook_account return before calling has_github_account since the generator’s items would be evaluated orderly. That does not sound fun.

Make it concurrent!

I am using Python’s concurrent.futures module (available since version 3.2). This module consists basically of two entities: the Executor and the Future object. You should read this documentation, it is really short and straight to the point. The Executor abstract class is responsible for scheduling a task (or callable) to be executed asynchronously (or concurrently). Scheduling a task returns a Future object, which is a reference to the task and represents its state — pending, finished, or canceled. If you have ever worked with JavaScript Promises before, Future is very similar to a Promise : you know its execution will eventually be done, but you can not know when. The nice thing is: it is non-blocking code, meaning the Python interpreter does not have to wait until the scheduled task’s execution finishes before running the next line of code. Thus, in our scenario, we could schedule three tasks, one for querying each platform (Facebook, GitHub, and Twitter) for a user email address. This way, once any of these tasks eventually returns a value, I can early return if the value is True , since all we want to know is if the user has an account in any of these platforms.

Talk is cheap. Show me the code.

import time # I will use it to simulate latency with time.sleep from concurrent.futures import ThreadPoolExecutor, as_completed def has_facebook_account(user_email): time.sleep(5) # 5 seconds! That is bad. print("Finished facebook after 5 seconds!") return True def has_github_account(user_email): time.sleep(1) # 1 second. Phew! print("Finished github after 1 second!") return True def has_twitter_account(user_email): time.sleep(3) # Well. print("Finished twitter after 3 seconds!") return False # Main method that answers if a user has an account in any of the platforms def has_social_account(user_email): # ThreadPoolExecutor is a subclass of Executor that uses threads. # max_workers is the max number of threads that will be used. # Since we are scheduling only 3 tasks, it does not make sense to have # more than 3 threads, otherwise we would be wasting resources. executor = ThreadPoolExecutor(max_workers=3) # Schedule (submit) 3 tasks (one for each social account check) # .submit method returns a Future object facebook_future = executor.submit(has_facebook_account, user_email) twitter_future = executor.submit(has_twitter_account, user_email) github_future = executor.submit(has_github_account, user_email) future_list = [facebook_future, github_future, twitter_future] # as_completed receives an iterable of Future objects # and yields each future once it has been completed. for future in as_completed(future_list): # .result() returns the future object return value future_return_value = future.result() print(future_return_value) if future_return_value is True: # I can early return once any result is True return True user_email = "user@email.com" if __name__ == '__main__': has_social_account(user_email) 
Finished github after 1 second! User has social account. # The created threads will still run until completion Finished twitter after 3 seconds! Finished facebook after 5 seconds! 

Notice that even though facebook_future takes longer than the other two scheduled tasks to finish, however, it does not block the execution — it keeps working on its own thread. And although github_future is the last scheduled task, it is the first to finish.

Читайте также:  Array newinstance in java

Quick summary

  • Future is an object that represents a scheduled task that will eventually finish.
  • Executor is the scheduler of tasks (once a task is scheduled, it returns a Future object).
    • It can be a ThreadPoolExecutor or a ProcessPoolExecutor (using threads vs processes).

    When would I not want to use concurrency then?

    As software engineers, our job is not only knowing how to use a tool, but also when to use it. Network operations (and I/O bound operations in general) are usually a good place to use concurrent code due to their latency. But there is always a trade-off.

    In the example above we traded off performance for resource usage. How so? When using generators, only the worst-case scenario would end up consuming 3 services — one call for each has__account . That’s because we could early return True if any service returned True .

    In our new example using concurrency, we are always consuming the 3 service — since the calls are made asynchronously.

    «Ah, but that still could save us lots of time!», you say. It depends on the services you’re consuming. In the example above I artificially made the has_facebook_account really slow — 5 times slower than the fastest alternative. But, if all the services had a similar response time and if saving resources was important (suppose that calling each service would trigger a really heavy query in the database, for instance), using a synchronous code could be a better approach.

    For the sake of data: Facebook has over 2.7 billion monthly active users, while Twitter has around 330 million, and GitHub has merely 40 million users. So, it is highly likely that calling the has_facebook_account first would be enough in a huge majority of scenarios since it would return True with a much higher frequency than the other services, thus, saving lots of unnecessary calls.

    Conclusion

    Know how to write concurrent code, which is pretty easy with Python Futures. But more important: know when to do so. There are cases where the performance increase does not pay off the resource usage.

    Источник

    Futures¶

    Future objects are used to bridge low-level callback-based code with high-level async/await code.

    Future Functions¶

    Return True if obj is either of:

    • an instance of asyncio.Future ,
    • an instance of asyncio.Task ,
    • a Future-like object with a _asyncio_future_blocking attribute.
    • obj argument as is, if obj is a Future , a Task , or a Future-like object ( isfuture() is used for the test.)
    • a Task object wrapping obj, if obj is a coroutine ( iscoroutine() is used for the test); in this case the coroutine will be scheduled by ensure_future() .
    • a Task object that would await on obj, if obj is an awaitable ( inspect.isawaitable() is used for the test.)

    If obj is neither of the above a TypeError is raised.

    See also the create_task() function which is the preferred way for creating new Tasks.

    Save a reference to the result of this function, to avoid a task disappearing mid-execution.

    Changed in version 3.5.1: The function accepts any awaitable object.

    Deprecated since version 3.10: Deprecation warning is emitted if obj is not a Future-like object and loop is not specified and there is no running event loop.

    Deprecated since version 3.10: Deprecation warning is emitted if future is not a Future-like object and loop is not specified and there is no running event loop.

    Future Object¶

    A Future represents an eventual result of an asynchronous operation. Not thread-safe.

    Future is an awaitable object. Coroutines can await on Future objects until they either have a result or an exception set, or until they are cancelled. A Future can be awaited multiple times and the result is same.

    Typically Futures are used to enable low-level callback-based code (e.g. in protocols implemented using asyncio transports ) to interoperate with high-level async/await code.

    The rule of thumb is to never expose Future objects in user-facing APIs, and the recommended way to create a Future object is to call loop.create_future() . This way alternative event loop implementations can inject their own optimized implementations of a Future object.

    Changed in version 3.7: Added support for the contextvars module.

    Deprecated since version 3.10: Deprecation warning is emitted if loop is not specified and there is no running event loop.

    Return the result of the Future.

    If the Future is done and has a result set by the set_result() method, the result value is returned.

    If the Future is done and has an exception set by the set_exception() method, this method raises the exception.

    If the Future has been cancelled, this method raises a CancelledError exception.

    If the Future’s result isn’t yet available, this method raises a InvalidStateError exception.

    Mark the Future as done and set its result.

    Raises a InvalidStateError error if the Future is already done.

    Mark the Future as done and set an exception.

    Raises a InvalidStateError error if the Future is already done.

    Return True if the Future is done.

    A Future is done if it was cancelled or if it has a result or an exception set with set_result() or set_exception() calls.

    Return True if the Future was cancelled.

    The method is usually used to check if a Future is not cancelled before setting a result or an exception for it:

    if not fut.cancelled(): fut.set_result(42) 

    Add a callback to be run when the Future is done.

    The callback is called with the Future object as its only argument.

    If the Future is already done when this method is called, the callback is scheduled with loop.call_soon() .

    An optional keyword-only context argument allows specifying a custom contextvars.Context for the callback to run in. The current context is used when no context is provided.

    functools.partial() can be used to pass parameters to the callback, e.g.:

    # Call 'print("Future:", fut)' when "fut" is done. fut.add_done_callback( functools.partial(print, "Future:")) 

    Changed in version 3.7: The context keyword-only parameter was added. See PEP 567 for more details.

    Remove callback from the callbacks list.

    Returns the number of callbacks removed, which is typically 1, unless a callback was added more than once.

    Cancel the Future and schedule callbacks.

    If the Future is already done or cancelled, return False . Otherwise, change the Future’s state to cancelled, schedule the callbacks, and return True .

    Changed in version 3.9: Added the msg parameter.

    Return the exception that was set on this Future.

    The exception (or None if no exception was set) is returned only if the Future is done.

    If the Future has been cancelled, this method raises a CancelledError exception.

    If the Future isn’t done yet, this method raises an InvalidStateError exception.

    Return the event loop the Future object is bound to.

    This example creates a Future object, creates and schedules an asynchronous Task to set result for the Future, and waits until the Future has a result:

    async def set_after(fut, delay, value): # Sleep for *delay* seconds. await asyncio.sleep(delay) # Set *value* as a result of *fut* Future. fut.set_result(value) async def main(): # Get the current event loop. loop = asyncio.get_running_loop() # Create a new Future object. fut = loop.create_future() # Run "set_after()" coroutine in a parallel Task. # We are using the low-level "loop.create_task()" API here because # we already have a reference to the event loop at hand. # Otherwise we could have just used "asyncio.create_task()". loop.create_task( set_after(fut, 1, '. world')) print('hello . ') # Wait until *fut* has a result (1 second) and print it. print(await fut) asyncio.run(main()) 

    The Future object was designed to mimic concurrent.futures.Future . Key differences include:

    • unlike asyncio Futures, concurrent.futures.Future instances cannot be awaited.
    • asyncio.Future.result() and asyncio.Future.exception() do not accept the timeout argument.
    • asyncio.Future.result() and asyncio.Future.exception() raise an InvalidStateError exception when the Future is not done.
    • Callbacks registered with asyncio.Future.add_done_callback() are not called immediately. They are scheduled with loop.call_soon() instead.
    • asyncio Future is not compatible with the concurrent.futures.wait() and concurrent.futures.as_completed() functions.
    • asyncio.Future.cancel() accepts an optional msg argument, but concurrent.futures.cancel() does not.

    Источник

Оцените статью