Implementing Async Features in Python - A Step-by-step Guide

Prashant Kumar

Full-stack Development

Tags:

Asynchronous programming

Parallel programming

Async IO

Concurrency

Async programming in Python

Asynchronous programming is a characteristic of modern programming languages that allows an application to perform various operations without waiting for any of them. Asynchronicity is one of the big reasons for the popularity of Node.js.

We have discussed Python’s asynchronous features as part of our previous post: an introduction to asynchronous programming in Python. This blog is a natural progression on the same topic. We are going to discuss async features in Python in detail and look at some hands-on examples.

Consider a traditional web scraping application that needs to open thousands of network connections. We could open one network connection, fetch the result, and then move to the next ones iteratively. This approach increases the latency of the program. It spends a lot of time opening a connection and waiting for others to finish their bit of work.

On the other hand, async provides you a method of opening thousands of connections at once and swapping among each connection as they finish and return their results. Basically, it sends the request to a connection and moves to the next one instead of waiting for the previous one’s response. It continues like this until all the connections have returned the outputs.

Source: phpmind

From the above chart, we can see that using synchronous programming on four tasks took 45 seconds to complete, while in asynchronous programming, those four tasks took only 20 seconds.

Where Does Asynchronous Programming Fit in the Real-world?

Asynchronous programming is best suited for popular scenarios such as:

1. The program takes too much time to execute.

2. The reason for the delay is waiting for input or output operations, not computation.

3. For the tasks that have multiple input or output operations to be executed at once.

And application-wise, these are the example use cases:

Web Scraping
Network Services

Difference Between Parallelism, Concurrency, Threading, and Async IO

Because we discussed this comparison in detail in our previous post, we will just quickly go through the concept as it will help us with our hands-on example later.

Parallelism involves performing multiple operations at a time. Multiprocessing is an example of it. It is well suited for CPU bound tasks.

Concurrency is slightly broader than Parallelism. It involves multiple tasks running in an overlapping manner.

Threading – a thread is a separate flow of execution. One process can contain multiple threads and each thread runs independently. It is ideal for IO bound tasks.

Async IO is a single-threaded, single-process design that uses cooperative multitasking. In simple words, async IO gives a feeling of concurrency despite using a single thread in a single process.

Fig:- A comparison in concurrency and parallelism

Components of Async IO Programming

Let’s explore the various components of Async IO in depth. We will also look at an example code to help us understand the implementation.

1. Coroutines

Coroutines are mainly generalization forms of subroutines. They are generally used for cooperative tasks and behave like Python generators.

An async function uses the await keyword to denote a coroutine. When using the await keyword, coroutines release the flow of control back to the event loop.

To run a coroutine, we need to schedule it on the event loop. After scheduling, coroutines are wrapped in Tasks as a Future object.

Example:

In the below snippet, we called async_func from the main function. We have to add the await keyword while calling the sync function. As you can see, async_func will do nothing unless the await keyword implementation accompanies it.

	import asyncio
	async def async_func():
	print('Velotio ...')
	await asyncio.sleep(1)
	print('... Technologies!')

	async def main():
	async_func()#this will do nothing because coroutine object is created but not awaited
	await async_func()

	asyncio.run(main())

view raw async.py hosted with ❤ by GitHub

Output

	RuntimeWarning: coroutine 'async_func' was never awaited
	async_func()#this will do nothing because coroutine object is created but not awaited
	RuntimeWarning: Enable tracemalloc to get the object allocation traceback
	Velotio ...
	... Blog!

view raw async_output.txt hosted with ❤ by GitHub

2. Tasks

Tasks are used to schedule coroutines concurrently.

When submitting a coroutine to an event loop for processing, you can get a Task object, which provides a way to control the coroutine’s behavior from outside the event loop.

Example:

In the snippet below, we are creating a task using create_task (an inbuilt function of asyncio library), and then we are running it.

	import asyncio
	async def async_func():
	print('Velotio ...')
	await asyncio.sleep(1)
	print('... Blog!')

	async def main():
	task = asyncio.create_task (async_func())
	await task
	asyncio.run(main())

view raw async_task.py hosted with ❤ by GitHub

Output

	Velotio ...
	... Blog!

view raw async_task_output.txt hosted with ❤ by GitHub

3 Event Loops

This mechanism runs coroutines until they complete. You can imagine it as while(True) loop that monitors coroutine, taking feedback on what’s idle, and looking around for things that can be executed in the meantime.

It can wake up an idle coroutine when whatever that coroutine is waiting on becomes available.

Only one event loop can run at a time in Python.

Example:

In the snippet below, we are creating three tasks and then appending them in a list and executing all tasks asynchronously using get_event_loop, create_task and the await function of the asyncio library.

	import asyncio
	async def async_func(task_no):
	print(f'{task_no} :Velotio ...')
	await asyncio.sleep(1)
	print(f'{task_no}... Blog!')

	async def main():
	taskA = loop.create_task (async_func('taskA'))
	taskB = loop.create_task(async_func('taskB'))
	taskC = loop.create_task(async_func('taskC'))
	await asyncio.wait([taskA,taskB,taskC])

	if __name__ == "__main__":
	try:
	loop = asyncio.get_event_loop()
	loop.run_until_complete(main())
	except :
	pass

view raw async_loop.py hosted with ❤ by GitHub

Output

	taskA :Velotio ...
	taskB :Velotio ...
	taskC :Velotio ...
	taskA... Blog!
	taskB... Blog!
	taskC... Blog!

view raw async_event_loop_output.txt hosted with ❤ by GitHub

Future

A future is a special, low-level available object that represents an eventual result of an asynchronous operation.

When a Future object is awaited, the co-routine will wait until the Future is resolved in some other place.

We will look into the sample code for Future objects in the next section.

A Comparison Between Multithreading and Async IO

Before we get to Async IO, let’s use multithreading as a benchmark and then compare them to see which is more efficient.

For this benchmark, we will be fetching data from a sample URL (the Velotio Career webpage) with different frequencies, like once, ten times, 50 times, 100 times, 500 times, respectively.

We will then compare the time taken by both of these approaches to fetch the required data.

Implementation

Code of Multithreading:

	import requests
	import time
	from concurrent.futures import ProcessPoolExecutor


	def fetch_url_data(pg_url):
	try:
	resp = requests.get(pg_url)
	except Exception as e:
	print(f"Error occured during fetch data from url{pg_url}")
	else:
	return resp.content


	def get_all_url_data(url_list):
	with ProcessPoolExecutor() as executor:
	resp = executor.map(fetch_url_data, url_list)
	return resp


	if __name__=='__main__':
	url = "https://www.velotio.com/careers"
	for ntimes in [1,10,50,100,500]:
	start_time = time.time()
	responses = get_all_url_data([url] * ntimes)
	print(f'Fetch total {ntimes} urls and process takes {time.time() - start_time} seconds')

view raw multithreading.py hosted with ❤ by GitHub

Output

	Fetch total 1 urls and process takes 1.8822264671325684 seconds
	Fetch total 10 urls and process takes 2.3358211517333984 seconds
	Fetch total 50 urls and process takes 8.05638575553894 seconds
	Fetch total 100 urls and process takes 14.43302869796753 seconds
	Fetch total 500 urls and process takes 65.25404500961304 seconds

view raw multithreading_output.txt hosted with ❤ by GitHub

ProcessPoolExecutor is a Python package that implements the Executor interface. The fetch_url_data is a function to fetch the data from the given URL using the requests python package, and the get_all_url_data function is used to map the fetch_url_data function to the lists of URLs.

Async IO Programming Example:

	import asyncio
	import time
	from aiohttp import ClientSession, ClientResponseError


	async def fetch_url_data(session, url):
	try:
	async with session.get(url, timeout=60) as response:
	resp = await response.read()
	except Exception as e:
	print(e)
	else:
	return resp
	return


	async def fetch_async(loop, r):
	url = "https://www.velotio.com/careers"
	tasks = []
	async with ClientSession() as session:
	for i in range(r):
	task = asyncio.ensure_future(fetch_url_data(session, url))
	tasks.append(task)
	responses = await asyncio.gather(*tasks)
	return responses


	if __name__ == '__main__':
	for ntimes in [1, 10, 50, 100, 500]:
	start_time = time.time()
	loop = asyncio.get_event_loop()
	future = asyncio.ensure_future(fetch_async(loop, ntimes))
	loop.run_until_complete(future) #will run until it finish or get any error
	responses = future.result()
	print(f'Fetch total {ntimes} urls and process takes {time.time() - start_time} seconds')

view raw final_async.py hosted with ❤ by GitHub

Output

	Fetch total 1 urls and process takes 1.3974951362609863 seconds
	Fetch total 10 urls and process takes 1.4191942596435547 seconds
	Fetch total 50 urls and process takes 2.6497368812561035 seconds
	Fetch total 100 urls and process takes 4.391665458679199 seconds
	Fetch total 500 urls and process takes 4.960426330566406 seconds

view raw final_async_output.txt hosted with ❤ by GitHub

We need to use the get_event_loop function to create and add the tasks. For running more than one URL, we have to use ensure_future and gather function.

The fetch_async function is used to add the task in the event_loop object and the fetch_url_data function is used to read the data from the URL using the session package. The future_result method returns the response of all the tasks.

Results:

As you can see from the plot, async programming is much more efficient than multi-threading for the program above.

The graph of the multithreading program looks linear, while the asyncio program graph is similar to logarithmic.

Conclusion

As we saw in our experiment above, Async IO showed better performance with the efficient use of concurrency than multi-threading.

Async IO can be beneficial in applications that can exploit concurrency. Though, based on what kind of applications we are dealing with, it is very pragmatic to choose Async IO over other implementations.

We hope this article helped further your understanding of the async feature in Python and gave you some quick hands-on experience using the code examples shared above.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Implementing Async Features in Python - A Step-by-step Guide

Source: phpmind

From the above chart, we can see that using synchronous programming on four tasks took 45 seconds to complete, while in asynchronous programming, those four tasks took only 20 seconds.

Where Does Asynchronous Programming Fit in the Real-world?

Asynchronous programming is best suited for popular scenarios such as:

1. The program takes too much time to execute.

2. The reason for the delay is waiting for input or output operations, not computation.

3. For the tasks that have multiple input or output operations to be executed at once.

And application-wise, these are the example use cases:

Web Scraping
Network Services

Difference Between Parallelism, Concurrency, Threading, and Async IO

Because we discussed this comparison in detail in our previous post, we will just quickly go through the concept as it will help us with our hands-on example later.

Parallelism involves performing multiple operations at a time. Multiprocessing is an example of it. It is well suited for CPU bound tasks.

Concurrency is slightly broader than Parallelism. It involves multiple tasks running in an overlapping manner.

Threading – a thread is a separate flow of execution. One process can contain multiple threads and each thread runs independently. It is ideal for IO bound tasks.

Components of Async IO Programming

Let’s explore the various components of Async IO in depth. We will also look at an example code to help us understand the implementation.

1. Coroutines

Coroutines are mainly generalization forms of subroutines. They are generally used for cooperative tasks and behave like Python generators.

An async function uses the await keyword to denote a coroutine. When using the await keyword, coroutines release the flow of control back to the event loop.

To run a coroutine, we need to schedule it on the event loop. After scheduling, coroutines are wrapped in Tasks as a Future object.

Example:

	import asyncio
	async def async_func():
	print('Velotio ...')
	await asyncio.sleep(1)
	print('... Technologies!')

	async def main():
	async_func()#this will do nothing because coroutine object is created but not awaited
	await async_func()

	asyncio.run(main())

view raw async.py hosted with ❤ by GitHub

Output

	RuntimeWarning: coroutine 'async_func' was never awaited
	async_func()#this will do nothing because coroutine object is created but not awaited
	RuntimeWarning: Enable tracemalloc to get the object allocation traceback
	Velotio ...
	... Blog!

view raw async_output.txt hosted with ❤ by GitHub

2. Tasks

Tasks are used to schedule coroutines concurrently.

When submitting a coroutine to an event loop for processing, you can get a Task object, which provides a way to control the coroutine’s behavior from outside the event loop.

Example:

In the snippet below, we are creating a task using create_task (an inbuilt function of asyncio library), and then we are running it.

	import asyncio
	async def async_func():
	print('Velotio ...')
	await asyncio.sleep(1)
	print('... Blog!')

	async def main():
	task = asyncio.create_task (async_func())
	await task
	asyncio.run(main())

view raw async_task.py hosted with ❤ by GitHub

Output

	Velotio ...
	... Blog!

view raw async_task_output.txt hosted with ❤ by GitHub

3 Event Loops

It can wake up an idle coroutine when whatever that coroutine is waiting on becomes available.

Only one event loop can run at a time in Python.

Example:

	import asyncio
	async def async_func(task_no):
	print(f'{task_no} :Velotio ...')
	await asyncio.sleep(1)
	print(f'{task_no}... Blog!')

	async def main():
	taskA = loop.create_task (async_func('taskA'))
	taskB = loop.create_task(async_func('taskB'))
	taskC = loop.create_task(async_func('taskC'))
	await asyncio.wait([taskA,taskB,taskC])

	if __name__ == "__main__":
	try:
	loop = asyncio.get_event_loop()
	loop.run_until_complete(main())
	except :
	pass

view raw async_loop.py hosted with ❤ by GitHub

Output

	taskA :Velotio ...
	taskB :Velotio ...
	taskC :Velotio ...
	taskA... Blog!
	taskB... Blog!
	taskC... Blog!

view raw async_event_loop_output.txt hosted with ❤ by GitHub

Future

A future is a special, low-level available object that represents an eventual result of an asynchronous operation.

When a Future object is awaited, the co-routine will wait until the Future is resolved in some other place.

We will look into the sample code for Future objects in the next section.

A Comparison Between Multithreading and Async IO

Before we get to Async IO, let’s use multithreading as a benchmark and then compare them to see which is more efficient.

For this benchmark, we will be fetching data from a sample URL (the Velotio Career webpage) with different frequencies, like once, ten times, 50 times, 100 times, 500 times, respectively.

We will then compare the time taken by both of these approaches to fetch the required data.

Implementation

Code of Multithreading:

	import requests
	import time
	from concurrent.futures import ProcessPoolExecutor


	def fetch_url_data(pg_url):
	try:
	resp = requests.get(pg_url)
	except Exception as e:
	print(f"Error occured during fetch data from url{pg_url}")
	else:
	return resp.content


	def get_all_url_data(url_list):
	with ProcessPoolExecutor() as executor:
	resp = executor.map(fetch_url_data, url_list)
	return resp


	if __name__=='__main__':
	url = "https://www.velotio.com/careers"
	for ntimes in [1,10,50,100,500]:
	start_time = time.time()
	responses = get_all_url_data([url] * ntimes)
	print(f'Fetch total {ntimes} urls and process takes {time.time() - start_time} seconds')

view raw multithreading.py hosted with ❤ by GitHub

Output

	Fetch total 1 urls and process takes 1.8822264671325684 seconds
	Fetch total 10 urls and process takes 2.3358211517333984 seconds
	Fetch total 50 urls and process takes 8.05638575553894 seconds
	Fetch total 100 urls and process takes 14.43302869796753 seconds
	Fetch total 500 urls and process takes 65.25404500961304 seconds

view raw multithreading_output.txt hosted with ❤ by GitHub

Async IO Programming Example:

	import asyncio
	import time
	from aiohttp import ClientSession, ClientResponseError


	async def fetch_url_data(session, url):
	try:
	async with session.get(url, timeout=60) as response:
	resp = await response.read()
	except Exception as e:
	print(e)
	else:
	return resp
	return


	async def fetch_async(loop, r):
	url = "https://www.velotio.com/careers"
	tasks = []
	async with ClientSession() as session:
	for i in range(r):
	task = asyncio.ensure_future(fetch_url_data(session, url))
	tasks.append(task)
	responses = await asyncio.gather(*tasks)
	return responses


	if __name__ == '__main__':
	for ntimes in [1, 10, 50, 100, 500]:
	start_time = time.time()
	loop = asyncio.get_event_loop()
	future = asyncio.ensure_future(fetch_async(loop, ntimes))
	loop.run_until_complete(future) #will run until it finish or get any error
	responses = future.result()
	print(f'Fetch total {ntimes} urls and process takes {time.time() - start_time} seconds')

view raw final_async.py hosted with ❤ by GitHub

Output

	Fetch total 1 urls and process takes 1.3974951362609863 seconds
	Fetch total 10 urls and process takes 1.4191942596435547 seconds
	Fetch total 50 urls and process takes 2.6497368812561035 seconds
	Fetch total 100 urls and process takes 4.391665458679199 seconds
	Fetch total 500 urls and process takes 4.960426330566406 seconds

view raw final_async_output.txt hosted with ❤ by GitHub

We need to use the get_event_loop function to create and add the tasks. For running more than one URL, we have to use ensure_future and gather function.

Results:

As you can see from the plot, async programming is much more efficient than multi-threading for the program above.

The graph of the multithreading program looks linear, while the asyncio program graph is similar to logarithmic.

Conclusion

As we saw in our experiment above, Async IO showed better performance with the efficient use of concurrency than multi-threading.

We hope this article helped further your understanding of the async feature in Python and gave you some quick hands-on experience using the code examples shared above.

Asynchronous programming

Parallel programming

Async IO

Concurrency

Async programming in Python

Subscribe to get the latest technology updates

Implementing Async Features in Python - A Step-by-step Guide

Prashant Kumar

Where Does Asynchronous Programming Fit in the Real-world?

Difference Between Parallelism, Concurrency, Threading, and Async IO

Components of Async IO Programming

1. Coroutines

2. Tasks

3 Event Loops

Future

A Comparison Between Multithreading and Async IO

Implementation

Conclusion

MORE POSTS BY THIS AUTHOR

Prashant Kumar

You may also like

A Guide to End-to-End API Test Automation with Postman and GitHub Actions

Praful Kolhe

Why Signals Could Be the Future for Modern Web Frameworks?

Harshil Shah

Automating test cases for text-messaging (SMS) feature of your application was never so easy

Praful Kolhe

Implementing Async Features in Python - A Step-by-step Guide

Where Does Asynchronous Programming Fit in the Real-world?

Difference Between Parallelism, Concurrency, Threading, and Async IO

Components of Async IO Programming

1. Coroutines

2. Tasks

3 Event Loops

Future

A Comparison Between Multithreading and Async IO

Implementation

Conclusion

About the Author

Did you like the blog? If yes, we're sure you'll also like to work with the people who write them - our best-in-class engineering team.

We're looking for talented developers who are passionate about new emerging technologies. If that's you, get in touch with us.

About Velotio

Subscribe to get the latest technology updates

Related Posts

A Guide to End-to-End API Test Automation with Postman and GitHub Actions

Why Signals Could Be the Future for Modern Web Frameworks?

Automating test cases for text-messaging (SMS) feature of your application was never so easy

Creating a Frictionless SignUp Experience with Auth0 for your Application

Why You Should Prefer Next.js 12 Over Other React Setup

Setting up S3 & CloudFront to Deliver Static Assets Across the Web

Setting Up A Single Sign On (SSO) Environment For Your App

SEO for Web Apps: How to Boost Your Search Rankings

Creating Faster and High Performing User Interfaces in Web Apps With Web Workers

A Beginner’s Guide to Kubernetes Python Client

Flutter vs React Native: A Detailed Comparison

Building a WebSocket Service with AWS Lambda & DynamoDB

Improving Elasticsearch Indexing in the Rails Model using Searchkick

Test Automation in React Native apps using Appium and WebdriverIO

Building Google Photos Alternative Using AWS Serverless

Optimize React App Performance By Code Splitting

Building a Collaborative Editor Using Quill and Yjs

How to Use Pytest Fixtures With Django Models

Getting Started With Golang Channels! Here’s Everything You Need to Know

Set Up Simple S3 Deployment Workflow with Github Actions and CircleCI

Eliminate Render-blocking Resources using React and Webpack

How to Test the Performance of Flutter Apps - A Step-by-step Guide

Automating Serverless Framework Deployment using Watchdog

Building Scalable and Efficient React Applications Using GraphQL and Relay

Implementing Federated GraphQL Microservices using Apollo Federation

Building Type Safe Backend Apps with Typegoose and TypeGraphQL

UI Automation and API Testing with Cypress - A Step-by-step Guide

An Introduction to React Fiber - The Algorithm Behind React

Using DRF Effectively to Build Cleaner and Faster APIs in Django

The 7 Most Useful Design Patterns in ES6 (and how you can implement them)

Enable Real-time Functionality in Your App with GraphQL and Pusher

Set Up A Production-ready REST API Server Using TypeScript, Express And PostgreSQL

Building High-performance Apps: A Checklist To Get It Right

Building a Progressive Web Application in React [With Live Code Examples]

Node.js vs Deno: Is Deno Really The Node.js Alternative We All Didn’t Know We Needed?

Building Dynamic Forms in React Using Formik

Building A Scalable API Testing Framework With Jest And SuperTest

Publish APIs For Your Customers: Deploy Serverless Developer Portal For Amazon API Gateway

Building Scalable Front-end With Lerna, YARN And React In 60 Minutes

Implementing gRPC In Python: A Step-by-step Guide

How To Use Inline Functions In React Applications Efficiently