Unzipping a file in python

Python: How to unzip a file | Extract Single, multiple or all files from a ZIP archive

In this article we will discuss different ways to unzip or extract single, multiple or all files from zip archive to current or different directory.

In Python’s zipfile module, ZipFile class provides a member function to extract all the contents from a ZIP archive,

ZipFile.extractall(path=None, members=None, pwd=None)

It accepts following arguments :

  • path : location where zip file need to be extracted, if not provided it will extract the contents in current directory.
  • members : list of files to be extracted. It will extract all the files in zip if this argument is not provided.
  • pwd : If zip file is encrypted then pass password in this argument default is None.

Frequently Asked:

from zipfile import ZipFile

Let’s use this to extract all the contents from zip files.

Extract all files from a zip file to current directory

Suppose we have a zip file ‘sample.zip’. in our current directory, let’s see how to extract all files from it.
To unzip it first create a ZipFile object by opening the zip file in read mode and then call extractall() on that object i.e.

# Create a ZipFile Object and load sample.zip in it with ZipFile('sampleDir.zip', 'r') as zipObj: # Extract all the contents of zip file in current directory zipObj.extractall()

It will extract all the files in zip at current Directory. If files with same name are already present at extraction location then it will overwrite those files.

Extract all files from a zip file to different directory

To extract all the files from zip file to a different directory, we can pass the destination location as argument in extractall(). Path can be relative or absolute.

# Create a ZipFile Object and load sample.zip in it with ZipFile('sampleDir.zip', 'r') as zipObj: # Extract all the contents of zip file in different directory zipObj.extractall('temp')

It will extract all the files in ‘sample.zip’ in temp folder.

Extract few files from a large zip file based on condition

Suppose we have a very large zip file and we need a few files from thousand of files in the archive. Unzipping all files from large zip can take minutes. But if are interested in few of the archived files only, then instead of unzipping the whole file we can extract a single file too from the zip file.

In Python’s zipfile module, ZipFile class provides a member function to extract a single from a ZIP File,

ZipFile.extract(member, path=None, pwd=None)

It accepts following arguments :

  • member : Full name of file to be extracted. It should one from the list of archived files names returned by ZipFile.namelist()
  • path : location where zip file need to be extracted, if not provided it will extract the file in current directory.
  • pwd : If zip file is encrypted then pass password in this argument default is None.
Читайте также:  File splitting in python

Let’s use this to extract only csv files from a zip file i.e.

# Create a ZipFile Object and load sample.zip in it with ZipFile('sampleDir.zip', 'r') as zipObj: # Get a list of all archived file names from the zip listOfFileNames = zipObj.namelist() # Iterate over the file names for fileName in listOfFileNames: # Check filename endswith csv if fileName.endswith('.csv'): # Extract a single file from zip zipObj.extract(fileName, 'temp_csv')

It will extract only csv files from given zip archive.

Complete example is as follows,

from zipfile import ZipFile def main(): print('Extract all files in ZIP to current directory') # Create a ZipFile Object and load sample.zip in it with ZipFile('sampleDir.zip', 'r') as zipObj: # Extract all the contents of zip file in current directory zipObj.extractall() print('Extract all files in ZIP to different directory') # Create a ZipFile Object and load sample.zip in it with ZipFile('sampleDir.zip', 'r') as zipObj: # Extract all the contents of zip file in different directory zipObj.extractall('temp') print('Extract single file from ZIP') # Create a ZipFile Object and load sample.zip in it with ZipFile('sampleDir.zip', 'r') as zipObj: # Get a list of all archived file names from the zip listOfFileNames = zipObj.namelist() # Iterate over the file names for fileName in listOfFileNames: # Check filename endswith csv if fileName.endswith('.csv'): # Extract a single file from zip zipObj.extract(fileName, 'temp_csv') if __name__ == '__main__': main()

Источник

Unzip a File in Python: 5 Scenarios You Should Know

python unzip file

If you are in the IT world for a while, you must know about the unzipping or extraction of a file. More than 70% of the files present on the internet are in the form of a lossless compressed file (ex. zip, jpeg, tar, rar). There are various apps available in the market to unzip a file. One of the most popular software to unzip or extract is Winrar. But in today’s article, we will learn how we can unzip a file in Python.

In this particular tutorial, we will learn how to deal with 5 different scenarios, when we want to extract or unzip a file in Python. For all those who don’t know what extraction is let me briefly explain it to you. Unzip is a term used to describe the process of decompressing and moving one or more files in a compressed zipped file to an alternate location.

Let’s see various conditions in which we can extract a file using Python.

5 Situations in Which You Can Extract a File Using Python

  • Extracting only one file
  • Unzip all / multiple files from a zip file to the current directory
  • Extracting all the Files into another directory
  • Unzipping only some specific files based on different conditions
  • Unzipping Password Protected zip file using extractall()

Module Used to Unzip File in Python

To extract a file using Python, we will use the zipfile module in python. The zipfile module is used to access functionalities that would help us create, read, write, extract and list a ZIP file in Python.

Читайте также:  Php оптимизация mysql запросов

Syntax

ZipFile.extractall(path=None, members=None, pwd=None)

Parameters

  • path: This path parameter stores a path to the directory where the zip files need to be unzipped. If it is not specified, the file is extracted in the current working directory.
  • members: This parameter is used to add the files’ list to be extracted. If no argument is provided, it will extract all the files.
  • pwd: This parameter is used to extract an encrypted file with a password.

1. Extracting only one file

Sometimes, we only require a specific file from the zip file to do our task. In such cases, extracting the whole zip files will consume time as well as the memory of your computer. To avoid this, there is an option to extract the specific file/folder from the zip and save it on your computer. Following code demonstrates it –

Currently, our zip file contains three files – “a.txt”, “p.txt”, and “pool.txt”

import zipfile zip_file = "a.zip" file_to_extract = "a.txt" try: with zipfile.ZipFile(zip_file) as z: with open(file_to_extract, 'wb') as f: f.write(z.read(file_to_extract)) print("Extracted", file_to_extract) except: print("Invalid file")

Explanation –

Firstly, we import the module zipfile in the first line. After that, we need to declare some constant variables which can be used later in the code. In this specific example, we’ve extracted “a.txt” from the zip file “a.zip”. With the help of the ZipFile.read() function, you can store the binary value of the file in a variable and this variable can be dumped on the local file to extract it.

2. Unzip all / multiple files from a zip file to the current directory in Python

Extracting all the files using python is one of the best features of python. If you have python installed on your computer, you don’t even need software like Winrar to extract zip files. Moreover, you can not only extract zip files but also, ‘.tar.gz’, ‘.apk’, ‘.rar’ and other formats too. What’s more exciting is that you don’t need to declare any type of zipping. The module automatically identifies whether it’s zip, rar, or other formats.

import zipfile zip_file = "a.zip" try: with zipfile.ZipFile(zip_file) as z: z.extractall() print("Extracted all") except: print("Invalid file")

Explanation –

As a default, we import the zipfile module in the first file. Then we use the context manager to use the ZipFile class. With the help of this method, you don’t need to close the zipfile after opening it. By using the extractall() method, you can extract all the files in the current directory of your python code.

3. Extracting all the Files into another directory in Python

It’s very important to learn file management in coding. You have to create a better space for your files to avoid any hassle in the same folder. By using the extractall method with the parameter you can extract the zip to any folder you want. If the directory is not present, the module will automatically create a new empty directory.

import zipfile zip_file = "a.zip" try: with zipfile.ZipFile(zip_file) as z: z.extractall("temp") print("Extracted all") except: print("Invalid file")

Explanation –

We import the zipfile module and open the zipfile by using ZipFile class. With the help of extractall(path), you can extract all the contents inside the zip file in the specifically mentioned path. If the directory is already present, all the files will be overwritten. If the directory is absent, a new empty folder will be created.

Читайте также:  Java символ строки по индексу

4. Unzipping only some specific files based on different conditions in Python

It’s very convenient to extract specific types of files from the zip file. Suppose, you have a zip file that contains images, videos, and other types of files, and you need to extract only images. Then conditional extracting will help you. You can use namelist() method with conditions to extract them.

Currently, our zip file contains three files – “a.txt”, “p.txt”, “pool.txt”, “a.jpeg”, “b.jpeg” and “a.mp4”

import zipfile zip_file = "a.zip" endswith = ".jpeg" try: with zipfile.ZipFile(zip_file) as z: for file in z.namelist(): if file.endswith(endswith): z.extract(file) print("Extracted all ", endswith) except: print("Invalid file")

Explanation –

We begin with importing the zipfile module. Then we declare some constants and others the zip file as a context manager. All the file names from the zip file can be obtained by the namelist() method. This method returns the list of all file objects. Nextly we check if the file ends with a specific file format (In our case we used ‘.jpeg’ format).

If the file ends with a specific extension, then we can extract the file. All the files will be extracted in the same folder. If you want to extract them to a specific folder, use the path parameter in the extractall() method.

5. Unzipping Password Protected Zip Files using extractall() in Python

One of the features of zip files is that you can apply a password to them. Moreover, you can also encrypt the file names to make it more difficult. In this section, we’ll only discuss the basic password manager for the zip files and how to use them.

Note – Every software uses a different encryption technique and it may happen that you cannot extract it using python. If you are aware of proper encryption for your zip file, use pyzipper module to create a proper zip object.

import zipfile zip_file = "a.zip" try: with zipfile.ZipFile(zip_file) as z: z.setpassword(bytes("pythonpool","utf-8")) z.extractall() print("Extracted all ") except Exception as e: print("Invalid file", e)

Explanation –

We import the zipfile module and open the zip file using the context manager. The setpassword method allows you to initialize a password for your zip file. The only condition is that it takes bytes object as input. We’ve used the bytes() function to convert our string into bytes. Finally, using the extractall() method, you can unzip all the files from the zip.

If the zip file is compressed by software like WinRar, you need to use my zipper module which has advanced methods to determine the algorithms.

Must Read

Final Words

Python has proved to be one of the best computer programming languages with the support of thousands of modules. With easy-to-write syntaxes and APIs for every system module, you can create beautiful applications to ease your tasks. Python unzip file is one of the features that allow you to automate your extraction tasks.

Share this post with your friends and let them know about this awesome module!

Источник

Оцените статью