Python check all files in directory

Python List Files in a Directory

In this article, we will see how to list all files of a directory in Python. There are multiple ways to list files of a directory. In this article, We will use the following four methods.

  • os.listdir(‘dir_path’) : Return the list of files and directories present in a specified directory path.
  • os.walk(‘dir_path’) : Recursively get the list all files in directory and subdirectories.
  • os.scandir(‘path’) : Returns directory entries along with file attribute information.
  • glob.glob(‘pattern’) : glob module to list files and folders whose names follow a specific pattern.

Table of contents

How to List All Files of a Directory

Getting a list of files of a directory is easy as pie! Use the listdir() and isfile() functions of an os module to list all files of a directory. Here are the steps.

  1. Import os module This module helps us to work with operating system-dependent functionality in Python. The os module provides functions for interacting with the operating system.
  2. Use os.listdir() function The os.listdir(‘path’) function returns a list containing the names of the files and directories present in the directory given by the path .
  3. Iterate the result Use for loop to Iterate the files returned by the listdir() function. Using for loop we will iterate each file returned by the listdir() function
  4. Use isfile() function In each loop iteration, use the os.path.isfile(‘path’) function to check whether the current path is a file or directory. If it is a file, then add it to a list. This function returns True if a given path is a file. Otherwise, it returns False.

Example to List Files of a Directory

Let’s see how to list files of an ‘account’ folder. The listdir() will list files only in the current directory and ignore the subdirectories.

Example 1: List only files from a directory

import os # folder path dir_path = r'E:\\account\\' # list to store files res = [] # Iterate directory for path in os.listdir(dir_path): # check if current path is a file if os.path.isfile(os.path.join(dir_path, path)): res.append(path) print(res)

Here we got three file names.

['profit.txt', 'sales.txt', 'sample.txt']

If you know generator expression, you can make code smaller and simplers using a generator function as shown below.

Generator Expression:

import os def get_files(path): for file in os.listdir(path): if os.path.isfile(os.path.join(path, file)): yield file

Then simply call it whenever required.

for file in get_files(r'E:\\account\\'): print(file)

Example 2: List both files and directories.

Directly call the listdir(‘path’) function to get the content of a directory.

import os # folder path dir_path = r'E:\\account\\' # list file and directories res = os.listdir(dir_path) print(res)

As you can see in the output, ‘reports_2021’ is a directory.

['profit.txt', 'reports_2021', 'sales.txt', 'sample.txt']

os.walk() to list all files in directory and subdirectories

The os.walk() function returns a generator that creates a tuple of values (current_path, directories in current_path, files in current_path).

Читайте также:  Java drivermanager sql server

Note: Using the os.walk() function we can list all directories, subdirectories, and files in a given directory.

It is a recursive function, i.e., every time the generator is called, it will follow each directory recursively to get a list of files and directories until no further sub-directories are available from the initial directory.

For example, calling the os.walk(‘path’) will yield two lists for each directory it visits. The first list contains files, and the second list includes directories.

Let’s see the example to list all files in directory and subdirectories.

from os import walk # folder path dir_path = r'E:\\account\\' # list to store files name res = [] for (dir_path, dir_names, file_names) in walk(dir_path): res.extend(file_names) print(res)
['profit.txt', 'sales.txt', 'sample.txt', 'december_2021.txt']

Note: Add break inside a loop to stop looking for files recursively inside subdirectories.

from os import walk # folder path dir_path = r'E:\\account\\' res = [] for (dir_path, dir_names, file_names) in walk(dir_path): res.extend(file_names) # don't look inside any subdirectory break print(res) 

os.scandir() to get files of a directory

The scandir() function returns directory entries along with file attribute information, giving better performance for many common use cases.

It returns an iterator of os.DirEntry objects, which contains file names.

import os # get all files inside a specific folder dir_path = r'E:\\account\\' for path in os.scandir(dir_path): if path.is_file(): print(path.name)
profit.txt sales.txt sample.txt

Glob Module to list Files of a Directory

The Python glob module, part of the Python Standard Library, is used to find the files and folders whose names follow a specific pattern.

For example, to get all files of a directory, we will use the dire_path/*.* pattern. Here, *.* means file with any extension.

Let’s see how to list files from a directory using a glob module.

import glob # search all files inside a specific folder # *.* means file name with any extension dir_path = r'E:\account\*.*' res = glob.glob(dir_path) print(res)
['E:\\account\\profit.txt', 'E:\\account\\sales.txt', 'E:\\account\\sample.txt']

Note: If you want to list files from subdirectories, then set the recursive attribute to True.

import glob # search all files inside a specific folder # *.* means file name with any extension dir_path = r'E:\demos\files_demos\account\**\*.*' for file in glob.glob(dir_path, recursive=True): print(file)
E:\account\profit.txt E:\account\sales.txt E:\account\sample.txt E:\account\reports_2021\december_2021.txt

Pathlib Module to list files of a directory

From Python 3.4 onwards, we can use the pathlib module, which provides a wrapper for most OS functions.

  • Import pathlib module: Pathlib module offers classes and methods to handle filesystem paths and get data related to files for different operating systems.
  • Next, Use the pathlib.Path(‘path’) to construct directory path
  • Next, Use the iterdir() to iterate all entries of a directory
  • In the end, check if a current entry is a file using the path.isfile() function
import pathlib # folder path dir_path = r'E:\\account\\' # to store file names res = [] # construct path object d = pathlib.Path(dir_path) # iterate directory for entry in d.iterdir(): # check if it a file if entry.is_file(): res.append(entry) print(res)

Did you find this page helpful? Let others know about it. Sharing helps me continue to create free Python resources.

About Vishal

I’m Vishal Hule, Founder of PYnative.com. I am a Python developer, and I love to write articles to help students, developers, and learners. Follow me on Twitter

Читайте также:  Parsing file using javascript

Python Exercises and Quizzes

Free coding exercises and quizzes cover Python basics, data structure, data analytics, and more.

  • 15+ Topic-specific Exercises and Quizzes
  • Each Exercise contains 10 questions
  • Each Quiz contains 12-15 MCQ

Источник

Get a List of All Files in a Directory with Python (with code)

Get a List of All Files in a Directory with Python (with code)

File handling is an essential part of your coding experience. Without file handling, you will not be able to create programs that are to their full potential. The first task that you will learn in file handling is mostly how to open a file.

But in the real world, it is rare that you will know the exact path to a file. More times than not, you know the directory in which the file is stored. So, here we will learn how to get all files in a directory in Python. Before that, let’s revise files and directories to understand the concepts first.

What are files?

A file is a named location on a disk that stores data. You can use Python to read from and write to files, which allows you to save data in a persistent storage location and read it back later.

To work with files in Python, you use the built-in open function to open a file, and then with a statement to ensure that the file is properly closed when you are done with it.

What is a directory?

A directory is a location on a computer’s file system that can contain other directories and files. It is also sometimes referred to as a folder. Directories allow you to organize your files and keep them separate from one another.

For example, you might have a directory for documents, another for pictures, and another for music. Each directory can contain multiple files and subdirectories, which allows you to create a hierarchical structure for your files.

What connects files and directories?

In a computer’s file system, files and directories are connected through the use of paths. A path is a string that specifies the location of a file or directory in the file system. There are two types of paths: absolute paths and relative paths.

An absolute path is a complete path to a file or directory that begins at the root of the file system. It specifies the exact location of the file or directory, regardless of the current working directory.

On the other hand, a relative path is a path to a file or directory that is relative to the current working directory. It specifies the location of a file or directory in relation to the current directory.

In short, Files are collections of information. A collection of files can be stored under a common name, called a folder or directory. There are various situations where one might need to know the contents of a directory. For example, when we do not know the file’s full name but know its directory, we can list the directory to search for the file.

How to list all the files in a directory in Python?

There are various modules that python provides that you can use to access and list all the files in any given directory. Broadly all the functions that we can use come under three modules, the os module, and the path module.

Читайте также:  Решение методом хорд питон

1) OS module

The OS module provides many functions that can be used to list all the files stored in a given directory in python. The os.listdir() is the most common method that you will find to list all the files that are present in any given directory. It is easy to use.

When the path to a directory is passed as an argument to the os.listdir() function, it returns a list that contains all the files in the directory. A code example of this is:

import os path = "D:/ABRAR/UNIVERSITY/Advanced ML/Website Generator" print(os.listdir(path=path))

The os.walk() function does not return a list. But rather it returns file names. These file names are all the files that exist in the directory. The os.walk() can be used when you want to iterate over all the files that are present in the directory one by one.

import os path = "D:/ABRAR/UNIVERSITY/Advanced ML/Website Generator" for (root,dirs,file) in os.walk(path): for x in file: print(x)

The os.scandir is a function in the os module that can be used only in Python 3.5 and greater. The os.scandir() returns an object instead of an iterable but rather it returns an object. This object is of the os.DirEntry type.

What this means is that the object contains all the entries of the directory given to it. We can use the is_file() function to see if the entry that is being checked is a file or not.

import os path = "D:/ABRAR/UNIVERSITY/Advanced ML/Website Generator" entries = os.scandir(path) for val in entries: if val.is_file(): print(val.name)

2) Glob Module

We can check and retrieve files in a directory in python that matches certain patterns. This can be done using the glob module. There are two methods that we can use in the glob module to find files that match a given pattern, these are: glob() and iglob() method.

The glob function returns a list of file and directory names that match a given pattern. The pattern can contain wildcards, such as * to match any sequence of characters, or ? to match any single character.

The iglob function is similar to glob, but instead of returning a list of matching files and directories, it returns an iterator that yields the matches one at a time. This can be more efficient when working with large numbers of files, as it allows you to process the files as they are found, rather than waiting for the entire list to be generated.

import glob path = "D:/ABRAR/UNIVERSITY/Advanced ML/Website Generator/" print("Using glob") names = glob.glob(path+'*.py') print(names) print("Using iglob") cnames = glob.iglob(path+'*.*py') for name in cnames: print(name)
Using glob ['D:/ABRAR/UNIVERSITY/Advanced ML/Website Generator\\input.py', 'D:/ABRAR/UNIVERSITY/Advanced ML/Website Generator\\main.py', 'D:/ABRAR/UNIVERSITY/Advanced ML/Website Generator\\whisper.py'] Using iglob D:/ABRAR/UNIVERSITY/Advanced ML/Website Generator\input.py D:/ABRAR/UNIVERSITY/Advanced ML/Website Generator\main.py D:/ABRAR/UNIVERSITY/Advanced ML/Website Generator\whisper.py

Conclusion

File access is important in the arsenal of a coder. Without the knowledge of handling files, one cannot do much in computer systems. And now you know how to get a list of files in a directory in Python. There are various methods to list the directories using the os and glob modules.

Источник

Оцените статью