How to list all files of a directory in Python?

Query:

How can I list all files of a directory in Python and add them to a list?

How to list all files of a directory in Python? Method #1:

os.listdir() will get you everything that’s in a directory – files and directories.

If you want just files, you could either filter this down using os.path:

from os import listdir
from os.path import isfile, join
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]

or you could use os.walk() which will yield two lists for each directory it visits – splitting into files and dirs for you. If you only want the top directory you can break the first time it yields

from os import walk

f = []
for (dirpath, dirnames, filenames) in walk(mypath):
    f.extend(filenames)
    break

or, shorter:

from os import walk

filenames = next(walk(mypath), (None, None, []))[2]  # [] if no file

List all files of a directory in Python- Method #2:

I prefer using the glob module, as it does pattern matching and expansion.

import glob
print(glob.glob("/home/adam/*"))

It does pattern matching intuitively

import glob
# All files ending with .txt
print(glob.glob("/home/adam/*.txt")) 
# All files ending with .txt with depth of 2 folder
print(glob.glob("/home/adam/*/*.txt")) 

It will return a list with the queried files:

['/home/adam/file1.txt', '/home/adam/file2.txt', .... ]

How to list all files of a directory in python?- Answer #3:

list in the current directory

With listdir in os module you get the files and the folders in the current dir

 import os

 arr = os.listdir()

Looking in a directory

arr = os.listdir('c:\\files')

with glob you can specify a type of file to list like this

import glob

txtfiles = []
for file in glob.glob("*.txt"):
    txtfiles.append(file)

or

mylist = [f for f in glob.glob("*.txt")]

get the full path of only files in the current directory

import os
from os import listdir
from os.path import isfile, join

cwd = os.getcwd()
onlyfiles = [os.path.join(cwd, f) for f in os.listdir(cwd) if 
os.path.isfile(os.path.join(cwd, f))]
print(onlyfiles) 

['G:\\getfilesname\\getfilesname.py', 'G:\\getfilesname\\example.txt']

Getting the full path name with os.path.abspath

You get the full path in return

 import os
 files_path = [os.path.abspath(x) for x in os.listdir()]
 print(files_path)
 
 ['F:\\documenti\applications.txt', 'F:\\documenti\collections.txt']

Walk: going through sub directories

os.walk returns the root, the directories list and the files list, that is why I unpacked them in r, d, f in the for loop; it, then, looks for other files and directories in the subfolders of the root and so on until there are no subfolders.

import os

# Getting the current work directory (cwd)
thisdir = os.getcwd()

# r=root, d=directories, f = files
for r, d, f in os.walk(thisdir):
    for file in f:
        if file.endswith(".docx"):
            print(os.path.join(r, file))

To go up in the directory tree

# Method 1
x = os.listdir('..')

# Method 2
x= os.listdir('/')

Get files of a particular subdirectory with os.listdir()

import os

x = os.listdir("./content")

os.walk(‘.’) – current directory

 import os
 arr = next(os.walk('.'))[2]
 print(arr)
 
 >>> ['5bs_Turismo1.pdf', '5bs_Turismo1.pptx', 'esperienza.txt']

next(os.walk(‘.’)) and os.path.join(‘dir’, ‘file’)

 import os
 arr = []
 for d,r,f in next(os.walk("F:\\_python")):
     for file in f:
         arr.append(os.path.join(r,file))

 for f in arr:
     print(files)

>>> F:\\_python\\dict_class.py
>>> F:\\_python\\programmi.txt

next… walk

 [os.path.join(r,file) for r,d,f in next(os.walk("F:\\_python")) for file in f]
 
 >>> ['F:\\_python\\dict_class.py', 'F:\\_python\\programmi.txt']

os.walk

x = [os.path.join(r,file) for r,d,f in os.walk("F:\\_python") for file in f]
print(x)

>>> ['F:\\_python\\dict.py', 'F:\\_python\\progr.txt', 'F:\\_python\\readl.py']

os.listdir() – get only txt files

 arr_txt = [x for x in os.listdir() if x.endswith(".txt")]
 

Using glob to get the full path of the files

from path import path
from glob import glob

x = [path(f).abspath() for f in glob("F:\\*.txt")]

Using os.path.isfile to avoid directories in the list

import os.path
listOfFiles = [f for f in os.listdir() if os.path.isfile(f)]

Using pathlib from Python 3.4

import pathlib

flist = []
for p in pathlib.Path('.').iterdir():
    if p.is_file():
        print(p)
        flist.append(p)

With list comprehension:

flist = [p for p in pathlib.Path('.').iterdir() if p.is_file()]

Use glob method in pathlib.Path()

import pathlib

py = pathlib.Path().glob("*.py")

Get all and only files with os.walk: checks only in the third element returned, i.e. the list of the files

import os
x = [i[2] for i in os.walk('.')]
y=[]
for t in x:
    for f in t:
        y.append(f)

Get only files with next in a directory: returns only the file in the root folder

 import os
 x = next(os.walk('F://python'))[2]

Get only directories with next and walk in a directory, because in the [1] element there are the folders only

 import os
 next(os.walk('F://python'))[1] # for the current dir use ('.')
 
 >>> ['python3','others']

Get all the subdir names with walk

for r,d,f in os.walk("F:\\_python"):
    for dirs in d:
        print(dirs)

os.scandir() from Python 3.5 and greater

import os
x = [f.name for f in os.scandir() if f.is_file()]

# Another example with scandir (a little variation from docs.python.org)
# This one is more efficient than os.listdir.
# In this case, it shows the files only in the current directory
# where the script is executed.

import os
with os.scandir() as i:
    for entry in i:
        if entry.is_file():
            print(entry.name)

Answer #4:

import os
os.listdir("somedirectory")

will return a list of all files and directories in “somedirectory”.

Answer #5:

A one-line solution to get only list of files (no subdirectories):

filenames = next(os.walk(path))[2]

or absolute pathnames:

paths = [os.path.join(path, fn) for fn in next(os.walk(path))[2]]

Answer #6:

Getting Full File Paths From a Directory and All Its Subdirectories

import os

def get_filepaths(directory):
    """
    This function will generate the file names in a directory 
    tree by walking the tree either top-down or bottom-up. For each 
    directory in the tree rooted at directory top (including top itself), 
    it yields a 3-tuple (dirpath, dirnames, filenames).
    """
    file_paths = []  # List which will store all of the full filepaths.

    # Walk the tree.
    for root, directories, files in os.walk(directory):
        for filename in files:
            # Join the two strings in order to form the full filepath.
            filepath = os.path.join(root, filename)
            file_paths.append(filepath)  # Add it to the list.

    return file_paths  # Self-explanatory.

# Run the above function and store its results in a variable.   
full_file_paths = get_filepaths("/Users/johnny/Desktop/TEST")

  • The path I provided in the above function contained 3 files— two of them in the root directory, and another in a subfolder called “SUBFOLDER.” You can now do things like:
  • print full_file_paths which will print the list:
    • ['/Users/johnny/Desktop/TEST/file1.txt', '/Users/johnny/Desktop/TEST/file2.txt', '/Users/johnny/Desktop/TEST/SUBFOLDER/file3.dat']

If you’d like, you can open and read the contents, or focus only on files with the extension “.dat” like in the code below:

for f in full_file_paths:
  if f.endswith(".dat"):
    print f

/Users/johnny/Desktop/TEST/SUBFOLDER/file3.dat

Python programs to list all files of a directory- Answer #7:

I really liked the second answer, suggesting that you use glob(), from the module of the same name. This allows you to have pattern matching with *s.

But as other people pointed out in the comments, glob() can get tripped up over inconsistent slash directions. To help with that, I suggest you use the join() and expanduser() functions in the os.path module, and perhaps the getcwd() function in the os module, as well.

As examples:

from glob import glob

# Return everything under C:\Users\admin that contains a folder called wlp.
glob('C:\Users\admin\*\wlp')

The above is terrible – the path has been hardcoded and will only ever work on Windows between the drive name and the \s being hardcoded into the path.

from glob    import glob
from os.path import join

# Return everything under Users, admin, that contains a folder called wlp.
glob(join('Users', 'admin', '*', 'wlp'))

The above works better, but it relies on the folder name Users which is often found on Windows and not so often found on other OSs. It also relies on the user having a specific name, admin.

from glob    import glob
from os.path import expanduser, join

# Return everything under the user directory that contains a folder called wlp.
glob(join(expanduser('~'), '*', 'wlp'))

This works perfectly across all platforms.

Another great example that works perfectly across platforms and does something a bit different:

from glob    import glob
from os      import getcwd
from os.path import join

# Return everything under the current directory that contains a folder called wlp.
glob(join(getcwd(), '*', 'wlp'))

Hope these examples help you see the power of a few of the functions you can find in the standard Python library modules.

Hope you learned something from this post.

Follow Programming Articles for more!

About ᴾᴿᴼᵍʳᵃᵐᵐᵉʳ

Linux and Python enthusiast, in love with open source since 2014, Writer at programming-articles.com, India.

View all posts by ᴾᴿᴼᵍʳᵃᵐᵐᵉʳ →