This code will read the entire file into memory and remove all whitespace characters (newlines and spaces) from the end of each line:
with open(filename) as file: lines = file.readlines() lines = [line.rstrip() for line in lines]
If you’re working with a large file, then you should instead read and process it line-by-line:
with open(filename) as file: for line in file: print(line.rstrip())
In Python 3.8 and up you can use a while loop with the walrus operator like so:
with open(filename) as file: while (line := file.readline().rstrip()): print(line)
Depending on what you plan to do with your file and how it was encoded, you may also want to manually set the access mode and character encoding:
with open(filename, 'r', encoding='UTF-8') as file: while (line := file.readline().rstrip()): print(line)
See Input and Output:
with open('filename') as f: lines = f.readlines()
or with stripping the newline character:
with open('filename') as f: lines = [line.rstrip() for line in f]
How to read a file line-by-line into a list in Python?
According to Python’s Methods of File Objects, the simplest way to convert a text file into a
with open('file.txt') as f: my_list = list(f) # my_list = [x.rstrip() for x in f] # remove line breaks
If you just need to iterate over the text file lines, you can use:
with open('file.txt') as f: for line in f: ...
with open('file.txt') as f: lines = f.readlines()
If you don’t care about closing the file, this one-liner will work:
lines = open('file.txt').readlines()
The traditional way:
f = open('file.txt') # Open file on read mode lines = f.read().splitlines() # List with stripped line-breaks f.close() # Close file
This is more explicit than necessary but does what you want.
with open("file.txt") as file_in: lines =  for line in file_in: lines.append(line)
Introduced in Python 3.4,
pathlib has a really convenient method for reading in text from files, as follows:
from pathlib import Path p = Path('my_text_file') lines = p.read_text().splitlines()
splitlines call is what turns it from a string containing the whole contents of the file to a list of lines in the file).
pathlib has a lot of handy conveniences in it.
read_text is nice and concise, and you don’t have to worry about opening and closing the file. If all you need to do with the file is read it all in in one go, it’s a good choice.
This will yield an “array” of lines from the file.
lines = tuple(open(filename, 'r'))
open returns a file that can be iterated over. When you iterate over a file, you get the lines from that file.
tuple can take an iterator and instantiate a tuple instance for you from the iterator that you give it.
lines is a tuple created from the lines of the file.
How to read a file line-by-line into a list using NumPy?
Another option is
numpy.genfromtxt, for example:
import numpy as np data = np.genfromtxt("yourfile.dat",delimiter="\n")
This will make
data a NumPy array with as many rows as are in your file.
If you want the
with open(fname) as f: content = f.readlines()
If you do not want
with open(fname) as f: content = f.read().splitlines()
Having a Text file content:
line 1 line 2 line 3
We can use this Python script in the same directory of the txt above
>>> with open("myfile.txt", encoding="utf-8") as file: ... x = [l.rstrip("\n") for l in file] >>> x ['line 1','line 2','line 3']
x =  with open("myfile.txt") as file: for l in file: x.append(l.strip())
>>> x = open("myfile.txt").read().splitlines() >>> x ['line 1', 'line 2', 'line 3']
>>> x = open("myfile.txt").readlines() >>> x ['linea 1\n', 'line 2\n', 'line 3\n']
def print_output(lines_in_textfile): print("lines_in_textfile =", lines_in_textfile) y = [x.rstrip() for x in open("001.txt")] print_output(y) with open('001.txt', 'r', encoding='utf-8') as file: file = file.read().splitlines() print_output(file) with open('001.txt', 'r', encoding='utf-8') as file: file = [x.rstrip("\n") for x in file] print_output(file)
lines_in_textfile = ['line 1', 'line 2', 'line 3'] lines_in_textfile = ['line 1', 'line 2', 'line 3'] lines_in_textfile = ['line 1', 'line 2', 'line 3']
Clean and Pythonic Way of Reading the Lines of a File Into a List
First and foremost, you should focus on opening your file and reading its contents in an efficient and pythonic way. Here is an example of the way I personally DO NOT prefer:
infile = open('my_file.txt', 'r') # Open the file for reading. data = infile.read() # Read the contents of the file. infile.close() # Close the file since we're done using it.
Instead, I prefer the below method of opening files for both reading and writing as it is very clean, and does not require an extra step of closing the file once you are done using it. In the statement below, we’re opening the file for reading, and assigning it to the variable ‘infile.’ Once the code within this statement has finished running, the file will be automatically closed.
# Open the file for reading. with open('my_file.txt', 'r') as infile: data = infile.read() # Read the contents of the file into memory.
Now we need to focus on bringing this data into a Python List because they are iterable, efficient, and flexible. In your case, the desired goal is to bring each line of the text file into a separate element. To accomplish this, we will use the splitlines() method as follows:
# Return a list of the lines, breaking at line boundaries. my_list = data.splitlines()
The Final Product:
# Open the file for reading. with open('my_file.txt', 'r') as infile: data = infile.read() # Read the contents of the file into memory. # Return a list of the lines, breaking at line boundaries. my_list = data.splitlines()
Testing Our Code:
- Contents of the text file:
A fost odatã ca-n povesti, A fost ca niciodatã, Din rude mãri împãrãtesti, O prea frumoasã fatã.
- Print statements for testing purposes:
print my_list # Print the list. # Print each line in the list. for line in my_list: print line # Print the fourth element in this list. print my_list
- Output (different-looking because of unicode characters):
['A fost odat\xc3\xa3 ca-n povesti,', 'A fost ca niciodat\xc3\xa3,', 'Din rude m\xc3\xa3ri \xc3\xaemp\xc3\xa3r\xc3\xa3testi,', 'O prea frumoas\xc3\xa3 fat\xc3\xa3.'] A fost odatã ca-n povesti, A fost ca niciodatã, Din rude mãri împãrãtesti, O prea frumoasã fatã. O prea frumoasã fatã.
This is how we read a file line-by-line into a list in Python.
You could simply do the following, as has been suggested:
with open('/your/path/file') as f: my_lines = f.readlines()
Note that this approach has 2 downsides:
1) You store all the lines in memory. In the general case, this is a very bad idea. The file could be very large, and you could run out of memory. Even if it’s not large, it is simply a waste of memory.
2) This does not allow processing of each line as you read them. So if you process your lines after this, it is not efficient (requires two passes rather than one).
A better approach for the general case would be the following:
with open('/your/path/file') as f: for line in f: process(line)
Where you define your process function any way you want. For example:
def process(line): if 'save the world' in line.lower(): superman.save_the_world()
(The implementation of the
Superman class is left as an exercise for you).
This will work nicely for any file size and you go through your file in just 1 pass. This is typically how generic parsers will work.
Read a file line-by-line into a list in Python
To read a file into a list you need to do three things:
- Open the file
- Read the file
- Store the contents as list
Fortunately, Python makes it very easy to do these things so the shortest way to read a file into a list is:
lst = list(open(filename))
However, I’ll add some more explanation.
Opening the file
I assume that you want to open a specific file and you don’t deal directly with a file-handle (or a file-like-handle). The most commonly used function to open a file in Python is
open, it takes one mandatory argument and two optional ones in Python 2.7:
- Buffering (I’ll ignore this argument in this answer)
The filename should be a string that represents the path to the file. For example:
open('afile') # opens the file named afile in the current working directory open('adir/afile') # relative path (relative to the current working directory) open('C:/users/aname/afile') # absolute path (windows) open('/usr/local/afile') # absolute path (linux)
Note that the file extension needs to be specified. This is especially important for Windows users because file extensions like
.doc, etc. are hidden by default when viewed in the explorer.
The second argument is the
r by default which means “read-only”. That’s exactly what you need in your case.
But in case you actually want to create a file and/or write to a file you’ll need a different argument here.
For reading a file you can omit the
mode or pass it in explicitly:
open(filename) open(filename, 'r')
Both will open the file in read-only mode. In case you want to read in a binary file on Windows you need to use the mode
On other platforms the
'b' (binary mode) is simply ignored.
Now that I’ve shown you how to open the file, let’s talk about the fact that you always need to close it again. Otherwise, it will keep an open file handle to the file until the process exits (or Python garbages the file handle).
While you could use:
f = open(filename) # ... do stuff with f f.close()
That will fail to close the file when something between
close throws an exception. You could avoid that by using a
f = open(filename) # nothing in between! try: # do stuff with f finally: f.close()
However, Python provides context managers that have a prettier syntax (but for
open it’s almost identical to the
with open(filename) as f: # do stuff with f # The file is always closed after the with-scope ends.
The last approach is the recommended approach to open a file in Python!
Reading the file
Okay, you’ve opened the file, now how to read it?
open function returns an file object and it supports Python’s iteration protocol. Each iteration will give you a line:
with open(filename) as f: for line in f: print(line)
This will print each line of the file. Note however that each line will contain a newline character
\n at the end (you might want to check if your Python is built with universal newlines support – otherwise you could also have
\r\n on Windows or
\r on Mac as newlines). If you don’t want that you can simply remove the last character (or the last two characters on Windows):
with open(filename) as f: for line in f: print(line[:-1])
But the last line doesn’t necessarily have a trailing newline, so one shouldn’t use that. One could check if it ends with a trailing newline and if so remove it:
with open(filename) as f: for line in f: if line.endswith('\n'): line = line[:-1] print(line)
But you could simply remove all whitespaces (including the
\n character) from the end of the string, this will also remove all other trailing whitespaces so you have to be careful if these are important:
with open(filename) as f: for line in f: print(f.rstrip())
However, if the lines end with
\r\n (Windows “newlines”) that
.rstrip() will also take care of the
Store the contents as list
Now that you know how to open the file and read it, it’s time to store the contents in a list. The simplest option would be to use the
with open(filename) as f: lst = list(f)
In case you want to strip the trailing newlines you could use a list comprehension instead:
with open(filename) as f: lst = [line.rstrip() for line in f]
Or even simpler: The
.readlines() method of the
file object by default returns a
list of the lines:
with open(filename) as f: lst = f.readlines()
This will also include the trailing newline characters, if you don’t want them I would recommend the
[line.rstrip() for line in f] approach because it avoids keeping two lists containing all the lines in memory.
There’s an additional option to get the desired output, however it’s rather “suboptimal”:
read the complete file in a string and then split on newlines:
with open(filename) as f: lst = f.read().split('\n')
with open(filename) as f: lst = f.read().splitlines()
These take care of the trailing newlines automatically because the
split character isn’t included. However, they are not ideal because you keep the file as a string and as a list of lines in memory!
with open(...) as fwhen opening files because you don’t need to take care of closing the file yourself and it closes the file even if some exception happens.
fileobjects support the iteration protocol so reading a file line-by-line is as simple as
for line in the_file_object:.
- Always browse the documentation for the available functions/classes. Most of the time there’s a perfect match for the task or at least one or two good ones. The obvious choice, in this case, would be
readlines()but if you want to process the lines before storing them in the list I would recommend a simple list comprehension.
In this post, we learned how to read a file line-by-line into a list in Python using multiple methods.
Hope you learned something from this post.
Follow Programming Articles for more!