My file... is too big! My RAM is too small!



How To Read Humongous .txt Files With Python

Sometimes we have to work with txt files with millions of entries on each line. How do we go about reading it without killing our machine?

We blow chunks obviously...

def readInChunks(fileObj, chunkSize=2048):
    while True:
        data = fileObj.read(chunkSize)
        if not data:
            break
        yield data

with open('file.txt') as f:
    for chunk in readInChunks(f):
        lines = chunk.split('\n')
        for line in lines:
            if line:
                print(line)

You can change the chunkSize if you like but 2048 is enough to run on almost anything. Have fun reading your massive files.

Thanks for reading. x

Resources