How To Read Humongous .txt Files With Python
Sometimes we have to work with txt files with millions of entries on each line. How do we go about reading it without killing our machine?
We blow chunks obviously...
def readInChunks(fileObj, chunkSize=2048):
while True:
data = fileObj.read(chunkSize)
if not data:
break
yield data
with open('file.txt') as f:
for chunk in readInChunks(f):
lines = chunk.split('\n')
for line in lines:
if line:
print(line)
You can change the chunkSize
if you like but 2048 is enough to run on almost anything. Have fun reading your massive files.
Thanks for reading. x
Resources
- Python: https://python.org