One of the things I've always enjoyed about Python is how easy it is to write simple text transform programs that can get more sophisticated over time, and we're going to go down that path today.
Now that we can take some arguments in our Python scripts, let's get to work.
There are many ways to read a text file in Python, but I'm going to talk about just two: line-by-line, or all-in-one.
My default choice is line-by-line if I don't need multiple lines. It simply does less work up-front and reduces memory usage.
import argparse
def main_with_args(args):
line_count = 0
with open(args.infile) as f:
for l in f:
if not l.startswith("#"):
line_count = line_count + 1
print("Found {} commented lines in {}".format(line_count, args.infile))
def main():
parser = argparse.ArgumentParser(description="My program")
parser.add_argument("infile")
args = parser.parse_args()
main_with_args(args)
if __name__ == '__main__':
main()
Note the following in this snippet:
with
to make sure the file is closed even if an exception is thrown.for ... in
to enumerate lines in the file in a streaming (one-by-one) way.Now, if we needed to work with all the lines at once, perhaps because we want to look ahead or behind when processing a line, this is how main_with_args
would change.
def main_with_args(args):
line_count = 0
with open(args.infile) as f:
lines = f.readlines()
print("Found {} total lines in {}".format(len(lines), args.infile))
I didn't show the filtering in this case, but note how I have all the lines in my lines
list.
The second method is also very handy when you want to overwrite the original file, as you need to close it before you can open it again for writing out.
Happy text reading!