Playing with Snakes
>
<

Chapter 12: Files: Long Term Storage

Hofstadter's Law:
It always takes longer than you expect, even when you take into account Hofstadter's Law.

So if you'll recall, a few chapters back we build a program to keep track of where all the zombies are. We'd end up with data like this:

{'braaain':{'speed':5,'swim':False,'height':70,'dance':"not bad"},
'grawg':{'speed':3,'swim':True,'height':70,'dance':"Excellent"},
'Kelly':{'speed':6,'swim':False,'height':65,'dance':'Mediocre'}}

That tells us where our zombies are. We also had a list that told us what supplies we had:

['food','water','zombie repellent','tent','chainsaw']

That's great and all, but let's admit one terrifying truth right now: Eventually they're going to overrun us, and we're going to need to move. In that case, what happens when we power our zombie simulator down in order to take it with us? All our zombie information is going to disappear! In order to save information when the computer turns off, we need to save information to a file.

Reading and Writing files

The simplest way to read a file is through the open function. open takes two arguments - the location of the file, and the mode we want to open the file in. The location is pretty easy to understand, it's just a string that describes the file's location on the computer - like "C:\My Documents\zombies.txt" or "/home/me/zombies.txt".

Modes are trickier. First of all, they're optional. By default, Python will assume you want to open your file in read only mode. So by default, Python uses 'r' as the mode. This is good. Opening a file in read mode prevents you from accidentally overwriting data.

But if you want to do something else, like writing or appending, you have other options. If we open the file in append mode, using 'a', then we can write to the end of the file, but we can't modify any other part. 'w' overwrites an existing file (or creates a new one if there isn't one) and allows you to only write. 'r+' opens a file for reading and writing.

I assume that since the above was a lot of text, and you're in a hurry to get away from the zombies, you'd prefer it in a table. Here you go:

CharacterModeDescription
rreadcan only read data
wwritecan only write data, destroys existing file
aappendcan only add on to end of file
r+read and writecan read or write to any part of the file

So, since we're moving locations here, let's open a new file for writing. I'll assume you're on a Unix like system, since that's the type of computer system they use in Jurassic Park and the one most likely to be useful for corralling dinosaurs.

  1. f = open('/home/us/zombieDB.txt','w')
  2.  

Writing Data to a File

Before you read in to this too far, understand that I'm going to be going into an in depth way to store our zombies, and then providing a really easy way to do it at the bottom. So, skim this section, as it's important for reading regular text files, but understand that there's a super easy way to store dictionaries, lists, and other objects that I'll explain at the bottom.

Now we've got our file open, so let's write our data. Python is really good at writing anything at all, but there's a limitation: it can only read strings, so we need to settle on a standard way of writing data to our file to make it easy to read. Here's my plan:

Current Date
Start Equipment
Object 1
Object 2
etc
End Equipment
Start Zombies
Name:Zombie Name
Speed:Zombie Speed
etc
End Zombies

We're putting a couple things in that there will make it easy to read this file later. First, we're putting the words "Start Equipment" and "End Equipment" to note where our equipment is. We're also doing the same for the Zombies. Second, we're pairing Zombie stats with the type of stat they are, making it easy to add these to a dictionary.

We'll write everything using the write method of the file object we used earlier.

  1. import datetime
  2.  
  3. zombies = {'braaain':{'speed':5,'swim':False,'height':70,'dance':"not bad"},
  4. 'grawg':{'speed':3,'swim':True,'height':70,'dance':"Excellent"},
  5. 'Kelly':{'speed':6,'swim':False,'height':65,'dance':'Mediocre'}}
  6.  
  7. equipment = ['food','water','zombie repellent','tent','chainsaw']
  8.  
  9. #I left off most of the file path, so Python will store it in our
  10. #current working directory
  11. f = open('zombieDB.txt','w')
  12.  
  13. #This is needlessly precise, but when you're dealing with zombies
  14. #you can't be too careful
  15. f.write(str(datetime.datetime.now()) + "\n")
  16. f.write("Start Equipment\n")
  17. for i in equipment:
  18.   f.write(i + "\n")
  19. f.write("End Equipment\n")
  20.  
  21. f.write("Start Zombies\n")
  22. for name, zombie in zombies.iteritems():
  23.   f.write("Name:" + name + "\n")
  24.   for stat, value in zombie.iteritems():
  25.     f.write(stat + ":" + str(value) + "\n")
  26. f.write("End Zombies\n")
  27. f.close()
  28.  

While you're here, also check out the datetime library, which we can use for date and time functions. It has a lot of options so you can format the time or the date however you'd like.

close closes the file for us. This means it can't be used any more. While strictly speaking it wasn't necessary in our program (Python closed the file automatically when our program ended), it's usually polite to close what you opened.

Anyway, enough of this, it's time to run to our new, secret, zombie secure facility.

Reading a file back in

Ok, good. We've made it to our new, secret, zombie secure facility. Time to read the file back in.

For this, we'll open our file in read mode, and then loop through the file, one line at a time. Because we put markers in, like "Start Zombies" it will be easy to tell where each part of our file is.

But we actually have three options. If we use the readlines method, we can read the entire file into a list all at once. On the other hand, we can use a loop to go through the file one line at a time. For this example, it's best to loop, as we have more than one type of data in the file. But if you were doing something like reading a dictionary into a file, you could just use readlines.

But I promised three options, and I've only given two! There's actually two ways to loop. You can loop with a for loop, like this:

  1. f.open('file.txt')
  2. for line in f:
  3.   print line
  4.  

Or you can use a while loop to loop through and read each line:

  1. f.open('file.txt')
  2. line = f.readline()
  3. while line != '':
  4.   print line
  5.   line = f.readline()
  6.  

Of these two types we'll be using the while loop, since we'll be using nested loops.

Anyway, our program to read the data:

  1. f=open('zombieDB.txt','r')
  2.  
  3. equipment = []
  4. zombies = {}
  5.  
  6. line = f.readline().rstrip()
  7. while line != '':
  8.   if line == "Start Equipment":
  9.     #We use this next readline so we don't add "Start Equipment" to our list
  10.     line = f.readline().rstrip()
  11.     while line != "End Equipment":
  12.       equipment.append(line)
  13.       line = f.readline().rstrip()
  14.   elif line == "Start Zombies":
  15.     #We use this next line to skip the line that says "Start Zombies"
  16.     line = f.readline().rstrip()
  17.     while line != "End Zombies":
  18.       #We stored zombie stats with a : separating the trait and the stat
  19.       #we'll use split to break that string up into the pair
  20.       statPair = line.split(":")
  21.       if statPair[0] == "Name":
  22.         name = statPair[1]
  23.         zombies[name] = {}
  24.       else:
  25.         zombies[name][statPair[0]] = statPair[1]
  26.       line = f.readline().rstrip()
  27.   else:
  28.     line = f.readline().rstrip()
  29.  
  30. print equipment
  31. print zombies
  32.  

Most people would consider this code to be really ugly. They'd be right. But you should pay attention to the rstrip method on each of our readline calls. Look at one of those lines again:

  1. line = f.readline().rstrip()
  2.  

Python evaluates functions from left to the right, much like how you read. So first it'll do the f.readline(), and give that a value. For us, that value might be something like "End Zombies\n". But wait, where'd that \n come from? Remember when we wrote our file, we put a \n on the end of every line so that we could have new lines. Now we want to get rid of all the \n's. That's a pretty common operation. So we call rstrip and that gets rid of them. We end up with "End Zombies", which we store in line.

This opening data program that I wrote is also bad for another reason - when you read in data, it comes in as a string. So the booleans we set on our data earlier will be read as "True" and "False" instead of True or False. This might seem subtle, but subtle bugs are usually the hardest to track down.

So, how about a better method?

Pickling

Wouldn't it be super if we could just save our list of equipment and our dictionary of zombies directly, and then read them back? It would. Just agree. This is a common operation, and it's called pickling. Pickling lets you easily store any Python object in a file, and then read it back.

Here's our new program for saving zombie information:

  1. import datetime
  2. import pickle
  3.  
  4. zombies = {'braaain':{'speed':5,'swim':False,'height':70,'dance':"not bad"},
  5. 'grawg':{'speed':3,'swim':True,'height':70,'dance':"Excellent"},
  6. 'Kelly':{'speed':6,'swim':False,'height':65,'dance':'Mediocre'}}
  7.  
  8. equipment = ['food','water','zombie repellent','tent','chainsaw']
  9.  
  10. f = open('zombieDB.txt','w')
  11. pickle.dump(datetime.datetime.now(),f)
  12. pickle.dump(equipment,f)
  13. pickle.dump(zombies,f)
  14. f.close()
  15.  

And here's the new code for reading back the data:

  1. import pickle
  2.  
  3. f = open('zombieDB.txt','r')
  4.  
  5. date = pickle.load(f)
  6. equipment = pickle.load(f)
  7. zombies = pickle.load(f)
  8.  
  9. print date
  10. print equipment
  11. print zombies
  12.  

Notice that the order data is stored and read in is important. But also notice that equipment was read directly into a list, and zombies was read in directly as a dictionary. That saves us a lot of time.

A Friendly Notice

Remember how I started this chapter with a sidebar that said to remember exceptions? I didn't include any exception handling in any of the sample code in this chapter because I didn't want to make the examples longer than they already were. But you should be a better programmer than I was here.

This website will be taken offline before the end of 2011