Saving structured data with json
Strings can easily be written to and read from a file. Numbers take a bit more effort, since the
read() method only returns strings, which will have to be passed to a function like
int(), which takes a string like
'123' and returns its numeric value 123. When you want to save more complex data types like nested lists and dictionaries, parsing and serializing by hand becomes complicated.
json can take Python data hierarchies, and convert them to string representations; this process is called serializing. Reconstructing the data from the string representation is called deserializing. Between serializing and deserializing, the string representing the object may have been stored in a file or data, or sent over a network connection to some distant machine.
The JSON format is commonly used by modern applications to allow for data exchange. Many programmers are already familiar with it, which makes it a good choice for interoperability.
If you have an object
x, you can view its JSON string representation with a simple line of code:
>>> json.dumps([1, 'simple', 'list'])
'[1, "simple", "list"]'
To decode the object again, if
f is a text file object which has been opened for reading:
x = json.load(f)
This simple serialization technique can handle lists and dictionaries, but serializing arbitrary class instances in JSON requires a bit of extra effort. The reference for the
json module contains an explanation of this.
pickle - the pickle module
Contrary to JSON, pickle is a protocol which allows the serialization of arbitrarily complex Python objects. As such, it is specific to Python and cannot be used to communicate with applications written in other languages. It is also insecure by default: deserializing pickle data coming from an untrusted source can execute arbitrary code, if the data was crafted by a skilled attacker.