Python Object Serialization : yaml & json
What is yaml?
Let's see how it looks like (from wiki):
--- receipt: Oz-Ware Purchase Invoice date: 2012-08-06 customer: given: Dorothy family: Gale items: - part_no: A4786 descrip: Water Bucket (Filled) price: 1.47 quantity: 4 - part_no: E1628 descrip: High Heeled "Ruby" Slippers size: 8 price: 100.27 quantity: 1 bill-to: &id001 street: | 123 Tornado Alley Suite 16 city: East Centerville state: KS ship-to: *id001 specialDelivery: > Follow the Yellow Brick Road to the Emerald City. Pay no attention to the man behind the curtain. ...
- strings do not require quotations.
- The specific number of spaces in the indentation is unimportant as long as parallel elements have the same left justification and the hierarchically nested elements are indented further.
- The sample above defines:
- An associative array with 7 top level keys
- The "items" key contains a 2-element array (or "list")
- Each element of which is itself an associative array with differing keys.
- Relational data and redundancy removal are displayed:
- The "ship-to" associative array content is copied from the "bill-to" associative array's content as indicated by the anchor (&) and reference (*) labels.
- Optional blank lines can be added for readability.
- Multiple documents can exist in a single file/stream and are separated by "---".
- An optional "..." can be used at the end of a file (useful for signaling an end in streamed communications without closing the pipe).
This answer is an abstracts from What is the difference between YAML and JSON?
Technically YAML is a superset of JSON. This means that, in theory at least, a YAML parser can understand JSON, but not necessarily the other way around.
See the official specs, in the section entitled "YAML: Relation to JSON".
In general, there are certain things available from YAML that are not available from JSON.
YAML is visually easier to look at. In fact the YAML homepage is itself valid YAML, yet it is easy for a human to read. YAML has the ability to reference other items within a YAML file using "anchors." Thus it can handle relational information as one might find in a MySQL database. YAML is more robust about embedding other serialization formats such as JSON or XML within a YAML file. In practice neither of these last two points will likely matter for things that we do, but in the long term, YAML may be a more robust and viable data serialization format.
We can use YAML Lint to validate *.yml file.
But it doesn't like the "..." in the last line. Otherwise the sample document passes the validation test.
Let's convert the following json to yaml:
{ "foo": "bar", "baz": [ "qux", "quxx" ], "corge": null, "grault": 1, "garply": true, "waldo": "false", "fred": "undefined", "emptyArray": [], "emptyObject": {}, "emptyString": "" }
Python code:
import json import yaml sample = { "foo": "bar", "baz": [ "qux", "quxx" ], "corge": None, "grault": 1, "garply": True, "waldo": "false", "fred": "undefined", "emptyArray": [], "emptyObject": {}, "emptyString": "" } json_obj = json.dumps(sample) print 'json_obj =', json_obj ff = open('data.yml', 'wb') yaml.dump(sample, ff, default_flow_style=False) ydump = yaml.dump(sample, default_flow_style=False) print 'ydump=',ydump
Output:
json_obj = {"emptyObject": {}, "emptyString": "", "emptyArray": [], "corge": null, "waldo": "false", "grault": 1, "garply": true, "foo": "bar", "baz": ["qux", "quxx"], "fred": "undefined"} ydump= baz: - qux - quxx corge: null emptyArray: [] emptyObject: {} emptyString: '' foo: bar fred: undefined garply: true grault: 1 waldo: 'false'
If we open the data.yml :
baz: - qux - quxx corge: null emptyArray: [] emptyObject: {} emptyString: '' foo: bar fred: undefined garply: true grault: 1 waldo: 'false'
We can check our conversion is correct via yamllint:
We can reads in the yaml and write it to json:
stream = file('data.yml', 'r') yml_loaded = yaml.load(stream) with open('data.json','wb') as f: json.dump(yml_loaded, f)
The data.json looks like this:
{"emptyObject": {}, "emptyArray": [], "waldo": "false", "baz": ["qux", "quxx"], "emptyString": "", "corge": null, "grault": 1, "garply": true, "foo": "bar", "fred": "undefined"}
We can check the conversion using one of the online conversion tools:
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization