• 12

A PHP Error was encountered

Severity: Notice

Message: Undefined index: userid

Filename: views/question.php

Line Number: 191


File: /home/prodcxja/public_html/questions/application/views/question.php
Line: 191
Function: _error_handler

File: /home/prodcxja/public_html/questions/application/controllers/Questions.php
Line: 433
Function: view

File: /home/prodcxja/public_html/questions/index.php
Line: 315
Function: require_once

I'd like to get PyYAML's loader to load mappings (and ordered mappings) into the Python 2.7+ OrderedDict type, instead of the vanilla dict and the list of pairs it currently uses.

What's the best way to do that?

Update: In python 3.6+ you probably don't need OrderedDict at all due to the new dict implementation that has been in use in pypy for some time (although considered CPython implementation detail for now).

Update: In python 3.7+, the insertion-order preservation nature of dict objects has been declared to be an official part of the Python language spec, see What's New In Python 3.7.

I like @James' solution for its simplicity. However, it changes the default global yaml.Loader class, which can lead to troublesome side effects. Especially, when writing library code this is a bad idea. Also, it doesn't directly work with yaml.safe_load().

Fortunately, the solution can be improved without much effort:

import yaml
from collections import OrderedDict

def ordered_load(stream, Loader=yaml.Loader, object_pairs_hook=OrderedDict):
    class OrderedLoader(Loader):
    def construct_mapping(loader, node):
        return object_pairs_hook(loader.construct_pairs(node))
    return yaml.load(stream, OrderedLoader)

# usage example:
ordered_load(stream, yaml.SafeLoader)

For serialization, I don't know an obvious generalization, but at least this shouldn't have any side effects:

def ordered_dump(data, stream=None, Dumper=yaml.Dumper, **kwds):
    class OrderedDumper(Dumper):
    def _dict_representer(dumper, data):
        return dumper.represent_mapping(
    OrderedDumper.add_representer(OrderedDict, _dict_representer)
    return yaml.dump(data, stream, OrderedDumper, **kwds)

# usage:
ordered_dump(data, Dumper=yaml.SafeDumper)
  • 147
Reply Report

The yaml module allow you to specify custom 'representers' to convert Python objects to text and 'constructors' to reverse the process.

_mapping_tag = yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG

def dict_representer(dumper, data):
    return dumper.represent_dict(data.iteritems())

def dict_constructor(loader, node):
    return collections.OrderedDict(loader.construct_pairs(node))

yaml.add_representer(collections.OrderedDict, dict_representer)
yaml.add_constructor(_mapping_tag, dict_constructor)
  • 56
Reply Report
      • 2
    • Or even better from six import iteritems and then change it to iteritems(data) so that it works equally well in Python 2 & 3.
      • 2
    • This seems to be using undocumented features of PyYAML (represent_dict and DEFAULT_MAPPING_TAG). Is this because the documentation is incomplete, or are these features unsupported and subject to change without notice?
    • Note that for dict_constructor you'll need to call loader.flatten_mapping(node) or you won't be able to load <<: *... (merge syntax)
      • 1
    • @brice-m-dempsey can you add any example how to use your code? It does not seem to work in my case (Python 3.7)

2018 option:

oyaml is a drop-in replacement for PyYAML which preserves dict ordering. Both Python 2 and Python 3 are supported. Just pip install oyaml, and import as shown below:

import oyaml as yaml

You'll no longer be annoyed by screwed-up mappings when dumping/loading.

Note: I'm the author of oyaml.

  • 53
Reply Report
    • Thank you for this! For some reason, even with Python 3.8 the order was not respected with PyYaml. oyaml solved this for me immediately.

2015 (and later) option:

ruamel.yaml is a drop in replacement for PyYAML (disclaimer: I am the author of that package). Preserving the order of the mappings was one of the things added in the first version (0.1) back in 2015. Not only does it preserve the order of your dictionaries, it will also preserve comments, anchor names, tags and does support the YAML 1.2 specification (released 2009)

The specification says that the ordering is not guaranteed, but of course there is ordering in the YAML file and the appropriate parser can just hold on to that and transparently generate an object that keeps the ordering. You just need to choose the right parser, loader and dumper¹:

import sys
from ruamel.yaml import YAML

yaml_str = """\
3: abc
    10: def
    3: gij     # h is missing
- what
- else

yaml = YAML()
data = yaml.load(yaml_str)
data['conf'][10] = 'klm'
data['conf'][3] = 'jig'
yaml.dump(data, sys.stdout)

will give you:

3: abc
  10: klm
  3: jig       # h is missing
- what
- else

data is of type CommentedMap which functions like a dict, but has extra information that is kept around until being dumped (including the preserved comment!)

  • 26
Reply Report
      • 1
    • That's pretty nice if you already have a YAML file, but how do you do that using a Python structure? I tried using CommentedMap directly but it does not work, and OrderedDict puts !!omap everywhere which is not very user-friendly.
      • 2
    • I am not sure why CommentedMap did not work for you. Can you post a question with your (minimalized) code and tag it ruamel.yaml? That way I will be notified and answer.
    • Sorry, I think it's because I tried to save the CommentedMap with safe=True in YAML, which did not work (using safe=False works). I also had issue with CommentedMap not being modifiable, but I cannot reproduce it now... I'll open a new question if I encounter this issue again.
      • 1
    • You should be using yaml = YAML(), you get the round-trip parser/dumper and that is derivative of the safe parser/dumper that knows about CommentedMap/Seq etc.

Note: there is a library, based on the following answer, which implements also the CLoader and CDumpers: Phynix/yamlloader

I doubt very much that this is the best way to do it, but this is the way I came up with, and it does work. Also available as a gist.

import yaml
import yaml.constructor

    # included in standard lib from Python 2.7
    from collections import OrderedDict
except ImportError:
    # try importing the backported drop-in replacement
    # it's available on PyPI
    from ordereddict import OrderedDict

class OrderedDictYAMLLoader(yaml.Loader):
    A YAML loader that loads mappings into ordered dictionaries.

    def __init__(self, *args, **kwargs):
        yaml.Loader.__init__(self, *args, **kwargs)

        self.add_constructor(u'tag:yaml.org,2002:map', type(self).construct_yaml_map)
        self.add_constructor(u'tag:yaml.org,2002:omap', type(self).construct_yaml_map)

    def construct_yaml_map(self, node):
        data = OrderedDict()
        yield data
        value = self.construct_mapping(node)

    def construct_mapping(self, node, deep=False):
        if isinstance(node, yaml.MappingNode):
            raise yaml.constructor.ConstructorError(None, None,
                'expected a mapping node, but found %s' % node.id, node.start_mark)

        mapping = OrderedDict()
        for key_node, value_node in node.value:
            key = self.construct_object(key_node, deep=deep)
            except TypeError, exc:
                raise yaml.constructor.ConstructorError('while constructing a mapping',
                    node.start_mark, 'found unacceptable key (%s)' % exc, key_node.start_mark)
            value = self.construct_object(value_node, deep=deep)
            mapping[key] = value
        return mapping
  • 14
Reply Report
    • If you want to include the key_node.start_mark attribute in your error message, I don't see any obvious way to simplify your central construction loop. If you try to make use of the fact that the OrderedDict constructor will accept an iterable of key, value pairs, you lose access to that detail when generating the error message.
    • Example Usage: ordered_dict = yaml.load( ''' b: 1 a: 2 ''', Loader=OrderedDictYAMLLoader) # ordered_dict = OrderedDict([('b', 1), ('a', 2)]) Unfortunately my edit to the post was rejected, so please excuse lack of formatting.

Update: the library was deprecated in favor of the yamlloader (which is based on the yamlordereddictloader)

I've just found a Python library (https://pypi.python.org/pypi/yamlordereddictloader/0.1.1) which was created based on answers to this question and is quite simple to use:

import yaml
import yamlordereddictloader

datas = yaml.load(open('myfile.yml'), Loader=yamlordereddictloader.Loader)
  • 10
Reply Report

On my For PyYaml installation for Python 2.7 I updated __init__.py, constructor.py, and loader.py. Now supports object_pairs_hook option for load commands. Diff of changes I made is below.


$ diff __init__.py Original
< def load(stream, Loader=Loader, **kwds):
> def load(stream, Loader=Loader):
<     loader = Loader(stream, **kwds)
>     loader = Loader(stream)
< def load_all(stream, Loader=Loader, **kwds):
> def load_all(stream, Loader=Loader):
<     loader = Loader(stream, **kwds)
>     loader = Loader(stream)


$ diff constructor.py Original
<     def __init__(self, object_pairs_hook=dict):
<         self.object_pairs_hook = object_pairs_hook
>     def __init__(self):
<     def create_object_hook(self):
<         return self.object_pairs_hook()
<         self.constructed_objects = self.create_object_hook()
<         self.recursive_objects = self.create_object_hook()
>         self.constructed_objects = {}
>         self.recursive_objects = {}
<         mapping = self.create_object_hook()
>         mapping = {}
<         data = self.create_object_hook()
>         data = {}
<             dictitems = self.create_object_hook()
>             dictitems = {}
<             dictitems = value.get('dictitems', self.create_object_hook())
>             dictitems = value.get('dictitems', {})


$ diff loader.py Original
<     def __init__(self, stream, **constructKwds):
>     def __init__(self, stream):
<         BaseConstructor.__init__(self, **constructKwds)
>         BaseConstructor.__init__(self)
<     def __init__(self, stream, **constructKwds):
>     def __init__(self, stream):
<         SafeConstructor.__init__(self, **constructKwds)
>         SafeConstructor.__init__(self)
<     def __init__(self, stream, **constructKwds):
>     def __init__(self, stream):
<         Constructor.__init__(self, **constructKwds)
>         Constructor.__init__(self)
  • 3
Reply Report

here's a simple solution that also checks for duplicated top level keys in your map.

import yaml
import re
from collections import OrderedDict

def yaml_load_od(fname):
    "load a yaml file as an OrderedDict"
    # detects any duped keys (fail on this) and preserves order of top level keys
    with open(fname, 'r') as f:
        lines = open(fname, "r").read().splitlines()
        top_keys = []
        duped_keys = []
        for line in lines:
            m = re.search(r'^([A-Za-z0-9_]+) *:', line)
            if m:
                if m.group(1) in top_keys:
        if duped_keys:
            raise Exception('ERROR: duplicate keys: {}'.format(duped_keys))
    # 2nd pass to set up the OrderedDict
    with open(fname, 'r') as f:
        d_tmp = yaml.load(f)
    return OrderedDict([(key, d_tmp[key]) for key in top_keys])
  • -1
Reply Report

Trending Tags