:date: 2013-03-08 21:24
.. index:: tech, puppet, hiera, database, yaml
Hiera
=====
When you start configuring Puppet, you'll quickly end up defining all you parameters according to your environment in your Manifests. And as you know: that makes these manifests quite unflexible. Every small difference on a host will `require a different manifest `__.
That's not what we like, is it?
So `Puppetlabs `__ started to introduce `Hiera `__, a Hierarchical Database.
Using :program:`Hiera` for anything else but hierarchical data seems IMHO meaningless. But for :program:`Puppet` and testing of Manifests it's wonderful. After circumnavigating some issues it's even pretty simple to use.
Installation
------------
I'm already running `Octopress `__ on my system. So Ruby was already installed and a simple installation was done with *gems*.
.. code:: bash
$ gems install hiera
Alternatively you can just use :ref:`GIT <2012-git_introduction>` to clone the current repository to you disk and run the :program:`Hiera` binary from the :file:`bin`-sub-folder
.. code:: bash
$ git clone https://github.com/puppetlabs/hiera.git github.com.hiera.git
$github.com.hiera.git> /bin/hiera --help
For getting it to work for Puppet there's another module (:program:`hiera-puppet`) providing the connection between :program:`Hiera` and :program:`Puppet`. But that's not covered in this post. I'm mainly focused on how to use :program:`Hiera` from the CLI here.
--------
Configuration
-------------
Hiera needs a configuration file (:file:`/etc/hiera.yaml`)
* in order to know which backend to use,
* to know which hierarchy to use, and
* to know where to look for the data.
These are the three elements you basically need to worry about to get :program:`Hiera` working.
hiera.yaml
""""""""""
This is a basic `YAML `__ file. The elements and syntax is documented `at Puppetlabs `__. The file for the CLI is expected to be at :file:`/etc/hiera.yaml`.
The file must be a valid *YAML* file without any data.
My file contains basically only this:
.. code:: yaml
---
:backends:
- yaml
:logger: console
:hierarchy:
- '%{environment}/nodes/%{fqdn}'
- '%{environment}/locations/%{location}'
- common
:yaml:
:datadir: /data/hiera
The backend is defined as :samp:`yaml` and logging of errors shall happen to the :program:`console`.
Now it's getting special. The hierarchy is setup like this:
#. Check the yaml files in :file:`$environment/nodes/$fqdn.yaml`.
#. Check the yaml files in :file:`$environment/locations/$location.yaml`.
#. Check the yaml files in common.
The underlying folder structure is like this:
.. code::
.
├── production
│ ├── locations
│ │ ├── datacenter1-prod.yaml
│ │ └── datacenter2-prod.yaml
│ └── nodes
│ └── client01.yaml
└── testing
├── locations
│ ├── datacenter1-testing.yaml
│ └── datacenter2-testing.yaml
└── nodes
└── client01.yaml
The variable :samp:`%{environment}` is a variable I have to pass manually from the command line to _Hiera_ so that the right hierarchy folder structure can be chosen. The same is required for the variables :samp:`%{fqdn}` and :samp:`%{location}`. This is just an example to show how it works.
The last part is the :samp:datadir. We could also put the variable :samp:`%{environment}` in here. It wouldn't make a difference, I guess.
The Datadir
"""""""""""
The parameter :samp:`:datadir:` specifies where the YAML files will be located. It makes sense to put the into a version controlled directory so that you can verify changes to them and roll back if necessary.
The YAML files follow common YAML structure. Nothing special about it. The example file :file:`testing/nodes/client01.yaml` contains the following lines:
.. code:: bash
---
location: datacenter1-testing
ipaddress1: 192.168.15.100
ipaddress2: 192.168.15.101
A file like :file:`testing/locations/datacenter1-testing.yaml` has a similar setup:
.. code:: bash
---
street: Hans-Møller Gasmannsvei 9b
city: Oslo
ntp1:
- 192.168.15.1
- 192.168.15.2
Watch out for that the variables (like :samp:`city`) match exactly the writing as you'll need to use them in the parameter. If you write something uppercase in the YAML file and the variable is lowercase: you will get any return values (:samp:`City` != :samp:`city`).
The Hierarchy
"""""""""""""
The hierarchy defined in the :program:`Hiera` configuration works top to bottom. The first match for a variable counts and will deliver the value back to you.
When you specify the hierarchy folders structure and you include variables, make sure to surround the paths with quotation marks. Otherwise :program:`Hiera` will return errors and complain about invalid syntax and characters in the configuration file.
Usage
-----
.. code:: bash
$ hiera ntp1 fqdn=client01 environment=testing
This will search the :program:`Hiera` datadir for a value for the variable :samp:`ntp1`. Two variables are pre-defined:
#. :samp:`fqdn` is set to :samp:`client01`
#. :samp:`environment` is set to :samp:`testing`
The first choice will actually being done by the second parameter :samp:`environment`. Defined in :file:`/etc/hiera.yaml` it will set the hierarchy folder within the :samp:`datadir` to either :samp:`testing` or :samp:`production`. Other parameters are possible, but make only sense if the directories are actually created (and they aren't in this example).
The parameter :samp:`fqdn` has been specified in :file:`/etc/hiera.yaml` as well. :program:`Hiera` will find the YAML files only if they match with one of the provided values for :samp:`fqdn`.
The command above will set the environment to :samp:`testing` and search the file :file:`testing/nodes/client01.yaml` for a value set to :samp:`ntp1`. And in this example this will return :samp:`nil`.
The file :file:`client01.yaml` doesn't contain any value for :samp:`ntp`. We could set a value into common, but that's boring.
The NTP information is in the locations files. Setting the parameter rights will automatically return the IP addresses of the NTP server (in this example) as an array (as defined in the YAML file.)
.. code:: bash
$ hiera]$ hiera ntp1 location=datacenter1-testing environment=testing
["192.168.15.1", "192.168.15.2"]
Comments
--------
Beside minor typos and initial configuration problems there was one thing I think is worth mentioning:
My expectation with :program:`Hiera` was that it is working like a basic database. I thought that by defining a command like
.. code:: bash
$ hiera ntp1 fqdn=client01
The system would check the YAML file of :file:`client01` for e.g. the location and then return the :samp:`npt1` information from that one. Well, it doesn't. There are now cross-references within the YAML files to other files. :program:`Hiera` is a hierarchical database and the important point is: hierarchical. It will follow its hierarchy, nothing more, nothing less.