About nomenklatura


nomenklatura is a reference data recon server. It allows users to manage a list of canonical values (e.g. person or organisation names) and aliases that connect to one of the canonical values. This helps to clean up messy data in which a single entity may be referred to by many names.

The key elements in the service include the dataset, which is a segmented unit of reference data, a set of values - i.e. the canonical forms - and a set of link which connect a non-standard form to a value.


Using the API


The API is spread through the application, JSON representations of most pages can be retrieved by setting an Accept header. Similarly, all forms can be submitted as JSON data by setting the requests Content-Type.

To authenticate against the API, look up your API key on the user page after you are signed in. The API key can be sent either as the content of an Authorization header or as a query paramter called api_key.

When a new key is looked up via the API, a new link object is created, which will be added to the queue and presented to users for manual linkage. Once a link exists, the API will return the corresponding value.

  • /{dataset} - retrieve basic dataset metadata, including the reconciliation algorithm parameters.
  • /{dataset}/lookup - look up a link matching the given key. By default, if no judgement exists, a new link element is created and queued for reconciliation. This does not happen if the user is not authenticated or the readonly query parameter is set.
  • /{dataset}/values - retrieve a listing of values. A new value can be created with a POST request to this location, with a single field called value and an optional data dictionary.
  • /{dataset}/values/{id} - retrieve an individual value, or update the same values with a POST request to this location.
  • /{dataset}/links - retrieve a listing of links.
  • /{dataset}/links/{id} - retrieve an individual
  • /{dataset}/links/{id}/match - get a list of options for the link or POST a link decision as the choice parameter.

nomenklatura-client for Python


To facilitate the use of the API in Python, a client library is available: https://github.com/okfn/nomenklatura-client

To install the library from the command line, try this command:

pip install pynomenklatura
from nomenklatura import Dataset

dataset = Dataset('my-dataset', api_key='..')

# Create a reference value (normally done via the UI):
in_value = dataset.ensure_value('This is a reference value')

try:
  key = 'This needs to be reconciled'
  out_value = dataset.lookup(key, readonly=False)
  print out_value.value

except dataset.NoMatch nm:
  # no match exists, the new key has been queued
  # for reconciliation.
  # link it to the value we created before (this 
  # is normally done via the UI):
  dataset.match(nm.id, in_value.id)

except dataset.Invalid inv:
  # the key is known but not a valid value (i.e.
  # data error)

License


By submitting databases to this service you agree to license them under the terms of the Open Database License.