Linux build status on Travis CI Documentation Status


Datakit is a pluggable command-line tool for managing the life cycle of data projects.

The Associated Press Data Team uses Datakit to auto-generate project skeletons, archive and share data on Amazon S3, and other routine tasks.

Datakit is a thin wrapper around the Cliff command-line framework and is intended for use with a growing ecosystem of plugins.

Feel free to use our plugins on Github, or fork and modify them to suit your needs.

If you’re comfortable programming in Python, you can create your own plugins (see Creating plugins).


For a system-wide install, from the command line:

$ sudo pip install datakit-core



After installing one or more plugins, Datakit can be used to invoke the commands provided by those plugins.

Let’s say you installed the datakit-project plugin, which helps create project skeletons.

To see which commands the plugin provides, try the --help flag:

$ datakit --help

The plugin provides a project create command, which sounds like it fits the bill.

To see which flags are available or required, try using the --help flag again:

$ datakit project create --help

It appears you need to specify a Cookiecutter template to use this command. Let’s try it:

$ datakit project create --template

That’s the basic recipe for working with plugins: install, explore, and invoke! [1]


To use Datakit in a project:

import datakit


This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

[1]Plugins may also provide more robust docs, so don’t forget to check those out when available.