What is in ProvToolbox 0.7.3?

Today, I released ProvToolbox 0.7.3. The  principal changes in this new version of ProvToolbox are concerned with prov-template, the templating system for provenance. The new release also contains few minor bug fixes and changes.

1. Template System

A reminder: a PROV-template is a PROV document, in which some variables are placeholders for values. A PROV-template is a declarative specification of the provenance intended to be generated by an application.   A set of bindings contains associations between variables and values. The PROV-template  expansion algorithm, when provided with a template and a set of bindings, generates a provenance document, in which all variables have been replaced by values.

PROV-template is a new approach to creating a provenance-enabled application. Templates are designed and embedded in the application’s code, the application logs values (in the form of bindings), and provenance is automatically generated by template expansion.

A tutorial for templates is available on this blog:

In ProvToolbox 0.7.3, we have adopted a more compact and user-friendly representation for sets of bindings. Instead of representing them as PROV, we can now represent them as JSON.  At the same time, we also handle variables in a more uniform manner, allowing variables occurring in mandatory position, to be also used in attribution position. I won’t go into the technical details, but these two changes make the design of templates and the construction of bindings  much simpler!

A further change is that we have implemented a simple “bindings bean” compiler: it takes a template definition and creates a java class, which allows sets of bindings to be created directly from Java, and serialized easily.  The aim of this compiler is to simplify the implementation of applications generating provenance.

The GitHub source code repository contains code for two further tutorials (Tutorial5 and Tutorial6). I will write up the text for these tutorials in the New Year.

2. Qualified Pattern for All PROV Relations

At the recent PROV: Three Years Later Workshop, I made the case for the Qualified Pattern  to be used for all PROV relations. My key motivation for this extension to PROV is my provenance summarisation algorithm, which generates a “summary provenance graph“, in which nodes and edges are annotated with weights indicating how frequently these kinds of nodes and edges  can be found in the original graph. To allow for such annotations to be added to specialization, alternate, and membership relations, they need to support the Qualified Pattern.

At this stage, it is the data model that is modified. Serialization to xml and provn is work in progress, and not supported in prov-json and prov-sql yet. Furthermore, there is no parsing yet. Three new interfaces have been defined in the package org.openprovenance.prov.model.extension.

3. Release Log

For full details of the changes, see the release log at https://github.com/lucmoreau/ProvToolbox/wiki/Releases#073.

4. Conclusion

We keep on using ProvToolbox in various applications to generate provenance with templates and to undertake some analytics using the summarisation algorithm. This new release was critical to support these two use cases of ProvToolbox. Shortly, I will release two further blogs with new tutorials for prov-template.

As always, all relevant links can be found at http://lucmoreau.github.io/ProvToolbox/, including binary installers for linux (rpm and debian) and macosx.

Seasonal greetings!