What is in ProvToolbox 0.5.0?

Release 0.5.0 is the second Christmas release of ProvToolbox. A year ago, I was releasing ProvToolbox 0.1.1. At the time, the Provenance working group had just released its candidate recommendations, and was in the implementation phase of PROV. Since then, PROV has become a recommendation. ProvToolbox has also changed dramatically, being released no less than 9 times since last Christmas.

This blog post highlights the key new features found in ProvToolbox 0.5.0.

1. Artefact architecture

Benefiting from the stable nature of PROV, ProvToolbox underwent significant refactoring. PROV-DM is essentially specified by a set of interfaces. They are implemented by POJOs, offering a Java representation of the PROV model in memory.  This Java representation can be marshalled to various formats, using two different kinds of marshallers: POJO-based and external marshallers, which I now define.

POJO-based marshallers include PROV-XML and a very(!) preliminary mapping to SQL. The design is extensible and other serializations could be defined. For instance, Spring Data could be used for serializing to NoSQL databases (any taker?). POJO-based marshallers typically use Java annotations to specify marshalling: JAXB annotations are used for marshalling to XML and JPA annotations for mapping to SQL.

External marshallers take care of the conversion to rdf, json, and graphviz representations.  These marshallers only rely on accessors to access the properties of the objects represented in memory.

A significant contribution of this release is the refactoring of the Maven artefacts to minimise cross-dependencies. For instance, the converter to rdf, prov-rdf, only depends on the prov-model and is independent of all other artefacts.

The following figure summarizes the component architecture of ProvToolbox.

Key Components of ProvToolbox

Key Components of ProvToolbox

2. Qualified Names

PROV uses qualified names to denote resources. Qualified names can be converted into URIs ensuring compatibility with the web architecture. PROV qualified names have a syntax that is more permissive than XML QNames.

PROV POJOS are now specified in terms of QualifiedNames. QualifiedNames replace java.xml.namespace.QName, which have essentially been phased out from the 0.5.0 code since they are expected to be compatible with  XML QNames.

For instance, in PROV, one is concerned by the generation of entities by activities. This is modelled by the following interface, with getters and setters for entity and activity, identified by a QualifiedName.

public interface WasGeneratedBy extends  .... {
  void setEntity(QualifiedName entity);
  void setActivity(QualifiedName activity);
  QualifiedName getEntity();
  QualifiedName getActivity();
}

Full details about this interface can found here.

3. Documentation

It was now time to provide some documentation for ProvToolbox. The focus has been on Javadoc providing good cross-reference to the PROV specifications. It can be found at http://openprovenance.org/java/site/0_5_0/apidocs/.

4.  Miscellaneous Improvements

A series of improvements have been brought to ProvToolbox 0.5.0:

  • A new “visitor” interface for the PROV statement has been defined. It makes it very easy to define functionality that is statement specific  (see StatementAction). A variant of this visitor also allows for values to be returned (see StatementActionValue).
  • In 0.4.0, I introduced the class Namespace to help manage prefix-namespace mappings. PROV-N allows for bundles to inherit prefixes from the enclosing document. To implement this mechanism, Namespace can now be chained, and prefixes can be looked up along that chain.
  • Extensive testing of the Key construct for PROV-Dictionary.

5. Use cases: ProvValidator and ProvTranslator

I don’t develop ProvToolbox just for the sake of it. It is used as a core component of ProvValidator and ProvTranslator.

ProvValidator  implements the prov-constraints specification over prov-model. The validator is made available as a service at https://provenance.ecs.soton.ac.uk/validator/view/validator.html.

ProvTranslator is also an online service offering conversion of PROV into various representations. It is essentially a service wrapping up ProvToolbox. It is available from https://provenance.ecs.soton.ac.uk/validator/view/translator.html.

Conclusion

In the new year, the focus will be on bug fixing, tackling outstanding issues in the tracker, and refactoring code, with a view to release ProvToolbox 1.0.

Useful Pointers

GitHub repository: https://github.com/lucmoreau/ProvToolbox/

Javadoc: http://openprovenance.org/java/site/0_5_0/apidocs/

Maven repository: http://openprovenance.org/java/maven-releases/

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s