Our previous post “SWORDv2 Compliance: what is it and why is it good?” introduced some reasons to make your scholarly systems compliant with SWORDv2, and what that really means. This post covers an approach to achieving SWORDv2 compliance for your particular use case(s), using DataStage and DataBank as examples.
First, if you are implementing a SWORD v2 solution, it’s worth having a passing acquaintance with the specification, at least so you are aware of the standard protocol operations.
An approach that we’ve found works well for designing how to fit SWORD v2 to your workflow is as follows:
1. Diagram your deposit workflow
Draw a diagram of your deposit workflow, with all the systems and interactions required. Don’t mention SWORD v2 anywhere at this point; let’s make sure that it meets your workflow requirements, not that you fit your workflow to it.
For example, here’s a basic diagram showing how the DataStage to DataBank deposit looks (click to enlarge):
2. Re-draw the diagram around SWORDv2
Re-draw the diagram in the following form (click to enlarge):
By referring to the spec throughout, you can figure out which SWORD v2 operations with which HTTP headers and what deposit content is required to achieve your workflow. For example, the diagram for DataFlow which integrates DataStage and DataBank looks basically like this (click to enlarge):
3. Utilise the pre-existing code libraries
Take one of the code libraries (which exist in Python, Ruby, Java and PHP for the client-side, and Java and Python for the server-side), and find the methods which implement the SWORD operations that your diagram tells you that you need.
For example, consider some (simplified) Python code required by DataStage to deposit to DataBank:
# create a Connection object conn = Connection(service_document_url) # obtain a silo to deposit to (this just gets the first Silo in the list) conn.get_service_document() silo = conn.sd.workspaces[0][1][0] # construct an Atom Entry document containing the metadata e = Entry(id=dataset_identifier, title=dataset_title, dcterms_abstract=dataset_description) # issue a create request receipt = conn.create(col_iri=silo.href, metadata_entry=e)
It should then be a relatively straightforward task for a programmer to integrate this code into your application workflow.
(This is the second part of a two-part article about SWORDv2 compliance. You can read the first part here: SWORDv2 Compliance: what is it and why is it good?)