Categories
Clients News Repositories SWORD v2

SWORD v2 development effort starts

Following a meeting of the SWORD v2 developers earlier this week, development work to implement the proposed SWORD v2 standard has now started.  Our aim is to have things to show by the time of the Open Repositories 2011 conference in June this year.

The SWORD v2 standard is not yet finalised, however it is hoped that lessons learned during the implementations will allow the final wrinkles to be ironed out and agreed upon.

Thanks to generous funding from JISC, the project is able to fund multiple repository and client implementations.  Each of these will be made openly available, and are listed below.  Once development locations or code repositories for these become available, links will be added to this post.

In addition, a Python Simple SWORD Server (http://sword-app.svn.sourceforge.net/viewvc/sword-app/sss/trunk/) has been developed to aid initial testing, and a further automated validation test suite will be developed.

Categories
News SWORD v2

Next draft of the v2 specifcation

A new copy of the proposed SWORD v2 specification has been posted based on recent feedback from the Technical Advisory Panel:

The main changes that were made are as follows:

1/ Changed all relevant instances of URI to IRI.

2/ Provided a better introduction with text from the business case/technical approach document.

3/ Corrected the identifiers to which 6.6.2 and 6.6.3 referred.

4/ Added 2 new sections covering overwriting metadata or overwriting with Atom Multipart (sections 6.5.2 and 6.5.3), which were missing from the previous version.

5/ Added a new section (6.10) on the use of SWORD on arbitrary IRIs in order to clarify how it should be used alongside other standards like CMIS, GData, OData, or just plain old REST.

6/ Removed the use of 202 (Accepted) as an HTTP response code to any request.

7/ Better introduction to the Metadata Handling section.

8/ Removed usage of In-Progress header on IRIs which do not represent the container (i.e. the Atom Entry).

9/ Better introduction to the Continued Deposit section, and the addition of a section on how to complete an In-Progress deposit.

10/ Added a new IRI type which is called the SE-IRI (SWORD Edit IRI) which is identified by the @rel value “http://purl.net/org/sword/terms/add”, and is used to identify the IRI which can be used to do HTTP POST against for adding content to a container.  I updated all the references for HTTP POST operations to refer to this IRI, and added it and an explanation of its relation to Edit-IRI near the top of the document.  Note this doesn’t strictly add a new IRI to be maintained, the atom:link can still point to the Edit-IRI, but their usage is distinguished as per the previous discussions on this list.

11/ Changed the rules for default packaging so that in the event that a server does not announce any packaging, the client will assume that none is supported.  I believe this means that SWORD now fully simplifies to plain old AtomPub without confusion.

12/ Changed the old IRI which represented the zip package to be http://purl.net/org/sword/package/SimpleZip, which is more descriptive.

There are still some comments on the list which haven’t been addressed.  These will be looked at next in tandem with an effort to
start the development of the clients and servers, as we feel that there’s not a lot more we can get out without first having a go at the implementations.

There are just a couple of questions regarding the latest changes:

a/ Are there any obstacles in the spec to using another APP based protocol in parallel.  We are thinking CMIS and GData, obviously, but
also perhaps others like OData.  It’s important that SWORD not prove an obstacle to them.

b/ Are there any other @rel values that we need to create to accurately describe SWORD specific operations which aren’t purely APP operations?  Looking over the profile when writing this version of the spec, it was  only the SE-IRI that we picked up.  Have we missed anything?

Categories
News SWORD v2

Discussing the scope of SWORDv2

As part of the SWORD v2 developments, the Technical Advisory Panel have been busy discussing many aspects of the proposed new version of the standard.  This has been a lively and engaging process.  If you would like to read these discussions and contribute any feedback, you would be very welcome!

One particularly interesting thread came from the project’s technical lead.  The message concerns the scope of SWORD v2 (what areas it should contribute to, ans which it should not):

Hi Folks,

There’s been some great discussion on the list this past week or two, and I thought it might be time for a summary of what looks to me to be a key sticking point: the scope of sword.

There are two distinct sides to this argument as it’s been articulated on this list:

a) That we should adopt the approach of content management API like CMIS or more likely GData

b) That SWORD should be not say anything about what happens to the content once it is sent to the server.

In general, I am against (a) for a number of reasons.  First, I am concerned that the idioms that are associated with GData are not /necessarily/ appropriate.  The hierarchical file system is a common idiom but an idiom nonetheless, and it wouldn’t be SWORD’s place to therefore build itself over the top of it.  CMIS I have a harder time refuting or accepting, so am open to persuasion either way.  Secondly, I don’t see a reason to re-create a content management standard, since they already exist.  SWORD should, instead, provide support for the things that these standards don’t provide for our sector/use cases, while not preventing the use of them.

From a purists perspective of (b) the main thing that SWORD offers, then, is support for Packaging (with a capital P).  This is a valuable addition to the community since it is both common in our sector and expressly not covered at least by GData and I believe not by CMIS (though again, open to correction).  The support for packaging, though, needs to extend to a full CRUD implementation of AtomPub, which is a large part of what the profile attempts to do.  I think we have had some good technical discussion which which will allow the next draft of the profile to do better at that.

In the mean time, there are some grey area parts of the profile, particularly In Progress and Suppress Metadata which are more content management than they are deposit.  I, personally, think these are important; they are light touch, the profile doesn’t mandate the server to obey them, and they help fulfill known use cases.  Likewise the Statement could be viewed as more content management than not, although we have tried to pitch that as more an informational resource rather than an operational one (i.e. read but not write).

What I’m going to suggest for the next draft is as follows:  we’ll put some more time into analysing the appropriate ways of updating and overwriting deposit packages using the feedback on this list.  And we will extend the profile to cover how you would use the SWORD headers to be used in content management operations /if that’s what your implementation wants/ (e.g. how you might use Suppress Metadata or In Progress with GData).  There will, obviously, be plenty of time for comment.

In conclusion: we must constrain the scope of sword to something which doesn’t tread on anyone’s toes and is of value to the community.  Too far one way or the other and we’ll either be superseded or of no value.

Cheers,

Richard

Categories
SWORD v1

Retirement of old demonstration servers

Since the very first SWORD project, there have been two demonstration SWORD servers provided.  These were funded by the original SWORD project.  They have been in existence for a number of years, and have served as excellent demonstration servers for SWORD clients. However, they have now had to be retired.

http://dspace.swordapp.org/ used to provide a demonstration DSpace system. Instead, it is possible (thanks to Duraspace for making this available) to use the http://demo.dspace.org/ server.  http://fedora.swordapp.org/ has not been replaced, but if anyone knows of a public test instance of Fedora that could be used, we’d be happy to update this page.

Categories
SWORD v2

Video introduction to SWORD v2

The team over at CottageLabs.com have created a nice 2 1/2 minute technical introductory video to the SWORD v2 standard.  You can view it via their blog:

The narrator is Richard Jones, the Technical Lead for the SWORD v2 initiative.  The video provides a great high-level introduction to the main technical concepts of the standard, and how these fit into the deposit lifecycle.

Categories
SWORD v2

Decisions regarding the challenges of SWORDv2

Following some great recent discussions by the SWORD Technical Advisory Panel, we’re pleased to announce a few decisions that have been made regarding some of the details for the new version 2 of SWORD.  The full email announcing the decisions is shown below, or can be seen in the list archives of the technical advisory group: http://www.mail-archive.com/sword-app-techadvisorypanel@lists.sourceforge.net/msg00105.html

The decisions came about from discussions within the group over the past few weeks.  They relate to the following questions:

  1. Whether the Statement should be embedded in the Deposit Receipt or be a separate document referenced in an atom:link element: In order to allow SWORD v2 to move from a fire-and-forget methodology to one where a SWORD client can interact with the deposit through what we’re calling the ‘deposit lifecycle’, some form of feedback is required where the client can ask the server for details of what has happened to the deposited item(s).  The proposal is to support this via the provision of a ‘statement’.  Think of it a bit like a bank account statement: You can see what has gone into the bank account (deposits), what might have have happened to the deposit (e.g. interest being added), and full details of of the item.The question here, was whether a copy of the statemnet should be given to the SWORD client when it makes the deposit(s), or if the client should ask for a copy of the statement whenever it wants it.
  2. Whether to use OAI-ORE for the Statement format or an Atom Feed (as per CMIS and GData): There is a decision to be made as to how the statement should be formatted.  Should it be formatted as an OAI-ORE resource map, or using an Atom Feed.  There are pros and cons for each method.
  3. How the client and server should negotiate over the format of the content returned by the edit-media link (EM-URI): If multiple formats of statement are allowed, how should the client and server come to an agreement as to which is the best format to send, based upon a combination of the servers capabilities and the clients preferences.  This problem is known as content negotiation.

The full email below outlines these problems, and the decisions made.  The next job is to now attempt the implementation of the standard, and based on the experiences of the developers and initial users, the standard will likely become refined further.

Dear All,

Thanks for your extensive feedback on the various issues that we have been discussing on this list, it has been really valuable for the project team to get this input.  We have, we think, identified 3 particular issues of contention:

1/ Whether the Statement should be embedded in the Deposit Receipt or be a separate document referenced in an atom:link element

2/ Whether to use OAI-ORE for the Statement format or an Atom Feed (as per CMIS and GData)

3/ How the client and server should negotiate over the format of the content returned by the edit-media link (EM-URI)

The project team has gone through each of these issues carefully, and attempted to extract the simplest solutions but with a view to keeping the SWORD 2.0 specification quite open at this stage, so that community best practices can actually inform the standard itself in the long run.

Therefore, we’re proposing the following approaches to these issues:

1/ Whether the Statement should be embedded in the Deposit Receipt or be a separate document referenced in an atom:link element

If the Statement is to be embedded in the Deposit Receipt, then it needs really to be in OAI-ORE form, for the purposes of being clear foreign markup.  Nonetheless, bearing in mind that there is a question as to whether the Statement should be an Atom Feed, it is clear that this solution will not be adequate by itself.  We therefore propose that the standard provided to the project’s funded developers to code against says that an OAI-ORE serialisation MAY be embedded in the Deposit Receipt (the Deposit Receipt will not be required to meet the OAI-ORE spec for being a resource map itself).

Alongside – or instead – of this, there MAY be one or more atom:link elements in the Deposit Receipt which link to an external Statement. These atom:link elements can specify their type attribute to say whether they are an application/rdf+xml or  application/atom+xml;type=feed.  It will be a requirement of the spec that there MUST be an embedded Statement or at least one separate Statement.

Therefore, you may see a Deposit Receipt like:

<atom:entry>
  <atom:link rel="http://purl.org/net/sword/terms/statement" type="application/rdf+xml" href="http://....."/>
  <rdf:RDF>
    <!-- ORE statement goes here -->
  </rdf:RDF>
</atom:entry>

2/ Whether to use OAI-ORE for the Statement format or an Atom Feed (as per CMIS and GData)

Another good reason for the approach in (1) is that this means we can provide different Statement URIs with different type attributes.  We plan to ask developers to produce an ORE and an Atom Feed Statement format under the project funding.  So you may see a Deposit Receipt like:

<atom:entry>
  <atom:link rel="http://purl.org/net/sword/terms/statement" type="application/rdf+xml" href="http://....."/>
  <atom:link rel="http://purl.org/net/sword/terms/statement" type="application/atom+xml;type=feed"href="http://....."/>
   <rdf:RDF>
      <!-- ORE statement goes here -->
   </rdf:RDF>
</atom:entry>

The combination of approaches in (1) and (2) may seem woolly or indecisive, but we believe that we can’t determine in advance which of these approaches is better, and that it should be up to the community of users and implementers to decide which approach works best based on actual usage of the developed software.  Therefore, while the burden of implementation is placed on the funded portion of the project, we expect community driven implementations/usages to favour one approach over another (possibly taking into account things like compatibility with GData and CMIS, or preferring the more semantic web approach of ORE). We can then use this information later in deriving a SWORD spec which is based on best practices.

3/ How the client and server should negotiate over the format of the content returned by the edit-media link (EM-URI)

The Content Negotiation issue arises from the fact that AtomPub requires at most one edit-media URI with a given type to be available in the Atom Entry (Deposit Receipt).  Since the SWORD server may contain multiple files rather than the one file that AtomPub assumes, what this EM-URI returns under GET is unclear.  We initially considered 2 approaches:

a/    A separate HTTP header like Accept-Packaging to allow content negotiation on a package format
b/    A separate HTTP header like Accept-Media-Features to allow general content negotaiton on feature sets

As we discussed, both of these have pros and cons, and none of the approaches to doing this are marked by any best practices, which makes the project team unwilling to commit to anything too complex or substantial, at a risk to the simplicity and overall success of SWORD. Instead we are suggesting adopting a much simpler approach:

The Deposit Receipt can contain already contain a sword:package element (as per SWORD 1.3), and SWORD 2 plans to allow an arbitrary number of such elements.  These elements will describe the packaging formats supported by the server, so the client will know in advance what the capabilities of the server are.  Therefore, instead of engaging in a content negotiation process, the client will just specify a separate HTTP header indicating what package format should be returned.  Whether this header re-uses the Packaging header used during deposit or specifies a new header has yet to be decided.

Hopefully these approaches make sense to the group.  We are interested in how you think these will go down both during the project and beyond in the community, and if there are any obvious problems with what we’re proposing here as the way forward for SWORD.

All the best,

Richard
(On-Behalf-Of the SWORD project team)

Categories
News Repositories SWORD v1 SWORD v2

SWORD wikipedia entry

SWORD now has a wikipedia entry!

http://en.wikipedia.org/wiki/SWORD_(Protocol)

The page does not have much detail on at the moment, so if you have a minute or two, please take a look at the page, and see if you could add / edit / correct / improve / enhance the page.  If you know of any other entries that should link to or reference the SWORD entry, please could you add those in too.

If you have any SWORD-related content or links that you would like to be added to this site (links to implementations, code, documentation, blog entries, papers) please pass them on to info@swordapp.org and we’ll get them added.

Categories
SWORD v2

SWORD Technical Advisory Panel

As part of the development of SWORD v2 a Technical Advisory Panel has been formed.  This panel consists of experts from across the SWORD and general Digital Repository domains, along with experts in related fields. The purpose of the panel is to ensure that the standard develops in a way that meets the needs of its user community, that it exhibits best-practice in the area of Internet standards, that developers are able to work with it, and that it tries to be generic enough to allow interoperability with other types of systems whilst maintaining its focus on repository resource deposit.  The panel consists of people from universities, national libraries, research funders, commercial companies, developers, repository domain experts, and repository managers.

The following people have generously donated their time and expertise to be on this panel:

  • Julie Allinson (The University of York)
  • Tim Brody (University of Southampton)
  • Pablo de Castro (SONEX / Universidad Carlos III de Madrid)
  • Charles Duncan (Intrallect)
  • Reinhard Engels (Harvard University Library)
  • David Flanders (JISC)
  • John Fearns (Symplectic)
  • Kathi Fletcher (Shuttleworth Foundation Fellow)
  • Steve Hitchcock (University of Southampton)
  • Jason Hoyt (Mendeley)
  • Bill Ingram (University of Illinois at Urbana-Champaign)
  • Richard Jones (SWORD Technical Lead)
  • Graham Klyne (University of Oxford)
  • Stuart Lewis (SWORD Community Manager / The University of Auckland Library)
  • Mark MacGillivray (Developer)
  • Andrea Marchitelli (CILEA)
  • Alistair Miles (The Wellcome Trust Centre for Human Genetics)
  • Ben O’Steen (Developer)
  • Glen Robson (National Library of Wales)
  • Richard Rodgers (MIT)
  • Robert Sanderson (LANL)
  • Peter Sefton (Australian Digital Futures Institute, University of Southern Queensland)
  • Nick Sheppard (UKCoRR / Leeds Metropolitan)
  • Eddie Shin (MediaShelf)
  • Alec Smecher (Public Knowledge Project)
  • Adrian Stevenson (UKOLN)
  • Ian Stuart (Repository Junction / EDINA)
  • Ed Summers (Library of Congress)
  • David Tarrant (University of Southampton)
  • Robin Taylor (The University of Edinburgh)
  • Graham Triggs (BioMed Central)
  • Alex Wade (Microsoft External Research)
  • Paul Walk (UKOLN)
  • Simeon Warner (arXiv)
  • Scott Wilson (CETIS)
  • Nathan Yergler (Creative Commons)

In the interests of openness, the group discussions are being archived in an open mail archive: http://www.mail-archive.com/sword-app-techadvisorypanel@lists.sourceforge.net/

Categories
SWORD v2

SWORDv2 Project Plan – Timeline

This is the third in a series of blog posts outlining the SWORDv2 project plan.  This blog post details the proposed timeline of when each work package will take place.

The project is split into two phases, an initial research and specification phase, followed by a development and test phase.  During this time a second stream of advocacy and general community management will be undertaken.  There will also be a stream of work for project support.

Phase One: November 2010 – January 2011

  • Technical
    • Work package 1: Compile use cases
    • Work package 2: Analysis of white paper
    • Work package 4: Creation of prototype SWORD v2 specification
  • Community
    • Work package 3: Initial community formation
  • Project support
    • Work package 10: Project dissemination
    • Work package 11: Project administration

Phase Two: February 2011 – April 2011

  • Technical
    • Work package 5: Server implementations
    • Work package 6: Client implementations
  • Community
    • Work package 7: Instructional documentation and ongoing community management
    • Work package 9: Support the JISCDepo projects
  • Project support
    • Work package 8: Develop a sustainability plan
    • Work package 10: Project dissemination
    • Work package 11: Project administration
Categories
SWORD v2

SWORDv2 Project Plan – Workplans

This is the second in a series of blog posts that is outlining the project plan for the SWORDv2 project.  This post will describe the work packages that will be undertaken over the next 6 months to complete the project.

Work package 1: Compile use cases

  • The project will work with the SONEX group to gather, document, and publicise relevant use cases that fit with the overarching principles of SWORD v2. These use cases will be used to ensure that the SWORD v2 standard meets the requirements of the repository community.  In addition the community manager will ensure that the repository manager community is made aware of the project, and is encouraged to participate in the collection use cases.

Work package 2: Analysis of white paper

  • As a concluding part to the SWORD 3 project, the Technical Lead wrote a white paper that outlined the potential requirements of the SWORD v2 standard. This white paper was given wide publicity at the Open Repositories 2010 conference and was placed online on the JISCPress site (http://sword2depositlifecycle.jiscpress.org/) for comment by the community.The feedback received from the community needs to be analysed, and once combined with the use cases from the SONEX group it will be written-up to create a document that will define the use cases, requirements, and outputs of the SWORD v2 project. The document will be used to describe to the community what the standard will do and why it will do it, and will give rise to the standard itself in work package 4.

Work package 3: Initial community formation

  • While the analysis of the white paper is being undertaken the Community Manager will need to perform work to start to form an initial community around the SWORD v2 work. This first phase of community building will be centered around publicity of the project including a project web site and blog (to supplement the current swordapp.org site), an elevator pitch to explain the aims of the project to the community, and the creation of an effective communications channel to allow dissemination, discussion and feedback to easily take place.A technical advisory panel will be created that consists of project staff members, and developers and repository managers from the worldwide repository community. They will come from a cross section the different repository platforms and user types. Whilst all aspects of the project will be open for comment by any interested party, the technical advisory panel will be consulted closely at all times to ensure the project is meeting the requirements of the repository community.

Work package 4: Creation of prototype SWORD v2 specification

  • Following the analysis of the white paper, the creation of the report and the formation of the initial community, a prototype SWORD v2 specification will be written. The specification will detail the SWORD v2 protocol to a level where it can be discussed, evaluated, and implemented.The specification will be considered a draft as it is envisaged that during the later implementation phase that the specification will change in light of user experience and evaluation.

Work package 5: Server implementations

  • Once the prototype specification has been written, a server implementation will be required. This will be used in conjunction with the client implementations (work package 6) to test the specification and evaluation it against the use cases and requirements as specified earlier in the project.  The server implementations should attempt to provide some mechanism by which client requests can be validated.  All resultant code will be released with a suitable open source licence.Three server implementations will be created, one each for DSpace, EPrints, and Fedora.  If appropriate and possible, other server implementations will be encouraged with technical support from the project.

Work package 6: Client implementations

  • Once the prototype specification has been written, client implementation will be required. This will be used in conjunction with the server implementations (work package 5) to test the specification and evaluation it against the use cases and requirements as specified earlier in the project.  The client implementations should attempt to provide some mechanism by which they can be used to validate a sword endpoint.There will be 4 client implementations created, to offer a highly heterogeneous environment within which to test the use cases and requirements.  They are also designed to offer a series of multi-language software libraries for easy use within other systems not dealt with here. All resultant code will be released with a suitable open source licence.  The clients will include Java, PHP, Ruby and Phython code.

Work package 7: Instructional documentation and ongoing community management

  • In order to be empowered to interact with the SWORD v2 standard the community will require instructional documentation to make use of the proposed new standard. This will take the form of code examples, training materials, and support.The community will need facilitation to ensure it interacts with the standard and the demonstration implementations. This will be achieved through the provision of support, advocacy and publicity of the standard the implementations.As well as ensuring the user and developer communities adopt SWORD v2 and feel ownership of it, the work of advocating, educating and promoting SWORD v1. This will be achieved by maintaining the SWORD v1 website, and by seeking further opportunities to teach ‘The SWORD Course’ at suitable events.

Work package 8: Develop a sustainability plan

  • A viable sustainability plan is required to ensure the ongoing development and maintenance of the SWORD protocol standard,the implementations, and the advocacy.

Work package 9: Support the JISCDepo projects

  • The JISCDepo programme (#jiscdepo) is a suite of projects that work in the area of repository deposit. This work package will support the JISCDepo projects to use SWORD v2 where applicable, and ensure the SWORD v2 standard meets any suitable requirements that they have.The work will be undertaken by UKOLN through the DevCSI infrastructure. In addition to supporting the JISCDepo programme, they will support the use of SWORD in the wider repository programme of projects funded by JISC. If appropriate, the involvement of DevCSI will be used to run a community development event to bring together developers to experiment and develop with the SWORD v2 standard.

Work package 10: Project dissemination

  • In addition to the advocacy and community development of SWORD v2 through the use of the project website, blog, and other activities, the project will also disseminate its findings, the developments it has created, and encourage community interaction by attending major repository events.  Additionally the project will seek to work with existing support networks such as the RSP to deliver training and and advocacy through their programmes of events.

Work package 11: Project administration

  • The project will require standard JISC project management activities to take place. These include formal project plans, reports, budgets, and attendance at relevant programme meetings. Overheads and administration costs will be required for UKOLN to support these activities and the general running of the project.