Tuesday, 2 October 2012

Digital Motor Archive available free for the month of October in return for…...


It came to my notice last week that a publisher in the UK had digitised both the current and back issues of their magazine the ‘Commercial Motor’ which in its early life was a newspaper. In the cut throat world of publishing there are few publishers left that are still publishing the same title they were 100 years ago, and who also have a complete set of back copies.  If they fall into this bracket they are in the unique position of being able to either digitise the content themselves for their readers (usually at a loss); offer it to a library to digitise (at no cost); or sell it to a commercial e-vendor to package with another product for academia (and make a profit).  Unfortunately most choose the latter, which makes this type of content only really accessible to academics and students via academic libraries.  E-vendors normally charge high subscription rates to digitised magazines and newspapers and package them up with other content, making it only viable for large universities and national libraries to purchase, and therefore severely restricting readership to the content.

Not a lot of publishers are digitising their own content because generally speaking the cost of preparation, digitisation and OCR, and building a good website to deliver the content outweigh the amount of money they would ever recuperate from reader subscriptions. Normally a subscription to the current copy would be packaged up with old copies.  And that’s where the model fails, because often current readers have no interest in the old stuff.  People who do have an interest in the old stuff are generally a different group of people – historians, researchers etc.  The exception to this rule appears to be anything to do with hobbies such as knitting, cooking, railways, cars, and stamps. 

The Commercial Motor Archive http://archive.commercialmotor.com/ came to my notice because it is available for free for the month of October and I wondered why. It is a rich archive going from 1905 to the present day, covering a complete century and two world wars, well illustrated and with everything you ever wanted to know about commercial vehicles.   

The search and browse mechanism is very impressive and works well.  For example articles on pages have been zoned so you can search and find the article easily within a page.  It has many similarities to the hugely popular Australian Newspapers http://trove.nla.gov.au/newspaper.  The page displays alongside the OCR text to make it easier to read. You can browse covers, browse by date, and zoom in on pages.  Results can be filtered. Users can add comments and tags.  The quality of the OCR text and therefore search is very good.

It has one thing that was never implemented on Australian Newspapers (though often asked for by users) which is a little box on each page called ‘Report an error’ and this is the reason for its free access.  The site owner is hoping that as people use and read the pages in the archive they will report errors as they see them, and for this they get free access to the content.  The known errors that need to be identified are incomplete articles (where the zoning has gone wrong); OCR text error in headlines of articles; and OCR errors in text. However readers can only report them, not actually fix them.  The site states:

“the archive is beta because it isn’t perfect at the moment and there are a few glitches to be ironed out. Every article page has a 'Noticed an error?' button you can use to report a problem. Please don’t expect an immediate change to the error - we gather all the reports together and prioritise them, fixing the most pressing errors first.”

It sounds like they don’t know how many errors there are, how many people will report errors, and who and how the errors will be fixed.  Interesting.

Although the ‘report an error’ button was never implemented on Australian Newspapers (it was mainly needed to report upside down and duplicate pages) it had already been decided that a ‘super user’ would be the person to review these reports and take action.  In a world where some volunteer text correctors wanted to take on extra responsibility and have special roles like the hierarchy in the Wikipedia editors community this would have been a good thing for trusted volunteers to do.

The Commercial Motor Archive has impressed me because they are clearly striving for perfection, they understand that the fewer mistakes there are the better the search will be and they have taken a brave step by asking the public to help in return for free access.  This is indeed unusual for a commercial publisher, belonging more in the realms of libraries and archives and referred to as crowdsourcing……