Skip to main content

The Basics: Large scale Digital Project Management Part 1

The Basics: Large scale Digital Project Management Part 1

I went through a class last year about digital project management, and as much as I was impressed at the technical part of the class, and the part of the class where we designed a website to put the digital stuff up, I was surprised to find out that the class had very little to do with large scale project (items that were over 100 items). It also completely ignored book digitization. This is odd, because 90% of my job is large scale digitization book efforts.

There are a few things that you need to consider when doing a large scale digitization effort (especially if it’s books).

What is the quantity of items to be digitized?

A collection of 200 items is going to be treated differently than a collection of 14,000 items.

How much variety is there in the group of items?

Are they all books? All pictures? Are they mixed documents? Are they bound? If you find that the group has many subgroups, go ahead and divide them out and make each format a different phase of the project. Formats that you don’t have equipment for currently, or items that are difficult can be put at the end.

Who wants the items digitized and why?

The answer to this question can greatly change how much effort you put into a project. If the project is just a passing idea from someone who isn’t invested in it, then they probably won’t care how long it takes to get up, or what the quality is.

When do they want the items to be through scanning?

If they expect a 14,000 item collection to be done in three months, this might be an indication that the person has unrealistic expectations. Then again, if they say they don’t care when it’s done, assume they mean a year or two.

When do they want the items returned?

Do they want the items returned all at once, or in batches? Just keep in mind that if they want the items returned all at once, that means you have to store the items somewhere. It’s easier to do things in batches, have them delivered and have them sent back.

When do they expect the items to be on-line and accessible?

Some people assume that after something is scanned, it’s immediately available on-line. Make sure they are aware that it may take twice as long to post process and item as it took to scan it.

Are there any copyright restrictions that need to be taken into consideration?

A digital collection is almost useless if you don’t have the rights to make it available on-line. Make sure that you know what restrictions are there, and how you can work with them to make sure the collection gets to the people they need to.

Do we have storage space for all the files for 10 years?

When you’re dealing with a book project, remember that each book has about 300 pages, and each page is an image. If you put them into a PDF, that helps on the size, but you still have the problem about storing the files for archival purposes. Do a few test documents and find out how much space you’ll need if every item in the collection was done the same way with the same specifications.

Do we have a content management system that can handle that many items?

A traditional website can only handle so many items. If you want a large scale digitization project to be useful, you might consider getting a content management system created for large collections that will support the correct metadata and searching.

Stay tuned for Part 2: Project planning

Comments

Popular posts from this blog

Atiz scanner and Kirtas scanner aren’t playing nice with eachother

I love the Atiz scanner for it's simplicity, good design, and utility. I love the Kirtas scanners for their speed and their "wow" factor when people see the things work. The only problem I have at the moment is taking our current Kirtas workflow (using Kirtas's software Bookscan Editor, Superbatch, and OCR manager), and finding a way to make the Atiz scanner workflow work with it. The Atiz machine came with a hefty batch editing program that does a great job of cleaning up the images and making them wonderfully presentable. The machine even came with a PDF maker, but it doesn't OCR on its own, and it doesn't give you the options that Kirtas' OCR manager do. So, I want to process the Atiz scanner finished images using Kirtas’s OCR manager. However, that seems to be more difficult than I had first expected. For the next month, I’ll be trying to figure out how to make this marriage of Atiz and Kirtas systems work. If it ends up failing, then I may have t...

Ex Libris Digital Preservation system

Today I attended a webinar from Sun Microsystems about the new Ex Libris Digital Preservation system. You can view the webinar here . The talking points are they handle all the hardware and they can handle the software. They claim it’s secure and built with redundancy. The major problem is that they say you can’t provide access to the files without getting Primo (Ex Libris’s new Amazon-like catalog toy-which is looking fun). They won’t convert the files for you when the formats out of style, but they make it so that you can maintain and upgrade the files. All and all, I like the idea of a comprehensive digital preservation system being handled by people who know hardware. I Just think it is going to be too expensive for most libraries. Time will tell how many libraries pick this up.

Microfilm and Microfiche scanners

I have been researching high speed microfiche and microfilm scanners for the last year. There are four major companies that produce microform scanners. Mekel (a Crowley Company), Wicks and Wilson , nextScan ,and Sunrise . They each have their advantages and disadvantages. Both nextScan and Sunrise have 3-in-1 or 2-in-1 models, where you have one machine (~$100,000) that comes with one attachment, and you buy other attachments for different types of microform (Microfilm, Microfiche, and Aperture card). Each attachment costs extra. I never figured out the cost for the attachments. nextScan also has a dedicated roll film scanner , that I’ve heard good reviews from the Newspaper Digitization Project in Australia . In general, I have heard that the 3-in-1 or 2-in-1 machines are fine, but they tend to go slower than dedicated machines. They really are built for versatility and marketed toward libraries who can only afford one machine that can do all types (Paying $100,000+ for one...