Skip to main content

The Basics: Software

Image editing software

Which software you use depends on what you are digitizing. If you are digitizing images, Photoshop or something similar is vital. The GIMP is a similar program that is free.

Whenever possible you want to avoid having to edit the images at all. You want your scan to have good lighting, good white balance, and a good crop box before you touch it with a program. It saves time and saves the image from unnecessary tampering.

All image editing software works off of different algorithms. This is why the same function will work differently in different programs. The more expensive and popular software (like Photoshop) has good algorithms that may produce better results.

The basic functions you want in an image editing software:
· color correction
· cropping
· brightness/contrast correction


Text image processing software

If you are scanning mostly text, then you need a totally different kind of program. If you are doing text, then you are likely doing books, newspapers (stuff that has a lot of pages). You will probably also want OCR (Optical Character Recognition ), so you can make the text searchable.

In order to make Text from a scan readable, the scan has to be as clear as possible, as level as possible, and as clean as possible. This means that when you are looking at text image processing, you want these basic functions:

· Batch processing (many images in a batch without human interaction)
· Crop
· Conversion to bi-tonal (black and white) or grayscale
· De-skew (leveling the image based on lines of text)- There is actually a plugin for deskewing for GIMP


Many companies that sell digitization equipment will have some piece of software that takes care of these issues. Check around, and ask.

In addition to a program that can to the above, you need a separate program to OCR. The program I’ve heard used most often is AABBY Finereader. It creates a PDF with searchable text.

That's the basic software.

Comments

Popular posts from this blog

Documentation and Good Management in Digital Libraries

This month is all about self-evaluations for me and my employees.  Because of this, I have been thinking about how a manager is supposed to show their work and their worth. The easy answer is to say that if the employees are doing well, then the supervisor is good. It could be that the employees are doing well despite a bad supervisor. An employee doing badly is also not a sign of a bad supervisor. So what tangible thing can I say makes me a good or bad supervisor? Throughout the year, I try to focus on the actions I take to make my employees' lives at work better. I try to give them direction, advice, and help make things easier. I also try to champion them. Things do not always work but I adjust. When I sit down to write my own evaluation, though, I end up writing about documentation. To me, that is a concrete indicator of a good supervisor. They care enough about the work, and their employees, to write things down and make a record. I want to challenge everyone to write

The Workload Iceberg for Digital Collections and Initiatives

In the last few weeks, I was asked to write a small paragraph explaining my area to others in the library.  I was happy to do this, as many people say they don’t know what my people do.  It’s sometimes hard to explain to others what we do without going into overtly technical topics and terms.  If we have done our job right, we’re practically invisible, which is the way it should be.  Anyway, writing the description made me realize why there is often a mis-match between what we do and what people think we do.  I’ll let you read the description yourself.  I’ve underlined the important bit. “Digital Resources is primarily an Open Access publisher.  We publish both born digital items (produced by students or faculty), and we scan to publish or republish old items. We curate digital collections through the whole digital life-cycle. Our work is a bit different from other departments because the more work we finish; the more work we create in having to maintain the collections. We’re no

HathiTrust and Local Digital Stewardship

An article I had been working on for a bit with Heidi Winkler finally got published in the second volume of the International Journal of Librarianship in a special issue on Data Librarianship . Abstract This article reviews the influence that massive digital libraries like the HathiTrust Digital Library can have on local, smaller institutions’ digitization, preservation, and curation programs. The history of HathiTrust’s digital preservation efforts as a Trusted Repository is reviewed. A case study is presented showing how one academic library made difficult digital stewardship decisions in a modern world of globally federated preservation initiatives. The authors introduce the concept of deselection as part of the digital curation process and discuss how digital collection administrators can refine their local digital preservation efforts to better reflect the realities of constrained human and financial resources. Read more...