Here’s an admission for you: I have a hard time comprehending all of the details around what Google Books is up to. I gather they’re scanning everything (though not necessarily well — more on that in a sec) ever published, posting the out-of-copyright stuff, linking to snippets of the other stuff, and there are some folks objecting to that, so there’s a lawsuit, and a settlement in the works. This article, The Audacity of the Google Book Search Settlement, caught my eye today, and laid out some of the legal hoop-jumping, so that helped. And I know that the American Library Association has been having their say, as they generally do (not a fan, but that’s another post). I’m just not sure what I think of it all.
I digitized a book myself last year, a book the Town of Amherst holds the copyright to so no worries there, and is long out of print (A History of the Town of Amherst, New York, 1818-1865). It was a painful, arduous process, and I even had a grant to do it, but oy, I’m not sure I’d gear up for that again. But there’s Google Books, with tons of resources, just scanning and posting away, and doing all the heavy lifting, so that’s good, right? For the same reason I scanned my one book (it’s useful to local researchers and there are only a few copies out there, so this way anyone can access it), Google Books is scanning, well, all the books.
But — I’ve been playing around with the Barnes & Noble ereader app for the iPhone, and used it to download a handful of free ebooks from Google Books. The results have all been poor. Emma chopped off a few words at the beginning and had all kinds of bad characters, poor OCR. Ditto for Persuasion. Virgil’s Aeneid wasn’t so bad. But Anna Karenina was missing the first four chapters — completely missing them. I was appalled. The sloppy OCR I can try to get past (though…) but leaving out four chapters?!