Technology and Books for All - Part 6
Library

Part 6

Jean-Paul added in June 2000: "Surfing the web is like radiating in all directions (I am interested in something and I click on all the links on a home page) or like jumping around (from one click to another, as the links appear). You can do this in the print media, of course. But the difference is striking. So the internet didn't change my life, but it did change how I write. You don't write the same way for a website as you do for a script or a play.

But it wasn't exactly the internet that changed my writing, it was the first model of the Mac. I discovered it when I was teaching myself Hypercard. I still remember how astonished I was during my month of learning about b.u.t.tons and links and about surfing by a.s.sociation, objects and images. Being able, by just clicking on part of the screen, to open piles of cards, with each card offering new b.u.t.tons and each b.u.t.ton opening onto a new series of them. In short, learning everything about the web that today seems really routine was a revelation for me.

I hear Steve Jobs and his team had the same kind of shock when they discovered the forerunner of the Mac in the labs of Rank Xerox.

Since then I have been writing directly on the screen. I use a paper print-out only occasionally, to help me fix up an article, or to give somebody who doesn't like screens a rough idea, something immediate. It is only an approximation, because print forces us into a linear relationship: the words scroll out page by page most of the time. But when you have links, you have a different relationship to time and s.p.a.ce in your imagination. And for me, it is a great opportunity to use this reading/writing interplay, whereas leafing through a book gives only a suggestion of it -- a vague one because a book is not meant for that."

2000: YOURDICTIONARY.COM

[Overview]

After founding A Web of Online Dictionaries (WOD) in 1995, Robert Beard included it in a larger project, yourDictionary.com, that he cofounded in early 2000. He wrote in January 2000: "The new website is an index of 1,200+ dictionaries in more than 200 languages. Besides the WOD, the new website includes a word-of-the-day-feature, word games, a language chat room, the old Web of On-line Grammars (now expanded to include additional language resources), the Web of Linguistic Fun, multilingual dictionaries; specialized English dictionaries; thesauri and other vocabulary aids; language identifiers and guessers, and other features; dictionary indices. yourDictionary.com will hopefully be the premiere language portal and the largest language resource site on the web. It is now actively acquiring dictionaries and grammars of all languages with a particular focus on endangered languages. It is overseen by a blue ribbon panel of linguistic experts from all over the world."

[In Depth (published in 2001)]

After creating A Web of Online Dictionaries in 1995, Robert Beard cofounded yourDictionary.com in early 2000. He wrote in January 2000: "A Web of Online Dictionaries (WOD) is now a part of yourDictionary.com (as of February 15, 2000). The new website is an index of 1,200+ dictionaries in more than 200 languages. Besides the WOD, the new website includes a word-of-the-day-feature, word games, a language chat room, the old Web of On-line Grammars (now expanded to include additional language resources), the Web of Linguistic Fun, multilingual dictionaries; specialized English dictionaries; thesauri and other vocabulary aids; language identifiers and guessers, and other features; dictionary indices. YourDictionary.com will hopefully be the premiere language portal and the largest language resource site on the web. It is now actively acquiring dictionaries and grammars of all languages with a particular focus on endangered languages. It is overseen by a blue ribbon panel of linguistic experts from all over the world."

Answering my question about multilingualism, Robert Beard added in January 2000: "While English still dominates the web, the growth of monolingual non-English websites is gaining strength with the various solutions to the font problems. Languages that are endangered are primarily languages without writing systems at all (only 1/3 of the world's 6,000+ languages have writing systems). I still do not see the web contributing to the loss of language ident.i.ty and still suspect it may, in the long run, contribute to strengthening it. More and more Native Americans, for example, are contacting linguists, asking them to write grammars of their language and help them put up dictionaries. For these people, the web is an affordable boon for cultural expression."

Answering the same question, Caoimhin o Donnaile wrote in May 2001: "I would emphasize the point that as regards the future of endangered languages, the internet speeds everything up. If people don't care about preserving languages, the internet and accompanying globalization will greatly speed their demise. If people do care about preserving them, the internet will be a tremendous help."

Caoimhin o Donnaile teaches computing - through the Gaelic language - at the Inst.i.tute Sabhal Mor Ostaig, located on the Island of Skye, in Scotland. He also maintains the college website, which is the main site worldwide with information on Scottish Gaelic. He also maintains European Minority Languages, a list of minority languages by alphabetic order and by language family. He wrote in May 2001: "There has been a great expansion in the use of information technology at the Gaelic-medium college here. Far more computers, more computing staff, flat screens. Students do everything by computer, use Gaelic spell-checking, Gaelic online terminology database. More hits on our web site. More use of sound. Gaelic radio (both Scottish and Irish) now available continuously worldwide via the internet. Major project has been translation of the Opera web-browser into Gaelic - the first software of any size available in Gaelic."

Published by SIL International (SIL: Summer Inst.i.tute of Linguistics), The Ethnologue: Languages of the World is a catalogue of more than 6,700 languages. A paper version and a CD-ROM are also available.

Barbara Grimes was the editor of the 8th to 14th editions, 1971-2000.

She wrote in January 2000: "It is a catalog of the languages of the world, with information about where they are spoken, an estimate of the number of speakers, what language family they are in, alternate names, names of dialects, other sociolinguistic and demographic information, dates of published Bibles, a name index, a language family index, and language maps."

2000: ONLINE BIBLE OF GUTENBERG

[Overview]

The Bible of Gutenberg went online in November 2000, on the website of the British Library. As we all know, the Bible of Gutenberg is considered as the first print book. Gutenberg printed it in 1455 in Germany, perhaps printing 180 copies, with 48 copies that would still exist in 2000. Three copies - two full ones and one partial one - belong to the British Library. The two full copies - a little different from each other - were digitized in March 2000 by experts from the Keio University of Tokyo and NTT (Nippon Telegraph and Telephone Communications).

2000: DISTRIBUTED PROOFREADERS

[Overview]

Conceived in October 2000 by Charles Franks, Distributed Proofreaders was launched online in March 2001 to help in the digitization of public domain books. The method is to break up the tedious work of checking eBooks for errors into small, manageable chunks. Originally meant to a.s.sist Project Gutenberg in the handling of shared proofreading, Distributed Proofreaders has become the main source of Project Gutenberg eBooks. In 2002, Distributed Proofreaders became an official Project Gutenberg site. The number of books processed through Distributed Proofreaders has grown fast. In 2003, about 250-300 people were working each day all over the world producing a daily total of 2,500-3,000 pages, the equivalent of two pages a minute. In 2004, the average was 300-400 proofreaders partic.i.p.ating each day and finishing 4,000-7,000 pages per day, the equivalent of four pages a minute.

Distributed Proofreaders processed a total of 3,000 books in February 2004, 5,000 books in October 2004, 7,000 books in May 2005, 8,000 books in February 2006 and 10,000 books in March 2007, with the help of 36,000 volunteers.

[In Depth (published in 2005, updated in 2008)]

The main "leap forward" of Project Gutenberg since 2000 is due to Distributed Proofreaders. In 2002, Distributed Proofreaders became an official Project Gutenberg site. In May 2006, Distributed Proofreaders became a separate ent.i.ty and continues to maintain a strong relationship with Project Gutenberg.

Volunteers don't have a quota to fill, but it is recommended they do a page a day if possible. It doesn't seem much, but with hundreds of volunteers it really adds up. In December 2007, five books were produced per day by thousands of volunteers.

From the website one can access a program that allows several proofreaders to be working on the same book at the same time, each proofreading different pages. This significantly speeds up the proofreading process. Volunteers register and receive detailed instructions. For example, words in bold, italic or underlined, or footnotes are always treated the same way for any book. A discussion forum allows them to ask questions or seek help at any time. A project manager oversees the progress of a particular book through its different steps on the website.

The website gives a full list of the books that are: (a) completed, i.e. processed through the site and posted to Project Gutenberg; (b) in progress, i.e. processed through the site but not yet posted, because currently going through their final proofreading and a.s.sembly; (c) being proofread, i.e. currently being processed. On August 3, 2005, 7,639 books were completed, 1,250 books were in progress and 831 books were being proofread. On May 1st, 2008, 13,039 books were completed, 1,840 books were in progress and 1,000 books were being proofread.

Each time a volunteer (proofreader) goes to the website, s/he chooses a book, any book. Then one page of the book appears in two forms side by side: the scanned image of one page and the text from that image (as produced by OCR software). The proofreader can easily compare both versions, note the differences and fix them. OCR is usually 99% accurate, which makes for about 10 corrections a page. The proofreader saves each page as it is completed and can then either stop work or do another. The books are proofread twice, and the second time only by experienced proofreaders. All the pages of the book are then formatted, combined and a.s.sembled by post-processors to make an eBook. The eBook is now ready to be posted with an index entry (t.i.tle, subt.i.tle, author, eBook number and character set) for the database. Indexers go on with the cataloging process (author's dates of birth and death, Library of Congress cla.s.sification, etc.) after the release.

Volunteers can also work independently, after contacting Project Gutenberg directly, by keying in a book they particularly like using any text editor or word processor. They can also scan it and convert it into text using OCR software, and then make corrections by comparing it with the original. In each case, someone else will proofread it. They can use ASCII and any other format. Everybody is welcome, whatever the method and whatever the format.

New volunteers are most welcome too at Distributed Proofreaders (DP), Distributed Proofreaders Europe (DP Europe) and Distributed Proofreaders Canada (DPC). Any volunteer anywhere is welcome, for any language. There is a lot to do. As stated on both websites, "Remember that there is no commitment expected on this site. Proofread as often or as seldom as you like, and as many or as few pages as you like. We encourage people to do 'a page a day', but it's entirely up to you! We hope you will join us in our mission of 'preserving the literary history of the world in a freely available form for everyone to use'."

2000: PUBLIC LIBRARY OF SCIENCE

[Overview]

The Public Library of Science (PLoS) was founded in October 2000 by biomedical scientists Harold Varmus, Patrick Brown and Michael Eisen, from Stanford University, Palo Alto, and University of California, Berkeley. Headquartered in San Francisco, PLoS is a non-profit organization whose mission is to make the world's scientific and medical literature a public resource. In early 2003, PLoS created a non-profit scientific and medical publishing venture to provide scientists and physicians with high-quality, high-profile journals in which to publish their most important work: PLoS Biology (launched in 2003), PLoS Medicine (2004), PLoS Genetics (2005), PLoS Computational Biology (2005), PLoS Pathogens (2005), PLoS Clinical Trials (2006), PLoS Neglected Tropical Diseases (2007). All PLoS articles are freely available online, and deposited in the free public archive PubMed Central. They can be freely redistributed and reused, including for translations, as long as the author(s) and source are cited. PLoS also hopes to encourage other publishers to adopt the open access model, or to convert their existing journals to an open access model.

2001: WIKIPEDIA

[Overview]

Launched in January 2001 by Jimmy Wales and Larry Sanger (Larry resigned later on), Wikipedia has quickly grown into the largest reference website on the internet. Its multilingual content is free and written collaboratively by people worldwide. Its website is a wiki, which means that anyone can edit, correct and improve information throughout the encyclopedia. The articles stay the property of their authors, and can be freely used according to the GFDL (GNU Free Doc.u.mentation License). Wikipedia is hosted by the Wikimedia Foundation, which runs a number of other projects, for example Wiktionary - launched in December 2002 - followed by Wikibooks, Wikiversity, Wikinews and Wikiquote. In December 2004, Wikipedia had 1.3 million articles from 13,000 contributors in 100 languages. Two years later, in December 2006, it had 6 million articles in 250 languages.

2001: CREATIVE COMMONS

[Overview]

Creative Commons (CC) was founded in 2001 by Lawrence Lessing, a professor at Stanford Law School, California. As stated on its website, "Creative Commons is a nonprofit corporation dedicated to making it easier for people to share and build upon the work of others, consistent with the rules of copyright. We provide free licenses and other legal tools to mark creative work with the freedom the creator wants it to carry, so others can share, remix, use commercially, or any combination thereof." There were one million Creative Commons licensed works in 2003, 4.7 million licensed works in 2004, 20 million licensed works in 2005, 50 million licensed works in 2006, 90 million licensed works in 2007, and 130 million licensed works in 2008. Science Commons was founded in 2005 to "design strategies and tools for faster, more efficient web-enabled scientific research." ccLearn was founded in 2007 as "a division of Creative Commons dedicated to realizing the full potential of the internet to support open learning and open educational resources."

2002: MIT OPENCOURSEWARE