Technology and Books for All - Part 7
Library

Part 7

[Overview]

The MIT OpenCourseWare (MIT OCW) is a large-scale, web-based electronic publishing initiative launched by MIT (Ma.s.sachusetts Inst.i.tute of Technology) to promote open dissemination of knowledge and information.

A pilot version of the MIT OpenCourseWare (MIT OCW) was available online in September 2002, with 32 course materials of MIT. In September 2003, the site was officially launched with several hundred course materials. In March 2004, 500 course materials were available in 33 different topics. In May 2006, 1,400 course materials were offered by 34 departments belonging to the five schools of MIT. In November 2007, all 1,800 course materials were available, with 200 new and updated courses per year. In November 2005, the MIT launched the OpenCourseWare Consortium (OCW Consortium) as a collaboration of educational inst.i.tutions creating a broad body of open educational content using a share model. One year later, the OCW Consortium included the courses of 100 universities worldwide.

2004: PROJECT GUTENBERG EUROPE

[Overview]

In January 2004, Project Gutenberg spread across the Atlantic with the launching of Project Gutenberg Europe (PG Europe) and Distributed Proofreaders Europe (DP Europe) by Project Rastko, a non-governmental cultural and educational project located in Belgrade, Serbia. DP Europe uses the software of the original Distributed Proofreaders. DP Europe is a multilingual website, with its main pages translated into several European languages by volunteer translators. In April 2004, DP Europe was available in 12 languages. The long-term goal is 60 languages and 60 linguistic teams representing all European languages. DP Europe supports Unicode to be able to proofread eBooks in numerous languages.

Unicode is an encoding system that gives a unique number for every character in any language. DP Europe finished processing its 100th book in May 2005 and its 500th book in October 2008. DP Europe operates under "life +50" copyright laws. When it gets up to speed, DP Europe will provide eBooks for several national and/or linguistic digital libraries.

[In Depth (published in 2005, updated in 2008)]

In 2004, multilingualism became one of the priorities of Project Gutenberg, like internationalization. Michael Hart went off to Europe, with stops in Paris, Brussels and Belgrade. In Belgrade, he met with the team of Project Rastko, to support the creation of Distributed Proofreaders Europe (launched in December 2003) and Project Gutenberg Europe (launched in January 2004).

The launching of Distributed Proofreaders Europe (DP Europe) by Project Rastko was indeed a very important step. DP Europe uses the software of the original Distributed Proofreaders and is dedicated to the proofreading of books for Project Gutenberg Europe. Since the very beginning, DP Europe has been a multilingual website, with its main pages translated into several European languages by volunteer translators. DP Europe was available in 12 languages in April 2004 and 22 languages in May 2008.

The long-term goal is 60 languages and 60 linguistic teams representing all the European languages. When it gets up to speed, DP Europe will provide books for several national and/or linguistic digital libraries.

The goal is for every country to have its own digital library (according to the country copyright limitations), within a continental network (for France, the European network) and a global network (for the whole planet).

A few lines now on Project Rastko, which launched such a difficult and exciting project for Europe, and catalyzed volunteers' energy in both Eastern and Western Europe (and anywhere else: as the internet has no boundaries, there is no need to live in Europe to register). Founded in 1997, Project Rastko is a non-governmental cultural and educational project. One of its goals is the online publishing of Serbian culture.

It is part of the Balkans Cultural Network Initiative, a regional cultural network for the Balkan peninsula in south-eastern Europe.

In May 2005, Distributed Proofreaders Europe finished processing its 100th book. In June 2005 Project Gutenberg Europe was launched with these first 100 books. DP Europe supports Unicode to be able to proofread books in numerous languages. Created in 1991 and widely used since 1998, Unicode is an encoding system that gives a unique number for every character in any language, contrary to the much older ASCII that was meant only for English and a few European languages.

On August 3, 2005, 137 books were completed (processed through the site and posted to Project Gutenberg Europe), 418 books were in progress (processed through the site but not yet posted, because currently going through their final proofreading and a.s.sembly), and 125 books were being proofread (currently being processed). On May 10, 2008, 496 books were completed, 653 books were in progress and 91 books were being proofread.

2004: GOOGLE BOOKS

[Overview]

In October 2004, Google launched the first part of Google Print as a project aimed at publishers, for internet users to be able to see excerpts from their books and order them online. In December 2004, Google launched the second part of Google Print as a project intended for libraries, to build up a world digital library by digitizing the collections of main partner libraries. The beta version of Google Print went live in May 2005. In August 2005, Google Print was stopped until further notice because of lawsuits filed by a.s.sociations of authors and publishers for copyright infringement. The program resumed in August 2006 under the new name of Google Books. Google Books has offered books digitized in the partic.i.p.ating libraries (Harvard, Stanford, Michigan, Oxford, California, Virginia, Wisconsin-Madison, Complutense of Madrid and New York Public Library), with either the full text for public domain books or excerpts for copyrighted books. The lawsuit with a.s.sociations of authors and publishers was settled in October 2008.

[In Depth (published in 2008)]

In October 2004, Google launched the first part of Google Print as a project aimed at publishers, for users to be able to see snippets of their books and order them online. The beta version of Google Print went on line in May 2005. In December 2004, Google launched the second part of Google Print as a project intended for libraries, to build up a digital library of 15 million books by scanning and digitizing the collections of main libraries, beginning with the Universities of Michigan (7 million books), Harvard, Stanford and Oxford, and the New York Public Library. The planned cost was an average of US $10 per book, and $150 to $200 million on ten years. In August 2005, Google Print was stopped until further notice because of lawsuits filed by publishers for copyright infringement. The program resumed in August 2006 under the new name of Google Books.

Google Books was launched in August 2006 to replace the controversial Google Print, stopped in August 2005 because of main copyright concerns. Google Books offers excerpts of books digitized by Google in the partic.i.p.ating libraries (Harvard, Stanford, Michigan, Oxford, California, Virginia, Wisconsin-Madison, Complutense of Madrid and New York Public Library). Google scans 3,000 books a day, including copyrighted books. The inclusion of copyrighted books is widely criticized by authors and publishers worldwide. In the US, lawsuits were filed by the Authors Guild and the a.s.sociation of American Publishers (AAP) for alleged copyright infringement. The a.s.sumption is that the full scanning and digitizing of copyrighted books infringes copyright laws, even if only snippets are made freely available on the search engine. To counteract copyright concerns and the problems of a closed platform, the Internet Archive launched the Open Content Alliance (OCA) with the goal of digitizing only public domain books and make them searchable and downloadable through any search engine.

2005: OPEN CONTENT ALLIANCE

[Overview]

The Open Content Alliance (OCA) was conceived by the Internet Archive in early 2005 to offer broad, public access to the world culture. It was launched in October 2005 as a group of cultural, technology, non profit and governmental organizations willing to build a permanent archive of multilingual digitized text and multimedia content. The project aims at digitizing public domain books around the world and make them searchable through any web search engine and downloadable for free. Unlike the Google Print project, the OCA scans and digitizes only public domain books, except when the copyright holder has expressly given permission. The first contributors to OCA were the University of California, the University of Toronto, the European Archive, the National Archives in the United Kingdom, O'Reilly Media and Prelinger Archives. The digitized collections are freely available in the Text Archive of the Internet Archive. In December 2006, they reached a milestone of 100,000 digitalized books publicly available, with 12,000 new books added per month. Two years later, in December 2008, one million books were "posted under OCA principles or otherwise public domain hosted by the Internet Archive."

2006: MICROSOFT LIVE SEARCH BOOKS

[Overview]

Microsoft has also partic.i.p.ated in the Open Content Alliance (OCA), launched by the Internet Archive in October 2005. In December 2006, Microsoft released the beta version of Live Search Books. The book search engine performs keyword searches for non copyrighted books digitized by Microsoft from the collections of the British Library, University of California, and University of Toronto, followed in January 2007 by the New York Public Library and Cornell University.

Books offer full text views and can be downloaded in PDF files. In the future, Microsoft intends to add copyrighted works with the permission of their publishers. In May 2007, Microsoft announced agreements with several main publishers, including Cambridge University Press and McGraw Hill. After digitizing 750,000 books and indexing 80 million journal articles, Microsoft ended the Live Search Books program in May 2008 and closed the website.

2006: FREE WORLDCAT

[Overview]

WorldCat was created in 1971 by the non-profit OCLC (Online Computer Library Center) as the union catalog of the university libraries in the State of Ohio. Over the years, OCLC became a national and worldwide library cooperative, and WorldCat the largest library catalog in the world. In 2005, WorldCat had 61 million bibliographic records in 400 languages from 9,000 member libraries (paid subscription) in 112 countries. In 2006, 73 million bibliographic records were linking to 1 billion doc.u.ments available in these libraries. In August 2006, WorldCat began to migrate to the web through the beta version of the new website WorldCat.org. Member libraries now provide free access to their catalogs and electronic resources: books, audio books, abstracts and full-text articles, photos, music CDs and videos. Another pioneer site was RedLightGreen, launched in Spring 2004 (with a beta version in Fall 2003) as the web version of the RLG Union Catalog, another major union catalog created in 1980 by the Research Libraries Group (RLG).

RedLightGreen ended its service in November 2006, after a successful 3-year run, and RLG joined OCLC.

[In Depth (published in 1999)]

In 1998, two organizations - OCLC (Online Computer Library Center) and RLIN (Research Library Information Network) - were running international bibliographical databases through the internet.

The OCLC Online Computer Library Center is a non-profit, membership, library computer service and research organization dedicated to furthering access to the world's information and reducing information costs. More than 27,000 libraries in 65 countries were using OCLC services to manage their collections and to provide online reference services. The website was available in English, Chinese, French, German, Portuguese, and Spanish.

OCLC services included: access services; collections and technical services; reference services; resource sharing; Dewey Decimal Cla.s.sification (published by OCLC Forest Press); and preservation resources. From its headquarters in Dublin, Ohio, OCLC operated one of the world's largest library information networks. Libraries in the US joined OCLC through their OCLC-affiliated regional networks. Libraries outside the US received OCLC services through OCLC Asia Pacific, OCLC Canada, OCLC Europe, OCLC Latin America and the Caribbean, or via international distributors.

OCLC was also running WorldCat - the name of the OCLC Online Union Catalog - which is a merged electronic catalog of library catalogs around the world, and the world's largest bibliographic database with its 38 million records (in early 1998) in 400 languages (with transliteration for non-Roman languages), and an annual increase of 2 million records.

WorldCat stemmed from a concept which is the same for all union catalogs: earn time to avoid the cataloguing of the same doc.u.ment by many catalogers worldwide. When they are about to catalog a publication, the catalogers of the member libraries search the OCLC catalog. If they find the record, they copy it in their own catalog and add some local information. If they don't find the record, they create it in the OCLC catalog, and this new record is immediately available to all the catalogers of the member libraries worldwide.

Unlike RLIN, another main union catalog that accepts several records for the same doc.u.ment (please see below), the OCLC Online Union Catalog accepts only one record per doc.u.ment, and asks its members not to create duplicate records for doc.u.ments that were already cataloged. The records are created in USMARC format (MARC: Machine Readable Catalog) according to the Anglo-American Cataloguing Rules, 2nd version (AACR2).

What is the history of OCLC? "In 1967, the presidents of the colleges and universities in the state of Ohio founded the Ohio College Library Center (OCLC) to develop a computerized system in which the libraries of Ohio academic inst.i.tutions could share resources and reduce costs.

OCLC's first offices were in the Main Library on the campus of the Ohio State University (OSU), and its first computer room was housed in the OSU Research Center. It was from these academic roots that Frederick G.

Kilgour, OCLC's first president, oversaw the growth of OCLC from a regional computer system for 54 Ohio colleges into an international network. In 1977, the Ohio members of OCLC adopted changes in the governance structure that enabled libraries outside Ohio to become members and partic.i.p.ate in the election of the Board of Trustees; the Ohio College Library Center became OCLC, Inc. In 1981, the legal name of the corporation became OCLC Online Computer Library Center, Inc.

Today, OCLC serves more than 27,000 libraries of all types in the US and 64 other countries and territories." (excerpt from the 1998 website)

In early 1998, WorldCat had 38 million records - with one record per doc.u.ment. RLIN (Research Libraries Information Network) had 88 million records - with several records per doc.u.ment.

RLIN was run by the Research Libraries Group (RLG). The central RLIN database was a union catalog of 88 million items held in main libraries belonging to RLG member inst.i.tutions, including research and specialized libraries, like law, technical, and corporate libraries.