The New Frontier In Research And Library Science

Arthur Crivella

Co-Founder & CEO of Crivella West Incorporated

Presentation at the 9th International Conference on the Book University of Toronto - October 15, 2011

Libraries have existed for at least 4,000 years, with few changes to the physical structure. With the invention of the printing press, hand scribed books became obsolete, too expensive to produce and too heavy to carry around. Most significantly, thousands of authors were unleashed to create new works and millions of people had access to a previously unimaginable quantity of information, education, and entertainment.

About a hundred years ago, the numbers of libraries increased dramatically. As a result, many more people were able to learn and study. In modern times, professional librarians have made advances in the cataloging and archiving of books and documents, so that libraries everywhere could be utilized in a consistent manner. In addition, catalogues are now electronic and can be used from remote locations.

However, the concept of storing books and documents has remained the same. The primary skill of researchers has continued to be reading and thinking about what could be found in the books, particularly primary source materials. Digital texts literally change everything. The use of books and other research materials are no longer limited by the number of physical copies the library owns or constrained by a limited number of shelves in the library.

For the past 20 years, my partner Wayne West and I have been studying how knowledge and information is found and used. We have been interacting with scholars from a variety of fields and universities, including wonderful people at the University of Toronto and at the National Institute for Newman Studies in Pittsburgh. We have read hundreds of articles predicting the digital future, with such titles as “Education Needs a Digital-Age Upgrade”, “Libraries of the Future”, “The Role of the Library in 21st-Century Scholarship”, and “The Future of the Library in the Research University”.

Most of the ideas proffered thus far are still tied to the analog model of a traditional book.

Digital text is an untapped goldmine and can be used traditionally for reading purposes or manipulated and cultivated to create new content and textual metadata that can be exploited to open new worlds of discovery for researchers. We stand at a major inflection point in how books are published, marketed, distributed, read and studied. Today there is an explosion of new content that can be created through the application of new technologies, to electronic text, metadata and multimedia. We call this new world of unconstrained content the “New Media.” This New Media is interactive, available from any internet connection worldwide and because it is digital it can be quantified and measured, providing researchers and librarians with a wealth of new analytical capabilities that were never available before.

We call these new analytical capabilities “New Media Analytics”. Crivella West’s answer to leveraging these new analytical capabilities for research is called the Knowledge Kiosk. It is not a “library” in the traditional sense, because that word conjures up images of a 4,000 year old analog model. It is not a “digital library”, as those words are generally used, because most electronic books and electronic collections are now created, used, stored and catalogued in ways that are similar to the treatment of analog books.

What researchers can do in the Knowledge Kiosk is as important as the quality of the content itself. Of course, having all the works of and about an author in one place is both efficient and crucial to avoid missing valuable information. However, if a researcher cannot isolate specific content of interest within vast collections of material having everything in one place is of little value. For example, a search for “John Henry Cardinal Newman” in the online catalogues of three major universities and Internet Archive found:

  • University 1 - 98 works;
  • University 2 - 605 works;
  • University 3 -382 works;
  • Internet Archive - 253: works.

In comparison, the Knowledge Kiosk contains 1,560 different works authored by Cardinal Newman, plus some 5,000 works by other authors writing about Cardinal Newman or his major themes. This collection is organized into 27 overlapping topical sub-collections. This is an impressive collection, but unless you can isolate material specific to your own research goals you have gained little.

In the Knowledge Kiosk, analytics create “Special Collections” by merging all the information from:

  • Library Catalog Metadata
  • Electronic Text
  • Multimedia
  • Internet Resources
  • Linguistic Analysis

This process of using all relevant information to gather relevant content in one place for researchers to study creates special collections with impact. For example: These kinds of collections are now available to researchers in the Knowledge Kiosk

    What if you are researching an author’s ideas on education wouldn’t you want to know who influenced him in his thinking? What if you had a special collection on educational topics that contained the works and the specific content quoted or cited by your author?

    Wouldn’t you also want to know all the subsequent writers who quoted him and which of his theories influenced the next generation of great thinkers?

Subsequent analytics controlled by the individual researcher can isolate major themes, and find contextual similarities across the special collections. Simple tools, coupled with a scholar’s domain expertise can perform complex tasks on large collections to hone in on the specific content relevant to the specific inquiry. This process insures that researchers have all the relevant content needed for in-depth reading and analysis. As researchers are conducting in depth reading they are provided with more tools that help them record their thoughts while the system automatically builds an editable outline for their thesis. Everything they view, touch, edit , analyze and record are saved, enabling them to defend their process and results when others review their work product.

However, these new processes, methods and technologies that are available to the research community in the digital age require learning new skills. Modern researchers must learn how to use the technology efficiently for best results.

Searching is no longer restricted to simple word finds or metadata searches. Knowing how to search and what to search for is different in a digitized world because the usable content is not restricted to documents. Therefore, how researchers are educated and trained must change. Understanding and effectively using this new technology must be an integral part of the educational process. A recent study on student research habits, by the ERIAL Project, of several Illinois libraries shows what educators have to deal with:

“The majority of students — of all levels — …tended to overuse Google and misuse scholarly databases. They preferred simple database searches to other methods of discovery …They were basically clueless about the logic underlying how the search engine organizes and displays its results. Consequently, the students did not know how to build a search that would return good sources...”

This poses a dilemma first, we must educate students to think beyond “Google” which organizes results by merchantability rather than relevance and introduce them to new concepts in “Search” that embrace the new world of new media. Next we must provide them with effective tools that enable the learning and discovery process.

For example, if a scholar wanted to study a Cardinal Newman theme, the Crivella West Knowledge Kiosk contains about 6,600 documents written by Newman and twenty-two (22) other authors writing about Newman themes. The technology would create special collections for analysis and identify and prioritize specific relevant passages for review, none of which would have been found by using MARC record metadata. MARC records categorize at the level of the document not the text. The end result would be that the scholar could then focus on the most relevant materials which are typically two to four (2-4%) percent of the total and know within the material collected where information pertaining to his specific area of interest was located.

The researcher could then further refine his results through the application of Crivella West’s Domain Expert Analytics, which enable researchers to do very complex searches, quickly and easily. Most advanced library searching today is done using a few key words and standard Boolean logic. Crivella West arms researchers with Pre-Constructed Analytics that can be applied to their searches to help them refine their concepts and language markers. It allows the researcher to easily add or exclude terms into their analysis and frequency and proximity results provide content for them to consider in the refinement of their analysis. This metadata produced from the results of their analysis (frequency counts and proximity detects and highlights language anomalies found in the result set which opens doorways for further investigation.

Approximately one hundred (100) pre-established expert algorithms can find concepts or types of language, such as revelations, epiphanies, fears, beliefs, emotions etc. Each algorithm typically contains hundreds or thousands of rules. Advanced users can use these algorithms to build customized inquiries, easily and quickly enabling researchers to look deeply into a collection.

A researcher can explore the evolution of a concept, its origins and changes over time. Find antecedents to an author’s concept in the work of another author; can compare an author to other authors; and can compare different editions or different books by the same author in order to uncover the evolution of opinions and language.

The Crivella West Knowledge Kiosk facilitates collaboration. Notes can be written and linked to sections of the text. Other researchers can respond, thereby building on each other’s thoughts and insights. Future researchers interested in similar concepts can build on the work of previous scholars and establish their own intellectual legacy for future generations. Professors guiding students in their research can peer into work in progress to evaluate progress and advise or direct next steps. Entire classrooms or departments can work together on on research or interdisciplinary projects.

In summary, the key to the new frontier is not just content or where it is stored, but how it is used and can be used. A digital library should not be viewed or used as a digital library. The technologies, processes and methodologies that can capitalize on the new media are available today for scholars and librarians.

In collaboration with the University of Toronto, Crivella West populated the Knowledge Kiosk with over twenty thousand (20,000) books and other materials relating to all aspects of Catholic theology, culture, history and literature. Anyone doing research related to any aspect of Catholicism can visit the Knowledge Kiosk. Researchers can also make additions of Catholic related materials to the Knowledge Kiosk, dynamically enhancing the value of the Kiosk for everyone.