General Catalog and Time Schedule archives

Two-sentence Description

The UW’s history extends back to the 1890s, and along with it are its General Catalogs and Time Sschedules. Along with an intern, I brought those documents online—with no budget!

Background

Each biennium, the UW produces a General Catalog, and each academic quarter a Time Schedule each quarter. The GenCat, as it’s affectionately known, contains program descriptions, University academic policies, faculty lists, and many other sorts of information. Time Schedules (TSs) list each class offered, meeting times and locations, and instructor for each quarter. Together. they form a significant portion of the historical record of the University.

Like many other universities, the UW stopped printing these documents to save money and respond to students’ preferences; online versions took their place. And while more recent printed versions could be converted into PDFs, there was more than a century’s worth of GetCats and TSs that had no electronic version, and therefore were “invisible” unless you were at the University archives—or in the Registrar’s office, which had a near-complete set. As the history of the state’s largest and oldest public institution, this information deserved to be transferred online for all to review, and to safeguard it against loss should there be a major natural disaster. But how to get it there?  With slashed budgets and little manpower, the task was daunting.

Skills and Technologies

With a bit of scraped-together cash to hire an intern, I worked with the University Registrar to craft a job description for an “archivist intern.” After interviewing many applicants we picked a talented graduate student whose work I oversaw. Our combined recommendation: leverage existing campus resources (the Digital Initiatives group within the UW Libraries) to disbind, scan, perform optical-character recognition (OCR) on, index, and create PDF files from the GenCats and TSs. This work, performed over three academic quarters, yielded two archive pages: one for General Catalogs and another for Time Schedules. (I crafted some slightly more whimsical CSS to present the TSs, which due to their quarterly nature are often presented in these colors.)

The collaboration with the Libraries allowed them to include the scanned material in their own content-management system, ContentDM. While this was a boon to the University, the software left a lot to be desired in terms of actual use. Our desire was for a simpler method to read and search the materials, so we created our own archive of searchable PDF files. PDFs are a ubiquitous, familiar format that better suited our needs.

Attention to detail was required for this work by both myself and the intern. We tested various resolution levels for the scans to find a good balance between quality (for the OCR software) and size of the files, which range from 8 to 20 megabytes.

Take aways

Although this project required almost no coding, I feel like it is one of the most important projects I’ve worked on at the UW. Preserving the history of the University is paramount, and I was able to facilitate a solution. It also showcased a truth about projects: collaboration with partners can make things happen when resources like money are scarce. It also demonstrated the importance of finding the right person for a job. The intern we chose was exceptional in her knowledge of archiving and information science and the project would have been much less successful with the wrong person performing the work.