Transcription & Text Encoding

The list of tools below is organized alphabetically, and it represents a selection of the resources available to Digital Humanists. Many of these tools are actively updated, so please contact the DH@Bucknell Web Team if you find any outdated information or if you would like to suggest additional tools or software.

Bucknell University has site licenses and provides faculty, staff, and students with access to and support for a number of these tools; tools for which this is the case have “BU access” listed under pricing.


ABBYY FineReader

ABBYY FineReader is an Optical Character Recognition (OCR) technology that accurately converts paper documents, scans and PDFs to Word, Excel, searchable PDF and other formats.

Details

Website: https://www.abbyy.com/en-us/finereader/
Open Source Software (OSS) or Proprietary? Proprietary
Pricing: Pricing tiers


Altova XMLSpy

Altova XMLSpy provides an XML editor, Graphical XML Schema, and a SmartFix validation with automatic error correction, among other services.

Details

Website: https://www.altova.com/download-xml-editor-b
Open Source Software (OSS) or Proprietary? Proprietary
Pricing: Pricing tiers


Annotation Studio

Annotation Studio is a suite of collaborative web-based annotation tools currently under development at MIT. The most significant difference between Annotation Studio and other digital annotation projects is its emphasis on student-centered design and pedagogy. Most other annotation tools assume user familiarity with TEI, and a well-developed understanding of the relationships between literary sources, manuscripts, editions, and adaptations. Annotation Studio makes sophisticated yet easy-to-use commenting tools immediately accessible to students with no prior experience with close textual analysis or TEI.

Details

Website: https://www.annotationstudio.org/
Open Source Software (OSS) or Proprietary? OSS
Pricing: Free


FromThePage

FromThePage is software for transcribing documents and collaborating on transcriptions with others. Use FromThePage for everything from simple, plain-text transcription projects to bilingual digital scholarly editions with annotations. You can collaborate with others and index a document by tagging people, places, and other subjects of interest. FromThePage enables you to import documents to transcribe from any system that supports the International Image Interoperability Framework (IIIF) and share your transcribed documents with IIIF systems.

Details

Website: https://fromthepage.com/
Open Source Software (OSS) or Proprietary? Proprietary
Pricing: Pricing tiers


Juxta Editions

Juxta Editions is a professional editing suite for the creation of digital scholarly editions. It provides assistance during the entire process of preparing a digital edition, from transcribing texts to editing and annotating them, to publishing online.

Details

Website: http://www.juxtaeditions.com/
Open Source Software (OSS) or Proprietary? Proprietary
Pricing: Free (limited) account; pricing tiers


Optical Character Recognition (OCR) for Phones and Hand-Held Devices

There are competing Optical Character Recognition (OCR) apps for both Android and iOS systems. These apps enable scholars and other individuals to scan a document in special collections directly to their device, automatically converting the image to searchable text.

Details

Website: For Android, see https://techwiser.com/5-best-ocr-app-for-android/ and for iOS, see https://mashtips.com/ocr-scanner-ios-apps/
Open Source Software (OSS) or Proprietary? Varies
Pricing: Varies


oTranscribe

A free web app that assists with the transcription of recorded interviews.

Details

Website: https://otranscribe.com/
Open Source Software (OSS) or Proprietary? OSS
Pricing: Free


Oxygen XML Editor

Oxygen XML Editor provides a comprehensive suite of XML authoring and development tools. It is designed to accommodate a large number of users, ranging from beginners to XML experts. It is available on multiple platforms, all major operating systems, and as a standalone application or an Eclipse plug-in.

Details

Website: https://www.oxygenxml.com/
Open Source Software (OSS) or Proprietary? Proprietary
Pricing: Pricing tiers


Text Encoding Initiative (TEI)

The Text Encoding Initiative (TEI) is a consortium which collectively develops and maintains a standard for the representation of texts in digital form. Its chief deliverable is a set of Guidelines which specify encoding methods for machine-readable texts, chiefly in the humanities, social sciences and linguistics. Since 1994, the TEI Guidelines have been widely used by libraries, museums, publishers, and individual scholars to present texts for online research, teaching, and preservation.

Details

Website: https://tei-c.org/
Open Source Software (OSS) or Proprietary? N/A
Pricing: Free tools and documentation online; membership available for a subscription fee


Transkribus

Transkribus is a tool that helps scholars and archivists transcribe hand-written and printed documents into a machine-readable format. Transkribus offers automated tools, such as Handwritten Text Recognition and Layout Analysis, and allows you to transcribe text in any language and with any character set (you can load and use your own virtual keyboard). Additionally, if you have at least 100 images of in any given handwritten script, Transkribus will train an HTR engine from the Computational Intelligence Technology Lab (CITlab) of the University of Rostock on your documents, thereby enabling you to transcribe further pages of your documents with the support of automatically produced handwritten text.

Details

Website: https://transkribus.eu/Transkribus/
Open Source Software (OSS) or Proprietary? OSS
Pricing: Free


VideoAnt

VideoAnt is a web-based video annotation tool for mobile and desktop devices. Use VideoAnt to add annotations, or comments, to web-hosted videos. 

Details

Website: https://ant.umn.edu/
Open Source Software (OSS) or Proprietary? Proprietary
Pricing: Pricing tiers