Coptic SCRIPTORIUM (Sahidic Corpus Research: Internet Platform for Interdisciplinary multilayer Methods) is a collaborative, digital project created by Caroline T. Schroeder (University of the Pacific) and Amir Zeldes (Humboldt University, Berlin). The team is constantly growing.
Coptic SCRIPTORIUM provides a platform for interdisciplinary and computational research in texts in the Coptic language, particularly the Sahidic dialect. As an open-source, open-access initiative, the SCRIPTORIUM technologies and corpus function as a collaborative environment for digital research by any scholars working in Coptic. It provides:
We hope SCRIPTORIUM will serve as a model for future digital humanities projects utilizing historical corpora or corpora in languages outside of the Indo-European and Semitic language families.
The latest release notes and news about the project are on C. Schroeder's blog. A video introduction to the project, including how to use ANNIS, is also available.
Please read our Frequently Asked Questions for more information on the project, methodologies, and terminology.
We hosted a workshop on digital research and scholarship in Coptic at Humboldt University on May 14, 2013. The program and presentations are available.
The corpora below offer some examples of mark-up for diplomatic transcription and normalization. Most data is available in TEI XML, PAULA XML and relANNIS for use with the ANNIS corpus search software. Links are provided to search the corpus online in ANNIS. Individual documents can also be viewed in HTML for reading purposes in either diplomatic or normalized transcriptions with English translations. [For more information on TEI, PAULA, and ANNIS, check out our FAQ.]
All corpus data generated by the SCRIPTORIUM
project is licensed under the Creative Commons Attribution 3.0 Unported License unless otherwise indicated.
The search and visualization tool ANNIS is the most powerful way to use the texts for research purposes. We've provided some sample queries below to demonstrate some of the kinds of searches you may construct. ANNIS queries use either regular expressions or the ANNIS query language. If you are familiar with ANNIS or regular expressions, jump right in. If not, you may wish to try some of the sample queries and then substitute terms or search parameters to adapt them to your needs and learn the system. After clicking on the magnifying glass, you will be taken to a new page with the ANNIS query and results. The query will appear in the box on the upper left. The corpus/corpora you are searching will be selected on the lower left. And your search results will appear in the panel on the right.
Note: This corpus is derived from the Sahidica New Testament, which was released by Warren Wells and made available for free electronic distributionfor academic use only. It is not licensed CC-BY; click here for Sahidica licensing information.
Some of the tools below use a Sahidic Coptic lexicon based on data kindly provided by Prof. Tito Orlandi and the CMCL project. When using the part-of-speech tagging models or the tokenization script and its lexicon please make sure to refer back to the CMCL project.
The project is supported by the National Endowment for the Humanities Office of Digital Humanities and Division of Preservation and Access, the University of the Pacific, and Humboldt University.
Page last updated 7 July 2014