Digital Papyrology

Thursday, December 1, 2011

papyri.info updates

This just posted by sosin to the papylist:

Colleagues:

I write with some IDP updates.

(1) By now you will have noticed that we have rolled out the new search interface. This is an entirely new way of doing search and will take some getting used to.

A crude test scenario:

Go to http://papyri.info/search . The search screen is divided into two parts. Search results appear on the right and search filters appear on the left.

The primary difference between the old and the new is that rather than coming up with a single search query that aims to give you exactly what you want, it is now possible to begin with a more open-ended search and successively narrow and expand it until you reach a desired end.

So, let's say your class is interested in literacy. Run an initial search for #αγραμματ: enter #αγραμματ in the search box or select "Convert from betacode as you type" and enter #agrammat (by the way, this is a good option if you want to run searches from your smartphone, which probably does not have Unicode Greek); click 'Search'. This will give you words beginning with αγραμματ (# indicates a boundary; you will not find διαγρραμματ)

You get 446 hits--too many to show your class. You select (toward the bottom) "Show only records with images from:" / "Papyri.info" and click 'Go'. This will narrow your found set down to the 16 texts that contain a word beginning with αγραμματ and are known (via APIS) to Papyri.info. But there must be more: HGV knows about a great many links to other sites (external to papyri.info). So, you select "Other sites" and 'Go'. Now you have 221 records for which some digital image is known, whether via APIS or HGV. Add "Print publications" and you will see that papyri.info knows of a total of 337 texts that have *some* image associated with them, whether digital or print.

Your class is working on texts from the first 2 centuries CE. So, you set "Date on or after" at "1 CE" and "Date on or before" to "200 CE" and click Go. That's 45 hits. The classroom where you are teaching has digital projection, but no access to books. So, you *remove* the "Print images exist" filter by clicking it away from the top of the right side of your screen. That's 28 hits; you can actually look at several of these in class.

A student asks how many of these were written on behalf of a man or a woman. Well, it wouldn't be a foolproof test, but you can now add another Greek search as an additional filter. So, enter "υπερ αυτης" (as "Word/Phrase search", which assumes that the strings that you enter are complete words) or "υπερ# #αυτης" (as "Substring search", which assumes that the strings you enter are fragments, unless you indicate a boundary with #). Now we have 9 hits.

Not all of the students in the class have strong Greek; so, you also set "Translation language" to "English" and you are down to 3 hits. Note: these filters are defined by what is in the system, so that if 5 of these texts had been translated into, say, Italian, you would have found that as one of your filter options. At the moment, most of the translations known to papyri.info come from APIS and so are in English.

You may remove any of these filters by deselecting them (click the X next to each at the top of the window). Remember to remove all such queries before starting a completely *new* search (or click on the “Reset all” button.)

This is a somewhat silly example, but it should give you the basic idea of how this new approach works and how it differs from the old. It takes some getting used to, but once all of the functionality is in place you will find that it is a much more powerful and flexible manner of searching.

Also, if you would like to track our progress toward implementing more complicated search functionality, you may feel free to look at the following Google Doc:
https://docs.google.com/spreadsheet/ccc?key=0Ajkz6D9lOd20dGtwczJSUXVhWUx6ZEJCUEhvR1lIYVE&hl=en_US#gid=0
It is meant for our own internal use and so may not make perfect sense to you, but you should probably get the idea. Y = we expect this to work now. N = we expect this not to work now. dev = our test / development server. prodo = the production version of papyri.info, which you use. If your favorite search pattern is not there, and if you cannot achieve the same result with a combination of filters, we encourage you, please, to *send us an email*. If you don’t report an error or omission, there is a good chance we won’t know to correct it!

(2) You will have noted also that tens of thousands of ‘original’ readings now have diacriticals. So, where the scribe wrote αναδεχομε, we put the normalized “ἀναδέχομαι” in the text and now print “ἀναδέχομε papyrus” in the apparatus. Before the year is out we shall invert this practice, printing “ἀναδέχομε” in the text and “read ἀναδέχομαι” in the apparatus. This first step has been a bigger task than we had imagined, and we owe a great deal to the brilliant and devoted work of Faith Lawrence at the Department of Digital Humanities, King’s College London. A few thousand words still lack accents; please feel free to add such and submit them to us. Some small number or errors have been introduced along the way (outright mistakes, which will be clearly apparent as nonsense); we are hacking away at those, fixing as many of them as we can; but if you notice any please feel free to correct and submit, or else (if is seems to be a systematic kind of error) please just alert us by email.

(3) Note also that the process of adding accents and reversing the display-regime of regularizations was more difficult in the case of regularized words that wrapped from the end of one line to the beginning of the next. Many thousands of these we handled automatically; about 6000 or so still need to be fixed. We think we are in position to have 80-90% fixed in the next couple weeks. In the meantime, all lines affected by this are flagged with a bright red line number--you may have noticed this already. In all such lines you will see that chunks of words appear oddly at the end or beginning of an affected line; the apparatus entries for all such will look fine.

(4) The line-by-line commentary feature has a bug, which will create a dozen or so clones of some commentary entries...this is very frustrating. It should be fixed before the new year. Just don’t enter any commentary in the next few weeks!

(5) Frequent Papyrological Editor users will notice that we have radically overhauled our apparatus criticus capabilities.

* for BL corrections: <:αἱ τοῦ=BL 9.17|ed|Θίτου:>
* for corrections proposed in journals/books: <:(διαγρ(άφου))=N. Gonis, ZPE 143 (2003) 150|ed|(διαγρ(αφῆς)):>
* for corrections proposed direct to DDbDP: <:τοῦ=PN G. Claytor (CPR VI plate 35)|ed|:>
* we now support multiple regularizations: <:ἀνοίγεται (?)|ἀνοίεται (?)||reg||ἀ̣νύεται:>
* we now support ‘regularizations’ by language (useful in multi-lingual texts for example): <:ἄρακος=grc|reg|ⲁⲣⲁⲕ:>
* we now support combinations of all of the above, as the following *fictional* example illustrates:

275a. <:στρ[ατηγὸς]=BL 15.2||ed||

στρ[ατηλάτης]=J. Cowey, ZPE 150 (2020) 321-323|

στρ[ατιώτης]=R. Ast, CdE 100 (2018) 13-15 (BL 14.5)|

Συρ[ίων]=Original Edition:>

* And we support extremely complicated combinations (including nesting of virtually every type of apparatus tag, as the following *fictional* example illustrates:

75. <:<:στρ[ατηγὸς]|subst|<:σ.2[.?]|alt|γ.3[.?]:>:>=BL 19.2||ed||
<:<:στρ[ατηλάτης]|reg|ξ̣τ̣ρ[ατηλάτης]:>|alt|.1γρ[.?]:>=J. Cowey, ZPE 200 (2020) 321-323|
<:<:στρ[ατιώτης]|alt|στρ[ατηγία]:>|reg|στυ̣ρ[ατ][.?]:>=R. Ast, CdE 100 (2018) 13-15 (BL 14.5)|
<:Συρ[ίων](?)|reg|<:<:Σο̣υ̣ρ[ίων]||alt||Συ̣υ̣ρ[ίων]|Σω̣υ̣ρ[ίων]:>|subst|Σ.2ρ[ίων]:>:>=Original Edition:>

This means:
(i) at line 275 the DDbDP prints στρ[ατηγὸς], which the scribe himself corrected from either "σ . . [ca.?]" or "γ . . . [ca.?]", and which is recorded in BL vol.19 p.2

(ii) previously, Cowey had argued (in ZPE 200) for correcting the text to either στρ[ατηλάτης], which is a modern regularization of ξ̣τ̣ρ[ατηλάτης], or to ". γρ[ca.?]"

(iii) before Cowey, Ast had suggested (in CdÉ 100) that the papyrus reads στυ̣ρ[ατ- ca.?], which should be regularized either to στρ[ατιώτης] or to στρ[ατηγία]; this was subsequently picked up by BL 14.5

(iv) The original editors of the papyrus thought that the scribe had originally written "Σ . . ρ[ίων]", and then corrected it to either Σο̣υ̣ρ[ίων] or Συ̣υ̣ρ[ίων] or Σω̣υ̣ρ[ίων], any one of which should perhaps be regularized to Συρ[ίων]

The PN will display στρ[ατηγὸς] in the text and in the app: 275. corr. ex σ ̣ ̣[ -ca.?- ] (or γ ̣ ̣ ̣[ -ca.?- ]) BL 19.2 : ξ̣τ̣ρ[ατηλάτης] (l. στρ[ατηλάτης]) (or ̣γρ[ -ca.?- ]) J. Cowey, ZPE 200 (2020) 321-323 : στυ̣ρ[ατ -ca.?- ] (l. στρ[ατιώτης (or στρ[ατηγία])) R. Ast, CdE 100 (2018) 13-15 (BL 14.5) : Σο̣υ̣ρ[ίων] (or Συ̣υ̣ρ[ίων] or Σω̣υ̣ρ[ίων]) (corr. ex Σ ̣ ̣ρ[ίων]) (l. Συρ[ίων]) Original Edition. This is not *precisely* what you expect to find in a print publication, but it is full, clear, and quite unambiguous.

This is, we think, a huge achievement and a great good; we have Gabby Bodard at King’s College and Jon Fox at the University of Kentucky to thank!

The Leiden+ Documentation page should now reflect most/all of these improvements: (http://papyri.info/editor/documentation?docotype=text); please let us know if we have missed something. Also, please note that the Apparatus “Helpers” (for BL, Editorial, and SoSOL) are now buggy as a result of the changes but should be fixed before the new year. In the meantime, enter apparatus entries by hand.

(6) I shall also mention just briefly that thanks to the extraordinary generosity and collegiality of our colleagues in Brussels, Alain Martin, Paul Heilporn, and Alain Delattre, we have begun a process of surfacing Bibliographie Papyrologique data via the PN. Our Heidelberg colleagues James Cowey and Carmen Lanz did an amazing amount of work to make this happen. You will see that there are still many bugs in the conversion of the BP records to structured XML, but we are getting there one step at a time. From the navigation bar at the top of the PN select Search / Bibliography. This will take you to a very *simple* search screen, where you may enter, for example, “Hombert” (no quotation marks) (http://papyri.info/bibliosearch?q=Hombert); or you may constrain the search by BP fields. So, if you enter “author:Hombert; title:bibliographie; date:1932” (no quotation marks) you will get 2 records (http://papyri.info/bibliosearch?q=author%3AHombert%3B+title%3Abibliographie%3B+date%3A1932+). This works also for BP subject codes; so, for example, “index:146” will find 1106 records concerning Archives. The FileMaker version of the BP remains much more flexible, powerful, and manipulable, for those users who are so inclined. In the next few weeks we shall be able to link from BP records to DDbDP texts and from DDbDP texts to BP records. This is only a start; we expect this service to improve dramatically over the coming months, soon with the ability to add/correct records, and submit them for review by the BP editorial team.

This message is already far too long. So, enough for now.

As always, please feel free to send questions, comments, complaints to Josh Sosin jds15@duke.edu, Rodney Ast ast@uni-heidelberg.de, and James Cowey james.cowey@urz.uni-heidelberg.de; same, with regard to PN performance to Hugh Cayless .

All best,
Rodney Ast
James Cowey
Josh Sosin

Wednesday, November 16, 2011

stable identifiers

Trismegistos now has stable identifiers for the following items:

- Texts: e.g. www.trismegistos.org/text/4563
- Archives: e.g. www.trismegistos.org/archive/364
- Collections: e.g. www.trismegistos.org/collection/234
- Places: e.g. www.trismegistos.org/place/264
- Names: e.g. www.trismegistos.org/name/1

We are still developing TM People and a stable identifier for individuals will be implemented.

For TM,

Mark

Sunday, September 25, 2011

Trismegistos People

Just posted to the PAPY-list:

Dear colleagues,

New travels fast in these days of Facebook and Twitter ...
Although we had hoped to develop the system for proposing corrections and adding names before sending this email, it seems better to announce the launch of a beta version of Trismegistos People now.

Trismegistos People consists of a complex set of prosopographical and onomastic databases, listing personal names of non-royal individuals in Trismegistos Texts (currently some 458,000 attestations).
It is very much a work in progress and needs to be perfected in many ways, as users will notice. For some of these we will develop, as said above, a system enabling you to help us.
In the meantime, I hope the tool proves useful and you enjoy its search facilities, limited as they currently are.

Please check out the website at http://www.trismegistos.org/ref/about.php.

For Trismegistos,

Mark Depauw

Friday, September 2, 2011

Updates to the DDbDP and PN (papyri.info)

Just posted to papylist:

Dear Colleagues:

I write with some updates concerning the DDbDP and PN.

First, we are most pleased to welcome Mark Depauw to the Editorial Boards of the DDbDP and HGV; and it is also our great pleasure to welcome W. Graham Claytor as Assistant Editor of the DDbDP. Both will be great assets to the team.

The latest release features a number of improvements to the PN search and display, many of which you will have seen already; these include line numbers in search returns and enhanced tabular display of search/browse hits. We look forward to releasing vastly improved and significantly redesigned search functionality in the coming weeks. You may also have noticed that we are starting to change the look of the search interface; we are in process of tightening the integration of the navigator and editor and bits of the new appearance are already showing.

Frequent users of SoSOL/Papyrological Editor will be happy to know that we now offer syntax for multiple alternate readings, e.g. <:στρατ[ηγὸς]||alt||στρατ[ηλάτης]|στρατ[ηγήσαντα]:>. The editor also offers enhanced display controls for introductory and line-by-line commentary: bold, italics, underline, footnotes, embedded links to PN (DDbDP/HGV/APIS) and any other website; in the next couple months we shall also support controlled linking to bibliography.

We have also very nearly finished entering CPR 25, CPR 30, P.Heid. 9, O. Stras. 2, and O.Abu Mina; the bugs that we encountered with P.Köln have now been fixed, so that we hope to move quickly to finish those volumes as well. We have also begun entering P.Count and continue (slowly) to enter SB 26. There is much to do, but since rolling out the new system we have entered well over 2000 texts. Joyous thanks to the many and good-spirited contributors who do this work without any pay and in service to our field.

I also want to alert the community to a development currently under way, whose fruits you will start to see in the DDbDP. It has long been DDbDP practice to place orthographically/grammatically normalized forms, and even outright *corrected* forms, in the text and the ancient reading in the apparatus, e.g. TEXT: ἀνωμολογήσαντο, APP: ανομολογησαντο. This is of course not in keeping with papyrological convention and we have heard (loud and clear) over the years that colleagues would like to see this practice ‘flipped’.

We have begun that process. This means putting accents, breathing marks, and capitalization (and in many many cases Leiden mark-up, including line-breaks) in the appropriate places in the original reading; on 70,000 unique strings (more than 90,000 cases). This is, as you might imagine, a big job and we can neither convert every single case nor verify autoptically every case that we can convert. This cannot be done by hand on any reasonable timeframe or budget. But thanks to generous assistance from the TLG and some excellent programming work done at the Department of Digital Humanities King’s College London, we now can convert over 75% of these cases to a fairly high degree of accuracy. For example, we can take the original reading as it currently stands, αροστοι, collate it against the regularized reading, ἄρρωστοι, and produce ἄροστοι, which we will soon display in the text (and “read ἄρρωστοι” in the app).

The process is driven by a large table of generalizable equivalencies; the script does not know rules for accenting non-standard forms; rather, it sees that an original form is γιτωνεις, sees also that the regularized form is γείτονες, and knows also that ει/ι, ω/ο, and ε/ει are often in exchange, and so it collates the two forms and moves the accent over from the ί to the εί. It works very well, but we know that we will be introducing some forms whose accents may/will be incorrect; intensive checking on a sample of 10,000 unique strings suggests that such will be quite few, but there will be mistakes. Expect some nouns to have incorrect accents where the regularization involved change of case; expect to see a few iota subscripts where they do not belong or omitted where they do). Some cases cannot be correctly converted automatically; where γυναικος is normalized to γυναῖκες the computer does not ‘know’ whether the normalization is phonetic (in which case → γυναῖκος) or morphological (in which case → γυναικὸς); some such cases may be converted where you would prefer them not to be and some left unconverted where you would prefer otherwise. There will also be a relatively small number of infelicitous conversions, e.g. αιλαεινα → αἰλάεϊνα (regularized: ἐλάϊνα). To avoid all of these errors would mean not (correctly) converting many many more clear cases. So, we tolerate some mess.

Forms that we cannot convert by this collation process, we will pass through the TLG morphological engine; this will catch many but not all of the remaining examples. Names, for example, are less than fully cotrolled by TLG, so that in many cases we can only partially convert a string. So, where we have corrected to Ἐσμῖνι from εσμινιος, we cannot simply move the accent over (→ Ἐσμῖνιος); and since our system does not know accentuation rules, we have no automatic way to generate the form ᾿Εσμίνιος; but we do know that the regularized form begins with a capital letter and smooth breathing, so that we can output ᾿Εσμινιος, which at least indicates the class of noun and is better than nothing!

Errors, infelicities, and partial conversions can of course be corrected via SoSOL by anyone who wishes! Some degree of mess is simply the cost of delivering the type of text and apparatus that the community wants. And we will clean up the mess over time and, I hope, together. In any case, know that we have not changed the *spelling* of such original readings--only the accentuation; so, in this crucial way the integrity of the data remains intact. And some of the successes are pretty spectacular, thanks to the hard work of the team: for instance, the original επειγιμαινον, which was normalized to ἐπικείμενον, is ‘correctly’ converted to ἐπειγίμαινον!

In the next month or two we should have apparatus criticus syntax that is both much better than the status quo and deploys the new handling of such regularizations.

As always, please feel free to contact Rodney Ast (ast@uni-heidelberg.de), James Cowey (james.cowey@urz.uni-heidelberg.de), and myself with questions, comments, complaints about the DDbDP, and Hugh Cayless (hugh.cayless@nyu.edu) with the same regarding PN functionality.

Sincerely,
Josh Sosin

Friday, August 26, 2011

Papyri.info outage

Update: new start time.

The papyri.info site will be taken offline before 10:00am EDT on Saturday, August 27th in advance of Hurricane Irene's impact on New York. The datacenter that hosts it is being shut down as a precaution.

We do not have an estimated uptime at this point.

Friday, July 22, 2011

PE (SoSOL) training session: one down one to go!

This from the airport, between two DDbDP/HGV/SoSOL training sessions. About two dozen papyrologisists, epigraphists, and ancient historians are now winding down a week-long Workshop on Digital Tools in Papyrology, sponsored by the Austrian National Library, the Austrian Academy of Science and Vienna University. Included among the exciting presentations and discussions were several Papyrological Editor (SoSOL) training sessions, where a wide range of participants -- junior and senior, papyrologists and epigraphists, Hellenists, Demotists, Coptologists, and Arabists -- entered many texts from O.Abu Mina, all of CPR 25 (most of it is now online already), and almost all of CPR 30 (which we shall finalize soon). We tested the new line-by-line commentary feature, documented search-interface desiderata, and even entered -- for testing purposes -- the Arabic portion of a Greek-Arabic bilingual document (and it worked!). What a fantastic and collegial group! Many thanks to Bernhard Palme and all of our colleagues in Vienna for conceiving, organizing, and hosting such a wonderful and productive event!

Now on to London, where Adam Bülow-Jacobsen, Hélène Cuvigny, Holger Essler, Maria Rosaria Falivene, Micaela Langelloti, Nikos Litinas, Myrto Malouta, Anna Monte, Alberto Nodar, Marco Perale, Nadine Quenouille, and Joanne Stolk, will join Rodney Ast, Gabriel Bodard, Chad Crouch, Todd Hickey, Josh Sosin, and Charlotte Tupman in the next training session, starting Monday 25 July!

Friday, July 1, 2011

Workshop on Digital Tools in Papyrology

Posted for Bernhard Palme, via Papy-L:

This workshop, organized jointly by the Austrian National Library, the Austrian Academy of Science and Vienna University, will provide an introduction to the most important digital tools in papyrology. The program will offer a mixture of classes (in English), in which the students will get an overview of the manifold electronic resources in the field, and training sessions on the new editing platform for DDbDP, HGV, and APIS.

The workshop will also include visits to the Papyrus Collection and the Papyrus Museum of the Austrian National Library. The main teachers will be James Cowey (Universität Heidelberg), Mark Depauw (Katholieke Universiteit Leuven), Sandra Hodecek (Österreichische Nationalbibliothek), Thomas Kruse (Österreichische Akademie der Wissenschaften), Bernhard Palme (Österreichische Nationalbibliothek/Universität Wien), Lucian Reinfandt (Universität Wien), Joshua D. Sosin (Duke University), Johannes Thomann (Universität Zürich).

The workshop will begin on Monday, 18th July with registration in the morning and courses in the afternoon, and will end on Friday, 22nd July in the evening. On Saturday, 23rd July, morning there will be a guided tour to the Ephesos Museum.

There is no fee for the course, but 125 Euros have to be charged for accommodation in a university Hall of Residence. The number of participants is restricted to 20. Advanced students with an interest in papyrology and solid knowledge of Ancient Greek and English are invited to participate, whether they have already experience in the subject or not.

Applications, including a curriculum vitae, should be sent before July 12 to

Bernhard Palme
bernhard.palme@univie.ac.at