Thursday, December 1, 2011

papyri.info updates

This just posted by sosin to the papylist:

Colleagues:

I write with some IDP updates.

(1) By now you will have noticed that we have rolled out the new search interface. This is an entirely new way of doing search and will take some getting used to.

A crude test scenario:

Go to http://papyri.info/search . The search screen is divided into two parts. Search results appear on the right and search filters appear on the left.

The primary difference between the old and the new is that rather than coming up with a single search query that aims to give you exactly what you want, it is now possible to begin with a more open-ended search and successively narrow and expand it until you reach a desired end.

So, let's say your class is interested in literacy. Run an initial search for #αγραμματ: enter #αγραμματ in the search box or select "Convert from betacode as you type" and enter #agrammat (by the way, this is a good option if you want to run searches from your smartphone, which probably does not have Unicode Greek); click 'Search'. This will give you words beginning with αγραμματ (# indicates a boundary; you will not find διαγρραμματ)

You get 446 hits--too many to show your class. You select (toward the bottom) "Show only records with images from:" / "Papyri.info" and click 'Go'. This will narrow your found set down to the 16 texts that contain a word beginning with αγραμματ and are known (via APIS) to Papyri.info. But there must be more: HGV knows about a great many links to other sites (external to papyri.info). So, you select "Other sites" and 'Go'. Now you have 221 records for which some digital image is known, whether via APIS or HGV. Add "Print publications" and you will see that papyri.info knows of a total of 337 texts that have *some* image associated with them, whether digital or print.

Your class is working on texts from the first 2 centuries CE. So, you set "Date on or after" at "1 CE" and "Date on or before" to "200 CE" and click Go. That's 45 hits. The classroom where you are teaching has digital projection, but no access to books. So, you *remove* the "Print images exist" filter by clicking it away from the top of the right side of your screen. That's 28 hits; you can actually look at several of these in class.

A student asks how many of these were written on behalf of a man or a woman. Well, it wouldn't be a foolproof test, but you can now add another Greek search as an additional filter. So, enter "υπερ αυτης" (as "Word/Phrase search", which assumes that the strings that you enter are complete words) or "υπερ# #αυτης" (as "Substring search", which assumes that the strings you enter are fragments, unless you indicate a boundary with #). Now we have 9 hits.

Not all of the students in the class have strong Greek; so, you also set "Translation language" to "English" and you are down to 3 hits. Note: these filters are defined by what is in the system, so that if 5 of these texts had been translated into, say, Italian, you would have found that as one of your filter options. At the moment, most of the translations known to papyri.info come from APIS and so are in English.

You may remove any of these filters by deselecting them (click the X next to each at the top of the window). Remember to remove all such queries before starting a completely *new* search (or click on the “Reset all” button.)

This is a somewhat silly example, but it should give you the basic idea of how this new approach works and how it differs from the old. It takes some getting used to, but once all of the functionality is in place you will find that it is a much more powerful and flexible manner of searching.

Also, if you would like to track our progress toward implementing more complicated search functionality, you may feel free to look at the following Google Doc:
https://docs.google.com/spreadsheet/ccc?key=0Ajkz6D9lOd20dGtwczJSUXVhWUx6ZEJCUEhvR1lIYVE&hl=en_US#gid=0
It is meant for our own internal use and so may not make perfect sense to you, but you should probably get the idea. Y = we expect this to work now. N = we expect this not to work now. dev = our test / development server. prodo = the production version of papyri.info, which you use. If your favorite search pattern is not there, and if you cannot achieve the same result with a combination of filters, we encourage you, please, to *send us an email*. If you don’t report an error or omission, there is a good chance we won’t know to correct it!

(2) You will have noted also that tens of thousands of ‘original’ readings now have diacriticals. So, where the scribe wrote αναδεχομε, we put the normalized “ἀναδέχομαι” in the text and now print “ἀναδέχομε papyrus” in the apparatus. Before the year is out we shall invert this practice, printing “ἀναδέχομε” in the text and “read ἀναδέχομαι” in the apparatus. This first step has been a bigger task than we had imagined, and we owe a great deal to the brilliant and devoted work of Faith Lawrence at the Department of Digital Humanities, King’s College London. A few thousand words still lack accents; please feel free to add such and submit them to us. Some small number or errors have been introduced along the way (outright mistakes, which will be clearly apparent as nonsense); we are hacking away at those, fixing as many of them as we can; but if you notice any please feel free to correct and submit, or else (if is seems to be a systematic kind of error) please just alert us by email.

(3) Note also that the process of adding accents and reversing the display-regime of regularizations was more difficult in the case of regularized words that wrapped from the end of one line to the beginning of the next. Many thousands of these we handled automatically; about 6000 or so still need to be fixed. We think we are in position to have 80-90% fixed in the next couple weeks. In the meantime, all lines affected by this are flagged with a bright red line number--you may have noticed this already. In all such lines you will see that chunks of words appear oddly at the end or beginning of an affected line; the apparatus entries for all such will look fine.

(4) The line-by-line commentary feature has a bug, which will create a dozen or so clones of some commentary entries...this is very frustrating. It should be fixed before the new year. Just don’t enter any commentary in the next few weeks!

(5) Frequent Papyrological Editor users will notice that we have radically overhauled our apparatus criticus capabilities.

* for BL corrections: <:αἱ τοῦ=BL 9.17|ed|Θίτου:>
* for corrections proposed in journals/books: <:(διαγρ(άφου))=N. Gonis, ZPE 143 (2003) 150|ed|(διαγρ(αφῆς)):>
* for corrections proposed direct to DDbDP: <:τοῦ=PN G. Claytor (CPR VI plate 35)|ed|:>
* we now support multiple regularizations: <:ἀνοίγεται (?)|ἀνοίεται (?)||reg||ἀ̣νύεται:>
* we now support ‘regularizations’ by language (useful in multi-lingual texts for example): <:ἄρακος=grc|reg|ⲁⲣⲁⲕ:>
* we now support combinations of all of the above, as the following *fictional* example illustrates:

275a. <:στρ[ατηγὸς]=BL 15.2||ed||
στρ[ατηλάτης]=J. Cowey, ZPE 150 (2020) 321-323|
στρ[ατιώτης]=R. Ast, CdE 100 (2018) 13-15 (BL 14.5)|
Συρ[ίων]=Original Edition:>

* And we support extremely complicated combinations (including nesting of virtually every type of apparatus tag, as the following *fictional* example illustrates:

75. <:<:στρ[ατηγὸς]|subst|<:σ.2[.?]|alt|γ.3[.?]:>:>=BL 19.2||ed||
<:<:στρ[ατηλάτης]|reg|ξ̣τ̣ρ[ατηλάτης]:>|alt|.1γρ[.?]:>=J. Cowey, ZPE 200 (2020) 321-323|
<:<:στρ[ατιώτης]|alt|στρ[ατηγία]:>|reg|στυ̣ρ[ατ][.?]:>=R. Ast, CdE 100 (2018) 13-15 (BL 14.5)|
<:Συρ[ίων](?)|reg|<:<:Σο̣υ̣ρ[ίων]||alt||Συ̣υ̣ρ[ίων]|Σω̣υ̣ρ[ίων]:>|subst|Σ.2ρ[ίων]:>:>=Original Edition:>

This means:
(i) at line 275 the DDbDP prints στρ[ατηγὸς], which the scribe himself corrected from either "σ . . [ca.?]" or "γ . . . [ca.?]", and which is recorded in BL vol.19 p.2

(ii) previously, Cowey had argued (in ZPE 200) for correcting the text to either στρ[ατηλάτης], which is a modern regularization of ξ̣τ̣ρ[ατηλάτης], or to ". γρ[ca.?]"

(iii) before Cowey, Ast had suggested (in CdÉ 100) that the papyrus reads στυ̣ρ[ατ- ca.?], which should be regularized either to στρ[ατιώτης] or to στρ[ατηγία]; this was subsequently picked up by BL 14.5

(iv) The original editors of the papyrus thought that the scribe had originally written "Σ . . ρ[ίων]", and then corrected it to either Σο̣υ̣ρ[ίων] or Συ̣υ̣ρ[ίων] or Σω̣υ̣ρ[ίων], any one of which should perhaps be regularized to Συρ[ίων]

The PN will display στρ[ατηγὸς] in the text and in the app: 275. corr. ex σ ̣ ̣[ -ca.?- ] (or γ ̣ ̣ ̣[ -ca.?- ]) BL 19.2 : ξ̣τ̣ρ[ατηλάτης] (l. στρ[ατηλάτης]) (or ̣γρ[ -ca.?- ]) J. Cowey, ZPE 200 (2020) 321-323 : στυ̣ρ[ατ -ca.?- ] (l. στρ[ατιώτης (or στρ[ατηγία])) R. Ast, CdE 100 (2018) 13-15 (BL 14.5) : Σο̣υ̣ρ[ίων] (or Συ̣υ̣ρ[ίων] or Σω̣υ̣ρ[ίων]) (corr. ex Σ ̣ ̣ρ[ίων]) (l. Συρ[ίων]) Original Edition. This is not *precisely* what you expect to find in a print publication, but it is full, clear, and quite unambiguous.

This is, we think, a huge achievement and a great good; we have Gabby Bodard at King’s College and Jon Fox at the University of Kentucky to thank!

The Leiden+ Documentation page should now reflect most/all of these improvements: (http://papyri.info/editor/documentation?docotype=text); please let us know if we have missed something. Also, please note that the Apparatus “Helpers” (for BL, Editorial, and SoSOL) are now buggy as a result of the changes but should be fixed before the new year. In the meantime, enter apparatus entries by hand.

(6) I shall also mention just briefly that thanks to the extraordinary generosity and collegiality of our colleagues in Brussels, Alain Martin, Paul Heilporn, and Alain Delattre, we have begun a process of surfacing Bibliographie Papyrologique data via the PN. Our Heidelberg colleagues James Cowey and Carmen Lanz did an amazing amount of work to make this happen. You will see that there are still many bugs in the conversion of the BP records to structured XML, but we are getting there one step at a time. From the navigation bar at the top of the PN select Search / Bibliography. This will take you to a very *simple* search screen, where you may enter, for example, “Hombert” (no quotation marks) (http://papyri.info/bibliosearch?q=Hombert); or you may constrain the search by BP fields. So, if you enter “author:Hombert; title:bibliographie; date:1932” (no quotation marks) you will get 2 records (http://papyri.info/bibliosearch?q=author%3AHombert%3B+title%3Abibliographie%3B+date%3A1932+). This works also for BP subject codes; so, for example, “index:146” will find 1106 records concerning Archives. The FileMaker version of the BP remains much more flexible, powerful, and manipulable, for those users who are so inclined. In the next few weeks we shall be able to link from BP records to DDbDP texts and from DDbDP texts to BP records. This is only a start; we expect this service to improve dramatically over the coming months, soon with the ability to add/correct records, and submit them for review by the BP editorial team.

This message is already far too long. So, enough for now.

As always, please feel free to send questions, comments, complaints to Josh Sosin jds15@duke.edu, Rodney Ast ast@uni-heidelberg.de, and James Cowey james.cowey@urz.uni-heidelberg.de; same, with regard to PN performance to Hugh Cayless .

All best,
Rodney Ast
James Cowey
Josh Sosin