Friday, April 20, 2012

PN Search Updates


just posted to papy-list:

Dear Colleagues:

We write with recent updates to the PN.

(1) Search facets now feature auto-complete combo-boxes instead of simple drop-down menus. So, let’s say you search for #οικονομ (strings that start with οικονομ); 913 hits. You are really just interested in the Hermopolite and so you click into the ‘Nome’ facet. As you start to type ‘he’ you see your options shrinking...1 Heliopolite, 1 Heptakomias, 80 Herakleopolite, and 24 Hermopolite. Select Hermopolite and see your 24 hits. The same sort of thing should work with the other facets as well.

(2) APIS Collection and Publication Series facets now play nicely together. Let’s say you remember that a Berkeley papyrus was published in P.Coll.Youtie, but you cannot remember which and you do not have the book to hand. Select Berkeley from the ‘Collection’ facet; 4316 hits. Select ‘DDBDP: p.coll.youtie (2)’ from the facet box, or else start to type, p.coll; 2 hits. Find P.Coll.Youtie 12.

(3) Improved indexing of regularizations. Let’s say you are interested in documents whose first word after the greeting χαίρειν is βούλομαι vel sim. You search for "χαιρειν# #βουλ" (with the quotes; a string equal to or ending in χαιρειν followed immediately by a new-word string starting βουλ); 11 hits. But you will miss Chrest.Wilck. 323, since lin.7-8 read χέρειν(*). βουλό|μεθα.

Why? Because χαίρειν is in this case not, strictly speaking, adjacent to βουλ; χερειν is. This is an important fact to master.

In this particular case, it means that, knowing that χαίρειν is often rendered as χέρειν, you must search carefully. You can either

(a) enter search query "χαιρειν# #βουλ", then click OR and in the new search box enter "χερειν# #βουλ"; 12 hits
or
(b) search for REGEX χ(αι|ε)ρειν\sβουλ; 12 hits. The regular expression for ‘αι or ε’ is (αι|ε) and the regular expression for space is \s.

Note that under both of these searches PN *finds* Chrest.Wilck. 323 but does not properly *highlight* the query for that text. This is a known bug; we are working on it.

(4) We now support abbreviation-aware searching. Say you are looking for abbreviations beginning πρ. Search for #πρ° (enter πρ and then click the abbr button; ° indicates opening parenthesis); 862 hits. NOTE: this new feature has some known bugs:

(a) highlighting: if you search for #πρ° in your first hit you will see highlighting on both Πρ(ώτων), which you expect, and on Πρώτων, which you do not. We are working on it.

(b) highlighting: if you search for τιθ° you will see that in your 6th hit (BGU 9.1891) none of the highlighted hits is τιθ( , which is what you expect to see. But if you click to view the text, you will see (lin.169, 194, 220 etc.) Τιθ(οείους), which is what you expect to see. This isn’t very helpful; you want the right hit on the search results page. We are working on it.

(c) false positives: if you search for αρ° you will find in your second hit (BGU 1.4) that none of the highlighted hits includes αρ( , which is what you expect to find. And (ἑκατοντάρ)χ(ῃ) and χ(ιλιά)ρ(χῃ) (both in lin.1) even seem misleading: in neither case is αρ on the papyrus at all. We are working on it; this particular fix may even be live by Monday, and in any case quite soon.

(d) XML error bug: if you search for °ετους, expecting many many many hits, you find only ten! There is a bug in the search, whose fix will be in soon (along with (c) above); but there are also odd bugs in the XML, which we’ll fix.

As always, if you have questions, please feel free to contact us. Generally best to write to ast@uni-heidelberg.de, hugh.cayless@nyu.edu, james.cowey@urz.uni-heidelberg.de, and joshua.sosin@duke.edu. Or just write the papy-list.

All best,
josh sosin

Friday, April 6, 2012

Aegyptus [1920– 2006] at JSTOR

Aegyptus 1920-2006 (Anni 1-86) (Previous Title: Studi della Scuola papirologica
[1915–1920])
ISSN: 0001-9046
Aegyptus, the Italian Egyptology and Papyrology journal, was founded in 1920
by Aristide Calderini and directed by him until his death (1968). The direction
was then entrusted to Orsolina Montevecchi until year n. 80 (2000). Since year
81 (2001) the Director has been Rosario Pintaudi. The editing is care of the
Papyrology School of the Catholic University of Milan. A general index of the first
50 years (1920-1970) can be found in “Studia Amstelodamensia” II (1974) edited
by S.M.E. van Lith. This specialized magazine publishes articles of Egyptology,
Greek and Coptic Papyrology written by Italian and foreigner  scholars, in Italian,
French, English, Spanish and German.

is now available in the Arts & Sciences IX collection at JSTOR to institutions with
subscriptions.

You will find a link to it in AWOL.

And see also The Ancient World in JSTOR: AWOL's full list of journals in JSTOR with substantial representation of the Ancient World.

Thursday, March 1, 2012

IDP Updates

This just sent to papylist:
= = =
Colleagues:

I write with news of recent IDP updates. Frequent users will have noticed already that we have just released significant enhancements to PN search capabilities. Thanks go to our colleagues Rodney Ast and James Cowey, and especially to our fantastic programmers Hugh Cayless and Tim Hill.

Some of the changes are intuitive, others less so. If you visit our search checklist :
https://docs.google.com/spreadsheet/ccc?key=0Ajkz6D9lOd20dGtwczJSUXVhWUx6ZEJCUEhvR1lIYVE#gid=1
and click on the SearchDocumentation tab at the bottom, you will see a more clearly emerging set of search instructions. This will be a useful page to bookmark.

Note that Boolean operators are now deployable with buttons. You will note also that sometimes when you click on a particular operator PN will automatically open another search box for you. For example if you search for πρωτ* THEN αρσιν* within 5 words, and then click ‘and’, PN will open a new box for you and insert the operator ‘AND’. Note also that PN greys out some operator buttons some of the time. IMPORTANT : both of these features are designed to help you avoid running searches that are impossible or so computationally intensive that they will cause disruptions for the wider community. You can type the operators into the search box and you can construct complicated searches in a single search box, but you run the very real risk of a timeout or fail. Please try to help us keep things running smoothly for everyone by using the buttons and not ‘hacking’ the search box!

We now support proximity searches in two forms:

(1) NEAR = word/string X within n words or characters of word/string Y in either direction
(2) THEN = word/string X followed by word/string Y, within n words or characters=

Note: NEAR and AND are not the same thing. X AND Y finds documents containing X and Y anywhere within them. A search for X NEAR Y must specify within n characters/words and will only find documents in which X and Y co-occur within the specified distance.

So, if you want to find documents containing a string beginning σπεν- followed immediately by a string -υχεσθ- then enter σπεν* THEN *χεσθ* within 1 word. Note that * indicates wildcards (the absence of a word boundary). You will find one hit, P.Lond. VII 2193.9: σ̣π̣έν̣δοντ̣ες εὐχέσθωισαν. Note that the highlighting and hit-quotation does not work on this search. This is a *known bug* and we are working to fix it.

Note that the PN treats strings and words very differently and this can affect the way you must run certain searches. Take the previous example. A search for σπεν THEN χεσθ within 10 characters will return one hit, P.Lond. VII 2193.9: σ̣π̣έν̣δοντ̣ες εὐχέσθωισαν (with the highlighting bug). However, a search for σπεν THEN χεσθ within 1 word WILL NOT WORK at all. This is because proximity searching looks by default for words not strings. Thus, if you are running a proximity search you must include * to indicate the part of the word that you are looking for. Again, when you constrain a proximity search by characters you are in effect running a substring search; but when you constrain a proximity search by words you are searching for words and so must use asterisks to indicate that you have provided only a part of the word. If you constrain the search by characters you do not need asterisks, because PN automatically treats the queried terms as strings (so πεν finds σπενδ).

Sometimes you can reach the same or similar results via different search strategies. For example, say you want to find documents containing the precise phrase "ομολογουμεν πεπρακεναι" and the precise phrase "του νυν". You can enter:

(1) enter "ομολογουμεν πεπρακεναι" AND "του νυν" in a single box
(2) enter "ομολογουμεν πεπρακεναι", click AND and in the second box enter "του νυν" after the AND, which PN will enter for you.
(3) enter "ομολογουμεν πεπρακεναι" then click search; then refine search with "του νυν"

Lexical searching can be combined with string searching. Let’s say you want to find documents containing any form of the word ἀνήκω followed by the string beginning υπηρετ- within 5 words. Enter LEX ἀνήκω NEAR υπηρετ* within 5 words (you will get 2 hits).

You can use the radio buttons to search Text, Metadata (=HGV and APIS), or Translations. But you can also restrict searches to either HGV or APIS. So a search for HGV:witnesses will return 5 hits; a search for APIS:witnesses will return 87 hits; a search for witnesses alone, with the ‘Metadata’ radio button selected will return 92 hits.

Phrases of more than 2 words are now searchable: enter "οἱ ἐκ τῆς"

For those of you who are comfortable with regular expressions, we now offer support for regex searches. If you want to find documents containing a string beginning αυτο- within 20 characters of a string beginning και-, but not within 20 words of a complete word αδριανου, enter: REGEX αυτο\b.{1,20}\bκαι\s(?!(\S+\s){1,20}αδριανου)

Our best advice is to play around with the new interface and its capabilities, try searches from the SearchDocumentation list. Everything that we have marked as working should work. If something appears not to, let us know. If a particularly important combination of searches does not appear on the list, let us know; we can add it. Let us know also whether that particular combination works or does not, so that we can test and confirm. If anything is especially weird or puzzling please be sure to give us as much information as possible (the precise steps you followed, the browser you are using, etc); even attach screenshots if you like. The more information we have the better we can address problems that you encounter.

Once you get used to the changes, we hope and think you will be as pleased with them as we are.

Best,
Josh Sosin