April 2008

I’ve been working on some citation issues in Open Context. One thing on the immediate horizon is implementing WebCite on Open Context. If all goes well, we should have the Open Context citation button send a copy of the cited web-page to WebCite for archiving and thereby gain a truly stable-URL that will make Open Context content retrievable even if the cite disappears. It’s not ideal, since it only archives the XHTML version of the cited page, but perhaps some added markup can be added to convey more structure.

Also, Stuart Campbell (University of Manchester) send some useful suggestions on how to cite sub-selections of data from a larger corpus. Together with co-director Elizabeth Carter (UCLA), he’s already contributed a substantial portion of his work at Domuztepe (Halaf period site in Turkey) to Open Context. The Domuztepe crew is in the midst of a big publication push and plan to add much more data. Thus the citation issues are becoming more pressing.

In practice researchers will rarely want to cite an entire 10-year excavation dataset generated by a large team of specialists researchers. They’ll want to cite parts of such datasets, ranging from an individual specialist analysis to a selection of items that may span across different specialist datasets but may still not encompass an entire project dataset. People may also want to cite subsets of specially selected data from several different projects.

All this makes citation issues very complicated. Who do you credit and how? Project directors should be credited, but so should individual specialists, and even lowly trench supervisors who make observations in the field. You can quickly gain a very long list of people who need some form of citation credit.

In addition, some uses of other people’s data can be quite sophisticated and should see recognition also. If you systematically go through the effort to comb through other people’s datasets, and attempt to interpret and select items among them, you can be actively doing significant research. Your selection of data should be credited (or blamed) to you, since this activity highlights items of interest and interpretive value and can clearly contribute to knowledge creation. People using and selecting sets of data from different projects and collections should also be cited.

All of this is hard to convey in typical citation conventions. I think it’s time to get some conversation about these issues since my poor brain hurts thinking about this.

Tom Elliot (Pleiades Project) sent me a link to a pretty hilarious discussion attempting to place archaeologists into a taxonomy based on their data sharing habits.

Tom self identifies as a “cranky space monkey“, and points to Bill Carahrer who thinks of himself as a squirrel. This was all touched off by Charles Watkinson who said that “grey panthers” (tenured people at the top of their field) are far more likely to experiment with total data transparency than would struggling junior faculty or graduate students.

Of course, Watkins has a good point, and has some more good thoughts about ways to link data publication with narrative publication. Sebastian Heath added some interesting discussion about back-and-forth linking between primary data and published narratives. I’ve been thinking about these issues too, and am working with my colleague Erik Wilde on a (hopefully) elegant approach to the issue based on his work on Linkbases. We’ll try to have something to publicly demo in the next few months.

Back to the taxa. In general, I also think that “grey panthers” are more likely to publish data than junior scholars, because junior researchers have more reason to be risk adverse. That said, like most things, there are plenty of exceptions. Some senior people may have excellent publication records but have shoddy field documentation and don’t like the idea of transparency. Some junior people act very openly with their material. Open Context has a mixture of datasets contributed from very prominent “grey panthers” (see Petra) and junior researchers who like this opportunity to advertise the quality of their research (see Justin Lev-Tov’s zooarch analysis of Hazor material).

As far as my own taxonomic self-identification, that’s a hard question. Open Context has been my main project for some time now, and its main aim thus-far has been to validate a common data model with lots of eclectic stuff (though we’re transitioning over to doing more thematic collection building). I’ve been eclectic and opportunistic in building Open Context content (and refining schema mapping processes etc.) with whatever people want to provide.

So I guess that makes me something like an Eastern Bluebird, since they build nests out of whatever is handy.

Hi Everyone.

This has nothing to do with archaeology, but I couldn’t help but to note this interesting April 1st development. It’s something of a follow up to my earlier posts on Google and its ambitions here and here. Please take a look at this short video by Google’s founders, Larry Page and Sergey Brin:


That’s right. They claim to be teaming up with Virgin Galactic to colonize Mars. Here’s Richard Branson on the “project”:


They’re calling it an “Open Source Planet”. The funny thing about this April fool’s joke is that it comes from a wildly ambitious and seemingly unstoppable firm (however, note that even Google seems to be constrained by market forces). Given their other goals, colonizing Mars almost seems like business as usual for Google.