April 2007

I’ve been in the midst of preparing grant proposals for enhancing Open Context by enabling users to create custom templates for the user interface (UI) and presentation of content. Thus far, most development effort has gone into making it work (data integration, management, etc.) and it is woefully lacking in standards compliance and flexibility on the UI-end.

One of the most important considerations in UI should be a close hard look and implementation of the recommendations of the W3C Web Accessibility Initiative. This really is an essential guide to making web resources accessible to the disabled community. Again, my own efforts with Open Context really fail in this regard, and I’m very eager to rectify this situation (hint: yes! collaboration help is desired!). The need to make sure our tools and resources meet the needs of the disabled community was drilled home by comments made at the recent Hewlett Open Educational Resources conference. Open Context won’t deserve “Open Access” status without it!

We’re working on a plant to support custom style-sheets for presentation and interface tools. One of the neglected advantages of working toward disabled access, is that it improves web-design more generally (reducing sloppiness, adhering to web-standards). It also means developing systems that can be more easily tailored to meet the needs of other communities (such as portals for sub-discipline communities, language / cultural localization, etc.).

As a parent of two small children, I’m often struck at how the Americans with Disabilities Act has made my life easier. Getting into buildings while pushing a baby-stroller is so much easier because of ramps intended for wheelchairs. So, I’m (belatedly) trying to internalize these lessens and revise my own work accordingly. In doing so, it really opens many doors for building a far more robust and flexible system.

While, I know these issues are widely known among web designers and university staff, they may not be that widely known among individual researchers. So, it is definitely worth discussion for the DDIG community, as many of our members have much more experience with archaeology, data sets, and technology, and less knowledge of the accessibility issues. I hope that other members of the DDIG community working on digital dissemination projects take these important considerations into their design decisions.

While at the SAA conference in Austin, I went online to check news. The Free GeoTools blog posted an irresistible link to a new Federal service, “EarthNow! Landsat Image Viewer“. It’s nearly real-time imagery from Landsat satellites. Flying over the Earth at 20,000 MPH or so is oddly entertaining, and sucked away some time I should have devoted to preparing for my “Discussant” role (something new for me).

It’s nice to see my tax-dollars going into something that just reduced my productivity a notch or two. Of course, by linking to it, I’m endangering your productivity too.

Then again, it is nice to take some time out once and a while and just enjoy the neat technology now available.

Something of an Open Access breakthrough is happening in the world of archaeology.

Open Access News reports that the APA/AIA have recently issued a report that takes into account open access recommendations made by the American Council of Learned Societies (ACLS)(blogged about here).

The extended excerpt reported by Open Access News shows that the APA/AIA report makes some important moves toward recognizing the value of open access. The report also notes increased barriers to scholarship created by the recent expansion of copyright restrictions. While this is an important observation, the APA/AIA report should also have recommended Creative Commons licenses as a strategy for unlocking scholarly materials from overly restrictive copyright. Given that the ACLS report suggested use of Creative Commons licenses (page 45), I am a little bit puzzled about this omission in the APA/AIA report.

Beyond licencing, the APA/AIA has some good language about interoperability (including a discussion of the OpenURL standard, related to the COinS standard discussed here). Additional discussion about the need for data sharing and longevity is also in the report (page 4).

The discussion about the “Digital Monograph Series” (pages 9-10) is also of interest to DDIG members. It noted that there is still a great deal of skepticism about publishing with digital media. Though, I wonder how quickly this may be changing given the growing amount of activity seen in the world of digital scholarship. Nevertheless, the report recognizes the need for dissemination mechanisms and calls for a new publication series “for works that would be improved through the digital medium”. Databases and the like would be included in this.

And there is much more, plus a discussion of this report (while it was in draft form) over at the Stoa Consortium. Don’t miss the comments by Greg Crane (Editor in Chief of the Perseus Digital Library), which (I believe) rightly emphasize how the “…center of gravity for intellectual life in academia and society as a whole has already shifted decisively to a digital environment.”

Nevertheless, the seriousness and interest in Open Access frameworks shown in this report is significant news, especially because it comes from a scholarly society. What a marked point of contrast from the very counter-productive approach taken by AAA! The APA/AIA report is well worth reading by archaeologists, including the leadership of the Society for American Archaeology (SAA).

I just finished installing COinS metadata into parts of Open Context. COinS is a lightweight, relatively easy to implement standard for expressing Dublin Core metadata (or “information about information”, as in a library catalog). Dublin Core is a very widely used set of metadata. It’s found in RSS feeds and it is the standard used by the pioneering Archaeology Data Service (UK).

Much discussion about metadata centers on interoperability of services and making information easier to find. To these ends, we’re also working on making Open Context compliant with the Open Archives Initiative Protocols for Metadata Harvesting.

Besides being important for back-end interoperability, there are also much more user-center applications of metadata. RSS really popularized Dublin Core. It made it much more than a librarian issue, and turned virtually everyone with a weblog into a Dublin Core metadata author.

Zotero, a break-through project out of George Mason University, promises to make digital metadata much more a part of the daily lives of scholars. Zotero is a free, open source, citation tool that plugs into the Firefox browser. It scans every webpage you view, ranging from weblog posts to articles in JSTOR, and looks for metadata. It uses this metadata to automatically capture bibliographic reference information. That saves researchers a great deal of tedium and reduces annoying typographic errors in building up their reference databases.

COinS is one of the standards for expressing Dublin Core supported by Zotero, and that’s why we use it in Open Context. And we’re not the only ones to realize the significance of Zotero’s automatic bibliographic tools. The Pleiades Project (an NEH funded open access initiative developing scholarly resources and community around ancient geography) is also compliant with Zotero.

These types of tools will do much to bootstrap digital dissemination of research. Easy capture of bibliographic information makes Web resources very convenient. It’s also amazing how some of the simple features (COinS is very easy to implement) make such a difference in easy of use and relevance for scholarship.

It is very exciting to see these developments come together!

A few recent open access developments circulated by Peter Suber highlight why we need to do more to raise awareness.

Regular readers of this blog may think I’m beating a dead-horse on this Open Access issue. However, some issues need continued attention. A recent report out of the UK illustrates cost increases in scholarly publishing very clearly. Social Science journal costs have risen some 39% between 2000-2006, which is more than twice the general rate of inflation (page IV, see link for the full report below).

Ultimately, this hurts faculty and students. If universities have to pay more and more to publishers, there is less money for new faculty hires, graduate student support, and undergraduate education. Bloated subscription fees therefore make life much more difficult for archaeologists in the university setting. And this impacts CRM archaeology too, since CRM organizations also depend on universities for training new people and for participation in professional publication. Archaeology aways seems to struggle financially, and we could certainly do without the financial squeeze that comes from escalating publication costs.

Still, much needs to be done to spread word of the many advantages of Open Access. However, Suber linked to a recent report by the Research Information Network showing that only 1 in 10 university faculty members are aware of the open access debate and issues. This is not too surprising given the many demands faculty face, and that library systems may still largely insulate faculty from directly facing dramatically escalating costs in subscriptions.

DDIG members! Open Access is an important issue for the future of our discipline. Please do what you can to get acquainted with the issues, and raise awareness with our colleagues. Here’s an excellent background primer to learn more. Please also come to the Open Archaeology Reception at the SAA conference in Austin. You’ll get to learn more about the issues, get a free t-shirt (while supplies last), and much on some sushi!

As noted earlier, ArchaeoInformatics.org is making their lecture series available through streaming-video. Another group, including the Society for Applied Anthropology, University of North Texas Dept of Anthropology , University of North Texas Center for Distributed Learning , all coordinated by Jen Cardew is doing similar work. This group is busy putting up podcasts of presentations given at the 67th annual meeting of the Society for Applied Anthropology (SfAA). The podcasts are going up at: http://sfaapodcasts.net/

This is a great idea for conferences, and I’ve already participated in a few meetings that make a point of podcasting presentations (along with associated powerpoints). David Wiley, a leading figure in the world of open educational resources, organizes an annual conference where the vast majority of presentations go online.

Conferences are often very busy, and one often misses papers because of schedule conflicts and concurrent sessions. Having later access to these papers would be very helpful. Also, many students can’t afford to travel to conferences, but can gain valuable information about current research (bearing mind that many results presented at conferences only make it into publication after a few years, if at all). I also think the IP (intellectual property) incentives should be remembered. It should make people more comfortable to share research and results if a public record of their contribution is made.

On the flip side, conference papers are often a bit less formal, and people sometimes want the freedom to speak “off the record”, and be a bit more provocative than they would be if their every word was archived. This raises some interesting issues. Because memory and bandwidth is getting increasingly cheap, it is more and more feasible to record absolutely everything in a conference. Although, since most people realize conference communication is less formal than peer-review publication, I doubt this kind of thing is too much of a concern.

Nevertheless, in the future, every slip of the tongue will be out there, and indexed by Google.

The Museums on the Web 2007 conference recently concluded in San Francisco. The conference sounded great (though expensive! $575-675 registration!), and I would have loved to have attended personally, aside from a bad case of conference-fatigue. Judging from the presentations, demonstrations, and other parts of the conference line-up, it seems there are very exciting things happening in the museum world. Obviously, these developments should be monitored by DDIG members, since the museum world is developing a great and growing pool of talent and experience in harnessing the Internet.
Also, Brewster Kahle, head of the Internet Archive, addressed the Museums on the Web 2007 conference. His organization has a great portfolio of fascinating projects ranging from scanning, OCR, and public access of all English language public-domain literature, to ongoing efforts to archive as much of the Web as possible (think petabytes of storage). Geoff Crane made a post on the “Questacon” Weblog about Kahle’s presentation (thanks to Peter Suber for the link). A key highlight is his call for Open Access to museum (digital) collections. The Internet Archive is prepared to back, with storage and bandwidth, open collections. That’s a very exciting opportunity for organizations with interesting collections but little exposure. There must be a wealth of material that can be made so much more valuable if known to scholars and the public.

Please join us for an event at the Society for American Archaeology Conference in Austin, Texas on Friday, April 27 (7 – 8 PM, Room 410, Austin Hilton). We will be celebrating “Open Archaeology” with free sushi, friendly conversation, and a chance to network with other researchers working to reform and enhance communications in our discipline.

OpenArch Flyer

DDIG members! If you are working on an Open Access project, this is a great chance to let an interested community know about your efforts. Please contact me (ekansa-at-alexandriaarchive.org), and send me a few (1-4) PowerPoint slides about your efforts. I’ll incorporate them into a presentation that will be looping in the background while we munch on some sushi!

The NEH funded Pleiades discussion list recently picked up on my last post about copyright and scientific data. Several contributors to that list had important points and resources to add, especially about geospatial data. These include:

  • Here’s an interesting post by Chris Holmes, “Promoting freely available geodata“. It touches on many of these themes, and also notes that Creative Commons and Science Commons is reluctant to develop licensing mechanisms around factual data. He also explores some of the policy implications of “copyleft”-type contracts that are not based on copyright law.
  • Another contributor to the Pleiades discussion list rightly pointed out that geospatial data sees very different legal regulatory frameworks internationally. I should also add that the EU has greater copyright protection for database content than the US. James Boyle (who’s on the Board of Creative Commons), wrote an interesting piece in the Financial Times about how the EU database protection laws have not helped the European database industry. This perspective helps explain why Creative Commons and Science Commons are very reluctant to get involved in licensing factual data. “Protecting” such content with licenses (even with “some rights reserved” licenses) may do more damage than good.

Aside from the fact that it seems we all need some good lawyers, these discussions help illustrate the importance of community social norms. Scholars are already (largely) a self-regulating community. Inviting in lawyers to craft custom licenses and contracts may not make the most sense, unless the law directly impedes our work (as is the case with standard “all rights reserved” copyright, where Creative Commons licenses are a vast improvement). Developing positive social norms is something of an art, but there are many examples of successful online communities. Hopefully we can learn from these examples and adapt them to help make open research in everyone’s enlightened self-interest.

Additional Note:

Before someone else points out my error, I was remiss in not linking to the original blog post over at the Open Knowledge Foundation that started all this discussion. Jamie Boyle’s article is already well discussed in this first post! It clearly pays to thoroughly read one’s primary sources before posting to a weblog. My apologies!

Peter Suber, an essential source of scholarly open access news, recently posted a discussion about the copyright status of “data”, and if Creative Commons licenses where appropriate for such content. Copyright law makes a distinction between “facts” (and/or ideas) and “expressions”. Original expressions are protected by copyright, but the ideas and facts being communicated by these expressions are public. If I write “Stratum B at site X dates to between 7500 – 7000 BP”, this specific sentence is an original expression and is copyright protected. However, you are free to “abstract” the ideas and facts out of my sentence and put them into a new expression such as the following table:

Site Phase Est. Dates (BP)

Site X Stratum B 7500-7000

Because the ideas and facts in my original sentence are not copyright protected, no permissions need to be asked to re-express them in a new way, like the table above. Legally, citation isn’t even required, though citation is a very important social norm for the scholarly community, even when it involves crediting non-copyrightable facts.

The legal distinctions between “facts” and “expressions” are important to consider when we develop online data-sharing systems. Creative Commons licenses are wonderful tools for the research community to share expressive (copyright protected) content. Each Creative Commons license requires attribution for all uses of a licensed work. Attributing researchers for their contributions is very important, since it helps them build their reputation.

However, Creative Commons licenses are copyright licenses. They only work with copyrightable material. Many scientific databases lack enough original expression and are too factual to be copyrightable. Their contents are therefore public domain and can’t be licensed with Creative Commons licenses. Here’s a great paper (“Geographic Information Legal Issues”) by Harlan Onsrud that explores these issues. He noted a legal case involving the copyright status of an alphabetically organized phonebook, where a court decided that the content (names and phone numbers) lacked sufficient originality of expression to make it copyrightable. Peter Suber also links to the Science Commons FAQ about databases and copyright, which is also an excellent resource.

So what’s the threshhold for original expression to make content copyrightable? The answer is ambiguous. For archaeology, which so often sees documentation expressed in free-form notes and drawings, copyright will probably often apply. In such cases, Creative Commons licenses can and should be used. However, some areas of archaeology capture much less expressive and more “factual” kinds of data (archaeometry, zooarchaeology, some studies involving GIS, etc.). In these cases Creative Commons licenses shouldn’t be used.

The public domain nature of factual data raises an incentive problem. Factual data can be legally copied and used without attribution. Again, even traditionally published factual data can be legally used without attribution. However, putting such resources up in open online archives would make such legal appropriation very easy. Without some reasonable expectation of attribution, why would any researcher share their hard-earned data?

Therefore, developing online archives of factual data requires developing social norms to regulate their use. Just as we expect citation even when we publish “facts” in traditional paper media, we should expect citation in online publication of our data. Professional ethical codes should be updated to reflect these needs, and journal editors and reviewers should be aware of these issues to help prevent cheating.

In addition, data archives may want to consider “terms and conditions of use” contracts that require end-users to attribute sources of factual data. Such contracts need not be based on copyright (as are Creative Commons licenses), but are made as a condition for using a data archive. While these should be explored, we should be very careful about such legal “solutions”. There may be hidden costs and unwanted problems associated with such end-user agreements. Nevertheless, I welcome such discussion, since, as a developer of tools for open access data archives, I’m keenly interested in incentives!