In The Battle of the Surrogates: Social Cards Probably Win

In The Battle of the Surrogates: Social Cards Probably Win

On Tuesday, we released our latest pre-print "Social Cards Probably Provide Better Understanding of Web Archive Collections". My work builds on AlNoamany’s work of using social media storytelling to provide a visualization that summarizes web archive collections. In previous b...

Read More
Google+ Is Being Shuttered, Have We Preserved Enough of It?

Google+ Is Being Shuttered, Have We Preserved Enough of It?

Google+ will be shut down on April 2, 2019. In this blog post I cover how much of Google+ is archived and how to archive its pages.

Read More
Improving Collection Understanding in Web Archives

Improving Collection Understanding in Web Archives

by Shawn M. Jones

Ever since the Internet Archive started large-scale web archiving in 1996, historians, sociologists, and journalists have found web archives to be an important source of information for their work. Archive-It, a service focused on creating collections, allows curators to gene...

Read More
The Off-Topic Memento Toolkit

The Off-Topic Memento Toolkit

by Shawn M. Jones, Michelle C. Weigle, and Michael L. Nelson

Web archive collections are created with a particular purpose in mind. A curator selects seeds, or original resources, which are then captured by an archiving system and stored as archived web pages, or mementos. The systems that build web archive collections are often configu...

Web mentions

Read More
The Many Shapes of Archive-It

The Many Shapes of Archive-It

by Shawn M. Jones, Alexander Nwala, Michelle C. Weigle, and Michael L. Nelson

Web archives, a key area of digital preservation, meet the needs of journalists, social scientists, historians, and government orga- nizations. The use cases for these groups often require that they guide the archiving process themselves, selecting their own original resources...

Read More
The Off-Topic Memento Toolkit

The Off-Topic Memento Toolkit

I presented my conference paper The Off-Topic Memento Toolkit where I talk about a software package that can detect off-topic mementos in a web archive collection.

Read More
The Many Shapes of Archive-It

The Many Shapes of Archive-It

I presented my conference paper The Many Shapes of Archive-It where I talk about the structural features that can be used to understand web archive collections.

Read More
A Preview of MementoEmbed: Embeddable Surrogates for Archived Web Pages

A Preview of MementoEmbed: Embeddable Surrogates for Archived Web Pages

With the death of Storify, I’ve been examining alternatives for summarizing web archive collections. Key to these summaries are surrogates. I have discovered that there exist services that provide users with embeds. These embeds allow an author to insert a surrogate into the H...

Read More
How well are the National Guideline Clearinghouse and the National Quality Measures Clearinghouse Archived?

How well are the National Guideline Clearinghouse and the National Quality Measures Clearinghouse Archived?

There are two US government websites in danger, the National Guideline Clearinghouse ( and the National Quality Measures Clearinghouse ( Both store medical guidelines. Both will “not be available after July 16, 2018”....

Read More
Extracting Metadata from Archive-It Collections with Archive-It Utilities

Extracting Metadata from Archive-It Collections with Archive-It Utilities

At iPres 2018, I will be presenting “The Many Shapes of Archive-It”, a paper that focuses on some structural features inherent in Archive-It collections. The paper is now available as a preprint on arXiv. As part of the data gathering for “The Many Shapes of Archive-It”, and a...

Read More