What goes into an email archive?

If we are going to create an email archive using Cardbox, the first thing to decide is what goes in it. Should an archive be an exact copy of everything that originally arrived by email?

The trouble is that to reproduce in an archive exactly the experience that you would have in opening the original email is a complex task and frequently impossible. It would mean preserving all attachments, and we frequently delete those very early on. If the email was something other than plain text, it would mean not only keeping all the formatting but also processing it in the identical way that the original email program did… and Eudora, for example, often displays HTML emails very differently from a browser.

So in this project, the archive will store just the text of each email: not its font size and text colours and not its attachments. This makes sense in the context of an archive because the principal task of an archive is to find what was said and not necessarily to reproduce the exact way it was laid out. Compare a museum catalogue or a photograph library: the Cardbox record is not a substitute for the thing you are archiving, only a representation of it.


%d bloggers like this: