The topic of history immediately draws to mind a dusty classroom in which professors tell stories of war, royalty and civilizations lost to the sands of time.
While traditional history is expressed as a vibrant tapestry of events, dates, people and places, we often forget that the web has its own rich history and a legacy to leave future generations that needs both preservation and recognition.
By examining current problems in how we preserve our digital heritage and through a significant change in our attitude towards web content, we can hope to leave future Internet users with something tangible and useful.
History doesn’t exist separate from our actions; it is built up over time in what we write and record, allowing those in the future to analyze and improve upon our work. Addressing our current perspective of web content as "disposable data" is critical at this time.
Evolution of Knowledge-Sharing
The passing on and recording of knowledge is a time-honored tradition. This practice has spanned generations, ranging from mankind’s early cave paintings to the industrious storage of information, such as can be found in the Library of Alexandria in Egypt.
Yet, while so much emphasis is placed on preserving our rich analog history, our digital past seems to be rapidly disappearing around us; and like the great Egyptian library that was lost to fire, we are now left with large gaps in our understanding of how much of our present web culture came to be.
The average website exists in a single form for a period of time before being reinvented when it goes through a website redesign, but if we value published content, then the need to preserve it should be immeasurable.
Today, many argue that if it does not appear in search engines or if there’s no clear link to it to the web, then it no longer exists. By law, the famous Library of Alexandra preserved any scroll it acquired, yet with the web, we throw out useful old content if visitor numbers are not high enough or because certain copyright laws prohibit us from preserving knowledge soon to be lost.
Digital Curators and Librarians
It would be unfair not to mention some present-day schemes to protect valuable web content from being discarded, such as the Internet Archive and, to a point, Wikipedia and Google.
However, if we look to Geocities being shut down, we can see the damage that a web service’s disappearance can cause to our web culture. With the bookmarking service Delicious being threatened into extinction, we might see valuable bookmarks of its users lost as well.
The knowledge and imprint of our history serves to teach others about the development of the web, and we must accept our role as curators and librarians of this modern digital world.
As a web professional, seeing web content disappear makes me sad, even if its relevancy and accuracy changes over time.
As we produce more and more content, finding true treasures on the web is becoming increasingly difficult. While the average blog may only have fragments of gold, it still reflects the diverse world we inhabit. The purpose of owning a website is to increase visibility, but so many still leave their creations unattended, creating dead, broken links, orphan pages and poor navigation and archival systems.
We should do more to improve our websites and showcase the good content in our archives, even if only to revisit an article or subject from a present-day perspective.
The Danger of Disposable Data
We have bred a society in which information and opinion aren’t valued over the long term. As a result, we are left with no infrastructure to ensure the sustainability of that content.
For example, little exists from websites created in the 1990s, and what can be found is often disjointed and scattered; imagine the state of our web content 20 years from now!
Quite paradoxically, we value our own work so highly, often spending hours upon hours creating beautiful websites and amazing content, and yet we forget them after their 15 minutes of fame.
To wake up from this mentality of waste, we have to be critical in our evaluation of modern web history. We must recognize that as the web changes, formats will shift and media consumption will alter, and we will need to maintain some level of control over the effects of popular trends.
Missing Links and Lost Empires
As a web professional or website owner, you can do plenty to reduce the disappearance of your content on your website. Certain practices have profound benefits and could give users additional reasons to return to your website. If we can expose a website’s history, we provide a more enriching experience full of quality content.
The first thing to do in promoting a healthy archive is to weed out the dead links that accumulate over time. This waste is easy to spot with tools like Xenu Link Sleuth (freeware), which scans every piece of content on a website.
Next, we can ensure the survival of our content by connecting every page to the rest of the website and listing them all on a site map (as well as creating a robot-readable Sitemaps XML file). Disconnected pages — pages that have no remaining active links to them — are orphans. Orphan pages rarely index well, which negatively impacts the findability of your content and its usefulness to future generations.
If you’re the kind of person who shudders at that thought of their earlier work, consider making active revisions of your old content visible on your website; do, however, preserve the original using some form of version control system.
If you redesign your website regularly and change content often, keep older pages available for nostalgic users or those interested in seeing previous versions. Exposing revisions (like a version history file) can also be very beneficial in tracking the progress and evolution of websites.
Donating published content is another practice to consider. While republishing old material ad infinitum is not sensible (because it would create duplicate content for search engines to index and perhaps be regarded as spam), there may come a time when you close down, change direction or overhaul your website.
In such cases, consider donating your posts to other websites (think of it as content acquisition); you could make a bit of money from it and free up useful data.
While donating outdated or useless content could help you clean up your web presence, the websites or services that receive this old material might not be able to manage it effectively, especially if it comes in at an unsustainable rate; this could, in turn, lead to dead image links, stale content and an increase in 404 errors. Instead, try to use your archived material in other ways, perhaps by promoting earlier articles, as a retrospective of sorts.
Finally, the most important precaution for ensuring the survival of your content is to back it up. To this day, many websites still have no substantial process for archiving and backing up their content. What happens if your website is hacked? What if your computer crashes? If the history of computing has shown us anything, it’s that data increasingly disappears as a result of both computer failure and a website or web technology aging.
Historiography for the People
The web is ever-changing, and its history is being erased, written over and lost in poorly maintained archives.
High-quality content — even an individual’s personal reflection on the world on his or her blog — never loses value. Of course, drowning out the spam and fluff helps, but if we value a healthy digital ecosystem, then we will focus on producing things that contribute to our evolving worldwide virtual library.
The content we produce will give future generations a fascinating look at how the web has evolved over time and how web professionals and ordinary folks have carried out their daily tasks.
We should not build websites solely for the here and now, forgetting the mistakes and successes of those who have come before us. By preserving the past and documenting the development of the web, we are immortalizing ourselves, ensuring that we don’t become yet another people who simply fade into oblivion.