Information Archiving Best Practices

version of this report

You must have Adobe Acrobat reader to view, save, or print PDF files. The
reader is available for free

Information Archiving
Best Practices

by Faulkner Staff

Docid: 00011149

Publication Date: 2208

Report Type: TUTORIAL


Given the vast amount of information that they can handle, organizations often find themselves at a crossroads
when it comes to managing large data
quantities in a safe, regulatory compliant manner. Information archiving,
in its simplest definition, promotes different methods to store data so
that it is safeguarded from loss or misuse. Although many executives treat
information archiving as an option that they would like to have, the truth
is that this practice can be a lifesaver in many ways. Putting a cohesive
plan in place is beneficial to protecting data for long periods of time.
This tutorial takes a look at many of the considerations that need to be
made when archiving massive amounts of information.

Report Contents:

Executive Summary

[return to top of this report]

The essential process of protecting enterprise information involves: (1)
preventing its loss or theft; (2) securing it against unauthorized
modification or destruction; and (3) safeguarding it from misappropriation
or other unauthorized use.

Faulkner Reports
Enterprise Content
Management Software Market Trends

One of the key strategies for protecting information, especially from
loss, is information archiving – copying and
transferring vital, important, and useful information to a geographically
separate location.

According to Code 232 of the National Fire Protection Association (NFPA)
“Standard for the Protection of Records,” information is considered:

  • Vital if it is irreplaceable and would cause a
    serious legal problem or business interruption if it were unavailable,
    even temporarily.
  • Important if it is reproducible, but only at
    considerable expense or delay.
  • Useful if its loss might cause temporary
    inconvenience but not a serious business disadvantage.1

From natural disasters like floods and tornadoes to human errors like
accidental deletion or intentional data breaches, it is critical for
businesses to employ information archiving in one form or another to
protect data for long periods.

A prime example pertains to the making of the movie Toy Story 2.
An animator with Pixar accidentally deleted a root folder connected to the
film and the movie company watched in horror as approximately 90 percent
of the film disappeared completely – nearly two years worth of work. Pixar
attempted to retrieve data from backup drives, but that ultimately failed. The
company was left with very limited options and a rapidly approaching
deadline. A technical director who was working remotely to take care of
her newborn notified Pixar that she had saved copies of the film on her
home computer. Without realizing it at the time, Galyn Susman saved Toy
Story 2
– all because she’d copied it to her home hard drive to
work on while keeping an eye on her new baby! The movie went on to gross
over $245 million in revenues.2

Figure 1. Classic Paper Archives (like this one) Are Being Replaced By Digital Archives

Figure 1. Classic Paper Archives (like this one) Are Being Replaced By Digital Archives

Source: Wikimedia Commons

Information Archiving Best Practices

To implement an effective information archiving program, enterprise
planners should:

  1. Adopt a comprehensive information classification scheme that
    delineates between vital, important, and useful information (that is,
    information that should be archived) and low-value or
    readily-reproducible information that does not have to be archived.
  2. Adapt information archival processes to record only archive-eligible
  3. Store archived information at enterprise owned, operated, or
    controlled facilities.
  4. Perform periodic retrieve and restore exercises to test the integrity
    of archived information.
  5. Convert old information archives stored on volatile media, like
    magnetic tape, to new, longer-lasting, and currently-supported devices.
    Importantly, exploit this opportunity to “weed out” expired or
    unnecessary information.
  6. Whenever possible, record archived information on write-once media to
    avoid inadvertent – or intentional – data corruption.
  7. Encrypt all vital, important, and useful information prior to
    archiving. Carefully preserve data encryption keys to ensure
    information restorability.
  8. Review information archiving protocols whenever new or revised record
    retention regulations are issued. Modify processes to satisfy any
    new legal requirements.
  9. Establish an in-house document imaging capability, and render all
    vital paper or hard copy records into electronic form for archiving.
  10. Develop procedures for capturing and archiving vital mobile and
    wireless information. This can be achieved by uploading smartphone,
    laptop, and other electronic data to an enterprise’s central servers,
    which are archived.
  11. Whenever possible, employ data compression technologies to minimize
    archive storage requirements.
  12. Archive essential information to enterprise class storage arrays.
  13. Develop an e-mail archiving policy to ensure electronic business
    communications are readily retrievable (i.e., discoverable) as per all
    relevant laws and regulations.
  14. Develop a closely-aligned social media archiving policy to preserve
    vital Twitter tweets and Facebook postings.
  15. Leverage business collaboration platforms to aggregate archive-worthy
    data. In this way, vital data pertaining to a particular project or
    initiative can be stored in – and, ultimately, archived from – a single
  16. Ensure that all removable media – especially tapes, CDs, DVDs, and
    USB flash drives – are properly labeled. The label should contain a
    brief description of the device’s content plus the creation date.
  17. Engage a third-party firm to perform periodic archiving
    audits. Promptly act on any audit recommendations.
  18. On an on-going basis, determine the impact of emerging technologies,
    such as the Internet of Things and edge computing, and update enterprise
    information archiving policies and practices as appropriate.
  19. Attend archiving seminars, conferences, and workshops and assimilate
    any lessons learned, adjusting EIA policies and practices as
  20. Appoint an Information Archiving Manager, an individual responsible
    for all enterprise information archiving operations.

Archiving vs. Backup

Information archiving is often confused with information (or data)

  • Backup provides short-term protection of data,
    usually digital data, and offers insurance against data corruption,
    accidental or deliberate erasures, or media failure. It is typically
    stored locally or at another location where it can be easily
  • Archiving provides long-term protection of digital
    and non-digital data, and preserves data that must be retained for long
    periods due to business or regulatory requirements.

Importantly, archiving can help reduce backup operations; once records are
archived, there is no necessity to back them up on a regular basis. Regulating
backup – and archiving – operations is critical.

Archiving Types

[return to top of this report]

Not long ago, the options for information archiving were rather limited
and generally involved trucking paper files or computer backup tapes to a
commercial storage facility like Iron Mountain. While this process is
still in practice, particularly for paper-based assets, technological
advances like the Internet and broadband communications have enabled the
real-time storage of electronic records through electronic vaulting, data
replication, and other means.

Electronic Vaulting

Electronic vaulting, a.k.a., data vaulting or online backup and recovery,
is the process of storing electronic information at an offsite location
via a direct data line or Internet connection. Electronic vaulting
can occur on a scheduled basis or in real-time, occasioned automatically
by the creation or modification of an electronic file. Encryption is used
to secure data for transmission to offsite locations. To help serve
the consumer and small business market, a number of vendors are offering
small-scale electronic vaulting services designed to store up to multiple
gigabytes of customer information.

Data Replication

Data replication is the process of creating duplicate databases to allow
local user groups to operate with their own copy. While not technically an
offsite storage option, data replication has the effect of producing an
offsite image, thus offering some measure of information protection in the
event the source database is destroyed or otherwise compromised.

Commercial Storage

Commercial storage centers provide long-term storage space for the
preservation of critical customer records, both paper and hard copy
records as well as electronic records (usually in the form of backup tapes
and CDs). Commercial vaults are:

  • Climate-controlled, observing the perfect balance of temperature and
  • Equipped with sophisticated fire detection and suppression systems to
    guard against the number one threat to storage media.
  • Armed with the latest access control systems to prevent information
    theft and/or tampering.

Intra-Enterprise Storage

In addition to third-party storage sites, an enterprise with two or more
locations can leverage its geographical diversity by providing its own
offsite storage. Site “A” can store information at Site “B”, and Site “B”,
if necessary, can store information at Site “A”. This arrangement may be:

  • More cost effective than engaging a commercial provider (depending on
    the provider’s fee structure).
  • More secure since information control is never ceded to an outside

There may, however, be a one-time cost associated with buying or building
an appropriate information storage facility (or facilities).

Mobile Storage

One ad hoc solution for achieving offsite storage – still employed by
many small business owners – involves the end-of-day (or end-of-week)
dumping of PC or server data to mobile information devices, such as tapes,
CDs, DVDs, USB flash drives, external hard drives, smartphones, and/or
laptops. The owners then take these devices home, rendering their
place of residence a de facto offsite storage
facility. This process is both imprecise and insecure when compared
to conventional offsite storage methods, but may be sufficient in
situations where vital records are relatively nonvolatile.

E-mail Storage

One backdoor method for storing electronic files offsite is to attach the
files to an e-mail messages, turning the e-mail repository into a quasi
hard drive. Commercial Internet providers like Microsoft virtually
invite the use – or abuse – of their e-mail archives by providing each
user with a large, often excessive amount of free space.

Cloud Archiving

Cloud archiving, including “archiving-as-a-service,” is quite popular.
Cloud archiving offers the same basic benefits as other cloud services,

  • Reduced capital expenditure
  • Reduced operational support
  • Predictable costs
  • Ready scalability

As with any outsourced information service, cloud archiving clients must
ensure that providers are exercising due diligence in protecting
enterprise data. Creating a detailed information archiving service
level agreement (SLA) is a good place to start.

Archiving Issues

[return to top of this report]

Information archiving presents certain information lifecycle management
(ILM) and security issues. Consider the following:

  • With rapid increases in hard drive and server capacity, electronic
    information is accumulating at an unprecedented pace.
  • While new and stricter retention regulations, particularly for
    electronic mail, are driving the archiving process, the indiscriminate
    production and distribution of electronic documents, spreadsheets, and
    other data (ostensibly, business-oriented) are imposing a burden on
    limited archiving assets.
  • Since archival storage is typically outsourced to third-party storage
    providers, the integrity of archived data – particularly long-term – is
    highly suspect.
  • Today’s archival storage protocols are biased toward protecting data
    forever, leaving little recourse for locating and extracting obsolete
    data, and thus reducing data stores to a manageable size.
  • Finally, securing archival data is a multi-dimensional process –
    protecting vital records against loss, destruction, corruption, and

To offer perspective on information archive security issues, researchers
Mark Storer, Kevin Greenan, and Ethan Miller at the University of
California at Santa Cruz studied the “Long-Term Threats to Secure
Archives.”3 Table 1 offers some of their observations.

Table 1. Long-Term Archiving Threats
Threat Description
Preservation of Encryption
Encryption keys offer a single point of
“It’s Here Somewhere”
Archival storage is not designed to
serve as a primary data storage solution.
The Owner Is Dead, Long Live
the Owner
Archival storage needs to be able to
authenticate new users and establish their relationship to
Degraded Data Data begins to degrade as soon as it is
placed on media.
Attack Time
Timeframes for data lifetimes extend
from a short-term scale (months and years) to a longer-term scale
(decades), thus giving potential attackers a larger window to
compromise system integrity.
Impermanent Archives Longterm storage systems will witness
events that require data to be moved between archives.
Out of Sight –
Out of Control
Migrating intellectual capital to a
third-party for longterm storage carries risk.
What About
Paper Records?
Archiving paper or hard copy records
can be problematic as many vital records are not archived, and
records tend to be difficult to store and search.
Waste Management
Bloated archives include expired
documents – which could and should be eliminated – and
non-business data – which should be excluded.
Lost in Transit Archive volumes can be lost or stolen
while in transit.

Archiving Compliance

[return to top of this report]

In response to the unprecedented growth in digital data over the past 30
years – Big Data, e-mail, texts, social media posts, etc. – lawmakers and
other governmental officials in the US, European Union, and elsewhere have
crafted a variety of laws and regulations designed to govern the
collection, protection, administration, and preservation of enterprise
information, with the effect that cautious and conscientious enterprise
officials have become heavily invested in complying with or adhering to
all legally-relevant information management rules, regulations, standards,
and guidelines, including, importantly, statutes related to information

Satisfying Regulatory Bodies

Among the major regulatory bodies and individual regulations requiring
enterprise attention are:

  • FINRA – the Financial Industry Regulatory Authority
  • SOX – the Sarbanes-Oxley Act of 2002
  • HIPAA – the Health Insurance Portability and Accountability Act
  • FDA – the Food and Drug Administration
  • GDPR – the EU General Data Protection Regulation
  • CCPA – the California Consumer Privacy Act

FINRA, for example, requires that a regulated broker-dealer’s financial
“books and records’ be stored in:

  • Paper form;
  • On micrographic media (microfilm, microfiche or any similar medium);
  • On electronic storage media.

With respect to electronic storage media, any selected ESM must:

  • Preserve records exclusively in a non-rewriteable, non-erasable
  • Automatically verify the quality and accuracy of the storage media
    recording process.
  • Have the capacity to readily download stored records and indexes.

The broker-dealer must have an audit system that identifies when original
and duplicate records are input onto the storage medium and when any
changes to existing records are made.

Additionally, the broker-dealer is required to retain, keep current, and
surrender upon request by the US Securities and Exchange Commission (SEC)
staff all the information needed to access and download stored records and

Information Archiving Solutions

To help facilitate archiving compliance, many, if not most, enterprises
have implemented a comprehensive enterprise information archiving
solution. The leading EIA solution providers, according to Gartner, are:

  • Microsoft
  • Smarsh
  • Proofpoint
  • Mimecast
  • Global Relay
  • Veritas 4

Not just one solution, the Microsoft Compliance suite, for example,
“offers an integrated set of solutions to address … information risk and
archiving challenges.” These include:

  • Seamless risk management. [The] Microsoft 365 Advanced
    eDiscovery solution … can automatically collect linked content with
    the original message in Microsoft Teams, Yammer, and Outlook.
  • Leveraging machine learning to manage data at scale. With the
    large volume of data created every day, it is impossible to manually
    manage an organization’s content.” Managing data “at scale” is more
    effective and efficient. Utilizing integrated machine learning,
    Microsoft’s “trainable classifiers categorize content for retention,
    deletion, and protection policies.
  • New data types and multi-cloud compliance. The rise of text
    messages, asynchronous communication, and other communication modes
    creates a variety of formats to manage risk and compliance. [Recently],
    Microsoft introduced 65 plus new connectors built by Microsoft and
    partners” to manage imported data.5


[return to top of this report]

Appoint a Content Curator

The term “curator” is normally associated with an individual, like a
museum curator, who collects, cultivates, and presents artistic works
according to a theme or strategy. Like museums, modern enterprises should appoint a curator, someone who
has responsibility for recommending – if not deciding – which information
elements are archive-worthy. Advances in information storage and
retrieval allow enterprises to be less discriminating in terms of the
information they retain – even encouraging some enterprises to adopt a
“keep it all” philosophy. Volume, however, does not equal value and,
thus, some individual or team should be charged with identifying which
information elements are vital and which elements are not. Only by
knowing which information elements are critical can enterprises extract
the maximum value from their information assets.

Utilize Information Archiving Solutions

Specifically, favor EIA solutions that are powered by artificial
intelligence and machine learning algorithms. “Use of AI/ML enables
organizations to train policies that automate and improve results of data
classification and assess user actions to streamline e-discovery and
supervisory review.”6

Monitor Information Archiving Cases

The enterprise legal department should endorse the enterprise approach to
information archiving. In addition, enterprise lawyers should provide
feedback on any pending judicial cases that involve archiving. Such
analysis is critical to ensure that enterprise procedures do not violate
crucial archiving-related laws and regulations.


[return to top of this report]

[return to top of this report]