PDF
version of this report
You must have Adobe Acrobat reader to view, save, or print PDF files. The
reader is available for free
download.
Information Archiving
Best Practices
Copyright 2022, Faulkner Information Services. All Rights Reserved.
Docid: 00011149
Publication Date: 2208
Report Type: TUTORIAL
Preview
Given the vast amount of information that they can handle, organizations often find themselves at a crossroads
when it comes to managing large data
quantities in a safe, regulatory compliant manner. Information archiving,
in its simplest definition, promotes different methods to store data so
that it is safeguarded from loss or misuse. Although many executives treat
information archiving as an option that they would like to have, the truth
is that this practice can be a lifesaver in many ways. Putting a cohesive
plan in place is beneficial to protecting data for long periods of time.
This tutorial takes a look at many of the considerations that need to be
made when archiving massive amounts of information.
Report Contents:
- Executive Summary
- Archiving Types
- Archiving Issues
- Archiving Compliance
- Recommendations
- Web Links
- Related Reports
Executive Summary
[return to top of this report]
The essential process of protecting enterprise information involves: (1)
preventing its loss or theft; (2) securing it against unauthorized
modification or destruction; and (3) safeguarding it from misappropriation
or other unauthorized use.
Related Faulkner Reports |
Enterprise Content Management Software Market Trends |
One of the key strategies for protecting information, especially from
loss, is information archiving – copying and
transferring vital, important, and useful information to a geographically
separate location.
According to Code 232 of the National Fire Protection Association (NFPA)
“Standard for the Protection of Records,” information is considered:
- Vital if it is irreplaceable and would cause a
serious legal problem or business interruption if it were unavailable,
even temporarily. - Important if it is reproducible, but only at
considerable expense or delay. - Useful if its loss might cause temporary
inconvenience but not a serious business disadvantage.1
From natural disasters like floods and tornadoes to human errors like
accidental deletion or intentional data breaches, it is critical for
businesses to employ information archiving in one form or another to
protect data for long periods.
A prime example pertains to the making of the movie Toy Story 2.
An animator with Pixar accidentally deleted a root folder connected to the
film and the movie company watched in horror as approximately 90 percent
of the film disappeared completely – nearly two years worth of work. Pixar
attempted to retrieve data from backup drives, but that ultimately failed. The
company was left with very limited options and a rapidly approaching
deadline. A technical director who was working remotely to take care of
her newborn notified Pixar that she had saved copies of the film on her
home computer. Without realizing it at the time, Galyn Susman saved Toy
Story 2 – all because she’d copied it to her home hard drive to
work on while keeping an eye on her new baby! The movie went on to gross
over $245 million in revenues.2
Figure 1. Classic Paper Archives (like this one) Are Being Replaced By Digital Archives
Source: Wikimedia Commons
Information Archiving Best Practices
To implement an effective information archiving program, enterprise
planners should:
- Adopt a comprehensive information classification scheme that
delineates between vital, important, and useful information (that is,
information that should be archived) and low-value or
readily-reproducible information that does not have to be archived. - Adapt information archival processes to record only archive-eligible
material. - Store archived information at enterprise owned, operated, or
controlled facilities. - Perform periodic retrieve and restore exercises to test the integrity
of archived information. - Convert old information archives stored on volatile media, like
magnetic tape, to new, longer-lasting, and currently-supported devices.
Importantly, exploit this opportunity to “weed out” expired or
unnecessary information. - Whenever possible, record archived information on write-once media to
avoid inadvertent – or intentional – data corruption. - Encrypt all vital, important, and useful information prior to
archiving. Carefully preserve data encryption keys to ensure
information restorability. - Review information archiving protocols whenever new or revised record
retention regulations are issued. Modify processes to satisfy any
new legal requirements. - Establish an in-house document imaging capability, and render all
vital paper or hard copy records into electronic form for archiving. - Develop procedures for capturing and archiving vital mobile and
wireless information. This can be achieved by uploading smartphone,
laptop, and other electronic data to an enterprise’s central servers,
which are archived. - Whenever possible, employ data compression technologies to minimize
archive storage requirements. - Archive essential information to enterprise class storage arrays.
- Develop an e-mail archiving policy to ensure electronic business
communications are readily retrievable (i.e., discoverable) as per all
relevant laws and regulations. - Develop a closely-aligned social media archiving policy to preserve
vital Twitter tweets and Facebook postings. - Leverage business collaboration platforms to aggregate archive-worthy
data. In this way, vital data pertaining to a particular project or
initiative can be stored in – and, ultimately, archived from – a single
repository. - Ensure that all removable media – especially tapes, CDs, DVDs, and
USB flash drives – are properly labeled. The label should contain a
brief description of the device’s content plus the creation date. - Engage a third-party firm to perform periodic archiving
audits. Promptly act on any audit recommendations. - On an on-going basis, determine the impact of emerging technologies,
such as the Internet of Things and edge computing, and update enterprise
information archiving policies and practices as appropriate. - Attend archiving seminars, conferences, and workshops and assimilate
any lessons learned, adjusting EIA policies and practices as
appropriate. - Appoint an Information Archiving Manager, an individual responsible
for all enterprise information archiving operations.
Archiving vs. Backup
Information archiving is often confused with information (or data)
backup.
- Backup provides short-term protection of data,
usually digital data, and offers insurance against data corruption,
accidental or deliberate erasures, or media failure. It is typically
stored locally or at another location where it can be easily
accessed. - Archiving provides long-term protection of digital
and non-digital data, and preserves data that must be retained for long
periods due to business or regulatory requirements.
Importantly, archiving can help reduce backup operations; once records are
archived, there is no necessity to back them up on a regular basis. Regulating
backup – and archiving – operations is critical.
Archiving Types
[return to top of this report]
Not long ago, the options for information archiving were rather limited
and generally involved trucking paper files or computer backup tapes to a
commercial storage facility like Iron Mountain. While this process is
still in practice, particularly for paper-based assets, technological
advances like the Internet and broadband communications have enabled the
real-time storage of electronic records through electronic vaulting, data
replication, and other means.
Electronic Vaulting
Electronic vaulting, a.k.a., data vaulting or online backup and recovery,
is the process of storing electronic information at an offsite location
via a direct data line or Internet connection. Electronic vaulting
can occur on a scheduled basis or in real-time, occasioned automatically
by the creation or modification of an electronic file. Encryption is used
to secure data for transmission to offsite locations. To help serve
the consumer and small business market, a number of vendors are offering
small-scale electronic vaulting services designed to store up to multiple
gigabytes of customer information.
Data Replication
Data replication is the process of creating duplicate databases to allow
local user groups to operate with their own copy. While not technically an
offsite storage option, data replication has the effect of producing an
offsite image, thus offering some measure of information protection in the
event the source database is destroyed or otherwise compromised.
Commercial Storage
Commercial storage centers provide long-term storage space for the
preservation of critical customer records, both paper and hard copy
records as well as electronic records (usually in the form of backup tapes
and CDs). Commercial vaults are:
- Climate-controlled, observing the perfect balance of temperature and
humidity. - Equipped with sophisticated fire detection and suppression systems to
guard against the number one threat to storage media. - Armed with the latest access control systems to prevent information
theft and/or tampering.
Intra-Enterprise Storage
In addition to third-party storage sites, an enterprise with two or more
locations can leverage its geographical diversity by providing its own
offsite storage. Site “A” can store information at Site “B”, and Site “B”,
if necessary, can store information at Site “A”. This arrangement may be:
- More cost effective than engaging a commercial provider (depending on
the provider’s fee structure). - More secure since information control is never ceded to an outside
firm.
There may, however, be a one-time cost associated with buying or building
an appropriate information storage facility (or facilities).
Mobile Storage
One ad hoc solution for achieving offsite storage – still employed by
many small business owners – involves the end-of-day (or end-of-week)
dumping of PC or server data to mobile information devices, such as tapes,
CDs, DVDs, USB flash drives, external hard drives, smartphones, and/or
laptops. The owners then take these devices home, rendering their
place of residence a de facto offsite storage
facility. This process is both imprecise and insecure when compared
to conventional offsite storage methods, but may be sufficient in
situations where vital records are relatively nonvolatile.
E-mail Storage
One backdoor method for storing electronic files offsite is to attach the
files to an e-mail messages, turning the e-mail repository into a quasi
hard drive. Commercial Internet providers like Microsoft virtually
invite the use – or abuse – of their e-mail archives by providing each
user with a large, often excessive amount of free space.
Cloud Archiving
Cloud archiving, including “archiving-as-a-service,” is quite popular.
Cloud archiving offers the same basic benefits as other cloud services,
notably:
- Reduced capital expenditure
- Reduced operational support
- Predictable costs
- Ready scalability
As with any outsourced information service, cloud archiving clients must
ensure that providers are exercising due diligence in protecting
enterprise data. Creating a detailed information archiving service
level agreement (SLA) is a good place to start.
Archiving Issues
[return to top of this report]
Information archiving presents certain information lifecycle management
(ILM) and security issues. Consider the following:
- With rapid increases in hard drive and server capacity, electronic
information is accumulating at an unprecedented pace. - While new and stricter retention regulations, particularly for
electronic mail, are driving the archiving process, the indiscriminate
production and distribution of electronic documents, spreadsheets, and
other data (ostensibly, business-oriented) are imposing a burden on
limited archiving assets. - Since archival storage is typically outsourced to third-party storage
providers, the integrity of archived data – particularly long-term – is
highly suspect. - Today’s archival storage protocols are biased toward protecting data
forever, leaving little recourse for locating and extracting obsolete
data, and thus reducing data stores to a manageable size. - Finally, securing archival data is a multi-dimensional process –
protecting vital records against loss, destruction, corruption, and
misappropriation.
To offer perspective on information archive security issues, researchers
Mark Storer, Kevin Greenan, and Ethan Miller at the University of
California at Santa Cruz studied the “Long-Term Threats to Secure
Archives.”3 Table 1 offers some of their observations.
Threat | Description |
---|---|
Preservation of Encryption Keys |
Encryption keys offer a single point of failure. |
“It’s Here Somewhere” Data |
Archival storage is not designed to serve as a primary data storage solution. |
The Owner Is Dead, Long Live the Owner |
Archival storage needs to be able to authenticate new users and establish their relationship to resources. |
Degraded Data | Data begins to degrade as soon as it is placed on media. |
Unlimited Attack Time |
Timeframes for data lifetimes extend from a short-term scale (months and years) to a longer-term scale (decades), thus giving potential attackers a larger window to compromise system integrity. |
Impermanent Archives | Longterm storage systems will witness events that require data to be moved between archives. |
Out of Sight – Out of Control |
Migrating intellectual capital to a third-party for longterm storage carries risk. |
What About Paper Records? |
Archiving paper or hard copy records can be problematic as many vital records are not archived, and records tend to be difficult to store and search. |
Ineffective Waste Management |
Bloated archives include expired documents – which could and should be eliminated – and non-business data – which should be excluded. |
Lost in Transit | Archive volumes can be lost or stolen while in transit. |
Archiving Compliance
[return to top of this report]
In response to the unprecedented growth in digital data over the past 30
years – Big Data, e-mail, texts, social media posts, etc. – lawmakers and
other governmental officials in the US, European Union, and elsewhere have
crafted a variety of laws and regulations designed to govern the
collection, protection, administration, and preservation of enterprise
information, with the effect that cautious and conscientious enterprise
officials have become heavily invested in complying with or adhering to
all legally-relevant information management rules, regulations, standards,
and guidelines, including, importantly, statutes related to information
archiving.
Satisfying Regulatory Bodies
Among the major regulatory bodies and individual regulations requiring
enterprise attention are:
- FINRA – the Financial Industry Regulatory Authority
- SOX – the Sarbanes-Oxley Act of 2002
- HIPAA – the Health Insurance Portability and Accountability Act
- FDA – the Food and Drug Administration
- GDPR – the EU General Data Protection Regulation
- CCPA – the California Consumer Privacy Act
FINRA, for example, requires that a regulated broker-dealer’s financial
“books and records’ be stored in:
- Paper form;
- On micrographic media (microfilm, microfiche or any similar medium);
or - On electronic storage media.
With respect to electronic storage media, any selected ESM must:
- Preserve records exclusively in a non-rewriteable, non-erasable
format. - Automatically verify the quality and accuracy of the storage media
recording process. - Have the capacity to readily download stored records and indexes.
The broker-dealer must have an audit system that identifies when original
and duplicate records are input onto the storage medium and when any
changes to existing records are made.
Additionally, the broker-dealer is required to retain, keep current, and
surrender upon request by the US Securities and Exchange Commission (SEC)
staff all the information needed to access and download stored records and
indexes.
Information Archiving Solutions
To help facilitate archiving compliance, many, if not most, enterprises
have implemented a comprehensive enterprise information archiving
solution. The leading EIA solution providers, according to Gartner, are:
- Microsoft
- Smarsh
- Proofpoint
- Mimecast
- Global Relay
- Veritas 4
Not just one solution, the Microsoft Compliance suite, for example,
“offers an integrated set of solutions to address … information risk and
archiving challenges.” These include:
- “Seamless risk management. [The] Microsoft 365 Advanced
eDiscovery solution … can automatically collect linked content with
the original message in Microsoft Teams, Yammer, and Outlook. - “Leveraging machine learning to manage data at scale. With the
large volume of data created every day, it is impossible to manually
manage an organization’s content.” Managing data “at scale” is more
effective and efficient. Utilizing integrated machine learning,
Microsoft’s “trainable classifiers categorize content for retention,
deletion, and protection policies. - “New data types and multi-cloud compliance. The rise of text
messages, asynchronous communication, and other communication modes
creates a variety of formats to manage risk and compliance. [Recently],
Microsoft introduced 65 plus new connectors built by Microsoft and
partners” to manage imported data.5
Recommendations
[return to top of this report]
Appoint a Content Curator
The term “curator” is normally associated with an individual, like a
museum curator, who collects, cultivates, and presents artistic works
according to a theme or strategy. Like museums, modern enterprises should appoint a curator, someone who
has responsibility for recommending – if not deciding – which information
elements are archive-worthy. Advances in information storage and
retrieval allow enterprises to be less discriminating in terms of the
information they retain – even encouraging some enterprises to adopt a
“keep it all” philosophy. Volume, however, does not equal value and,
thus, some individual or team should be charged with identifying which
information elements are vital and which elements are not. Only by
knowing which information elements are critical can enterprises extract
the maximum value from their information assets.
Utilize Information Archiving Solutions
Specifically, favor EIA solutions that are powered by artificial
intelligence and machine learning algorithms. “Use of AI/ML enables
organizations to train policies that automate and improve results of data
classification and assess user actions to streamline e-discovery and
supervisory review.”6
Monitor Information Archiving Cases
The enterprise legal department should endorse the enterprise approach to
information archiving. In addition, enterprise lawyers should provide
feedback on any pending judicial cases that involve archiving. Such
analysis is critical to ensure that enterprise procedures do not violate
crucial archiving-related laws and regulations.
References
1 David Hague. “How NFPA 232 Can Help You Protect Your
Records.” NFPA Journal. March/April 2002:53.
2 Gillian Orr. “Pixar’s Billion-Dollar Delete Button Nearly
Lost Toy Story 2 Animation.” Independent. May 17, 2012.
3 Mark W. Storer, Kevin Greenan, and Ethan L. Miller.
“Long-Term Threats to Secure Archives.” Storage Systems Research Center,
University of California, Santa Cruz, CA. 2006.
4 Timothy King. “The 6 Major Players in Enterprise Information
Archiving, 2022.” Solutions Review. March 3, 2022.
5 Rudra Mitra. “Gartner Names Microsoft a Leader in the 2022
Magic Quadrant for Enterprise Information Archiving.” Microsoft. January
28, 2022.
6 Michael Hoeck and Jeff Vogel. “Magic Quadrant for Enterprise
Information Archiving.” Gartner. January 24, 2022.
Web Links
[return to top of this report]
Global Relay: https://www.globalrelay.com/
Iron Mountain: http://www.ironmountain.com/
Microsoft: https://www.microsoft.com/
Mimecast: https://www.mimecast.com/
National Fire Protection Association: https://www.nfpa.org/
National Institute of Standards and Technology: https://www.nist.gov/
Proofpoint: https://www.proofpoint.com/
Smarsh: https://www.smarsh.com/
Sungard Availability Services: https://www.sungardas.com/
Veritas: https://www.veritas.com/
[return to top of this report]