Big Data Marketplace

PDF version of this report
You must have Adobe Acrobat reader to view, save, or print PDF files. The reader
is available for free

Big Data Marketplace

by James G. Barr

Docid: 00021308

Publication Date: 2005

Report Type: MARKET


The term “Big Data” refers to the massive amounts of data being generated on
a daily basis by businesses and consumers alike – data that cannot be processed
using conventional data analysis tools owing to its sheer size and, in many
cases, its unstructured nature. Convinced that such data holds the key to
improved productivity and profitability, enterprise planners are searching for
tools capable of processing Big Data, and information technology providers are
developing solutions to accommodate new Big Data market

Report Contents:


[return to top
of this report]

The term “Big Data” refers to the massive amounts of data being generated on
a daily basis by businesses and consumers alike – data that cannot be processed
using conventional data analysis tools owing to its sheer size and, in many
cases, its unstructured nature. Convinced that such data holds the key to
improved productivity and profitability, enterprise planners are searching for
tools capable of processing Big Data, and information technology providers are
developing solutions to accommodate new Big Data market

Big Data Size

Analyst Sohini Mitter reports that with the amount of data in the world
doubling every two years, in 2020 the digital universe will occupy 44 zettabytes, or 4 trillion gigabytes.1 That’s a lot of data to manage.

The situation today is somewhat analogous to the data
management dilemmas of the early 1990s in which enterprises were unable to
properly process large amounts of customer and other structured data. That
Big Data problem was ameliorated, at least in part, through the fusion of
inexpensive storage and a new technology called massively parallel processing (MPP)
that enabled the creation of large-scale data warehouses – repositories from
which enterprise planners could sift through terabytes of data to gain critical
insight into how to improve business operations.

That experience, as much as any other factor, convinces
today’s planners that there’s real value in this next generation version of Big
provided next-generation tools are developed to efficiently – and economically –
store and process the text, audio, video, and other complex data structures that surround
and pervade enterprise operations.

Big Data Processing

IBM, a Big Data leader, believes that Big Data
spans four dimensions: Volume, Velocity, Variety, and Veracity.

Volume – Enterprises are awash with ever-growing
data of all types, easily amassing terabytes – even petabytes – of
information. How can an enterprise, for example:

  • Turn 12 terabytes of Tweets created daily into
    improved product sentiment analysis?

  • Convert 350 billion meter readings per annum to
    better predict power consumption?

Velocity – Sometimes two minutes is too late. For time-sensitive processes such as catching fraud,
Big Data must be used as it
streams into the enterprise in order to maximize its value. How can an

  • Scrutinize five million trade events per day to
    identify potential fraud?

  • Analyze 500 million call detail records per day in
    real-time to predict customer churn faster?

Variety – Big
Data is any type of data – structured
and unstructured data such as text, sensor data, audio, video, click streams,
log files, and more. New insights are found when analyzing these data
types together. How can an enterprise:

  • Use hundreds of live video feeds from surveillance
    cameras to monitor points of interest?

  • Take advantage of the 80 percent data growth in
    images, video, and documents to improve customer satisfaction?

Veracity – Many business leaders
distrust the information they use to make decisions. How can an

  • Act with any confidence upon untrustworthy information?

These are questions that Big Data vendors like IBM are trying to

Big Data Market

As reported by Markets&Markets, the global Big Data market should grow from
$138.9 billion in 2020 to $229.4 billion by 2025, representing a respectable
compound annual growth rate (CAGR) of 10.6 percent during the forecast period.

According to the firm, major growth factors include increasing:

  • Utilization of Internet of Things (IoT) devices
  • Availability of data which might be mined for competitive advantage
  • Government investments aimed at enhancing digital technologies.

In terms of geographic presence, North America is dominant, with the Asia Pacific (APAC)
sector exhibiting the highest growth rate.2


[return to top
of this report]


Perhaps the single biggest contributor to the Big Data phenomenon is the
machine-to-machine (M2M) movement. The goal of M2M [now more commonly called the “Internet of
Things” (IoT)] is to make every individual
machine, device, or circuit "addressable", and capable of communicating and
interacting with every other machine, device, or circuit.

At its most basic, M2M/IoT involves four simple machine-to-machine functions:

  1. Collection – Select data is extracted
    from “Machine A,” a temperature sensor, for example.
  2. Transmission – The data is forwarded from Machine A – via
    a wired or wireless connection – to "Machine B" for analysis.
  3. Assessment – The data is evaluated by
    Machine B to determine what, if any, action should be taken; for example, the
    room temperature – as recorded by the temperature sensor (Machine A) – may
    be too high.
  4. Reaction – Machine B initiates the appropriate response, either
    activating the HVAC unit, or alerting a human operator. In the first
    instance, Machine B would interact directly with the HVAC system, essentially
    starting a second M2M transaction.

M2M/IoT is the foundational technology that undergirds the
“Smart Grid,” our national effort to reduce energy consumption by fully
automating the generation, distribution, and utilization of

Big Data Beneficiaries

Among those industrial sectors that will benefit most from harnessing Big
Data are:

  • Banking
  • Discrete Manufacturing
  • Process Manufacturing
  • Federal

  • Professional Services3

Private sector companies are expected to devote more resources to Big Data
analysis than public sector agencies owing to competitive pressures. One
exception might be public sector administration in Europe where many countries
are driving austerity programs – programs tied to increasing productivity and
eliminating extravagant expenditures. Big Data, it is hoped, will produce
Big Productivity.

Big Data-Small Data Push-Pull

At least for now, the Big Data push is being opposed by a Small Data pull.
Rather than embracing the mountains of information generated each day,
especially e-mails, many organizations, concerned about regulation, litigation,
and other potential exposures like e-discovery demands, have imposed strict data
retention policies aimed at ensuring that "legacy" data is purged on a regular
schedule. As more Big Data analysis tools become available, these organizations
may be persuaded that the value of processing all – or
most of – their data is actually worth the risk of keeping

Google, Facebook & Amazon

Massive web properties like Amazon, Facebook, and Google – which were developing
Big Data infrastructures even before the term
“Big Data” was coined – will continue to invest heavily in commodity
hardware to build out their massive Big Data processing platforms.

National Security & Law Enforcement

Homeland Security Research Corporation reports that the "National
Security & Law Enforcement" sector commands a major portion of the Big
Data & Data
Analytics market, with industry revenues growing at an estimated CAGR (compound
annual growth rate) of 17.5 percent between 2015 and 2022.

Market drivers include:

  • "The increased use of smartphones, wearables, and other
    smart connected devices
    (cars, machines, IoT, etc.), which will continue
    to create enormous amounts of information that Homeland Security and Public
    Safety organizations can use to their advantage, mostly in Sigint (signal
    intelligence) related activities.
  • "The rise of state-of-the-art attack technologies
    (e.g., cyber-warfare, encrypted communication, cyber-crime, chemical warfare
    agents, and GPS jammers) as well as other advanced techniques employed by
    terror organizations … and
    other 21st century criminals. Big Data and
    data analytics is one of the preferred ways to deal with this new reality.
  • "[The] growing number of countries … [increasingly]
    monitoring … citizen activities
    . [Ominously, countries] such as
    Russia, Turkey and China are expected to increase investments in Big Data
    gathering and intelligence gathering for internal security on vast amount of

Lower Market Volatility

Overall, the Big Data market is maturing and becoming less volatile. Analyst Todd Goldman
predicts that "soon, the number of
companies getting acquired or simply disappearing will match the number of new
entrants. Fewer vendors will get funded, but more of them will be delivering a
greater level of innovation and value. The market is no longer in the mood to
tolerate crowds of vendors with little differentiation or providers that aren’t
delivering real value or [revenue generation]."5


[return to top
of this report]

The market leaders in the Big Data space are easy to identify as they fall
into four basic categories:

  1. Companies capable of producing the physical infrastructure – the hardware
    – necessary to store and process Big Data – firms like Dell, HPE, and IBM.
  2. Companies skilled in managing and manipulating extremely large, or Big
    Data, databases – firms like Oracle and SAP.
  3. Companies offering or operating large-scale cloud
    storage repositories, firms like Amazon Web Services (AWS), Google, and
  4. Companies with established Professional Services organizations, since
    Big Data clients will require outside counsel to develop and deploy Big Data
    applications; these pro services firms include, prominently, Accenture and

Other prominent Big Data providers (encompassing hardware,
software, and service) include

  • Teradata (US)
  • SAS Institute (US)
  • Adobe (US)
  • Talend (US)
  • Qlik (US)
  • TIBCO Software (US)
  • Alteryx (US)
  • Sisense (US)
  • Informatica (US)
  • Cloudera (US)
  • Splunk (US)
  • Palantir Technologies (US)
  • 1010data (US)
  • Hitachi Vantara (US)
  • Fusionex (Malaysia)
  • Information Builders (US)
  • Salesforce (US)
  • Micro Focus (UK)
  • MicroStrategy (US)
  • ThoughtSpot (US)
  • Yellowfin (Australia)6

[return to top
of this report]

Market Growth

As reported by Markets&Markets, the global Big Data market should grow from
$138.9 billion in 2020 to $229.4 billion by 2025, representing a respectable
compound annual growth rate (CAGR) of 10.6 percent during the forecast period.

According to the firm, major growth factors include increasing:

  • Utilization of Internet of Things (IoT) devices
  • Availability of data which might be mined for competitive advantage
  • Government investments aimed at enhancing digital technologies

In terms of geographic presence, North America is dominant, with the Asia
Pacific (APAC) sector exhibiting the highest growth rate.7

Open Source

Despite the availability of proprietary (vendor) products, many enterprises
are processing their Big Data using open source solutions.

As observed by analyst Sohini Mitter, in Qubole’s 2018 Big Data Activation
Report, "About 76 percent [of] companies ‘actively leverage
at least three Big Data open source engines’ and put those findings into ‘active
use.’ The most popular engines are Apache Hadoop/Hive, Apache Spark, and
Presto, and these are used for data preparation, machine learning, and reporting
and analysis workloads."8

Data Privacy

Individuals, both employees and consumers, are concerned about the
confidentiality and integrity of their personally identifiable information
(PII), fearing identity theft and
other privacy-related violations. Big Data initiatives have the potential to
exacerbate these worries since increasing amounts of PII will be gathered and retained,
and breaching a Big Data warehouse will likely expose a big volume of
sensitive information.

Accordingly, the promoters of Big Data will have to ensure that Big Data
repositories are properly locked down. Moreover, they will have to convince an
already skeptical public that Big Data operations are both worthwhile and

Another complication are new governmental rules and regulations, as codified,
in particular, by:

  • GDPR – Any
    organization responsible for collecting, processing, or storing data
    belonging to the citizens of the European Union must comply with the EU
    General Data Protection Regulation (GDPR). Analyst Andrada Coos cautions
    that "companies that process EU data subjects’ personal information have
    very clear obligations as data controllers and processors. Prior
    authorization for processing is needed from data controllers and can only be
    done as per the documented instructions provided by them. Confidentiality is
    imposed on personnel processing sensitive data. Clear measures to protect
    personal data must be adopted and sub-processors cannot be engaged without
    the explicit authorization of data controllers."9
  • CCPA – The
    California Consumer Privacy Act (CCPA) of 2018 grants consumers various
    increased rights with regard to personal information held by a business.
    Among the expanded rights are the right to request a business to delete any
    personal information that is collected by the business, and the business is
    required to comply with such a verifiable consumer request unless the data
    is necessary to carry out specified acts.

New Technology

Analyst Alex Woodie predicts the emergence of new Big Data technology. "Many
of the major Big Data frameworks and databases that are driving innovation today
were created by the web giants in Silicon Valley, and released as open source.
The good news is there’s no sign the well is drying up. If anything, innovation
may be accelerating."

Woodie admonishes practitioners to "retain as much flexibility as possible in
their creations. While it may be tempting to cement your application to a
certain technology for performance reasons, that could come back to haunt you
when something better and faster comes along."10

Data Curation

Data curation is the active and on-going management of data throughout its
lifecycle, from creation or acquisition to disposal. Traditionally an
undervalued process within an enterprise context, Big Data is rendering the data curation function essential, both because of the volume of data under management
and the importance of discovering its potential value.

As analyst Keith D. Foote predicts, "many organizations will find the position of Data Curator (DC) has
become a new necessity. The Data Curator’s role will combine
responsibility for managing the organizations metadata, as well as Data
Protection, Data Governance, and Data Quality."11

Big Science

Now, and in the future, Big Data will contribute to our
understanding of Big Science; like climate change. As
analyst Marc Botha explains, "Big Data solutions can
make a decisive contribution to the debate on climate
change. The [multi-sourced data] can come from
meteorological institutes, various research institutions
for geosciences or particle physics, or even data sets
from ocean research."12

Small Intuition

While it may seem counterintuitive, Data Science Manager David
Thompson at Pareto Intelligence, a health tech company, feels that old-fashioned
(and not particularly scientific) intuition plays a role in Big Data
analysis. "I have been watching the re-emergence of intuition in developing analytic
strategies. While originally an enemy of data science, intuition seems to be
acting more like an ally. Data scientists and analysts that have been working
with Big Data for a while seem to have developed a recalibrated intuition that
is rooted in deep experience with the data rather than general experience within
the industry only. And as we continue to nearly drown in messy and erroneous
data, I believe intuition will begin to play an even larger role in analytics."13

Hybrid Cloud

To help support their Big Data ambitions, enterprise planners are
increasingly adopting a hybrid cloud information infrastructure, combining
private on-premise clouds with public clouds. This formula permits data to be
captured, stored, and processed close to its point of origination, often
invoking AI-powered tools. It also allows data to be shuffled from short-term to
long-term repositories for follow-on trend analysis. Finally, these
private-public cloud partnerships provide the immediate scalability that Big
Data applications demand.14

Data Democratization

The Big Data concept works best when data is made available to those
individuals uniquely qualified to spot trends and render meaningful
interpretations; in other words, when the analyst community is expanded to
include subject matter and other experts who can exercise their judgment (and
execute special analytic tools) to discover insights that more generic analysts
and applications might miss.

Making Big Data more accessible to more people – a phenomenon called "data
democratization" – will help optimize enterprise data analysis operations.15

Healthcare AI

Big healthcare data will yield better clinical results and improved patient
satisfaction. According to Director of Advanced
Analytics Adam Dubrow at Crossix, "Some of the most important trends
for the healthcare industry include the rapid growth of data available for new
applications in improving patient health outcomes, including real-world data
from the healthcare system as well as consumer and patient-generated data. From
there, we receive transformative insights and solutions made possible by
connecting these data sets and applying machine learning and AI. There are many
opportunities for using real-world data for regulatory and clinical applications
such as supporting FDA applications."16

Strategic Planning Implications

[return to top
of this report]

The Federal Viewpoint

According to the US National Institute of Standards and Technology, there is
broad agreement among commercial, academic, and government leaders about the

potential of Big Data to spark innovation, fuel commerce, and drive progress.

The availability of vast data resources carries the potential to answer
questions previously

out of reach, including the following:

  • How can a potential pandemic reliably be detected early enough to intervene?
  • Can new materials with advanced properties be predicted before these materials have ever been synthesized?
  • How can the current advantage of the attacker over the defender in guarding against cyber-security threats be reversed?

There is also broad agreement on the ability of Big Data to overwhelm traditional approaches.
The growth rates for data volumes, speeds, and complexity are outpacing scientific and
technological advances in data analytics, management, transport, and data user spheres.

Despite widespread agreement on the inherent opportunities and current
limitations of Big Data, a lack of consensus on some important fundamental questions continues to confuse potential
users and stymie progress. These questions include the following:

  • How is Big Data defined?
  • What attributes define Big Data solutions?
  • What is new in Big Data?
  • What is the difference between Big Data and bigger data that has been collected for years?
  • How is Big Data different from traditional data environments and related
  • What are the essential characteristics of Big Data environments?
  • How do these environments integrate with currently deployed architectures?
  • What are the central scientific, technological, and standardization
    challenges that need to be addressed to accelerate the deployment of robust,
    secure Big Data solutions?17

The Enterprise Perspective

As reported by analyst Louis Columbus, "[a]ccording
to an Accenture study, 79 percent of enterprise executives agree that companies
that do not embrace Big Data will lose their competitive position and could face
extinction. Even more, 83 percent, have pursued Big Data projects to seize a
competitive edge."18

From the standpoint of a cynical IT manager, Big Data might appear as the
“next big thing,” the “technology du jour.” Unfortunately, most IT
departments are still busy trying to comprehend and implement the “last big
thing,” Edge Computing, and the phenomenon that preceded it, “Cloud

If enterprise business planners believe in Big Data, they will likely have to
inject additional resources into their IT departments, or establish Big Data as
the number one enterprise data management priority.

Finally, there is a shortage of trained Big Data
technology experts. Before investing in any big Big Data projects,
enterprise planners may need to engage the services of prominent IT consulting
firms for Big Data design, development, and deployment planning. As an
object lesson, these planners should consider recent enterprise experience with
Virtualization, in which some companies and agencies did not achieve the level
of resource reduction they had anticipated – due, in part, to misconceptions
about the technology as well as poor planning.

Enterprise planners should push for the development of Big Data best practices, with special concentration on measures designed to promote
Big Data quality, security, and governance.


[return to top of this report]

About the Author

[return to top of this report]

James G. Barr is a leading business continuity
analyst and business writer with more than 30 years’ IT experience. A
member of “Who’s Who in Finance and Industry,” Mr. Barr has designed, developed,
and deployed business continuity plans for a number of Fortune 500 firms. He is the author of several books, including How to Succeed in Business BY
Really Trying
, a member of Faulkner’s Advisory Panel, and a senior editor
for Faulkner’s Security Management Practices. Mr. Barr can be
reached via e-mail at

[return to top
of this report]