PDF
version of this report
You must have Adobe Acrobat reader to view, save, or print PDF files. The
reader is available for free
download.
Natural Language Generation
Copyright 2022, Faulkner Information Services. All
Rights Reserved.
Docid: 00021396
Publication Date: 2208
Publication Type: TUTORIAL
Preview
Natural language generation (NLG), also known as automated narrative
generation (ANG), is a technology that transforms enterprise data into
narrative reports by recognizing and extracting key insights contained
within the data, and translating those findings into plain English (or
another language). There are articles, for example, prepared by respected
news outlets like the Associated Press that are actually “penned” by
computers. While perhaps overly optimistic, some analysts have contended
that as much as 90 percent of news could be algorithmically generated by
the mid-2020s, much of it without human intervention.
Report Contents:
Executive Summary
[return to top of this report]
“News content has always
been limited by resources. More events happen in the world each day than
can be covered by available journalists.” Automated journalism
“[transcends] the limits of humans. Because the algorithms that write news
stories can, given an ample input of data, spit out an exponentially
greater number of news stories. A news story essentially becomes an
instantly attainable object, much like the results of a search engine
query.”
– Matt
Carlson1
Please note: This report was written by a real human being. This
acknowledgement may sound silly but some articles prepared by respected
news outlets like the Associated Press are actually “penned” by computers
– artificial intelligence programs practicing a new form of authorship
called natural language generation.
Related Faulkner Reports |
Artificial Intelligence Tutorial |
Computational Linguistics Tutorial |
Smart Machines Tutorial |
While we’ve long known that mechanical operations like automotive
manufacturing can be reduced to repeatable processes – which, in many
cases, robots can assimilate and perform with greater efficiency, fewer
defects, and lower costs than humans – it turns out that certain
intellectual activities like translating a company’s quarterly earnings
report into an article for investors is also “mechanical” – and can be
accomplished by computer.
This new reality was glimpsed a few years ago when application developers
started to produce programs that could sort through massive amounts of
evidentiary material accumulated through e-discovery orders and identify
items of interest to litigators. Instead of engaging an army of
associates and paralegals to analyze the documents, firm lawyers could
delegate the work to computers, which not only operate faster, but are
immune to the type of fatigue-based errors and omissions that might
characterize reviews by humans.
Natural language generation (NLG), also known as automated narrative
generation (ANG), is a technology that transforms enterprise data into
narrative reports by recognizing and extracting key insights contained
within the data, and translating those findings into plain English (or
other language). The automated narratives can be generated in multiple
forms, each tailored to a specific audience.
Eliminating Cognitive Bias
As emphasized by Deloitte analysts, one of the principal advantages of
NLG – and the reason for pursuing machine writing – is the elimination of
human cognitive bias. “Seeking out pre-conceived insights from a set of
data, as opposed to performing objective analysis, increases as humans
strive to make sense of increasingly complicated datasets and data models.
As data growth transcends volume and impacts variety and velocity of data,
the risk of cognitive [bias] and its limitations in the production of
valuable insights become more material.”2 In such situations,
natural language generation is, increasingly, preferred.
The NLG Market
Reflecting the emergence of NLG as a valuable – and viable –
technology, MarketsandMarkets reports that the natural language generation
marketspace, which was valued at $277.2 million in 2017, is expected to
reach $825.3 Million by 2023, representing an impressive compound annual
growth rate (CAGR) of 20.8 percent during the forecast period.3
Technology
[return to top of this report]
NLG and CL
Offered for technological context, Natural Language Generation (NLG) is
generally recognized as a branch of Computational Linguistics (CL), which
also encompasses Natural Language Processing (NLP) and Natural Language
Understanding (NLU).
Natural Language Processing
As the engineering arm of computational linguistics, natural language
processing (NLP) helps streamline and expedite business processes by:
- Automating routine tasks, through chatbots and other digital
assistants. - Improving searches, “disambiguating” word definitions based on context
(carrier, for example, means something different in biomedical than in
industrial contexts). - Enabling search engine optimization, improving an enterprise’s rank –
and, thus, visibility – in online searches. - Analyzing and organizing large document collections, including
corporate reports and scientific documents. - Advancing social media analytics, evaluating customer reviews and
social media comments to make better sense of huge volumes of
information. - Providing market insights, analyzing the language of customers to
determine their needs and how to communicate with them. - Moderating user or customer content, [maintaining] quality and
civility by analyzing not only the words, but also the tone and intent
of comments.4
Natural Language Understanding
Natural language understanding (NLU) is also called Natural Language
Interpretation” (NLI). Through NLU analysis, “computers manage to
interpret … language and define a user’s intent.” Unlike simple speech
recognition, NLU “focuses on the determination of intent, sentiment, and
context.”
Commenting on the potential of NLU, analyst Bogdan Koretski predicts that
“Due to the deep and correct language interpretation, machine-to-human
communication will reach a completely new level. Business processes like
data collection and analysis, data vetting, and facts checking will be
automated and errors excluded.”5
Natural Language Generation
Natural language generation (NLG) is a technology that transforms
enterprise data into narrative reports by recognizing and extracting key
insights contained within the data, and translating these findings into
plain English (or other language). The automated narratives can be
generated in multiple forms, each tailored to a specific audience.
While the output of an NLG program is text, the input can take various
forms:
- Textual data;
- Non-linguistic data, like sensor data; and, more recently,
- Visual data, like images or videos.
NLG output can be:
- In written form, like a report or press release; and
- In spoken form, like data delivered by a chatbot.
As described by analyst Ivy Wigmore, the natural language generation
function is divided into six stages:
- “Content analysis – Data is filtered to determine what should
be included in the content produced at the end of the process. - “Data understanding – The data is interpreted, patterns are
identified, and it’s put into context. Machine learning is often used at
this stage. - “Document structuring – A document plan is created and a
narrative structure chosen based on the type of data being interpreted. - “Sentence aggregation – Relevant sentences or parts of
sentences are combined in ways that accurately summarize the topic. - “Grammatical structuring – Grammatical rules are applied to
generate natural-sounding text. The program deduces the syntactical
structure of the sentence. It then uses this information to rewrite the
sentence in a grammatically correct manner. - “Language presentation – The final output is generated based on
a template or format the user or programmer has selected.”6
Neural Text Generation
From a technological perspective, natural language generation is still in
its infancy. As analyst Robert Dale observes, “today’s commercial NLG
technology appears to be relatively simple in terms of how it works.
Nonetheless, there is a market for the results that these techniques can
produce, and much of the real value of the solutions on offer comes down
to how easy they are to use and how seamlessly they fit into existing
workflows.
“[In] terms of [the] underlying technology for generating language,
there is of course a new kid on the block: neural text generation, which
has radically revised the NLG research agenda.“7 Neural
networks mirror the operation of the human brain, enabling computer
programs to recognize patterns and solve problems as complex – and often
as confounding – as natural language generation.
Applications
[return to top of this report]
Much of the impetus behind the development of natural language generation
can be traced to Big Data.
Enabled by rapid advances in storage capacity and processing power,
companies are spending big on Big Data, hoping to gain vital business
intelligence from the mountains of data being created or extracted each
day.
Big Data is useless, however, without the ability to reduce it to
actionable intelligence and, just as importantly, present that
intelligence in a digestible form. Applying the aphorism that “one
picture is worth a thousand words,” many information analysts have
utilized dashboards containing charts, graphs, “infographics,” and other
visual effects to summarize the meaning behind Big Data.
But as analysts at Automated Insights observe, “dashboards disappoint.”
“Everyone reads dashboards
differently. Without guidance, it can be easy to miss significant
attributes. Some users may miss elements whose meaning is obvious to
others – even though that insight may be of crucial business
importance. Charts require titles, notes, and, all too often,
in-person explanations from their creators. Dashboards can create
beautiful visualizations that leave CEOs and line-workers alike asking:
what does it mean? In other words, dashboards may help data
experts formulate insights. But data experts still have to use
narrative to impart those insights to others in an
organization.”8
To realize the full benefits of Big Data, natural language is needed to
explain, clearly and authoritatively:
- What insights a particular Big Data set is offering, and
- Why such insights are relevant to the enterprise and its business
interests.
NLG Use Cases
Natural language generation is commonly used to facilitate the following:
Narrative Generation – NLG can convert
charts, graphs, and spreadsheets into clear, concise text.
Chatbot Communications – NLG can craft
context-specific responses to user queries.
Industrial Support – NLG can give voice
to Internet of Things (IoT) sensors, improving equipment performance and
maintenance.
Language Translation – NLG can
transform one natural language into another.
Sentiment Analysis – First, NLU
determines which of several languages resonates with users; second, NLG
delivers messages in the users’ preferred languages.9
Speech Transcription – First, speech
recognition is employed to understand an audio feed; second, NLG turns the
speech into text.
Content Customization – NLG can create
marketing and other communications tailored to a specific group or even
individual.
Robot Journalism – NLG can write
routine news stories, like sporting event wrap-ups, or financial earnings
summaries.
Emotional Content NLG
While most natural language generation is geared toward a “professional”
audience – and evinces a professional tone – some NLG scientists are
starting to embrace emotional content through the development of emotional
chatbots. According to analyst Bardia Eshghi, “The technology behind
emotional chatbots strives to make them grasp an understanding of the
user’s feelings and emotions and to then offer applicable advice or course
of action.”
Among the prospective users of emotional chatbots are:
- Human resources representatives,
- Customer services representatives, and
- Psychologists or other mental health professionals10
AI Story Generation
Perhaps the most promising application of natural language generation is
the creation of fictional works.
Some observers have remarked that there are only a finite number of plots
in all of literature and that all storylines are just variations on these
plots. Some suggest (only partly in jest) that watching network
television reinforces this hypothesis, with critics claiming that they
find many TV and movie scripts “derivative.”
One solution being pursued by NLG and AI researchers is “AI Story
Generation,” or “the use of [machine intelligence] to produce a fictional
story from a minimal set of inputs.” As analyst Mark Riedl explains,
“Aside from the grand challenge of an AI system that can write a book that
people would want to read,” storytelling offers other advantages. For
example:
- “Humans often find it easier to explain [situations or circumstances]
via vignettes [or evocative descriptions or accounts], and are often
able to more easily process complex procedural information via
vignettes. - “Telling and listening to stories is … a way that humans build
rapport.”11
Recommendations
[return to top of this report]
Select the Right Tool
To achieve optimum results, enterprise planners should exercise care in
selecting a natural language generation tool. Critical criteria include:
- Summarization – How effectively – and completely –
does the solution summarize target dataset contents? - Simplicity – Is the generated text easy to read and
understand? - English Proficiency – Does the generated text
comply with standard rules of English grammar and syntax? - Multilingual Capabilities – Can the solution
accommodate languages other than English? - Security – Can the solution be adopted without
compromising current security standards? - Scalability – Can the solution grow with the
business? - Platform – Can the solution be deployed on-premise or
in the Cloud (as desired)?12
Be Realistic About ROI
Understand the financial commitment associated with natural language
generation, and consider the return on investment.
Analyst Mark Kaput cautions that “NLG solutions, even basic ones,
typically require substantial time to set up. You also need to pay for a
solution, and possibly related NLG services. You’ll want to take a
realistic look at the technology, what it can do for you, and how much you
can scale using it.
“Start by analyzing how long reports, articles or narratives currently
take, then see how much time NLG can potentially shave off.
“Finally, apply those time savings to all staff whom NLG would affect. An
hour saved per week per employee may make financial sense for your
organization.”13
Document All Enterprise Processes
Finally, one area in which all enterprises – small to large – will
benefit from natural language generation is process documentation.
Understanding how enterprise operations are conducted on a daily basis
can promote process improvement, even encourage business reengineering –
reducing recurring costs and enhancing customer and employee satisfaction.
Web Links
[return to top of this report]
- Automated Insights: http://www.automatedinsights.com/
- IBM: http://www.ibm.com/
- US National Institute of Standards and Technology: http://www.nist.gov/
References
1 Matt Carlson. “What Automated Journalism Looks Like Now and
in the Future.” Data Driven Journalism. January 9, 2018.
2 Christos Giogkarakis, Weng Kit Ng, Raminderjit Singh, and
Aung Htet. “Automation in Reporting: Breaching the Cognitive Ceiling with
Natural Language Generation.” Deloitte. February 12, 2018.
3 “Natural Language Generation (NLG) Market by Application
(CEM, Fraud Detection & Anti-Money Laundering), Component (Software
& Services), Business Function, Deployment Model, Organization Size,
Industry Vertical, and Region – Global Forecast to 2023.” MarketsandMarkets.
May 2018.
4 “What Is Natural Language Processing?” Oracle. 2022.
5 Bogdan Koretski. “Five Recent Trends in Natural Language
Processing You Need to Know.” YSBM Group sp. z o.o. November 5, 2019.
6 Ivy Wigmore. “Natural Language Feneration (NLG).” TechTarget. July 2021.
7 Robert Dale. “Natural Language Generation: The Commercial
State of the Art in 2020.” Natural Language Engineering 26, 481–487 |
Cambridge University Press. 2020.
8 “Big Data Needs Big Insights: The Business Case for Natural
Language Generation.” Automated Insights. p. 4.
9 Mike Kaput. “Natural Language Generation (NLG): Everything
Your Need to Know.” Marketing AI Institute. April 20, 2022.
10 Bardia Eshghi. “The Ultimate Guide To Emotional Chatbots in
2022.” AIMultiple. March 9, 2022.
11 Mark Riedl. “An Introduction to AI Story Generation.” The
Gradient. August 21, 2021.
12 “Guide to Choosing a Narrative Generation Tool.” Yseop.
April 2016:8.
13 Mike Kaput. “Natural Language Generation (NLG): Everything
Your Need to Know.” Marketing AI Institute. April 20, 2022.
About the Author
[return to top of this report]
James G. Barr is a leading business continuity analyst
and business writer with more than 30 years’ IT experience. A member
of “Who’s Who in Finance and Industry,” Mr. Barr has designed, developed,
and deployed business continuity plans for a number of Fortune 500
firms. He is the author of several books, including How to Succeed in
Business BY Really Trying, a member of Faulkner’s Advisory Panel, and a
senior editor for Faulkner’s Security Management Practices.
Mr. Barr can be reached via e-mail at jgbarr@faulkner.com.
[return to top of this report]