Natural Language Processing Powers Data Hygiene for Commercial Insurance

Artificial Intelligence - November 24 2020

Today, commercial insurance brokers and carriers are inundated by mountains of internal and external data. With massive volumes of data created every day – 80% of it unstructured – the insurance industry struggles to access information locked away in enterprise data stores and data lakes. According to a 2019 report by the International Data Corporation (IDC), a total of 163 zettabytes of data will be produced per annum. This consists of both structured and unstructured data sets.

As an industry predicated on data, the hygiene and quality of this data is essential to fuel informed risk assessment, pricing, and decision making. However, according to an article in the Harvard Business Review, only 3% of companies’ data meets basic data quality standards. Common data quality and hygiene issues include data that is flawed or incomplete, poorly defined, incorrect, out-of-date, irrelevant or difficult to access and interpret. This is an issue that the insurance industry knows all too well. For insurance carriers and brokers, current manual business processes and practices that have plagued the industry for decades often result in dirty data for the following reasons:

  • Duplicate data
  • Manual data re-keying & duplication of errors across core systems
  • Misreading data on insurance applications/submissions
  • Data silos
  • Lack of a data-driven culture 
  • Lack of consolidation of coverage requests
  • Continuous reissuing of policies and updates due to errors and omissions

“Research estimates that this lack of data hygiene can comprise up to 2% of a carrier’s expense ratio.” – Atticus Associates Ltd.

Despite a push towards standardization and straight-through processing, underwriters continue to sift through and manually extract less than half of the available data points from submissions during the application intake process to assess risk and make business decisions. Given the daily volume of submissions, underwriters make judgment calls based on a limited set of data points, as they attempt to process as many applications as possible each day. Although brokers take the time to provide hundreds of data points and artifacts, underwriters are only using a small percentage of the data available to them due to time constraints and the manual aspect of reading and extracting data points from submissions which can easily amount to hundreds or thousands a day.

“Accenture reports that on average, only 25% of an underwriter’s day is spent on selling and broker engagement, while more than 50% is spent on core processing.”

Advances in Artificial Intelligence (AI), computing power, and cloud-based technologies are providing insurers with new ways to address the industry’s data dilemma.  AI can help commercial insurance brokers and carriers interpret massive amounts of data and enable correlations that would not otherwise be possible because there are simply too many factors for the human brain to process.

Developing better data hygiene practices can enable commercial insurance carriers and brokers to reduce underwriting expense ratios, improve overall loss ratios and a superior customer experience. Real-time data extraction can provide faster access to the data needed to inform better business decisions, allowing underwriters to focus their efforts, time and energy on submissions that meet the risk profile and help shape the book of business. Using AI-powered solutions, Natural Language Processing, and Machine Learning, data can be automatically extracted from digital insurance documents and forms, including tables and nested or embedded documents.

AI Drives Data Hygiene and Hyper Efficiency

Unlike redline or regular expression technology, AI can contextually understand the data, reading documents in much the same way that skilled human underwriters and knowledge workers do. The data does not need to be in the same location within the document to be recognized. Named Entity Recognition (NER) is the ability to identify data types such as person, insured, currency, geographic location (city, state, zip code), percentages, by applying classification techniques.

Machine learning is much more powerful than rules-based automation because machine learning leverages reinforced learning and continuously learns over time. In short, the solution gets smarter and more accurate based on feedback from humans in the loop. It also requires less overhead from a maintenance perspective, as the rules or the if/when logic in RPA solutions must be manually updated to cater to specific business cases or changes in business logic. While outsiders may think insurance never changes much, brokers and carriers are always making subtle and not so subtle changes to products and underlying business logic. 

With Semantic Analysis, the machine can contextually understand the meaning and interpretation of words, signs and sentence structure the same way humans do – only hundreds of times faster. It reads all the words to capture the true meaning of any text, pulling out relevant pieces of information, assigning value to those words, and intelligently analyzing structured and unstructured text. The machine analyzes context in the surrounding text to accurately understand the relationship between words. Artificial intelligence solutions purpose-built for insurance can help mitigate human error, improve data hygiene, and achieve straight-through processing, bringing new efficiencies and increasing quote to bind ratios.

Named Entity Recognition (NER) is the ability to identify data types or nouns such as person, insured, currency, geographic location (city, state, zip code), percentages, by applying classification techniques.

In most instances today, commercial insurance underwriters are responsible for reading and manually extracting data from applications for new policies, renewals and endorsement requests, so if the information is read and re-keyed into multiple systems incorrectly this affects the cleanliness of the data system wide. To avoid dirty data, artificial intelligence solutions that combine natural language processing, machine learning, and named entity recognition can accurately extract insurance data to automatically prepopulate systems including policy management, broker portals, rating engines with clean and consistent data, eliminating the risk of human error. Real-time access to data also enables carried and brokers to replace cumbersome manual processes and tasks with intelligent AI-powered underwriting and brokering workflows.

AI-powered solutions enable insurers to automate and standardize processes like application intake and policy checking to minimize policy reissuance and rework due to errors and omissions. Re-designing existing manual, time-consuming, error-prone processes can provide insurers with the peace-of-mind that they are delivering accurate policies to brokers and policyholders. AI solutions enable underwriters to upload a policy and compare a new policy against a binder or other source of the truth on-screen using a customizable checklist to verify the accuracy of the policy by validating the policy language, and key insurance-specific data points such as address, dollar values, coverage, premiums, endorsements, etc. Skilled knowledge workers can quickly review on-screen irregularities or issues flagged by the AI solution at a glance, allowing them to correct policy details immediately.

To keep up with customer expectations and market demands, underwriters need to throw away the yellow highlighters and embrace AI solutions to automate mind-numbing manual tasks.

Creating top-notch data involves more than just technology. It also involves having the right data practices and policies in place and developing a data-driven culture across the organization. By deploying AI solutions that extract data in real-time and automate insurance-specific workflows, commercial insurers can eliminate error-prone manual processes that create bottlenecks and friction in the insurance value chain and ultimately impact customers. The lack of data hygiene can be seen and felt in many ways:

  • Inability to respond fast enough to broker submissions
  • Failure to respond to submissions at all due to volume
  • Delays in issuing policies
  • Endless policy rework
  • High risk of E&O exposure

Whether a carrier is underwriting or a broker is selling a specific line of business insurance or a complex book of business such as D&O, BOP, professional liability, business interruption, general liability, property, cyber, inland marine, auto or other lines of business insurance, AI solutions that understand the language of insurance can promote good data hygiene, which, down the line, will streamline customer-centric analytic projects, drive faster turnaround times, making for happier and more loyal customers.

Buyers Guide Blog Banner without Borders


Browse different topics

Recent Posts