Data Qaulity Key to Building a Good Cat Model

February 29, 2008 | Last updated on October 1, 2024
4 min read

After seven devastating hurricanes spun into the United States between 2004 and 2005, there appeared to be widespread surprise at the significant discrepancies between actual and modelled losses. The first reaction was to condemn the models. But when companies compared the data they had entered into the models with what they had actually been insuring, a different story transpired.

Descriptions of insured locations were often incomplete and or even erroneous. An infamous example was the destruction of the floating casinos of Louisiana, which cost millions of dollars more in insured losses than expected largely due to business interruption. These huge barges proved highly vulnerable to storm surge but had been incorrectly categorized in the models. More worryingly, thousands of properties located close to the sea had only been identified by their area zip code, making them appear far less vulnerable than they actually were.

Improving the quality of data entered into models can have a dramatic effect on the models’ output. Changes to loss estimates by a factor of four times on a single building or 25% across a whole portfolio are not uncommon when information is enhanced.

The impact is so significant, rating agencies are now using an organization’s exposure data quality as a proxy for the effectiveness of management controls. Despite this, some risk managers are still fairly ambivalent about their data — they prefer to go to their broker for advice on what is needed for insurance placement, or they (mistakenly) believe that providing better information might result in higher insurance costs.

SPOTTING AND CORRECTING MISSING DATA

Missing information is relatively easy to spot, but identifying wrong data that appears complete poses an even greater challenge. Humans are very good at using intuition to spot anomalies in patterns and can catch some of these errors, but accurately reviewing extensive insurance schedules is unrealistic. Even when problems are identified, someone needs to decide how to correct them.

Software can be taught how to recognize and address irregularities in a systematic and auditable way, using predefined rules or heuristics that replicate expert intuition. Designing good heuristics is difficult: they need to be smart enough to decide when data is wrong and how it should be changed, but retain the ability to ask for human intervention where there is ambiguity. The best heuristics can learn from experience, creating processes that get increasingly automated, reliable and accurate.

Combining modelled heuristics with metrics that monitor data resolution and completeness allows companies to understand their progress in improving data quality over time and across business units. Quality cannot be managed until it has been measured.

Ironically, it was probably easier for insurers to get information about properties they were insuring 200 years ago than it has been until now.

From 1867 to 1970, Sanborn Fire Insurance Maps documented the rise of American cities with building level detail and colour-coded construction classes. But only in recent years — as the cost of computer storage has decreased, and technological performance has increased — has the insurance market been provided with similar levels of detailed information. Having lost the habit of collecting good data, the market needs to relearn how to identify and store the information required to make decisions.

Even the best heuristics need help. An independent view of the building construction, occupancy type and valuation can provide additional information to complement or even replace the original data. The construction and real estate investment industry has access to robust databases that contain some of the key information underwriters need; this data can be converted into a form insurers can use. Using such data in combination with aerial photographs and building surveys, insurers will have an independent view of what they are underwriting, where it is and what it is worth.

To be successful, both the databases and heuristics need to be accurate and timely. Companies excelling at assessing data quality are using systems integrated into the underwriting process, so there is no meaningful increase in analysis time.

MOVING FORWARD

All major events provide new insights for catastrophe models, and the hurricanes of 2004 and 2005 were no exception. The models have long since been updated to reflect these lessons. But the problem of poor quality data still affects the industry and remains one of the biggest barriers to accurate loss assessment, portfolio management and underwriting.

Following another year of relatively quiet hurricane activity, rates are starting to slip; there are worrying indications that risks are again being written with insufficient attention to location, construction type and size. Ultimately, the efforts insurers and reinsurers make to raise the standards of their data will determine how well-prepared the industry is for the next major event.