TSB IT failure: lessons must be learnt

Published on Wed, 30/05/2018 - 10:31

After the recent failure at TSB, caused by a botched switch to a new IT system, CEO Paul Tester admitted that the bank was "on its knees". Just how risk managers can forge productive relationships with their IT departments will be the subject of a workshop at the annual conference, chaired by Peter Erceg of Lockton.

Unlike many other high-profile cyber incidents, the TSB event was not caused by any criminal or malicious act. Instead, it resulted from an IT migration the bank undertook in order to reduce costs and improve services.

The resultant fall-out and costs are multifaceted, ranging from severe business interruption; possible loss of customers and reduced customer loyalty; reputational damage to the brand and senior management; and as yet unquantified financial costs from compensating customers and hiring various service providers to mitigate and address the problem.

This incident demonstrates a far wider trend: business interruption is increasingly caused by non-physical events, with cyber-incidents chief among them (see Allianz's Risk Barometer 2018 for a fuller exposition on this trend).

It is also a further reminder of how changes to a company's value/supply chain for the purposes of improving efficiencies can involve various risks that are easy to underestimate.

Below are some of the key lessons to be taken from this incident:

1. Challenges of IT migrations

Some defects are always likely immediately following a large-scale IT migration. Thorough multi-stage testing, staggering the migration process, and monitoring the system during and immediately after the migration can help to minimise the risks and mitigate any problems before they lead to large-scale customer dissatisfaction.

Testing is vitally important but can be very expensive - and the more customers that use a system, the more testing is typically required and the more expensive and time-heavy the process can be. The final stage of testing (quality assurance) before a new system goes live should involve stress-testing an environment that is as close as possible to the IT system that will ultimately be used.

When testing a system that could be used by many thousands of customers, however, creating a testing environment that constitutes a robust facsimile of real life can be challenging. Such thorough, rigorous testing can also sometimes conflict with budgetary and contractual pressures, as well as boardroom expectations.

Once thorough multi-stage testing has been conducted, companies should also try to avoid making a wholesale, all-in-one-go migration. Instead, IT migrations should be staggered, with various different customer segments migrated piecemeal.

Following staggered migration, a company should monitor the new environment closely for any anomalies/early warning signs. Having a back-up plan, and making decisions early and acting quickly if there are problems will also minimise any customer inconvenience.

2. Poor post-crisis response

This incident further demonstrates the importance of fast, accurate and empathetic crisis management. While cyber-related losses alone can cause great harm, a company's response can also be critical - enabling it to quickly recover, or causing further damage.

In this case, TSB's CEO was accused of being "extraordinarily complacent" by MPs after he said the bank's move to a new IT system had mostly run smoothly. "What we are hearing this afternoon is the most staggering example of a chief executive who seems unwilling to realise the scale of the problem that is being faced," said MP Nicky Morgan.

In order to influence the crisis narrative, a company's communication to the outside world should be proactive and speedy. Otherwise the vacuum that you've left will be filled by other commentators.

The most effective protection against a major reputational incident is to plan for the worst-case scenarios. It is important to train, rehearse and test the people who will handle these different scenarios.

Empathy with affected third parties is vital: the crisis is not about you - it is about the people affected by it. Your crisis is guaranteed to escalate unless you try to put yourself in other people's shoes.

3. Responsibility to third parties

TSB has said customers will receive compensation not only for any financial loss but also for emotional distress and inconvenience, adding that no customer would be left "out of pocket".

What will constitute reasonable/appropriate compensation to affected third parties in future is somewhat unclear. The implications of needing to provide comprehensive compensation to many thousands of customers need to be considered, particularly if some of them claim to have been severely inconvenienced and materially affected.

Regulators are increasingly keen on protecting customers, so compensations following such incidents could possibly become larger in future and cover a greater range of non-monetary costs/losses. The possibility of lawsuits following such incidents is high.

Peter Erceg is senior vice president, global cyber & technology, global professional & financial risks at Lockton. peter.erceg@uk.lockton.com

His conference workshop (C9: Getting on top of cyber risk management - how to work more effectively with the CISO) takes place at 10.00-11.15 on Tuesday 12 June