At least 30% of enterprise data is considered redundant, obsolete, trivial, or “dark”1 data. Organizations spend as much as US$34 million holding onto data that could otherwise be deleted2 , and since September 2020, regulators have has issued approximately US$3.4 billion in record-keeping-related fines.3 Despite the excessive cost, defensible disposal is rarely treated as a business imperative, or when initiatives are started, they often stall before real progress is made, leaving organizations exposed on numerous fronts.
While the financial impact of data overretention is striking, it is only one piece of a larger puzzle. In addition to cost increases, 85% of general counsel surveyed said the rising variety and volume of data types are also driving increased risk.4 Indeed, when teams fail to defensibly dispose of data and uphold rigorous deletion and retention practices across all information sources, an array of legal, compliance, financial, and operational risk can quickly accumulate. The result is increased vulnerability to regulatory fines, crisis incidents, IT failures, e-discovery exposures and other negative business outcomes.
Common threats that result from data overretention include:
- Compliance challenges—Highly regulated organizations, such as those in the financial services, life sciences, and energy industries, are required to meet specific retention guidelines for what data they may store, how they may store it, and for how long. However, retained data, whether known or hiding in the shadows, may be in violation of those requirements or may otherwise undermine the organization’s compliance policies.
- Privacy violations—Many global data protection and privacy laws include parameters for the duration that personal and sensitive information may be stored. Without a defensible deletion policy, it is difficult to comply with these requirements and demonstrate compliance to authorities, particularly under laws such as the EU General Data Protection Regulation (GDPR5 and China’s Personal Information Protection Law (PIPL).6 For example, many financial services institutions are placing renewed focus on understanding where personal data lives within their organization and with contracted third parties, how that information is managed, and whether it is being managed according to a retention and deletion schedule.
- Legal hold and e-discovery exposures—Large data volumes make legal hold and other e-discovery functions difficult. When systems are not mapped or managed, defensibility of preservation is difficult to uphold and verify, which may lead to spoliation issues. E-discovery costs can increase exponentially if an organization does not know what data it has, where it is located, or how to access it. Keeping 10 or more years of data as opposed to following retention policies will inevitably result in significant costs in collection, processing, and review.
- Operational inefficiencies—Where legacy data systems are in play, there is an increased risk of hardware or supporting software failures. Moreover, legacy systems are often difficult or impossible to integrate with modern systems, which hampers access to data and creates inefficiencies for incident response and other business needs. Moreover, teams often waste time trying to parse through excess data stores to find what they need, creating additional inefficiencies and pressure on resources.
- Innovation stagnation and impaired decision making—When excess data is not centrally managed, it is difficult to use it for insights and strategic business decisions. An organization’s data can be a valuable asset, but only if it is accessible, known, and useable.
Barriers to Entry
Conversely, implementing a rigorous and enforced defensible deletion program can yield many benefits for organizations in any industry. Additionally, organizations with active litigation portfolios will decrease the time, cost, and complexity of e-discovery. Defensible deletion can also help increase trust by demonstrating sound governance, reduce the cost of third-party and onsite storage, and eliminate the overall volume of data that could be impacted in an investigation, litigation, data breach, or other security event.
With such a clear cost and risk-to-benefits ratio, why is defensible deletion such a difficult hurdle for organizations to overcome? It is a challenging endeavour for many reasons, including:
Process reengineering—Manual work must be completed before putting in place technology solutions and automated processes. This work requires diligent data search, mapping, and classification across large data repositories, which are often spread between geographies, and dozens of enterprise functions or departments.
Legal hold assessment—Legal hold policy and existing legal holds must be evaluated to identify information that is obligated for preservation and separate it from that which can be deleted.
Defensibility assessment—In addition to reviewing for legal hold obligations, data must also pass a review for any other retention requirements according to various legal and regulatory obligations. Records, emails, file shares and other data stores must pass this test before it can be approved for disposition.
Alignment with backups—Data residing in legacy backups must be mapped and managed before anything can be deleted. If an organization deletes data but does not address its copy in backups, legal and regulatory problems can arise later. This also ensures that data on backups are unique, which can be more difficult and expensive to restore than pulling from more accessible data sources.
Resource and sensitivity challenges—Most organizations have an active investigation, litigation, or legal hold, which adds sensitivity and a degree of fluidity to the process of assessing data for possible disposal. There is rarely an optimal time to conduct this kind of sweeping project, which leads to hesitation and eventual delays. Additionally, internal teams do not have the time to conduct comprehensive reviews of data and go through the process of attesting to the defensibility of specific deletions.
Scope and scale—Organizations are creating vast amounts of data every day, much of which is being stored. Some enterprises have followed a save everything philosophy for years and years, leading to such large data environments that it can seem insurmountable to remediate. With so much data, it is often difficult for organizations to even know where to begin. In addition, organizations that do have policies may have too many retention categories for their data, which can be nearly impossible to operationalize over the long term, resulting in the policy being ignored.
Mergers and acquisitions—Organizations that have grown through mergers and acquisitions may have highly complicated and dispersed data environments that have been patched together with each new transaction. As organizations merge or absorb new divisions, they are also acquiring legacy systems, data stores, and complex IT environments, often without any remediation at the time of purchase. This exacerbates an already complex and large data landscape.
Misconceptions regarding data—Many organizations retain data due to a desire to use it in the future for business purposes and insight. When this happens, teams may begin to stray from following the existing retention policy. However, when data is not mapped and managed, it becomes useless for providing insight or supporting valuable analysis.
Outdated policies—If a policy exists, the organization may wrongly assume that all is well. It is common for policies and the associated retention and deletion categories to go for years without a refresh, despite key changes in the organization that might impact data classifiers or implementation of new data sources that were not accounted for in the original policy.
Despite the excessive costs, defensible disposal is rarely treated as a business imperative, or when initiatives are started, they often stall before real progress is made, leaving organizations exposed on numerous fronts.A Path Forward
To overcome these common barriers, organizations can build a program in stages, starting with the most impactful, sensitive data first. This provides a way to demonstrate the value of data disposition and help stakeholders see that it is indeed possible to achieve. Showing return on investment for a specific set of data can help build a business case for tackling more complex data stores.
Another step that helps make the effort more manageable (and sustainable for the long term) is to narrow and modernize the data categories and retention schedule. As an example, FTI Technology, a consulting organization, has seen clients with upwards of 1,000 categories in their retention schedule, a number that is just too complex to maintain.7 By bringing that number down, an organization can make it easier for teams to apply retention and deletion to their data consistently.
Once data is mapped, categorized, and reviewed for legal and regulatory obligations, automation capabilities can be implemented. This may include tools that can scan and monitor records as they are created and through their lifecycle, putting the retention policy into action and removing data that has been verified as defensible for disposal.
Conclusion
Overretention continues to be an issue in litigation and regulatory enforcement, particularly related to data privacy laws. Even amid a perceived relaxation of enforcement by US federal agencies, global, and state-level jurisdictions are actively pursuing data-related violations. The European Union is pursuing enforcement under the GDPR, the Digital Operational Resilience Act (DORA)8 and other similar data regulations. Across Asia Pacific and Latin America, similar actions are taking place under laws including the Australia Privacy Act,9 Brazil’s General Data Protection Law (LGDP),10 and many others. These actions, combined with antiquated or inadequate data deletion practices, pose a serious vector of risk, especially for organizations in financial services, healthcare, and other highly regulated industries. Organizations that take defensible deletion seriously as a business imperative, and work with experts to solve the issue in a pragmatic manner will reduce the cost and risk associated with organizational data, deriving value from it now and in the future.
Endnotes
1 IBM, “What is Dark Data”
2 Hackley, M.; “3 Common Information Governance Challenges (And How to Overcome Them),” Access, 2023
3 Global Relay Intelligence & Practice (Grip), “Recordkeeping Fines and Actions,” 11 September 2023
4 FTI Technology, The General Counsel Report 2025
5 Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation [GDPR]) (Text with EEA relevance)
6 Personal Information Protection Law (PIPL), Personal Information Protection Law of the People’s Republic of China, CN, 2021
7 FTI Technology, case studies
8 European Insurance Occupational Pensions Authority (EIOPA), “Digital Operational Resilience Act (DORA),” EU
9 Australian Government, Office of the Australian Information Commissioner, “The Privacy Act,” AUS
10 International Association of Privacy Professionals (IAPP), “Brazilian General Data Protection Law (LGPD, English translation),” 2020
Steven Stein, CIPP, PMP
is a senior managing director at FTI Technology. Stein is a data privacy and information governance leader and former civil litigation attorney with more than 20 years of experience developing and implementing data risk and compliance programs. He focuses on serving clients in the financial services, healthcare, life sciences, and power and utilities industries. He advises in the areas of privacy, records management, information lifecycle, legal hold, defensible data disposition, data protection and litigation readiness.
John Goff, CISSP
Is managing director within FTI Technology. Goff is an expert in helping corporations develop and implement a wide range of information governance and e-discovery programs to reduce the costs and risk of enterprise data. His experience includes identifying sensitive information for remediation, conducting vendor risk assessments, bringing e-discovery software and processes in-house, developing and implementing social media policies and cyberrisk response plans, updating records management and legal holds programs, and operationalizing e-discovery for easier cross-department cost transparency and billing. With a background in IT, he is adept at collaborating across IT, legal and information security teams to produce practical results.