Threat actors pose an increasing danger to any organization’s digital trustworthiness. Criminals go where the money is. For instance, executing a ransomware attack is usually going to be more profitable against a company, educational institution, or government agency than against individuals. Less work, generally less risk, and a significantly larger payoff are why ransomware attacks against organizations are increasing, rising 13% in the last five years.1
Assume Breach Mentality
The reality is that there are a multitude of ways for a threat actor to try and gain access to an organization. Often the easiest is via phishing attacks, and while these have been traditionally email, the rise in voice phishing (vishing) is concerning, especially with the use of GenAI to clone the voice of someone trusted.2 We can increase our educational efforts and the other standard practices to try and address these types of attacks, but all it takes is the right person having a bad day, being distracted, or losing focus for a single message or call and a threat actor is in the environment.
Given that a breach can happen at any time and a threat actor does not have to launch a sophisticated attack to get in, the recommendation within cybersecurity today is to assume a “breach mentality.” In other words, start from the position that adversaries are already in the environment and seek to detect, contain, and eject them while maintaining or restoring capability as quickly as possible. One write-up recommends three strategies to employ: visibility, robustness, and containment.3 Let us look at those three strategies.
Visibility
Detecting mistakes and issues is key, whether the focus is on day-to-day operations or cybersecurity. From a trust perspective, we want our systems, processes, and services to be available and fully functional to their consumers/customers. The faster an issue is detected, the quicker we can try and head off a problem and reduce the likelihood of downstream impact. Visibility throughout our ecosystem is critical to maintaining a high trust level for our various relationships.
If we ensure appropriate visibility, especially for the systems we control, we should be able to detect the possibility of a cyber incident more quickly. This gives us the opportunity to head off or minimize the impact of a system or data breach. An organization can perform its due diligence and build all kinds of security mechanisms into its own platforms, but a vulnerability in a third-party component or a supply chain attack may bypass all of the organization’s hard work. Therefore, visibility is a necessity everywhere feasible.
Robustness
Robustness is more than just capacity. It is the ability to keep running when there are challenges to the operating environment. Oftentimes, this is referred to as IT resilience.4 The visibility to detect problems and correct them quickly helps with robustness/resilience. More importantly, though, are the architectures of core systems and the incident/resiliency plans to deal with various disruptions. Speaking of architecture, there are two key areas to consider.
First, we must design systems where a single failure does not take down the whole system. Examples of resilient architecture include multiple, redundant servers handling the same function, and deploying functionality across multiple regions and cloud providers. Having back-up and stand-by systems can also be part of the resilient architecture. For instance, if the primary WAN line has a fiber cut, does the organization have a secondary line with a different pathway that would not be impacted? Is there a backup mechanism to accept payments if the primary means has a service outage?
Second, we want to employ security best practices such as zero trust, surface area reduction, and appropriate hardening configurations. With proper zero-trust architecture in place, a compromised workstation is unable to reach out to as many servers in the environment. Even with administrative rights, the workstation will not be able to access the database server which holds critical intellectual property, and a threat actor is not going to be able to launch a Denial of Service (DoS) attack against key customer-facing systems. Surface area reduction goes hand-in-hand with that. Even in cases where one platform can talk to another platform, if the second platform does not expose a remote procedure call (RPC) because it is not necessary, an RPC coming from the first platform cannot be used to attack it. Hardening efforts are usually straight forward things like ensuring default configuration and account/password combinations are not used, input validation is appropriately handled (which can sometimes be done at the service layer and not within the application), and a TLS connection is enforced even if a non-encrypted connection is initiated. There are a number of industry benchmarks to assist with this type of effort and they should be investigated and used as appropriate.
An organization can perform its due diligence and build all kinds of security mechanisms into its own platforms, but a vulnerability in a third-party component or a supply chain attack may bypass all of the organization’s hard work.Containment
If a system develops a problem, the area with the problem should be isolated to prevent other issues from occurring. For instance, if a particular component is incorrectly performing interest calculations, then it should be isolated and contained so that those incorrect calculations are not being written or accepted elsewhere. Imagine a case where the component is not calculating a 5% interest rate on a loan but a 50% interest rate! If a borrower were to see that, that would certainly impact trust. More so if it affected a significant number of borrowers and the miscalculation hit the news. We can imagine a potential headline: “ACME Corporation tried to cheat its customers by increasing interest rates 10-fold!” Any kind of failure or anomaly should be contained, whether operational or cyber related.
With regards to cyber-related incidents, if the organization can lock down a threat actor and prevent lateral movement and impact to other parts of a particular system, especially movement to other systems, which allows the organization to keep more of its services and offerings available. It reduces the monetary loss in the near term and hopefully minimizes the loss of trust in the long term. Tying this back to an assume breach mentality, if we work under the operating principle that we are already compromised, then we must also work under the assumption that we can never shut everything down. If we must continue to operate while minimizing the impact of threat actors already present, then our containment mechanisms are necessary defenses to plan for, implement, and test regularly.
Employ Active Testing Techniques
Speaking of testing, how do we test? We can test passively and actively. Both are needed in an assume breach mentality. Most organizations perform passive “testing” via the use of vulnerability scanners and the like. It is important that this type of scanning be conducted regularly, however, it is equally important that the results of those scans are reviewed and acted upon. If scans are scheduled and run but no one is following up on them, that is the same as “if no one is looking at the logs, it’s not a control.” However, passive techniques are not enough. Active techniques, specifically penetration testing and threat hunting, are critical.
Proper penetration testing will reveal weaknesses that passive vulnerability scans will not. There are tools in the marketplace to assist with this type of testing, such as dynamic application security testing tools,5 but tools are not enough. Properly trained personnel running the tools and assessing the results is key. For instance, if a tool indicates there is a possible vulnerability, personnel will need to have the skills to assess whether the indicator is a false positive or not.
Just as vulnerability scanning is not enough, neither is penetration testing, as both focus on known assets. Threat hunting6 focuses on detecting vulnerabilities that are not known. Scanning results are used as a starting point, but ultimately, threat hunting investigates anomalies or areas not normally covered by scans. For instance, a scan might detect an unusual device on the network. Threat hunting would involve probing the device to determine what it is and if it is allowed. Threat hunting, therefore, is not just technical. It could also reveal cases where proper documentation and other processes were not properly followed. Typically, though, we focus on the technical side because we are looking for the threat actor we assume to already be in the environment. The sooner we find them, the sooner we can take proper countermeasures.
A Short Word About Plans
Everything discussed up to this point involves having proper plans. There should be plans in place to ensure proper visibility of systems. That plan should also encompass alerting and escalation. Plans around robustness should include procedures for expected issues and failures. What should be failed over? What is the process for bringing the backup system online when the primary one develops an issue? Of course, vulnerability scanning and threat hunting should have solid plans as well. In reality, plans should exist for the most likely scenarios whether it is ransomware or a severe weather event such as a hurricane or wildfire. Plans should encompass both technology and organizational responses and be regularly tested.
It is not realistic to test all types of plans by carrying out the actions. However, periodic reviews of plans, use of tabletop exercises and the like provide forms of testing that help detect issues with those plans. Communication mechanisms should be periodically verified to ensure they work, and everyone is familiar with them. Every aspect that can be tested should be. If a plan is not tested, how will the organization know if it will work when needed? This is something we talk about a lot with regard to database administration. If I never recover the database backups, how do I know they are good? If I never perform recovery plans, how do I know we will meet RPO and RTO requirements? Plans are good, but tested plans are better.
Operational Availability Is an Organizational Advantage
The organization that can stay operational in adverse situations while its competitors flounder has an organizational advantage over said competitors. The organization that can weather issues and delays and stay online incurs trust. The organization that can rapidly respond to a threat actor and stop or minimize a data breach scenario while others lose millions of records stands out. Assuming a breach mentality and building systems and plans accordingly is critical to be the organization that keeps running when others falter. After all, the most trusted organizations in the digital ecosystems are the ones we can depend on to be up when others are not.
Endnotes
1 Sobers, R.; “Ransomware Statistics, Data, Trends, and Facts [updated 2024],” Varonis, 13 September 2024
2 Shea, S., Krishnan, A.; “How AI is Making Phishing Attacks More Dangerous,” TechTarget, 22 October 2024
3 Arora, K.; “Zero Trust: Three Key Strategic Components of Assume Breach,” F5, 13 April 2023
4 Edwards, J.; “IT Resilience and How to Achieve It,” InformationWeek, 6 September 2023
5 OWASP, “Dynamic Application Security Testing (DAST),” OWASP DevSecOps Guideline – v-0.2
6 Cisco, “What is Threat Hunting?”
K. BRIAN KELLEY | CISA, CDPSE, CSPO, MCSE, SECURITY+
Is an author and columnist focusing primarily on Microsoft SQL Server and Windows security. He currently serves as a data architect and an independent infrastructure/security architect concentrating on Active Directory, SQL Server, and Windows Server. He has served in a myriad of other positions, including senior database administrator, data warehouse architect, web developer, incident response team lead, and project manager. Kelley has spoken at 24 Hours of PASS, IT/Dev Connections, SQLConnections, the TechnoSecurity and Forensics Investigation Conference, the IT GRC Forum, SyntaxCon, and at various SQL Saturdays, Code Camps, and user groups.