OCR, Cybersecurity, and Threat Intelligence: Using Optical Character Recognition and Machine Learning to Identify Risk
Introduction and key takeaways
Optical character recognition (OCR) can immediately help security and fraud teams augment their data collections with timely, actionable intelligence. For example, OCR technology enables CTI and SOC teams to proactively identify when sensitive organizational or customer data—derived from images—is posted by threat actors in illicit communities and actively being leveraged across the deep and dark web.
In this article we:
- Define OCR—its history, development, and applications across the cybersecurity, corporate security, and physical security landscape;
- Examine how OCR and machine learning can be leveraged as a time-saving organizational tool—especially for intelligence gathering and threat hunting teams;
- Describe how threat actors leverage images as “proof,” enabling them to more effectively conduct business;
- Outline use cases—plus the explicit ROI that OCR can generate for cyber and physical security teams in the financial services industry and other public and private sectors.
What is Optical Character Recognition (OCR)?
Originally developed over 100 years ago as a way to analyze text to play as sounds for the visually impaired, and then commercialized by Ray Kurzweil and Xerox in the 1980s, OCR technology has come a long way.
Today, OCR is primarily used across a variety of applications and industries to extract text, logos, and other identifiable objects from images. For security and fraud teams, OCR can help organizations identify potential cyber, corporate, and physical threats, and then take the necessary steps to mitigate them.
Why OCR is a vital tool for security and fraud teams (and the CISOs who lead them)
Machine learning and searchable OCR
The information that OCR extracts and classifies from images should be immediately searchable, eliminating systematic barriers that may exist between raw data and actionable intelligence.
When paired with machine learning algorithms, OCR is an incredibly effective and efficient method of inspecting, identifying, and classifying images. Flashpoint’s OCR technology can even identify information from images displayed in any orientation (horizontal, vertical, diagonal or curved)—or language.
Altogether, the most capable OCR enables security and fraud teams to quickly identify images of fraudulent activity within illicit communities and take meaningful programmatic steps that mitigate those risks.
Why images matter: Risk exposures and OCR
Threat actors are increasingly showcasing sensitive information on chat platforms, including compromised bank accounts, domains, usernames, passwords, networks, devices, phone numbers, stolen identities, exploit code, as well as Remote Desktop Protocol (RDP) access.
Furthermore, threat actors may use images to boast about their recent activities, share sensitive or exploitative information, and even plan future attacks.
As image-rich, mobile-first communication becomes a primary mode of communication in illicit communities, organizations across the public and private sectors require insight into how these threat actors gain access to—then leverage—sensitive information.
While posting images may lend legitimacy to threat actors’ claims, it also provides valuable information for security teams to discover and take appropriate action.
Inside the minds of threat actors
Let’s pretend for a moment you’re in the market to purchase stolen credentials to a bank account. To accomplish this, you scan through multiple text-based posts on an illicit channel (chat server, e.g.); each poster claims to have access to various types of banking and investment account data.
How do threat actors on the buy side verify that the sellers actually have access to financial accounts? On the other hand, how do these sellers provide proof to potential buyers that the stolen credentials are viable?
Finally, if you’re a member of a fraud team that’s diligently hunting for indicators of compromise (IOCs), how would you also determine which claims are credible (or false positives) and therefore require action.
The answer? Images provide proof.
Use cases and OCR applications
Security teams can leverage OCR technology to discover and be alerted when activities are taking place that could pose a potential threat to their business and customers. Now, let’s take a look at a few real life use cases to better understand how this can occur.
OCR, fraud, and compromised credentials
Fraud teams at banks and other financial institutions have certainly had their hands full. The amount of threats and their level of sophistication has been steadily on the rise, with a 149% increase in digital fraud this year alone. Fraudulent purchases and compromised accounts cost banks billions annually, making fraud mitigation a primary goal.
This poses a huge challenge to fraud teams to not only know which illicit communities to monitor across the deep and dark web, but also to sift through the massive amounts of data in order to identify relevant threats to their business and customers, such as credit card fraud made possible by stolen credentials. OCR technology can be a game changer in helping to overcome this challenge.
When pointed toward a robust dataset of illicit online communities, OCR capabilities allow fraud teams to search through not only text chatter, but also images posted by threat actors, such as a company logo on a customer statement that has fallen into the wrong hands.
OCR for internal and insider threats
It’s essential for CTI and SOC teams to gain a stereoscopic understanding of potential vulnerabilities, including risk apertures opened by internal actors. These data leaks can be caused by honest mistakes, such as inadvertently storing code in open repositories, or by insiders with malicious intent who seek to profit off of their access to sensitive information.
OCR allows security teams to detect when this activity occurs. By searching images for company names/logos, a security analyst can identify images the insider has posted as proof of access, for example. This could be in the form of an employee badge, or possibly a screenshot of a system or application of interest.
OCR for physical security teams
It isn’t only cyber and application security teams that can benefit from OCR search technology. While maybe not quite as apparent, physical security teams can enjoy the benefits of this capability as well.
As in the insider threat example above, an employee offering a picture of his/her badge as proof of physical access can be caught with OCR. Additionally, bad actors might also post pictures of building entrances that might include the company’s name/logo. Keypads, scanners, and other physical security devices may have known exploits that could allow unauthorized access to company facilities.
Using OCR search capabilities to search for your company’s physical security devices that are in use could help your security department discover and take action against these types of risks.
Other important OCR applications
There are many other use cases OCR can help address for security and fraud teams, in addition to the few mentioned above.
OCR enables CTI and SOC teams to search posts containing pictures of ID documentation containing real or fraudulent personally identifiable information, including drivers licenses, social security cards, and passports.
OCR enables the discovery of access to stolen accounts—bank, investment, credit card, insurance, social media, and other spaces where sensitive organizational and customer data can be exposed.
OCR enables security teams to find evidence of counterfeit checks, including cashiers’ checks issued directly by banks.
Extremist images (physical security)
OCR technology enables organizations to explore images that might reveal the physical location of businesses, employees, and customers—all of which could be used to cause harm or disruption, including reputational damage, or gain unauthorized access to information systems.
Illicit trade of pharmaceuticals
OCR can provide actionable intelligence that can help security teams mitigate fraud. This includes the falsification of prescription and other medical documentation, enabling threat actors to gain unauthorized access to in-demand medications.
OCR can identify if the address and balance of crypto wallets have been exposed online.
The ROI of OCR: Public and private sector wins
Flashpoint customers have found immense value leveraging our OCR capabilities to identify potential risks to their organizations.
- A financial services customer identified $10.1M in at-risk account balances and secured the accounts within six days of Flashpoint’s launch of OCR capabilities.
- A financial services customer identified an average of 125 account screenshots with $6M at risk on a monthly basis.
- Multiple gaming industry customers have found OCR searches helpful to identify new types of “cheats” targeting their platforms.
- A Flashpoint customer identified an image within Flashpoint’s collections showcasing a tutorial targeting a state’s unemployment insurance site. Flashpoint’s OCR capability identified the image as the step-by-step example for how to claim unemployment, which mentioned the customer’s corporate name as the employer in one of the steps.
See Flashpoint’s OCR Solution in Action
Organizations across a variety of sectors, from financial institutions to governments and law enforcement, leverage Flashpoint’s OCR technology to proactively identify threats, prevent fraud, and take action to combat exposure to risk. To see our OCR technology in action and learn how it can help your organization, request a demo or sign up for a free 90-day trial today.