Stop Cyber Attacks before they start: Data Harvesting and Targeting

The Greek philosopher Plato wrote that “the beginning is the most important part of the work.” The great American statesman, scientist, and philosopher Benjamin Franklin similarly emphasized the importance of planning when he stated that “by failing to prepare, you are preparing to fail.”

It is unfortunate that many cybercriminals heed their advice today. The number of threats continues to increase exponentially; malware infections have ballooned from less than a quarter million per month a little more than two years ago to nearly a half million today. Hacktivism, social and government disruption, and espionage rank at the top of list of reasons, as cybercriminals sell information on the Dark Web and encrypt data and systems for extortion.

The days when cybercriminals predominantly targeted credit cards are past; medical information now is worth 10 times more. Cybercriminals use ransomware to infiltrate critical services such as healthcare systems that rely on real-time information. Their stealth activities span everything from reprogrammed USBs, to new malware that morphs so quickly that antivirus cannot match signatures and evades sandboxing, to activity hijacking that uses mirror versions of legitimate apps or reputable websites.

Think these are restricted only to high-profile industries and large enterprises? You’re mistaken. This new world of cyberthreats knows no boundaries and company size is irrelevant.  

 On-Demand Webinar: Through the Eyes of a Cybercriminal – Attack Planning and Data Harvesting

Reconnaissance: The First Step

Reconnaissance is the first step most cybercriminals take when planning an attack or intrusion, and it also is one of the most frequently used. In this scenario, cybercriminals employ a variety of techniques—actively and passively—to gain information about the vulnerabilities of a network. Metadata from social media, email attachments, and documents published on corporate websites containing data such as system names, login usernames and passwords, departments, and other details are prime targets. This information is leveraged to set the stage for phishing and other social engineering activities.

The following are some of the ways cybercriminals conduct metadata reconnaissance:

  • Information harvesting. Seeks enough information to become credible and trustworthy. Typically, a name isn’t enough; a name needs to be overlaid with a company name or a location.
  • Vulnerabilities. Culls machine IDs, system names, application versions, and directory structures to mount an attack through known and unknown vulnerabilities.
  • Social media. Social networks are proving to be rich repositories for cybercriminals seeking personal information that can be exploited.

It’s in the Document

Documents contain “hidden” information that can be harvested; this includes metadata such as usernames and system names. Word documents alone contain various types of hidden data and personal information: 

  • Comments and revision marks from tracked changes, versions, and ink annotations
  • Document properties and personal information
  • Headers, footers, and watermarks
  • Hidden text
  • Document server properties
  • Custom SML data 

Surprisingly, documents contain myriad types of information and can be found in emails, intranets, and even external websites. There are several approaches that organizations can leverage to help break the attack chain at the reconnaissance stage, thwarting cybercriminals before they get started: 

  • Deep content inspection. Deep content inspection enables organizations to detect embedded metadata, revision history, and fast save data. This includes recursive decomposition, true binary detection—even for embedded objects—and subcomponent detection (e.g., header, footer, properties).
  • Documentation sanitization. The gateway serves to detect and remove metadata and other embedded data, automatically and consistently (viz., set and forget), across email, the cloud, and social media.
  •  Adaptive data loss prevention. Use of a common policy engine that automatically removes sensitive data and malicious content as it passes in and out of your network; only removes exact content that violates policy, enabling the communication to continue.
  • Intrusion prevention. Intrusion prevention system technology can pinpoint scanning during an attack and shut it down before the attacker can gain too much knowledge of an organization’s network.

By Dr. Guy Bunker, SVP of Products and Marketing - Clearswift  @guybunker

Additional Information

Related Articles