Malware Terminology

Malware report

We collect malware reports from threat intelligence feeds (blocklists) that identify the URL, IP address, or domain name in the report as one which is serving up or distributing malware.

Some of the feeds that we employ report on multiple threat types, e.g., spam, malware, botnet, phish, etc. For malware studies, we only use reports that identify an activity that involves a form or forms of malware.

We gather information about malware activity from four threat feeds. More than one feed may identify a given malware activity, so de-duplication is necessary.

When we prefix “report” with a particular threat such as phishing, we are indicating that we are using only those reports identified as phish for our analyses.

Malware records

We enrich the threat intelligence data with ancillary data; for example, domain name and Internet registration data, DNS data, routing data, and statistical data (e.g., domains under management of a registry). The resulting malware records allow us to determine where malware is being served up or distributed, where malware attackers acquired the resources they used in a malware attack, and in some cases, how they acquired the resource (for example, through direct purchase of a domain name registration or through the compromise of a vulnerable device or computer).

There may be a many-to-one relationship between malware records and domain names or IP addresses; for example, when the domain name or IP address appears in URLs with different PATH elements (http://example.xyz/malwareA.exe and http://example.xyz/malwareB.doc) and also when subdomains are delegated from a domain name (http://cdn105-example.xyz/malwareA.exe and http://cdn110-example.xyz/malwareA.exe).

Malware threat intelligence feeds (blocklists)

We consume two types of malware reporting services (feeds): URL block lists (URLBLs) and domain block lists (DBLs or DNSBLs).

Our URL source feeds for malware – Malware Patrol, MalwareURL, and URLhaus - analyze URLs to to determine whether they harbor malware. We also use the Spamhaus DBL. This feed does not provide target information but it does classify domains according to the type of threat the domain is used to perpetrate. For malware reporting, we use only the DBL response codes 127.0.1.5 (malware domain) and 127.0.1.105 (abused legit malware).

The feed operators analyze the files or executables associated with domains, IP addresses or URLs to determine the purpose that the malware serves. Some of these feeds provide sufficient information for us to classify malware.

Classifying malware

Malware can be written to perform different functions. There are hundreds of malware executables, many of which are polymorphic. Some malware evolves by adding or borrowing code from other malware, open source, or commercial software. A malware may begin as an executable with a single purpose, e.g., to download other malware, but the creator or others may add new components or functionality to a malware that sees success in the wild, for example to serve up ransomware. Researchers, blocklist service providers, and commercial security companies further complicate classification by adopting their own naming conventions.

Classification, including ours, is thus subjective. Our classification may be consistent with that of some but not all malware research or commercial anti-malware companies.

We began by “normalizing” metadata provided by MalwareURL and URLhaus, where our subscriptions provided sufficient metadata to study the types of malware that were being served from hosting resources. We use a classification of malware proposed by the Computer Antivirus Research Organization (CARO) as a baseline to create a taxonomic ranking, where:

Class = Threat

Order = Cybercrime

Family = Crime Type

Sub-family = Target or Origin

Genus = Malware Type

Species = Malware (name)

Order (Cybercrime)

The Order, Cybercrime, adopts the cyberthreats identified as cybercrimes in the Council of Europe’s Convention on Cybercrime. We are measuring Crime Types that The Convention describes as illegal access or misuse (malware, generally), and data or system interference with data or systems (e.g., ransomware).

Sub-family

We attempt to group or classify malware according to the primary or original purpose the malware serves.We identify three sub-families in Crime Type = Malware:

  • IoT Malware targets Internet of Things (IoT) devices (such as surveillance camaras, sensors, or embedded technologies).

  • Endpoint Malware targets user-attended devices (such as computers or mobile phones). Endpoint Malware compromises these mostly human-attended devices through a user action such as the opening of an email attachment or the visiting of a malicious URL through a browser.

  • Malware operated from a Malicious IP address includes scripts or executables that are used maliciously and hence are malicious software emanating from a malicious IP address. Reports of such malware identify the origins of these attacks.

Figure 1. Malware Taxonomy

Genus (Malware Type)

Endpoint Malware

For Endpoint Malware, we include:

Backdoor/RAT. A backdoor is malware that installs a software tool that provides remote access or administration of the infected endpoint, i.e., a means for an attacker to enter the computer unobserved or “through a back door”. RAT is an acronym for remote administration tool or trojan.

Bot. A bot (Internet robot, also called zombie, spider, or crawler) is a form of malware that installs on an infected device and then contacts a command-and-control host (C2) to be “enrolled” into a criminal hosting infrastructure. Once enrolled, the bot communicates with the C2 for instructions or to download malware for second stage attacks, e.g., denial-of-service, relay spam, keylogging, or backdoor installation.

Cryptocurrency malware. Malware that targets cryptocurrency. Some cryptocurrency malware targets digital wallets (much like a banking trojan) but others exploit or “hijack” the infected devices’ resources to mine cryptocurrencies and are called cryptojackers.

Dropper/loader. A dropper/loader is a malware that installs other malware. The terms “dropper” and “loader” are often used interchangeably, but some use the term “dropper” for malware that is installed from something physically present on an infected device, e.g., a removable media or a malicious email attachment, and reserve the term “loader” for malware that is downloaded over a network connection from a host that an attacker uses to serve malware to infected computers.

Infostealer. A type of malware that steals usernames, passwords, or banking or credit card credentials, or any personal or sensitive information that can be used or sold for profit.

Malicious document. An Office document that contains a malicious macro, or a PDF, compressed file, image, or archive (ISO) file that contains harmful code or a component for a malicious executable, is considered a malicious document.

Malicious executable. A harmful, self-executing computer program, for example, a Windows, Linux, or Android application or app, a scripting language, or (Java) applet. Also known as malicious code or “malex”.

Ransomware. Malware that is used for extortion. Originally, criminals used ransomware to extract payments from individuals for the recovery of personal information. Today, attackers extort payments from corporations, government agencies, healthcare services, and critical infrastructures (power grids, water supply systems, etc.) for the recovery of sensitive information or service restoration.

Remote code execution. Remote code execution (RCE) malware exploits vulnerabilities that can grant a malicious actor unauthorized access to a in computer program or operating system. Following a successsful exploitation, the attacker can execute any arbitrary code on the compromised remote host from a LAN the Internet.

IoT Malware

Generally speaking, a bot is any software that performs an automated task using the Internet. Devices that become infected with bot malware are used for a variety of malicious purposes, including denial of service attacks and deeper infiltration of networks to which infected devices are connected. We include in the IoT Malware sub-family the addresses and names of devices that are part of not attended by humans - surveillance camaras, sensors, or embedded technologies that comprise the Internet of Things.

Malicious Traffic Sources (Malicious IPs)

In the Malicious IP sub-family, we include

Traffic injectors, e.g., infected devices, typically PCs, that inject unwanted advertisements or malicious URLs or pollute PHP, HTTP, or Web forums with posts containing inappropriate malicious content. Here, we also included infected devices that host credential-stuffing bots or captcha bypass bots, or bots that disrupt merchant services (bidding snipers, download stat boosters).

Attackware, e.g., malicious executables that have been reported for targeting systems with traffic that scan for ways to disrupt or break into targeted systems or services. Here, we will include reports of attacker IP addresses that target services – e.g., Apache, IMPA, FTP, Postfix, SSH – and reports of IPs that are participating in DDOS attacks, scraping attacks, or click fraud.

Within Genus, we identify malware by one of the names commonly associated with the malware.

In most cases, we adopted a simplified Malware Type that is based on the CARO naming scheme. When confronted with multiple names for a given malware, (e.g., Quakbot, Qbot, Qakbot), we chose arbitrarily from these. To impose our classification on malware reports that do not provide sufficient information to identify a Malware Type and Malware Name, we submitted malware URLs to Virus Total, Hybrid Analysis, or ANY.RUN and augmented our metadata with information from these reports. While we still were unable to obtain a Malware Name in all attempts, we were able to associate a Malware Type to significantly more malware URLs.

Malware Activity