Reputation Database

The Reputation Database (or RepDB for short) is a big data scale database maintained by Network Box Security Response; it forms the core repository of information we store on categorization and classification of threat and productivity information. Given its importance, we will now explain RepDB in more detail, and show how Network Box uses it to protect our customers, and work with partners.

RepDB is made up of a few related components:

ITEM TYPES

RepDB stores items of information of different types such as email addresses, IP addresses, telephone numbers, URLs, hashes and fingerprints. Overall, it stores data for more than a dozen different types.

Reputations

Each item entered into the database, with corresponding type, has an associated reputation. This stores information such as the category (politics, proxies & translators, real estate, virus/ malware infected, spam, etc), the classification (spam, malware, executable, etc), as well as the percentage confidence in that classification. Most importantly, multiple sources can provide categorization, classifications and confidences for the same item of information (so we can grow/reduce overall confidence based on the number of sources reporting as well as the confidence in and of each source).

Reputation History

A full history is kept for each modification made to RepDB over time. This allows us to call up a complete list of changes to any reputation item.

Signatures & Threats

Triggered by changes to reputation items, signatures are automatically generated, assigned threat IDs, and distributed to Network Box threat protection devices in real-time using both PUSH and cloud signature delivery mechanisms.

Statistical Feedback

Network Boxes under management periodically report back threats seen, and this is integrated back to RepDB (using the signatures and threats relationship to reputation items) so real-time and historical statistics can be seen on whether a particular reputation item has been seen in the real world, if we are blocking it, and how prevalent it is.

All this is stored in the cloud, in a high availability real-time distributed database system. We currently track more than 14 million reputation signatures, covering more than 100 million individual items, with historical data going back more than 15 years. RepDB is currently growing at the rate of more than 200,000 signatures a month.

Partnerships and threat information sharing arrangements are key to this system. As well as information coming in from partners and devices under management, we also maintain a large collection of honeypots and spam traps. Overall, we have several hundred sources of threat data and intelligence, all feeding into RepDB in real-time.

Let’s look at an example, to show how this works:

  1. RepDB receives threat intelligence (from a partner or honey pot) and creates reputation items to record the classifications and confidences.
  2. RepDB immediately raises signatures and threat indicators.
  3. Network Box Z-SCAN immediately issues protection.
  4. Traditional signatures are raised and pushed out to (a) mail scanners, (b) file scanners, and (c) on-demand scanners.
  5. Threat indicators and samples are provided to our information sharing partners.
  6. Over time, as Network Box devices start to record blocks on this emerging threat, the statistics flow back to RepDB. This is used to strengthen/weaken reputation scores, based on real-world experience.

While traditional anti-malware vendors continue to work on the scale of hours to release protection, Network Box’s RepDB has been fine-tuned over the past 15 years to the stage now where we can issue threat protection signatures in milliseconds. Both cloud-based and PUSH technology are used to get these protection signatures released, and statistical feedback loops keep us informed as to the effectiveness of that protection.