Cyber Crime Cases and Confusion Matrix

What is a Confusion Matrix?

A Confusion matrix is the comparison summary of the predicted results and the actual results in any classification problem use case. The comparison summary is extremely necessary to determine the performance of the model after it is trained with some training data.

  • Negative(N): the predicted result is Negative (Example: Images is not a cat)
  • True Positive(TP): Here TP basically indicates the predicted and the actual values is 1(True)
  • True Negative(TN): Here TN indicates the predicted and the actual value is 0(False)

Accuracy and Components of Confusion Matrix

After the confusion matrix is created and we determine all the components values, it becomes quite easy for us to calculate the accuracy. So, let us have a look at the components to understand this better.

An Overview of False Positives and False Negatives

Understanding the differences between false positives and false negatives, and how they’re related to cybersecurity is important for anyone working in information security. Why? Investigating false positives is a waste of time as well as resources and distracts your team from focusing on real cyber incidents (alerts) originating from your SIEM.

What Are False Positives?

False positives are mislabeled security alerts, indicating there is a threat when in actuality, there isn’t. These false/non-malicious alerts (SIEM events) increase noise for already over-worked security teams and can include software bugs, poorly written software, or unrecognized network traffic.

What Are False Negatives?

False negatives are uncaught cyber threats — overlooked by security tooling because they’re dormant, highly sophisticated (i.e. file-less or capable of lateral movement) or the security infrastructure in place lacks the technological ability to detect these attacks.

Strengthening Your Cybersecurity Posture

The existence of both false positives and false negatives begs the question: Does your cybersecurity strategy include proactive measures? Most security programs rely on preventative and reactive components — — establishing strong defenses against the attacks those tools know exist. On the other hand, proactive security measures include implementing incident response policies and procedures and proactively hunting for hidden/unknown attacks.

Here are a few simple rules to help govern your approach to cybersecurity with a preventative, reactive, and proactive mindset:

  • Assume you’re breached and begin your offensive (proactive) initiatives with the goal of finding those breaches. By doing so, you’ll seek to validate the strength of your defensive/prevention tools with the understanding that none of them are 100% effective.
  • Use asset discovery tools to discover the hosts, systems, servers, and applications within your network environment, because you can’t protect what you don’t know exists.
  • Execute regular compromise assessments (we recommend at least once a week) and inspect every asset residing on your network.
  • Define security policies and procedures, and implement educational/training requirements so your entire team knows what to do in the event you discover a hidden breach, or worse, fall victim to a data breach.
  • Time is your most valuable asset, so implementing tools/technology to speed your speed of detection and time to respond are key and can help your security team prevent a data breach.

Steps to Improve Forensic Analytics

Forensic analytics — the combination of advanced analytics, forensic accounting and investigative techniques — is making breakthroughs every day in identifying rare events of fraud, corruption and other schemes. To meet rising regulatory and customer demand for fraud mitigation, forensic analytics can reveal signals of emerging risks months — or sometimes even years — before they happen. Of course, predicting anomalous events can also create false positives.

  1. Employ network mapping and analysis — Explore fraudsters’ networks, affinities and relationships, as well as others committing similar illicit acts.
  2. Leverage both supervised and unsupervised modeling — Supervised modeling employs algorithms to sift through data, applying historical fraud patterns and digital fingerprints of fraudsters to new data and scoring the level of risk involved in new events based on historical data. Unsupervised modeling uses algorithms to sift through data independent of patterns relating to known historical cases, looking for new events following unprecedented patterns.
  3. Use natural language processing (NLP) — Sift through unstructured data, including emails, messaging, audio and video files to unearth unexpected nuance to communication or connections otherwise unclear in structured, text-only data. For example, the ability of NLP to analyze word choice, tone and possible stress levels expressed in a voicemail can sometimes offer more insight during investigations than text on page alone could offer.
  4. Training and self-learning — Train analytics to learn from a variety of data sources, such as risk issues the organization has confronted in the past. The corresponding models can adapt over time to future risks.
  5. Back testing — Scientifically test forensic analytics performance to evaluate its continued use. Backtesting can help establish confidence that pattern recognition models and algorithms work well and are effective in finding suspicious patterns of interest.
  6. Iterative approach — Iteratively develop, adapt and scale forensic analytics models so they respond to new and evolving fraud patterns. At the same time, develop a broader view of the risks an enterprise may face. This approach enables an organization to build the forensic analytics platform in stages — one step at a time with input and validation from the business stakeholders — while still staying a step ahead of bad actors.