Cyber Security Needs Its Self-driving Car


It’s a great time to be in the field of data science. Systems based on machine learning algorithms have made large strides in recent years and are increasingly impacting our daily lives. We have digital assistants that learn our habits better than our friends, facial recognition is essentially commoditized and close to human ability, computers can describe what they see, and cars are even beginning to drive themselves. We are also reaching an exciting point in time where these innovations can start helping defenders in the cyber security industry.

In this blog post we will briefly discuss what we at FireEye are doing in data science and what we think the metaphorical “self-driving car” might look like. While we don’t think that detection algorithms should take the human out of the loop, we think they can help automate and become a force multiplier, helping security teams find and focus on the evidence that most likely points to compromise.

In simplistic terms, a self-driving car can be thought of as a system that can take a variety of inputs from sensors like GPS, engine controls and range sensing equipment and use them to make a decision, which may then result in action (e.g. apply the brakes).  The analogue here at FireEye is a system that observes a different sensor suite (e.g. network activity, endpoint events, log events) and interprets them in order to determine if a breach is being attempted or has already occurred, and then takes action either actively or passively.

As technology evolves, and the threat evolves, so too could the sensor inputs. Sensors could include physical security like proximity card readers, or they could be running on less traditional hardware like printers or routers. Much like a self-driving car, a system like this would almost certainly have to rely on the power of the ensemble: a collection of imperfect signals can be combined, possibly over a window of time, to make a stronger determination. As more evidence is collected, more confidence can be placed in the alert.

Data Science at FireEye

FireEye is in a unique and exciting position. We are a product company with a large services division. Our services team is responsible for everything from vulnerability tests to incident response, and our customers include most of the Fortune 500. This means we have a lot of domain expertise employed at FireEye that we get to leverage. Combine our domain expertise with amazing customer relationships that allow for beta deployments, and you have a formula for amazing opportunities to innovate.

At FireEye, our data science vision leverages visibility across our security stack: network to endpoint to cloud. It involves building systems that generate signals and methods to combine these signals into high confidence alerts. The problem extends beyond just detecting malware. The adversary can be an employee stealing intellectual property, or nation state attacker exfiltrating data. They can be using custom built software, or standard tools already available on the host operating system.

To help meet this vision, we’ve engaged in discussions with local universities to understand how best to establish research partnerships within our industry. We also employ two types of Subject Matter Experts (SMEs) to ensure our success: data scientists and security SMEs. Security SMEs include our incident responders, intelligence analysts, or reverse engineers, all with unique and deep security background. The data scientists possess a deep understanding of probability theory, optimization, and signal processing that complements the expertise of our security SMEs.

While many hurdles exist, and machine learning in adversarial environments can be challenging, we have the necessary experience and the right ingredients to make this vision a reality. From appliances deployed in a large and diverse set of customers, we can observe enormous quantities of real-world data. Our existing detections and red team augmentation can generate labeled data sets for training and evaluation. Working together, our security experts and data scientists can build detection models that focus on the essence of the attack, and not signatures or narrowly defined rules.

The problem is still very challenging, but we are excited about what we have already accomplished and what is in store for the future. If you like hard-to-solve problems, please see our careers page.