USENIX Enigma 2016 – What Makes Software Exploitation Hard?

USENIX Enigma 2016 – What Makes Software Exploitation Hard?


HAWKES: Hi. My name is Ben Hawkes. And I’m a member
of Project Zero. This talk is
about software exploitation, what makes software
exploitation hard, and why does it even matter? About 18 months ago,
a group of security engineers at Google formed Project Zero
with the stated mission to make 0day hard, to make software exploitation using 0day vulnerabilities
harder than it is today. And behind that sort of
encapsulates the idea that we think
that the cost of an exploit is not in balance
with the capabilities that an attacker gains
from using that exploit. And also, we knew that
there was a private market for exploit purchase and sale, a global market
for exploit purchases, and the exploits were being used
to harm our users and to harm companies
like Google. And we thought as
security engineers that we could work
on this problem. Here’s a concrete example. $130,000. This was the purchase price
recently for what we might think of as the canonical exploit chain
of recent years, at least, which is an Adobe Flash exploit, 0day exploit, combined
with a Windows Kernel exploit. $40,000 for a Flash exploit and about $90,000
for a Windows Kernel exploit. And at first glance,
you might look at that number and think that’s a lot of money if you compare the cost of setting up
a phishing operation or something like this, this is much more expensive. But then if you think about
what this exploit is, it’s a full system compromise on hundreds of millions
of machines, just by having a user visit
a URL control, you control, or if you have
network access and the ability to inject into network traffic, not even that. So all of the classical
targeted attacks that we’ve heard about, system
administrators being compromised and network access being stolen, diplomats losing documents, company executives
losing e-mails, cryptographic key material
being stolen, back doors being inserted, all of this is open to you by this exploit’s capability. So, in practice,
we think exploits are too cheap and too numerous. So this talk is, in essence,
about our technical strategy to try to address this problem. There are other
nontechnical facets about how you might approach
this, policy issues, outreach education. But today I’m focusing
on our technical strategy. I want to sort of explain the observations and principles behind our technical strategy, why we think it’s effective, and then obviously
get feedback from you all about whether you agree
with that assessment and ideas of other
sort of technical facets that we can introduce
into our work on this problem. In essence, our answer
to the question, “What makes software
exploitation hard?” is twofold. It’s strategically targeted
vulnerability research and incrementally increasing
the state-of-the-art of exploit mitigations. But, in essence, behind that
is the idea that Project Zero is
an attack-research team in the same mold
that a private attacker, a private actor
might want to set up, except we do our work
transparently in the public for defensive purposes. In 18 months, we’ve reported
600 vulnerabilities, most of these
in non-Google software. So, importantly our scope is beyond just what Google
is producing as software. We look holistically
at all the types of software that are used
in the ecosystem of our users. 600 vulnerabilities
in about 18 months. We have a blog
that’s focused heavily on technical reports
about how you discover vulnerabilities, exploitation, how exploitation
actually happens, and mitigation design and
how we can break mitigations. There’s a new blog post comes out
about every second week. And we’ve also been involved in shipping
some exploit mitigations. I think of Chrome, Linux, Flash, Windows, Android. All ship mitigations that are either influenced
by our work or are directly designed
by our team. It’s a start. I would love to do more
at this final point, and we’ll address that shortly. But here’s another number. 26. So, 26 is the number of cases of
exploits discovered in the wild in 2015 that affected
the set of software targets that we’re monitoring. This includes Internet Explorer,
the Windows Kernel, Adobe Flash, Microsoft Office. 26 cases of discovered
0day exploits in the wild. And at first glance again, you would look at that number
and we’re very used to in the security community
hearing very high numbers of thousands of attacks per day
against systems, and so on, and this is relatively
a small number. But we have to decompress that and look at what’s behind it. This number, 26, represents the failure case
for the attacker. It’s where something
has gone wrong, some technical process, some operational-security
procedure has gone wrong, or the attacker
just gets very unlucky in order to allow
the security community to detect and discover
these instances of 0day attacks. But there’s an unknown here, and the unknown is the ratio
of discovery to nondiscovery. We don’t know the rates that 0day exploits
have been used in the wild for which we’ve not seen,
for which we’re not discovering. So, we need to be very careful
about looking at this number, this failure case, and taking it too far, taking too much insight
from just this number alone. And the idea
behind this is something that is, I think, very central to my approach to security, which is this idea
that it’s the attacker’s job to deprive us of the data that we need to make
optimal decisions about defense. This is, I think,
a significant mind-set and it’s stating the obvious,
of course. Security is unique in the sense that we have an adversary, and as engineers, our desire to be data-driven
in all things is very strong. It’s natural, in fact. But security is a little bit
different in the sense that in many cases — not all cases,
but in many cases — just taking the data alone
at face value, will lead to suboptimal decision
making in security. So we try to think
about a little bit deeper, what is going on
behind the scenes, what is the model that we can build
about attacker behavior? One approach is to start
to imagine yourself as in individual
attack researcher. On a day-to-day basis, Project
Zero is performing this work, and we know also
the individual anxieties, the things that a researcher worries about on
the attack research side. It’s questions like this. “Will I be able to find
a good vulnerability?” We know now that not
all vulnerabilities are equal. Some are more useful
than others. “What are the chances
that this particular bug I’ve just found will be fixed?” “What might be discovered
by the vendor itself or an open-source project and might be discovered
by another defensive-security researcher that’s very disruptive
to me as an attacker if I want to use this bug
as an exploit?” And “how long will it take me
to write an exploit?” We know that attackers
are also resource constrained and that even though theoretically a bug
might be exploitable, there are prioritizations
that have to take place. And “how can I make
this exploit reliable?” Reliability is something
that I think defense understands very poorly
in the context of exploitation. When you’re intending
to use an exploit in the wild in operation, if the exploit fails, this is a very costly thing
to occur. So we need to, as attackers,
make those exploits reliable. Now, at Project Zero,
we think about these questions and we think about how best
to approach each of these in a way to heighten the concern
of the attack researcher, the anxiety
of the attack researcher. So I wanted to dive into our
technical strategy a little bit. There’s two sides,
the vulnerability research, and there’s
the exploit-mitigation side. And I’ll start with
our vulnerability research. And our strategy
really is based around one idea, which is what we call
contention. And contention is the occurrence of multiple researchers
running into the same output as a researcher — bug collision
or research rediscovery, in essence. Our strategy is built around
trying to increase the rate of bug collision
between ourselves and attack researchers in the private market. We do this in sort
of two main threads. Not exclusively, but primarily,
these are our two approaches. The pincer strategy that we use. The first is to eliminate
low-hanging fruits — to use Google’s immense
machine resources and incredible access to corpus data,
file-format data, from our web crawler
and other sources to build some of
the world’s best fuzzers. We know historically
that fuzzing is a method that a subset
of attackers are willing to use. Perhaps they’re not
the most advanced attackers, but there are attackers
out there who are willing to use
fuzzing as a methodology to discover bugs
that are then exploited. And we think
that’s just unacceptable. It’s too easy. It’s too cheap. And we believe
that we can actually substantially affect
the ability of attackers to use fuzzing as
a viable research methodology. On the other side —
And here I would say that the contention comes from a shared methodology,
a shared approach. There isn’t such a wide variety of fuzzing techniques at this current moment, that different fuzzers
are always gonna find different sets of bugs. There’s gonna be some level
of overlap in the output of independently written
fuzzers. The other approach is
the last step of the bug chain, and here what
I want to emphasize is that a modern exploit is not a single-shot
vulnerability anymore. They tend to be a chain
of vulnerabilities that add up
to a full-system compromise. A classic example of this is
when you attack a browser, you need a render a process bug, and you need a sandbox escape. Then you may also need
a kernel bug to get administrator access. So the observation here
is that the attacker requires the full chain, the complete chain in order
to have a complete capability. And along the way on that chain, there are some parts,
some links of the chain that are more fragile
than others, that are perhaps
constrained attack surfaces or that are less densely buggy
in some sense. And this is where we try
to focus much of our effort. The prime example
would be a sandbox escape or a kernel-privilege-escalation
bug. And here we’re really willing
to use any methodology possible, a mix of automated methodologies and manual approaches,
manual analysis. But the question
still remains — How on earth do we know
what to look at? There’s such a wide variety
of software targets out there in the world that
are potentially under attack. We are very limited
in our individual resources. Our team size is around
10 security researchers. So we need to prioritize
our own target selection. And again,
this comes back to the idea that we don’t have
perfect insight into the target selection
of attackers. They don’t share their lists
of exploits they have stored up. They don’t share the lists
of software targets that they’re willing to purchase
exploits against. So we need to model this again. And for us, it’s a mix of different sources
of information. Firstly,
we look at that number — 26 — the observed attacks — and try and gain
some insights from that. But as I mentioned,
this is a little risky when you’re studying the
failure case of the attacker. If you draw too much into that, you may run the risk
of drawing false conclusions. So we also try to gain
other insights about what the attackers
are going after. So a mix of external feedback from participants
within the private market, either directly, through
our relationships with them, or through other means
where public information — information becomes public
about that world, which happens from time to time. And the final way,
I think, perhaps, the most useful that I find,
is our own internal deduction. The premise of our team
is that we hire some of the top vulnerability researchers
in the world. We want to use their expertise
and their experience to predict — to have predictive value about what they think
will be the next big thing. More often than not, I believe, they are right on the money. So in practice,
what does that mean? We’re focusing heavily
on endpoint attacks against ubiquitous
software systems. So mobile is a big thing.
We work on both Android and iOS. Desktop operating systems — The major kernels are a big part of what we work on. Almost every browser
is on our radar. And document readers and recently, also, antivirus endpoint
security systems. But vulnerability research
is only one side of the coin. We have a pipeline of sorts
where we find vulnerabilities. For some set
of those vulnerabilities, we write exploits. And then from those exploits, we gain insight about the state-of-the-art
of exploit mitigations. And this is an area where we think that
our practical insights about the realities
of exploitation can lead
to better exploit mitigations. There’s kind
of two sides to that. In the process
of writing in exploits, we have to bypass
exploit mitigations that are already
shipped to users. So in essence,
we can find those edge cases — find the areas
where an implementation is not correct in relationship
to its design or that the design is not
sufficient in all cases. And on the other hand,
we can start to plan or suggest new ideas
for exploit mitigations. So using our insight
about our own exploits, we have a sense
of where in the exploit is there fragility — whether we get a little bit
lucky, in essence. Where might we incur
a lot more time or effort if we have
a new mitigation idea? And then we share that with the broader
software-development community and the broader
security community. In essence, I see ourselves as advocates for good
exploit-mitigation design. There’s a lot of bad
exploit-mitigation design ideas out there, as well. But I’ve been talking
about exploit mitigations without actually really defining
what I mean, and this is something
that I think, in the past, as in the tech-research
community, we’re being quite bad at, is just talking in generalities
and not in specificity. So, this is one approach
to one sort of — Well, we call it a taxonomy. It’s a vocabulary,
if you will, about how to talk
about mitigations. It’s not the perfect taxonomy. It’s not the only one,
certainly. It is sufficient
as a starting point. So, to run through
some of these, a Strong Mitigation, these are ideas, like recently
Microsoft’s Edge Browser has shipped a new allocator type
called MemGC. This may well be considered
a Strong Mitigation. Now, it’s early days yet. It might end the bug class of use-after-frees
in Microsoft Edge, which would be a really amazing
accomplishment of Microsoft’s. Another example
you might consider — /GS or DEP, stacked cookies making
stacked buffer overflows more or less unexploitable. Or, perhaps, other forms
of compiler technology — CFI, CFG. Now, Weak Mitigations is a little bit more
on the tactical level. So, we observe often
that there’s commonalities between a sequence
of different exploits, where they’re reliant
on the same ideas. And a recent example of this is
Project Zero’s work with Adobe. We worked with Adobe to partition out a certain object type
called the vector of uint, which we had seen
in exploits time and time again over the last two, three years. And by partitioning out
that particular object type, it forced attackers to find
different objects to go after. Eventually, this leads
to a more generalized approach and a generalized solution. When saying that something
is a Weak Mitigation, it doesn’t necessarily mean it’s
worse than a Strong Mitigation. It’s just the observation that the properties that are given by the mitigation is unlikely to stop
the exploit from occurring. It’s about the overall cost,
the investment — the time and investment
going into the exploit. There’s a couple of other types of exploit mitigations
that I’ll briefly discuss, and they’re on the verge. Some people wouldn’t consider
these exploit mitigations. Personally, I do. And there are two types. Attack Surface Reduction — so, an example that
we’re working on at the moment with Chrome is — as we mentioned,
the canonical exploit chain is a Flash exploit combined
with a kernel exploit. Now, the problem, in essence, is that Flash has
access to the kernel. Does it need it?
It turns out perhaps it doesn’t. We’re working on the project to remove access to Win32k from the Flash plug-in process, to effectively remove
the kernel attack surface from the purview of Flash. Other examples
are font transcoders, shader transcoders,
the types of ideas that we’ve had in
browser security in the past. And the final idea is perhaps
the rarest type of mitigation — Chain Extension. As I mentioned, exploit
mitigations are chains — Sorry, exploits are chains, and that the longer the chain
is means, effectively, that a new bug
has to be discovered. The chances to introduce
the chain extension are actually quite rare. Think of, perhaps,
something like ASLR, where you now leave
an information-leak bug, or introduce in a sandbox
or a virtualization ideas where you need
a VM escape. These are the types of ideas we would consider
a chain extension. So, to wrap things up
just quickly, here’s a quick example of one of the projects
we’ve been working on. I’ve mentioned it. Flash. We’ve reported about 180
vulnerabilities in 18 months, and we’ve worked
with Adobe very closely. And Adobe have been
working very hard on exploit-mitigation
improvements. The observed price increase
for an exploit, based on just the public data, not with any sort
of market in sight, was 60% in 18 months. I’d like that to be higher. The reality, after some of
these mitigation improvements, we may well see an increase
in that price over time. Definitely on the ground as individual researchers
within our team, we have noticed the difficulty in performing exploitation work
against Flash in the last 18 months. So, what makes
software exploitation hard? For us, it’s the combination
of vulnerability research and exploit mitigation. One without the other
is not quite as effective. If you just
perform vulnerability research, you are ceding
an advantage to attackers when they do eventually
find a bug. You are letting them build
reliable, reusable exploits for a very cheap investment. On the other hand, if you only
perform exploit mitigation — We know that most mitigations
have documented edge cases if you have a very densely
buggy attack. So, first, then, it’s easier
to find edge-case bugs which aren’t covered
by your mitigation approaches. So, we need to think
of exploits as chains. There’s two approaches. You make
each link harder to achieve or you extend the chain. Thank you. [ Applause ]

3 thoughts on “USENIX Enigma 2016 – What Makes Software Exploitation Hard?

Leave a Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Copyright © 2019 Geted Tabs Online. All rights reserved.