ample ROC curve shown in Figure 1 this is achieved at
t
p
= 0.0518 and f
p
= 0.00029; i.e., only 5.18% of viable
victims are attacked. If attacking all of the top 0.1% of
the population is not profitable then we might expect a
slope of about 1,000, which is achieved at t
p
= 0.0019
so that only 0.19% of viable victims would be attacked.
This pushes the attacker to the extreme left of the ROC
curve, where (as we saw in Section 3.1) social good is
increased and fewer users are attacked.
3.3 As density decreases slope increases
Observe that as the density of viable targets, d, de-
creases, the slope at the OOP (given in (3)) increases.
Recall (from Section 2.6) that the slope of the ROC
curve is monotonically decreasing. Thus, as d → 0,
the optimal operating point will retreat leftward along
the ROC curve. As we’ve seen in Section 3.1 this means
fewer true positives and fewer total users attacked. Hence,
as the number of viable targets decreases the attacker
must make more conservative attack decisions. This is
true, even though the gain G, cost C and ability to dis-
tinguish viable from non-viable targets is unchanged.
For example, suppose, using the ROC curve of Figure
1, an attack has G/C = 9, i.e., the gain from a success-
ful attack is 9× the cost of an unsuccessful one. Further
suppose d = 1/10 which makes the slope at the OOP
equal to one. We already saw that the unity slope tan-
gent resulted in only 81.8% of viable targets and 18.2%
of non-viable targets being attacked. Since d = 1/10
we have that 10% of users are viable targets. Thus,
0.818×0.1 = 0.0818 or 8.18% of the population are suc-
cessfully attacked and 0.818 × 0.1 + 0.182 × 0.9 = 0.246
or 24.6% of all users will be attacked.
Now suppose that the density is reduced by a factor
of 10 so that d = 1/100. Everything else remains un-
changed. From (3) the slope at the OOP must now be:
100 × (1 − 1/100) × 1/9 = 11. Not shown, but the tan-
gent with this slope intersects the ROC curve in Figure
1 at approximately t
p
= 0.34 and f
p
= 0.013. Thus, the
optimal strategy now attacks only 34.0% of viable tar-
gets and 1.3% of non-viable targets. Since d = 1/100
we have that 1% of users are viable targets. Thus,
0.34×0.01 = 0.0034 or 0.34% of the population are suc-
cessfully attacked and 0.34×0.01+0.013×0.99 = 0.0163
or 1.63% of all users are attacked. Hence, in this case, a
10× reduction in the victim density reduces the number
of true positives by almost 24× and the number of all
attacked users by about 15 × .
While the exact improvement depends on the particu-
lar ROC curve, dramatic deterioration in the attacker’s
situation is guaranteed when density gets low enough.
Independent of the ROC curve, it is easy to see from
(3) that a factor of K reduction in density implies at
least a factor of K increase in slope (for K > 1). For
many classifiers the slope of the ROC tends to ∞ as
f
p
→ 0. (We show some such distributions in Section
3.4.) Very high slope for small values of density implies
that the true positive rate falls very quickly with further
decreases in d.
3.4 Different ROC Curves
We have used the ROC curve of Figure 1 to illustrate
several of the points made. While, as shown in Section
3.3, it is always true that decreasing density reduces the
optimal number of viable victims attacked the numbers
given were particular to the ROC curve, gain ratio G/C
and density d chosen. We now examine some alterna-
tives.
As stated earlier, a convenient parametric model is to
assume that pdf (x | viable) and pdf(x | non-viable)
are drawn from the same distribution with different
means. For example, with unit-variance normal dis-
tribution we would have pdf(x | non-viable) = N (0, 1)
and pdf(x | viable) = N (µ, 1). That is, by choosing µ we
can achieve any desired degree of overlap between the
two populations. This is shown in Figure 2 for three dif-
ferent values of µ. When µ is small the overlap between
N (0, 1) and N (µ, 1) is large and the classifier cannot
be very good. As µ increases the overlap decreases and
the classifier gets better.
The ROC curves for the distributions shown in Figure
2 are given in Figure 3 with values of AUC= 0.9, 0.95
and 0.99. The rightmost curve in Figure 2 corresponds
to the uppermost (i.e., best) classifier in Figure 3. These
correspond to an attacker ability to distinguish ran-
domly chosen viable from non-viable 90%, 95% and 99%
of the time. The highest curve (i.e., AUC = 0.99) is
clearly the best among the classifiers.
This parametric model, using normal distributions is
very common in detection and classification work [22].
It has an additional advantage in our case. Viability
often requires the AND of many things; for example
it might require that the victim have money, and have
a particular software vulnerability, and do banking on
the affected machine and that money can be moved ir-
reversibly from his account. The lognormal distribution
is often used to model variables that are the product of
several positive variables, and thus is an ideal choice
for modeling the viability variable x. Since the ROC
curve is unaffected by a monotonic transformation of
x the curves for pdf(x | non-viable) = ln N (0, 1) and
pdf(x | viable) = ln N ( µ, 1) are identical to those plot-
ted in Figure 3.
In Figure 4 we plot the slope of each of these ROC
curves as a function of log
10
t
p
. These show that large
slopes are achieved only at very small true positive
rates. For example, a slope of 100 is achieved at a true
positive rate of 5.18%, 20.6% and 59.4% by the curves
with AUC of 0.9, 0.95 and 0.99 respectively. Similarly, a
slope of 1000 is achieved at 0.19%, 3.52% and 32.1% re-
6