- Get link
- X
- Other Apps
In my previous post I looked at how a group of experts may be combined into a single, more powerful, classifier which I call NaiveBoost after the related AdaBoost. I'll illustrate how it can be used with a few examples.
As before, we're faced with making a binary decision, which we can view as an unknown label L∈{+1,−1}. Furthermore, the prior distribution on L is assumed to be uniform. Let our experts' independent probabilities be p1=0.8,p2=0.7,p3=0.6 and p4=0.5. Our combined NaiveBoost classifier is C(S)=∑iLi2log(pi1−pi), where S={Li}. A few things to note are that log(pi1−pi) is logit(pi), and an expert with p=0.5 contributes 0 to our classifier. This latter observation is what we'd expect, as p=0.5 is random guessing. Also, experts with probabilities pi and pj such that pi=1−pj are equivalent if we replace the stated label Lj with −Lj.
Ignoring the last (uninformative) expert, we end up with the combined classifier C(S)=L12log(4)+L12log(73)+L32log(32). If the overall value is positive, the classifier's label is L=+1; if it's negative, the classifier's label is L=−1. Note the base of the logarithm doesn't matter and we could also ignore the factor of 12, as these don't change the sign of C(S). However, the factor of 12 must be left in if we want the ability to properly recover the actual combined probability via normalization.
Now say L1=−1,L2=+1,L3=+1. What's our decision? Doing the math, we get C(S)=12log(78), and as 7<8, C(S)<0 and our combined classifier says L=−1. If we wanted to recover the probability, note exp(12log(78))=(78)1/2, hence our classifier states Pr(L=+1|S)=(78)1/2(78)1/2+(78)−1/2=7878+1=715, and of course Pr(L=−1|S)=815.
As a second example, consider the @CutTheKnotMath puzzle of two liars on an island. Here we have A and B, each of which lies with probability 2/3 and tells the truth with probability 1/3. A makes a statement and B confirms that it's true. What's the probability that A's statement is actually truthful? We can solve this in a complicated way by observing that this is equivalent to an ensemble of experts, where L∈{+1,−1}, the prior on L is uniform and L1=L2=+1. The probability that L=+1 is precisely the probability that A is telling the truth.
Following the first example, L+1=L12log(12)+L22log(12)=log(12). Continuing as before, we get Pr(L=+1|S)=1212+2=15.
As before, we're faced with making a binary decision, which we can view as an unknown label L∈{+1,−1}. Furthermore, the prior distribution on L is assumed to be uniform. Let our experts' independent probabilities be p1=0.8,p2=0.7,p3=0.6 and p4=0.5. Our combined NaiveBoost classifier is C(S)=∑iLi2log(pi1−pi), where S={Li}. A few things to note are that log(pi1−pi) is logit(pi), and an expert with p=0.5 contributes 0 to our classifier. This latter observation is what we'd expect, as p=0.5 is random guessing. Also, experts with probabilities pi and pj such that pi=1−pj are equivalent if we replace the stated label Lj with −Lj.
Ignoring the last (uninformative) expert, we end up with the combined classifier C(S)=L12log(4)+L12log(73)+L32log(32). If the overall value is positive, the classifier's label is L=+1; if it's negative, the classifier's label is L=−1. Note the base of the logarithm doesn't matter and we could also ignore the factor of 12, as these don't change the sign of C(S). However, the factor of 12 must be left in if we want the ability to properly recover the actual combined probability via normalization.
Now say L1=−1,L2=+1,L3=+1. What's our decision? Doing the math, we get C(S)=12log(78), and as 7<8, C(S)<0 and our combined classifier says L=−1. If we wanted to recover the probability, note exp(12log(78))=(78)1/2, hence our classifier states Pr(L=+1|S)=(78)1/2(78)1/2+(78)−1/2=7878+1=715, and of course Pr(L=−1|S)=815.
As a second example, consider the @CutTheKnotMath puzzle of two liars on an island. Here we have A and B, each of which lies with probability 2/3 and tells the truth with probability 1/3. A makes a statement and B confirms that it's true. What's the probability that A's statement is actually truthful? We can solve this in a complicated way by observing that this is equivalent to an ensemble of experts, where L∈{+1,−1}, the prior on L is uniform and L1=L2=+1. The probability that L=+1 is precisely the probability that A is telling the truth.
Following the first example, L+1=L12log(12)+L22log(12)=log(12). Continuing as before, we get Pr(L=+1|S)=1212+2=15.
In the event tree there are two cases where B says the response is true, one with probability of 1/9 (A says it's try and B also say's it is try, 1/3 * 1/3) and the other with probability of 4/9 (A lies and says it is false and B also lies and says it is true, 2/3 * 2/3). So the probability that A is really telling the truth given B said it is true is (1/9)/(1/9+4/9) or 1/5.
ReplyDeleteYes, that's simpler. I used this approach to illustrate how a puzzle such as this is actually related to a more complicated idea such as ensembles in machine learning.
DeleteUnderstood. Just checking the math. I don't know anything about machine learning...
Delete