## Monday, April 7, 2014

### Five Free Student Tickets for the SaberSeminar in Boston (August 17-18, 2014)

Meredith Wills, Will Carroll and myself are donating four student two-day tickets, including lunch, for the upcoming baseball analytics Saberseminar run by Dan Brooks. This is a wonderful event, and 100% of the proceeds are donated to the Jimmy Fund. You must be a current student. Meredith and myself will by choosing four students by the end of this week, Sunday April 13, 2014.

• These tickets are for both days, August 17-18, 2014
• The event is in Boston, MA
• Lunch is included, but no other meals
• Transportation and lodging are not included

If you would like to be considered for a donated ticket, please send:
• Your full name (first and last)
• If you're outside of the Boston area, how will you be getting to the event?
• Your school affiliation and whether high school or college
• Do you see yourself working in baseball? For a team, as a journalist, or something else?

Please email the above information to me at sabermetrics@gmail.com.

Again, please do so by the end of the day on Sunday, April 13, 2014. Once the tickets are awarded they're gone.

## Monday, December 30, 2013

### A Stupid and Strange Way of Looking at Sports Power Ratings that could be Smart and Useful

As I've mentioned previously, a common method used in sports for estimating game outcomes known as log5 can be written $p = \frac{p_1 q_2}{p_1 q_2+q_1 p_2}$ where $$p_i$$ is the fraction of games won by team $$i$$ and $$q_i$$ is the fraction of games lost by team $$i$$. We're assuming that there are no ties. What's the easiest way to derive this estimate? Here's one argument. Assume team $$i$$ has a probability $$p_i$$ of beating an average team (a team that wins half its games). Now imagine that this means for any given game the team has some "strength" sampled from [0,1] with median $$p_i$$ and that the stronger team always wins. Thus, the probability that team 1 beats team 2 is $p = \int_0^1 \int_0^1 \! \mathrm{Pr}(p_1 > p_2) \, \mathrm{d} p_1 \mathrm{d} p_2 .$ This looks complicated, but but with probability $$p_1$$ team 1 is stronger than an average team and with probability $$p_2$$ team 2 is stronger than an average team. From this perspective the log5 estimate is just the Bayesian probability that team 1 will be stronger than an average team while team 2 will be weaker than an average team, conditional on either team 1 being stronger than an average team and team 2 weaker than an average team, or team 1 weaker than an average team and team 2 stronger than an average team. In these cases it's unambiguous which team is stronger. The cases where the strength of both teams is stronger or weaker than an average team (the ambiguous cases) are thus discarded.

How could this be useful? Instead of ignoring the ambiguous outcomes when estimating the outcome probabilities under this "latent strength" model, we could instead determine which probability distributions best fit the outcome distributions for a given league! Furthermore, this allows us to cohesively put a power rating system into a Bayesian framework by assigning to each team a Bayesian prior strength distribution. These priors could either be uninformative or informative using e.g. preseason rankings.

## Thursday, November 7, 2013

### Building a Personal Supercomputer

It's time for a workstation upgrade; here's what I've assembled.

The massive case has plenty of space for additional drives to store the extreme amount of data that nearly every sport is now collecting together with the associated video footage. The case, power supply and motherboard allow up to 3 additional video cards to use as GPU units as your analytical needs demand (and budget can handle).
1. Cooler Master Cosmos II - Ultra Full Tower Computer Case

An absolutely massive case - plenty of room for an E-ATX motherboard, large power supply, multiple video cards and hard drives.

2. PC Power & Cooling 1200W Silencer MK III Series Modular Power Supply

An exceptionally large power supply, but platinum rated and plenty of room to spare allows it to operate nearly silenty, plus it'll handle any additional video cards you'll add later for GPU computing.

3. ASUS Rampage IV Black Edition LGA 2011 Extended ATX Intel Motherboard

Space for 4 video cards, 8 memory sticks (up to 64GB), superb quality and extreme tweakability.

4. Intel i7-4930K LGA 2011 64 Technology Extended Memory CPU

Ivy Bridge, 6 cores, exceptional ability to overclock.

5. (2) Corsair Dominator Platinum 32GB (4x8GB) DDR3 2133 MHz

Total of 64GB. Top-quality, you'll need your RAM in matched sets of 4 to enable quad-channel.

6. EVGA GeForce GTX TITAN SuperClocked 6GB GDDR5 384bit

Top-of-the-line NVIDIA Kepler card for GPU computing. 2688 CUDA cores that reach 4800 TFLOPS single-precision and 1600 TFLOPS double-precision.

7. (2) Seagate NAS HDD 2TB SATA 6GB NCQ 64 MB Cache Bare Drive

Solid hard drive; you'll need 2 or more drives for a RAID 10 array.

8. Samsung Electronics 840 Pro Series 2.5-Inch 256 GB SATA 6GB/s Solid State Drive

Small, fast SSD for quick booting and anything bottlenecked by drive read/write times.

9. Silverstone Tek 3.5-inch to 2 x 2.5-Inch Bay Converter

Needed to adapt the SSD to the Cosmos II case.

10. Ubuntu Linux

Ubuntu Linux - what else?

## Sunday, September 15, 2013

### The Good, the Bad and the Weird: Duels and the Gentleman's Draw

As I mentioned in the previous article "The Good, the Bad and the Weird: Duels, Truels and Utility Functions", a classic probability puzzle involves a 3-way duel (called a "truel").
A, B and C are to fight a three-cornered pistol duel. All know that A's chance of hitting his target is 0.3, C's is 0.5, and B never misses. They are to fire at their choice of target in succession in the order A, B, C, cyclically (but a hit man loses further turns and is no longer shot at) until only one man is left unit. What should A's strategy be?
There's a subtle issue involved in these types of problems in that we don't know how each participant values each outcome. If we allow duelists to deliberately miss there are $$2^3-1=7$$ possible outcomes; each person may or may not be shot and at least one person will not be shot. Even if deliberate missing isn't allowed, there are still 3 possible outcomes. A, for example, could conceivably value B winning more than C winning.

The classic solution concludes that with the given hit probabilities the optimal strategy for the first trueler is to deliberately miss. My contention is that this is an incomplete solution; for some sets of utility values this may not be the first trueler's optimal strategy. Let's examine duels using utility values as the first step towards addressing truels.

Let the two duelers have hit probabilities $$p_i$$ for $$i=1,2$$, and also let the values each assigns to a dueling win, loss or tie be $$W_i, L_i, T_i$$. I'm assuming these values are all known to both duelers and that $$W_i > T_i > L_i$$. Note that a strategy that optimizes expected utility is preserved under a positive affine transformation, so under the assumption that all values are finite, we may assume $$W_i=1$$ and $$L_i=0$$ for all $$i$$.

We'll declare the duel a "gentleman's draw" if both duelers deliberately miss in a single round. Let the expected utility value for dueler $$i$$ be $$U_i$$.

Trivially, if the optimal strategy for dueler 1 is trying for a hit, the optimal strategy for dueler 2 must be to also try for a hit. Conversely, the optimal strategy for dueler 1 can be to deliberately miss if and only if the optimal strategy for dueler 2 is also to deliberately miss. Assume the draw is taken if there's indifference (they are, after all, gentlemen).

If dueler 1 deliberately misses, dueler 2 has two choices. If he deliberately misses, it's a gentleman's draw and $$U_2 = T_2$$; if he tries for a hit, both duelers will subsequently try to hit each other and $$U_2 = p_2 + q_2 q_1 U_2$$. Solving, we get $U_2 = \frac{p_2}{1-q_1 q_2}.$ It's therefore optimal for dueler 2 to take the gentleman's draw if and only if $T_2 \geq \frac{p_2}{1-q_1 q_2}.$ As a consequence, dueler 1 will not deliberately miss if $T_2 < \frac{p_2}{1-q_1 q_2}.$ If dueler 1 tries for a hit, both will subsequently try to hit each other and we have $$U_1 = p_1 + q_1 q_2 U_1$$. Solving, we get $U_1 = \frac{p_1}{1-q_1 q_2}.$ Thus, the optimal strategy for dueler 1 is to deliberately miss if and only if $T_1 \geq \frac{p_1}{1-q_1 q_2}$ and $T_2 \geq \frac{p_2}{1-q_1 q_2}.$ These are, as expected, precisely the conditions under which dueler 2 will deliberately miss.

For example, if $$p_1=p_2=1/2$$, it'll be a gentleman's draw if and only if $$T_1,T_2 \geq 2/3$$. Paradoxically, both will fire under these hit probabilities if $$T_1 = 4/5$$ and $$T_2 = 1/2$$, even though this results in lower expected utility for both duelers than if they had agreed to a draw. This is a type of prisoner's dilemma.

## Saturday, August 10, 2013

### The Good, the Bad and the Weird: Duels, Truels and Utility Functions

In the excellent (and highly recommended) book "Fifty Challenging Problems in Probability with Solution", Frederick Mosteller poses "The Three-Cornered Duel":
A, B and C are to fight a three-cornered pistol duel. All know that A's chance of hitting his target is 0.3, C's is 0.5, and B never misses. They are to fire at their choice of target in succession in the order A, B, C, cyclically (but a hit man loses further turns and is no longer shot at) until only one man is left unit. What should A's strategy be?
This is problem 20 in Mosteller's book, and it also appears (with an almost identical solution) in Larsen & Marx "An Introduction to Probability and Its Applications".

Mosteller's solution:
A is naturally not feeling cheery about this enterprise. Having the first shot he sees that, if he hits C, B will then surely hit him, and so he is not going to shoot at C. If he shoots at B and misses him, then B clearly shoots the more dangerous C first, and A gets one shot at B with probability 0.3 of succeeding. If he misses this time, the less said the better. On the other hand, suppose A hits B.  Then C and A shoot alternately until one hits. A's chance of winning is $$(.5)(.3) + (.5)^2(.7)(.3) + (.5)^3(.7)^2(.3) + \ldots$$ . Each term corresponds to a sequence of misses by both C and A ending with a final hit by A. Summing the geometric series we get ... $$3/13 < 3/10$$. Thus hitting B and finishing off with C has less probability of winning for A than just missing the first shot. So A fires his first shot into the ground and then tries to hit B with his next shot. C is out of luck.
Is this right? What if B were to follow A's example and fire into the ground? What if all three were to keep firing into the ground? That this type of an outcome isn't unreasonable for certain sets of shot accuracy probabilities can be illustrated by considering the case where A's accuracy is 0.98, B's accuracy is 1.0 and C's accuracy is 0.99. Mosteller's argument is equally applicable in this case, but if B shoots C after A deliberately misses he'll be shot by A with probability 0.98. Is that reasonable?

Under the assumption that deliberately missing is allowed, there are $$2^3-1=7$$ possible outcomes - each participant can be shot or not shot, and there must be at least one participant not shot. The lack of clarity for what the ideal strategies are for A, B and C in the general case arises from the utility of 2- or 3-way ties to each of the participants being undefined.

In the next article I'll analyze 2-way duels where deliberate missing is allowed by using such fully-defined utility functions. These results will be used in a third article on 3-way duels (truels); in particular, I'll re-examine Mosteller's solution.

## Friday, August 9, 2013

### Solving a Bar Bet Using at Most Three Different Operations

A bar bet as presented in the YouTube video The HARDEST Puzzle Yet! involves starting with three of the same number from 0 through 9, then adding mathematical operations that result in an evaluation of 6. For example, if we start with three of the number 6, one solution could be $6+6-6=6 .$ I'll demonstrate a method for solving this bar bet puzzle starting with three of any number, say $$N$$, which involves using at most three different mathematical operations (although some of these may be used many, many times).

If $$0 \leq N \leq 2$$ we have the solutions
\begin{eqnarray}
(0! + 0! + 0!)! &= 6 \\
(1! + 1! + 1!)! &= 6 \\
2+2+2 &= 6.
\end{eqnarray} If $$N\geq 3$$, concatenate the three numbers. Repeatedly applying the square-root operation we'll eventually end up with a result $$x$$ with $$3 \leq x < 9$$.  If we now take the greatest integer $$\lfloor x \rfloor$$ we have an integer $$n$$ with $$3 \leq n \leq 8$$. If we can exhibit solutions for each of these cases that use only square-roots, greatest integers and one other operation, we'll be done. Using factorial for the third operation, some possibilities are
\begin{eqnarray}
\href{http://www.wolframalpha.com/input/?i=3%21}{3!} &= 6\\
\href{http://www.wolframalpha.com/input/?i=%5Cleft%5Clfloor+%5Csqrt%7B%5Csqrt%7B%5Cleft%5Clfloor+%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B%284%21%29%21%7D%7D%7D%7D%7D+%5Cright%5Crfloor%21%7D%7D+%5Cright%5Crfloor+%21}{\left\lfloor \sqrt{\sqrt{\left\lfloor \sqrt{\sqrt{\sqrt{\sqrt{\sqrt{(4!)!}}}}} \right\rfloor!}} \right\rfloor !} &= 6\\
\href{http://www.wolframalpha.com/input/?i=%5Cleft%5Clfloor+%5Csqrt%7B%5Csqrt%7B5%21%7D%7D+%5Cright%5Crfloor+%21}{\left\lfloor \sqrt{\sqrt{5!}} \right\rfloor !} &= 6\\
\href{http://www.wolframalpha.com/input/?i=6}{6} &= 6 \\
\href{http://www.wolframalpha.com/input/?i=%5Cleft%5Clfloor+%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B%5Cleft%5Clfloor+%5Csqrt%7B%5Csqrt%7B7%21%7D%7D%5Cright%5Crfloor+%21%7D%7D%7D+%5Cright%5Crfloor+%21}{\left\lfloor \sqrt{\sqrt{\sqrt{\left\lfloor \sqrt{\sqrt{7!}}\right\rfloor !}}} \right\rfloor !} &= 6\\
\href{http://www.wolframalpha.com/input/?i=%5Cleft%5Clfloor+%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B8%21%7D%7D%7D+%5Cright%5Crfloor+%21}{\left\lfloor \sqrt{\sqrt{\sqrt{8!}}} \right\rfloor !} &= 6.
\end{eqnarray}
I've added Wolfram Alpha links so you can verify that these do indeed evaluate to 6.

As an illustrative example, when $$N=1337$$ we have the solution $\href{http://www.wolframalpha.com/input/?i=%5Cleft%5Clfloor+%5Csqrt%7B%5Csqrt%7B%5Cleft%5Clfloor+%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B%28++%5Cleft%5Clfloor+%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B133713371337%7D%7D%7D%7D+%5Cright%5Crfloor+%21%29%21%7D%7D%7D%7D%7D+%5Cright%5Crfloor%21%7D%7D+%5Cright%5Crfloor+%21}{\left\lfloor \sqrt{\sqrt{\left\lfloor \sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\left( \left\lfloor \sqrt{\sqrt{\sqrt{\sqrt{133713371337}}}} \right\rfloor !\right)!}}}}} \right\rfloor!}} \right\rfloor !} = 6.$