tag:blogger.com,1999:blog-9149402429183581490Wed, 10 Dec 2014 08:58:45 +0000The Angry Statisticianhttp://angrystatistician.blogspot.com/noreply@blogger.com (Christopher Long)Blogger44125tag:blogger.com,1999:blog-9149402429183581490.post-7402880594498963226Sun, 12 Oct 2014 03:38:00 +00002014-10-11T20:41:45.665-07:00A Strange Recursive Relation, AutomaticHofstadter mentions the following recursive relation in his great book "Gödel, Escher, Bach": \[<br />\begin{align}<br />g(0) &= 0;\\<br />g(n) &= n-g(g(n-1)).<br />\end{align}<br />\] I claim that \( g(n) = \left\lfloor \phi\cdot (n+1) \right\rfloor \), where \( \phi = \frac{-1+\sqrt{5}}{2}\), and I'll show this using a technique that makes proving many identities of this type nearly automatic.<br /><br />Let \( \phi\cdot n = \left\lfloor \phi\cdot n \right\rfloor + e\), where \( 0 < e < 1\) as \( \phi \) is irrational, nor can \(e = 1-\phi\), and note that \(\phi\) satisfies \( {\phi}^2 + \phi - 1 = 0\). Some algebra gives \[<br />\begin{align}<br />n-\left\lfloor \left( \left\lfloor \phi\cdot n \right\rfloor + 1 \right) \cdot \phi \right\rfloor<br />&= n-\left\lfloor \left( n\cdot \phi - e + 1 \right) \cdot \phi \right\rfloor \\<br />&= n-\left\lfloor n\cdot {\phi}^2 - e\cdot \phi + \phi \right\rfloor \\<br />&= n-\left\lfloor n\cdot \left(1-\phi\right) - e\cdot \phi + \phi \right\rfloor \\<br />&= n-n-\left\lfloor -n\cdot \phi - e\cdot \phi + \phi \right\rfloor \\<br />&= -\left\lfloor -n\cdot \phi - e\cdot \phi + \phi \right\rfloor \\<br />&= -\left\lfloor -n \cdot \phi + e - e - e\cdot \phi + \phi \right\rfloor \\<br />&= \left\lfloor \phi\cdot n \right\rfloor -\left\lfloor - e - e\cdot \phi + \phi \right\rfloor.<br />\end{align}<br />\]<br />Now if \[<br />\begin{align}<br />0 < e < 1-\phi &\implies 0 < - e - e\cdot \phi + \phi < \phi;\\<br />1-\phi < e < 1 &\implies -1 < - e - e\cdot \phi + \phi < 0.<br />\end{align}<br />\]<br />This implies \[ n-\left\lfloor \left( \left\lfloor \phi\cdot n \right\rfloor + 1 \right) \cdot \phi \right\rfloor = \left\lfloor \phi\cdot (n+1) \right\rfloor .\] Since \( \left\lfloor \phi\cdot (0+1) \right\rfloor = 0\), we're done.<br /><br />The point of the algebra was to move all terms involving \(n\) out, and then checking to see how the remaining term varied with \( e\). A simple idea, but very useful.http://angrystatistician.blogspot.com/2014/10/a-strange-recursive-relation-automatic.htmlnoreply@blogger.com (Christopher Long)1tag:blogger.com,1999:blog-9149402429183581490.post-562657659715110374Sun, 05 Oct 2014 19:26:00 +00002014-10-05T21:01:29.267-07:00Closed Under MeansHere's a nice little problem from the 13th All Soviet Union Mathematics Olympiad.<br /><blockquote class="tr_bq">Given a set of real numbers \(S\) containing 0 and 1 that's closed under finite means, show that it contains all rational numbers in the interval \(\left[0,1\right]\).</blockquote>This isn't a difficult problem, but there's a particularly nice solution.<br /><br />First observe that if \(x\in S\) then both \(\frac{x}{4}\) and \(\frac{3x}{4}\) are in \(S\); average \(\{0,x\}\) to get \(\frac{x}{2}\), average \(\{0, \frac{x}{2}\}\) to get \(\frac{x}{4}\), average \(\{\frac{x}{2}, x\}\) to get \(\frac{3x}{4}\).<br /><br />We could show any rational number \(\frac{m}{n}\) with \(1\leq m < n\) is in \(S\) if we had \(n\) distinct elements from \(S\) that summed to \(m\). Let's exhibit one.<br /><br />Start with an array with \(m\) 1s on the left, \(n-m\) 0s on the right. Repeatedly replace adjacent \(x,y\) values with \(\frac{3(x+y)}{4}, \frac{(x+y)}{4}\), where either \(x=1,y\neq1\) or \(x\neq 0, y=0\), until there is one 0 and one 1 left. We can do this in exactly \(n-2\) steps, each resulting number is guaranteed to be in \(S\) by the above note, and each number is guaranteed to be distinct by construction!<br /><br />Examples:<br /><br />\(\frac{1}{3}: \left[1,0,0\right] \to \left[\frac{3}{4},\frac{1}{4},0\right] \)<br /><br />\(\frac{2}{5}: \left[1,1,0,0,0\right] \to \left[1,\frac{3}{4},\frac{1}{4},0,0\right] \to \left[1,\frac{3}{4},\frac{3}{16},\frac{1}{16},0\right] \)http://angrystatistician.blogspot.com/2014/10/closed-under-means.htmlnoreply@blogger.com (Christopher Long)0tag:blogger.com,1999:blog-9149402429183581490.post-8609248843020646690Fri, 29 Aug 2014 04:28:00 +00002014-08-30T01:15:37.813-07:00Learn One Weird Trick And Easily Solve The Product-Sum ProblemA tribute to <a href="https://en.wikipedia.org/wiki/Martin_Gardner" target="_blank"><b>Martin Gardner</b></a>.<br /><br />For which sets of positive integers does the sum equal the product? For example, when does \( x_1 + x_2 = x_1\cdot x_2\)? It's easy to see that this is only true when \(x_1 = x_2 = 2\).<br /><br />In the general case our equality is \(\sum_{i=1}^{n} x_i= \prod_{i=1}^{n} x_i \). We can rearrange terms to give \[x_1+x_2+\sum_{i=3}^{n} x_i= x_1\cdot x_2\cdot \prod_{i=3}^{n} x_i,\] and this in turn factors nicely to give us \[\left( x_1\cdot \prod_{x=3}^{n} x_i - 1\right)\cdot \left( x_2\cdot \prod_{x=3}^{n} x_i - 1\right) = \left( \prod_{x=3}^{n} x_i \right)\cdot \left(\sum_{x=3}^{n} x_i \right) + 1.\] How does this help? Consider the case \(n=5\), and without loss of generality assume \(x_i \ge x_{i+1}\) for all \(i\). If \(x_3=x_4=x_5=1\) our factorized equation becomes \[(x_1-1)\cdot(x_2-1)=4,\] with the obvious solutions \( x_1=5, x_2=2; x_1=3, x_2=3\). The only remaining case to consider is \(x_3=2\), as any other case forces \( \sum_{i=1}^{n} x_i < \prod_{i=1}^{n} x_i \). For this case our equality becomes \[\left( 2 x_1 - 1\right)\cdot \left( 2 x_2 - 1\right) = 9.\] This gives us the remaining solution \(x_1 = x_2 = x_3 = 2\) with the other \(x_i = 1\).<br /><br />This is quite efficient. For example, for the case \(n=100\) the only equations we need to consider are \begin{aligned} (x_1-1)\cdot(x_2-1) &= 99\\<br /> (2 x_1-1)\cdot(2 x_2-1) &= 199\\<br /> (3 x_1-1)\cdot(3 x_2-1) &= 301\\<br /> (4 x_1-1)\cdot(4 x_2-1) &= 405\\<br /> (4 x_1-1)\cdot(4 x_2-1) &= 401\\<br /> (6 x_1-1)\cdot(6 x_2-1) &= 607\\<br /> (9 x_1-1)\cdot(9 x_2-1) &= 919\\<br /> (8 x_1-1)\cdot(8 x_2-1) &= 809\\<br /> (12 x_1-1)\cdot(12 x_2-1) &= 1225\\<br /> (16 x_1-1)\cdot(16 x_2-1) &= 1633.<br />\end{aligned} Now 199, 401, 607, 919, 809 are all prime ruling them out immediately, and 301 and 1633 don't have factors of the right form. The other equations yield the five solutions \( (100,2), (34,4), (12,10), (7,4,4), (3,3,2,2,2) \) with the other \(x_i = 1\).<br /><br />For \(n = 1000 \) you'd need to consider 52 cases, but most of these are eliminated immediately. I get the six solutions \( (1000,2), (334,4), (112,10), (38,28), (67,4,4), (16,4,4,4) \).<br /><br />Have fun!<br /><br />http://angrystatistician.blogspot.com/2014/08/learn-this-one-weird-trick-and-easily.htmlnoreply@blogger.com (Christopher Long)0tag:blogger.com,1999:blog-9149402429183581490.post-5713515223902536661Wed, 09 Jul 2014 03:01:00 +00002014-07-08T20:30:01.826-07:00Finding the Best Book Quotes: Power Ranking GoodreadsMy friend Jordan Ellenberg <a href="http://online.wsj.com/articles/the-summers-most-unread-book-is-1404417569" target="_blank">wrote an article for the Wall Street Journal</a> recently describing a metric to roughly measure the least read books, which he calls the Hawking Index (HI). As he describes it, take the page numbers of a book's five top highlights, average them, and divide by the number of pages in the whole book.<br /><br />On a discussion thread on Facebook, this led to a proposal from me for measuring the general quality of a quote. Assume a user has some level of discrimination \(D\); the higher the value of \(D\), the more likely they are to quote a passage. Now assume each passage has some measure of quality \(Q\); the higher the value of \(Q\), the more likely a passage is to be quoted. Let's try a classic <a href="http://en.wikipedia.org/wiki/Pairwise_comparison" target="_blank">Bradley-Terry-Luce</a> model - if a user with discrimination \(D\) quotes at least one passage from a particular work, the probability \(p\) that they'll quote a given passage with quality \(Q\) from that same work is roughly \[ p = \frac{Q}{Q+D}.\] <br />Nice, but where can we find data to actually test this theory? It turns out an excellent source is <a href="https://www.goodreads.com/" target="_blank">Goodreads</a>. Users are numbered sequentially starting with Goodreads founder <a href="https://www.goodreads.com/user/show/1" target="_blank">Otis Chandler's ID of 1</a>; a user's <a href="https://www.goodreads.com/quotes/list/1" target="_blank">associated quote page</a> has a simple URL structure. Furthermore, every user's quote list has an <a href="https://www.goodreads.com/quotes/list_rss/1-otis-chandler" target="_blank">associated RSS/Atom URL</a> that provides an easy to parse XML file.<br /><br />The steps:<br /><ol><li>Write a simple Go scraper to uses a goroutine to quickly and sequentially get user quote pages, grab the count of quotes shown on that page (truncated at 30), then grab the RSS/Atom URL. This is able to process about 2500/users minutes over a slow internet connection.</li><li>Write a simple Ruby program to fetch quote XML files for every user that has at least one quote.</li><li>Write a simple Ruby program to fetch the author URL and work URL for every quote cited by at least one user.</li><li>Load all this data into a PostgreSQL database.</li><li>Write a simple R program to pull the data from the database and fit our model. We assign a value for the outcome "quoted" of 1 if the user quoted a passage; we assign "quoted" the value 0 if the user quote a passage from that work, but not that particular passage. Both user "discrimination" and quote "quality" are treated as factors and modeled as random effects; the R package "lmer" was used to fit the model.</li></ol><div>If there is interest, I can make all of this code publicly available through my <a href="https://github.com/octonion" target="_blank">GitHub account</a>. None of it is proprietary.</div><div><br /></div><div>As a test I grabbed the data from 333760 users, 10724 had at least one quote, there were 83160 total quotes and 32621 unique quotes. The model took about 3.5 minutes to fit using an optimized BLAS library and 4 hyperthreaded CPU cores.</div><div><br /></div><div>The top 10 quotes on Goodreads, ranked by quality:<br /><ol><li><a href="https://www.goodreads.com/quotes/11035-no-one-can-make-you-feel-inferior-without-your-consent" target="_blank">“No one can make you feel inferior without your consent.”</a><br /><b>Eleanor Roosevelt</b>, <i>This is My Story</i></li><li><a href="https://www.goodreads.com/quotes/352-outside-of-a-dog-a-book-is-man-s-best-friend" target="_blank">“Outside of a dog, a book is man's best friend. Inside of a dog it's too dark to read.”</a><br /><b>Groucho Marx</b>, <i>The Essential Groucho: Writings For By And About Groucho Marx</i></li><li><a href="https://www.goodreads.com/quotes/2340-twenty-years-from-now-you-will-be-more-disappointed-by" target="_blank">“Twenty years from now you will be more disappointed by the things that you didn't do than by the ones you did do. So throw off the bowlines. Sail away from the safe harbor. Catch the trade winds in your sails. Explore. Dream. Discover.”</a><br /><b>H. Jackson Brown Jr.</b>, <i>P.S. I Love You</i></li><li><a href="https://www.goodreads.com/quotes/1198-i-am-so-clever-that-sometimes-i-don-t-understand-a" target="_blank">“I am so clever that sometimes I don't understand a single word of what I am saying.”</a><br /><b>Oscar Wilde</b>, <i>The Happy Prince and Other Stories</i></li><li><a href="https://www.goodreads.com/quotes/5543-insanity-is-doing-the-same-thing-over-and-over-again" target="_blank">“Insanity is doing the same thing, over and over again, but expecting different results.”</a><br /><b>Narcotics Anonymous</b>, <i>Narcotics Anonymous</i></li><li><a href="https://www.goodreads.com/quotes/2644-he-has-achieved-success-who-has-lived-well-laughed-often" target="_blank">“He has achieved success who has lived well, laughed often, and loved much;Who has enjoyed the trust of pure women, the respect of intelligent men and the love of little children;Who has filled his niche and accomplished his task;Who has never lacked appreciation of Earth's beauty or failed to express it;Who has left the world better than he found it,Whether an improved poppy, a perfect poem, or a rescued soul;Who has always looked for the best in others and given them the best he had;Whose life was an inspiration;Whose memory a benediction.”</a><br /><b>Bessie Anderson Stanley</b>, <i>More Heart Throbs Volume Two in Prose and Verse Dear to the American People And by them contributed as a Supplement to the original $10,000 Prize Book HEART THROBS</i></li><li><a href="https://www.goodreads.com/quotes/2627-the-trouble-with-having-an-open-mind-of-course-is" target="_blank">“The trouble with having an open mind, of course, is that people will insist on coming along and trying to put things in it.”</a><br /><b>Terry Pratchett</b>, <i>Diggers</i></li><li><a href="https://www.goodreads.com/quotes/2621-love-all-trust-a-few-do-wrong-to-none" target="_blank">“Love all, trust a few, do wrong to none.”</a><br /><b>William Shakespeare</b>, <i>All's Well That Ends Well</i></li><li><a href="https://www.goodreads.com/quotes/6396-the-paradoxical-commandments-people-are-illogical-unreasonable-and-self-centered-love" target="_blank"><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;">“The Paradoxical Commandments</span><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;"><br />People are illogical, unreasonable, and self-centered.</span><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;">Love them anyway.</span><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;"><br />If you do good, people will accuse you of selfish ulterior motives.</span><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;">Do good anyway.</span><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;"><br />If you are successful, you will win false friends and true enemies.</span><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;">Succeed anyway.</span><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;"><br />The good you do today will be forgotten tomorrow.</span><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;">Do good anyway.</span><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;"><br />Honesty and frankness make you vulnerable.</span><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;">Be honest and frank anyway.</span><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;"><br />The biggest men and women with the biggest ideas can be shot down by the smallest men and women with the smallest minds.</span><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;">Think big anyway.</span><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;"><br />People favor underdogs but follow only top dogs.</span><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;">Fight for a few underdogs anyway.</span><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;"><br />What you spend years building may be destroyed overnight.</span><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;">Build anyway.</span><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;"><br />People really need help but may attack you if you do help them.</span><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;">Help people anyway.</span><span style="background-color: white; color: #181818; font-family: georgia, serif; font-size: 14px; line-height: 18px;"><br />Give the world the best you have and you'll get kicked in the teeth.</span></a><span style="background-color: white;"><span style="color: #181818; font-family: georgia, serif;"><span style="font-size: 14px; line-height: 18px;"><a href="https://www.goodreads.com/quotes/6396-the-paradoxical-commandments-people-are-illogical-unreasonable-and-self-centered-love" target="_blank">Give the world the best you have anyway.”</a></span></span><br /><span style="color: #181818; font-family: georgia, serif;"><span style="font-size: 14px; line-height: 18px;"><b>Kent M. Keith</b>, <i>The Silent Revolution: Dynamic Leadership in the Student Council</i></span></span></span></li><li><span style="background-color: white;"><span style="color: #181818; font-family: georgia, serif;"><span style="font-size: 14px; line-height: 18px;"><a href="https://www.goodreads.com/quotes/2795-she-is-too-fond-of-books-and-it-has-turned" target="_blank">“She is too fond of books, and it has turned her brain.”</a></span><br /><span style="font-size: 14px; line-height: 18px;"><b>Louisa May Alcott</b>, <i>Work: A Story of Experience</i></span></span></span></li></ol></div>http://angrystatistician.blogspot.com/2014/07/finding-best-book-quotes-power-ranking.htmlnoreply@blogger.com (Christopher Long)6tag:blogger.com,1999:blog-9149402429183581490.post-5487884645169400777Tue, 08 Apr 2014 03:04:00 +00002014-04-09T23:39:54.078-07:00Five Free Student Tickets for the SaberSeminar in Boston (August 17-18, 2014)<a href="https://twitter.com/Bbl_Astrophyscs" target="_blank">Meredith Wills</a>, <a href="https://twitter.com/injuryexpert" target="_blank">Will Carroll</a> and <a href="https://twitter.com/octonion" target="_blank">myself</a> are donating four student two-day tickets, including lunch, for the upcoming baseball analytics <a href="http://saberseminar.com/" target="_blank">Saberseminar</a> run by <a href="https://twitter.com/brooksbaseball" target="_blank">Dan Brooks</a>. This is a wonderful event, and 100% of the proceeds are donated to the <a href="https://www.jimmyfund.org/" target="_blank">Jimmy Fund</a>. You <b>must</b> be a current student. Meredith and myself will by choosing four students by the end of this week, Sunday April 13, 2014.<br /><br />Please note:<br /><ul><li>These tickets are for both days, August 17-18, 2014</li><li>The event is in Boston, MA</li><li>Lunch is included, but <b>no</b> other meals</li><li>Transportation and lodging are <b>not</b> included</li></ul><div><br />If you would like to be considered for a donated ticket, please send:</div><div><ul><li>Your full name (first and last)</li><li>If you're outside of the Boston area, how will you be getting to the event?</li><li>Your school affiliation and whether high school or college</li><li>Best contact email address (if different from reply-to address)</li><li>A little about your baseball interests, analytical or otherwise</li><li>Do you see yourself working in baseball? For a team, as a journalist, or something else?</li></ul><div><br />Please email the above information to me at <a href="mailto:sabermetrics@gmail.com" target="_blank">sabermetrics@gmail.com</a>.</div><div><br /></div><div>Again, please do so by the end of the day on Sunday, April 13, 2014. Once the tickets are awarded they're gone.</div></div>http://angrystatistician.blogspot.com/2014/04/four-free-student-tickets-for.htmlnoreply@blogger.com (Christopher Long)2tag:blogger.com,1999:blog-9149402429183581490.post-3718907100550685167Mon, 30 Dec 2013 17:51:00 +00002013-12-30T10:30:04.425-08:00A Stupid and Strange Way of Looking at Sports Power Ratings that could be Smart and UsefulAs I've <a href="http://angrystatistician.blogspot.com/2013/03/baseball-chess-psychology-and.html"><b>mentioned previously</b></a>, a common method used in sports for estimating game outcomes known as <a href="https://en.wikipedia.org/wiki/Log5"><b>log5</b></a> can be written \[p = \frac{p_1 q_2}{p_1 q_2+q_1 p_2}\] where \(p_i\) is the fraction of games won by team \(i\) and \(q_i\) is the fraction of games lost by team \(i\). We're assuming that there are no ties. What's the easiest way to derive this estimate? Here's one argument. Assume team \(i\) has a probability \(p_i\) of beating an average team (a team that wins half its games). Now imagine that this means for any given game the team has some <b>"strength"</b> sampled from [0,1] with median \(p_i\) and that the stronger team always wins. Thus, the probability that team 1 beats team 2 is \[ p = \int_0^1 \int_0^1 \! \mathrm{Pr}(p_1 > p_2) \, \mathrm{d} p_1 \mathrm{d} p_2 .\] This looks complicated, but but with probability \(p_1\) team 1 is stronger than an average team and with probability \(p_2\) team 2 is stronger than an average team. From this perspective the log5 estimate is just the <a href="https://en.wikipedia.org/wiki/Bayes'_theorem" target="_blank"><b>Bayesian probability</b></a> that team 1 will be stronger than an average team while team 2 will be weaker than an average team, conditional on either team 1 being stronger than an average team and team 2 weaker than an average team, or team 1 weaker than an average team and team 2 stronger than an average team. In these cases it's unambiguous which team is stronger. The cases where the strength of both teams is stronger or weaker than an average team (the ambiguous cases) are thus discarded.<br /><br />How could this be useful? Instead of ignoring the ambiguous outcomes when estimating the outcome probabilities under this <b>"latent strength"</b> model, we could instead determine which probability distributions best fit the outcome distributions for a given league! Furthermore, this allows us to cohesively put a power rating system into a Bayesian framework by assigning to each team a <a href="https://en.wikipedia.org/wiki/Prior_probability" target="_blank"><b>Bayesian prior</b></a> strength distribution. These priors could either be <a href="https://en.wikipedia.org/wiki/Prior_probability#Uninformative_priors" target="_blank"><b>uninformative</b></a> or <a href="https://en.wikipedia.org/wiki/Prior_probability#Informative_priors" target="_blank"><b>informative</b></a> using e.g. preseason rankings.http://angrystatistician.blogspot.com/2013/12/a-stupid-and-strange-way-of-looking-at.htmlnoreply@blogger.com (Christopher Long)2tag:blogger.com,1999:blog-9149402429183581490.post-6620747293907708484Thu, 07 Nov 2013 23:28:00 +00002013-11-07T16:01:12.861-08:00Building a Personal SupercomputerIt's time for a workstation upgrade; here's what I've assembled.<br /><br />The massive case has plenty of space for additional drives to store the extreme amount of data that nearly every sport is now collecting together with the associated video footage. The case, power supply and motherboard allow up to 3 additional video cards to use as GPU units as your analytical needs demand (and budget can handle).<br /><ol><li><a href="http://www.amazon.com/gp/product/B00BCXF4JQ/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B00BCXF4JQ&linkCode=as2&tag=octonion-20" target="_blank">Cooler Master Cosmos II - Ultra Full Tower Computer Case</a><br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://ecx.images-amazon.com/images/I/91O40bKe-mL._SL1500_.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img alt="" border="0" height="320" src="http://ecx.images-amazon.com/images/I/91O40bKe-mL._SL1500_.jpg" title="Cooler Master Cosmos II" width="259" /></a></div><br />An absolutely massive case - plenty of room for an E-ATX motherboard, large power supply, multiple video cards and hard drives. <br /><br /></li><li><a href="http://www.amazon.com/gp/product/B008U12856/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B008U12856&linkCode=as2&tag=octonion-20" target="_blank">PC Power & Cooling 1200W Silencer MK III Series Modular Power Supply</a><br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://ecx.images-amazon.com/images/I/41-vt3pnUJL.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img alt="" border="0" height="214" src="http://ecx.images-amazon.com/images/I/41-vt3pnUJL.jpg" title="PC Power & Cooling 1200W Silencer MK III" width="320" /></a></div><br />An exceptionally large power supply, but platinum rated and plenty of room to spare allows it to operate nearly silenty, plus it'll handle any additional video cards you'll add later for GPU computing. <br /><br /></li><li><a href="http://www.newegg.com/Product/Product.aspx?Item=N82E16813132053" target="_blank">ASUS Rampage IV Black Edition LGA 2011 Extended ATX Intel Motherboard</a><br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://images10.newegg.com/NeweggImage/productimage/13-132-053-02.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img alt="" border="0" height="240" src="http://images10.newegg.com/NeweggImage/productimage/13-132-053-02.jpg" title="ASUS Rampage IV Black Edition" width="320" /></a></div><br />Space for 4 video cards, 8 memory sticks (up to 64GB), superb quality and extreme tweakability. <br /><br /></li><li><a href="http://www.amazon.com/gp/product/B00EMHM622/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B00EMHM622&linkCode=as2&tag=octonion-20" target="_blank">Intel i7-4930K LGA 2011 64 Technology Extended Memory CPU</a><br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://ecx.images-amazon.com/images/I/91++fq52iVL._SL1500_.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img alt="" border="0" height="320" src="http://ecx.images-amazon.com/images/I/91++fq52iVL._SL1500_.jpg" title="Intel i7-4930K LGA 2011" width="236" /></a></div><br />Ivy Bridge, 6 cores, exceptional ability to overclock. <br /><br /></li><li><a href="http://www.amazon.com/gp/product/B008RAS8AO/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B008RAS8AO&linkCode=as2&tag=octonion-20" target="_blank">(2) Corsair Dominator Platinum 32GB (4x8GB) DDR3 2133 MHz</a><img alt="" border="0" height="1" src="http://ir-na.amazon-adsystem.com/e/ir?t=octonion-20&l=as2&o=1&a=B008RAS8AO" style="border: none !important; margin: 0px !important;" width="1" /><br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://ecx.images-amazon.com/images/I/51H33Y5+3zL.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img alt="" border="0" height="320" src="http://ecx.images-amazon.com/images/I/51H33Y5+3zL.jpg" title="Corsair Dominator Platinum 32GB (4x8GB)" width="320" /></a></div><br />Total of 64GB. Top-quality, you'll need your RAM in matched sets of 4 to enable quad-channel.<br /><br /></li><li><a href="http://www.amazon.com/gp/product/B00BL8BX7O/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B00BL8BX7O&linkCode=as2&tag=octonion-20" target="_blank">EVGA GeForce GTX TITAN SuperClocked 6GB GDDR5 384bit</a><img alt="" border="0" height="1" src="http://ir-na.amazon-adsystem.com/e/ir?t=octonion-20&l=as2&o=1&a=B00BL8BX7O" style="border: none !important; margin: 0px !important;" width="1" /><br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://ecx.images-amazon.com/images/I/71phIxf42XL._SL1200_.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img alt="" border="0" height="320" src="http://ecx.images-amazon.com/images/I/71phIxf42XL._SL1200_.jpg" title="EVGA GeForce GTX TITAN SuperClocked" width="320" /></a></div><br /> Top-of-the-line NVIDIA Kepler card for GPU computing. 2688 CUDA cores that reach 4800 TFLOPS single-precision and 1600 TFLOPS double-precision.<br /><br /></li><li><a href="http://www.amazon.com/gp/product/B00D1GYNT4/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B00D1GYNT4&linkCode=as2&tag=octonion-20" target="_blank">(2) Seagate NAS HDD 2TB SATA 6GB NCQ 64 MB Cache Bare Drive</a><img alt="" border="0" height="1" src="http://ir-na.amazon-adsystem.com/e/ir?t=octonion-20&l=as2&o=1&a=B00D1GYNT4" style="border: none !important; margin: 0px !important;" width="1" /><br /><br />Solid hard drive; you'll need 2 or more drives for a RAID 10 array.<br /><br /></li><li><a href="http://www.amazon.com/gp/product/B009NB8WRU/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B009NB8WRU&linkCode=as2&tag=octonion-20" target="_blank">Samsung Electronics 840 Pro Series 2.5-Inch 256 GB SATA 6GB/s Solid State Drive</a><img alt="" border="0" height="1" src="http://ir-na.amazon-adsystem.com/e/ir?t=octonion-20&l=as2&o=1&a=B009NB8WRU" style="border: none !important; margin: 0px !important;" width="1" /><br /><br />Small, fast SSD for quick booting and anything bottlenecked by drive read/write times.<br /><br /></li><li><a href="http://www.amazon.com/gp/product/B008O510FW/ref=as_li_ss_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B008O510FW&linkCode=as2&tag=octonion-20" target="_blank">Silverstone Tek 3.5-inch to 2 x 2.5-Inch Bay Converter</a><img alt="" border="0" height="1" src="http://ir-na.amazon-adsystem.com/e/ir?t=octonion-20&l=as2&o=1&a=B008O510FW" style="border: none !important; margin: 0px !important;" width="1" /><br /><br />Needed to adapt the SSD to the Cosmos II case.<br /><br /></li><li><a href="http://www.ubuntu.com/" target="_blank">Ubuntu Linux</a><br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://design.ubuntu.com/wp-content/uploads/logo-ubuntu_st_no%C2%AE-black_orange-hex.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img alt="" border="0" height="226" src="http://design.ubuntu.com/wp-content/uploads/logo-ubuntu_st_no%C2%AE-black_orange-hex.png" title="Ubuntu Linux" width="320" /></a></div><br /> Ubuntu Linux - what else? </li></ol>http://angrystatistician.blogspot.com/2013/11/building-personal-supercomputer.htmlnoreply@blogger.com (Christopher Long)0tag:blogger.com,1999:blog-9149402429183581490.post-1224011523756248298Mon, 16 Sep 2013 06:58:00 +00002013-09-16T00:10:30.580-07:00The Good, the Bad and the Weird: Duels and the Gentleman's Draw<div>As I mentioned in the previous article <a href="http://angrystatistician.blogspot.com/2013/08/the-good-bad-and-weird-duels-truels-and.html">"The Good, the Bad and the Weird: Duels, Truels and Utility Functions"</a>, a classic probability puzzle involves a 3-way duel (called a <a href="http://en.wikipedia.org/wiki/Truel">"truel"</a>).</div><blockquote class="tr_bq">A, B and C are to fight a three-cornered pistol duel. All know that A's chance of hitting his target is 0.3, C's is 0.5, and B never misses. They are to fire at their choice of target in succession in the order A, B, C, cyclically (but a hit man loses further turns and is no longer shot at) until only one man is left unit. What should A's strategy be?</blockquote><div>There's a subtle issue involved in these types of problems in that we don't know how each participant values each outcome. If we allow duelists to deliberately miss there are \(2^3-1=7\) possible outcomes; each person may or may not be shot and at least one person will not be shot. Even if deliberate missing isn't allowed, there are still 3 possible outcomes. A, for example, could conceivably value B winning more than C winning.<br /><br />The classic solution concludes that with the given hit probabilities the optimal strategy for the first trueler is to deliberately miss. My contention is that this is an incomplete solution; for some sets of utility values this may not be the first trueler's optimal strategy. Let's examine duels using utility values as the first step towards addressing truels.<br /><br /></div><div>Let the two duelers have hit probabilities \( p_i \) for \( i=1,2\), and also let the values each assigns to a dueling win, loss or tie be \( W_i, L_i, T_i\). I'm assuming these values are all known to both duelers and that \( W_i > T_i > L_i\). Note that a strategy that optimizes expected utility is preserved under a <a href="http://www.econ.ucsb.edu/~tedb/Courses/Ec100C/VarianExpectedUtility.pdf%E2%80%8E" target="_blank">positive affine transformation</a>, so under the assumption that all values are finite, we may assume \(W_i=1\) and \(L_i=0\) for all \(i\).<br /><br />We'll declare the duel a "gentleman's draw" if both duelers deliberately miss in a single round. Let the expected utility value for dueler \(i\) be \(U_i\).<br /><br />Trivially, if the optimal strategy for dueler 1 is trying for a hit, the optimal strategy for dueler 2 must be to also try for a hit. Conversely, the optimal strategy for dueler 1 can be to deliberately miss if and only if the optimal strategy for dueler 2 is also to deliberately miss. Assume the draw is taken if there's indifference (they are, after all, gentlemen).<br /><br />If dueler 1 deliberately misses, dueler 2 has two choices. If he deliberately misses, it's a gentleman's draw and \(U_2 = T_2\); if he tries for a hit, both duelers will subsequently try to hit each other and \(U_2 = p_2 + q_2 q_1 U_2 \). Solving, we get \[U_2 = \frac{p_2}{1-q_1 q_2}.\] It's therefore optimal for dueler 2 to take the gentleman's draw if and only if \[T_2 \geq \frac{p_2}{1-q_1 q_2}.\] As a consequence, dueler 1 will not deliberately miss if \[T_2 < \frac{p_2}{1-q_1 q_2}.\] If dueler 1 tries for a hit, both will subsequently try to hit each other and we have \( U_1 = p_1 + q_1 q_2 U_1\). Solving, we get \[ U_1 = \frac{p_1}{1-q_1 q_2}.\] Thus, the optimal strategy for dueler 1 is to deliberately miss if and only if \[ T_1 \geq \frac{p_1}{1-q_1 q_2}\] and \[T_2 \geq \frac{p_2}{1-q_1 q_2}.\] These are, as expected, precisely the conditions under which dueler 2 will deliberately miss.<br /><br />For example, if \(p_1=p_2=1/2\), it'll be a gentleman's draw if and only if \(T_1,T_2 \geq 2/3\). Paradoxically, both will fire under these hit probabilities if \( T_1 = 4/5 \) and \( T_2 = 1/2 \), even though this results in lower expected utility for both duelers than if they had agreed to a draw. This is a type of <a href="http://en.wikipedia.org/wiki/Prisoner's_dilemma" target="_blank">prisoner's dilemma</a>.</div>http://angrystatistician.blogspot.com/2013/09/the-good-bad-and-weird-duels-and.htmlnoreply@blogger.com (Christopher Long)0tag:blogger.com,1999:blog-9149402429183581490.post-5812573797954626341Sun, 11 Aug 2013 09:10:00 +00002013-08-11T02:10:20.672-07:00A Man, a Plan, a Cat, a Canal - Panama!A man, a plan, a cat, a master, a minae, a lunula, a tora, a vela, a cola, a belga, a tola, an uta, a dona, a tiara, an aura, a kata, a mana, a nipa, a cabana, a law, a harem, a taces, a liard, a gnat, a mut, a dawed, a bura, a nim, a levator, an eh, a trams, a sall, a vacates, an up, a garb, a keets, a serin, a sored, a patamar, a gater, a lees, a strep, a lama, a pat, a lame, a haded, a wag, a naan, a kalian, a fer, a yaws, a remit, a keep, a loom, a neep, a ha, an ai, a retailor, a cabals, a subah, a slits, a pated, a callet, a xu, a faena, a tups, a sati, a rapider, a reps, a rapid, a korat, a dens, a sled, a tical, a pipages, a rats, a seeks, a defi, a wasts, a peracid, a nomas, a rosarian, a bran, a nonet, a gama, a sac, a scrota, a lab, a cain, an aid, a statin, a reel, a tiff, a haj, a talas, a nares, a mares, a laics, an allot, a til, an arena, a nits, a palace, a fanum, a yarer, a daub, a basts, a masas, a called, a slay, a galoots, a senega, a ton, a span, a mum, a dracaenas, a coda, a must, a satis, a lion, a yob, a reknit, a dessert, a lex, an aver, a red, a fab, a katas, a deet, a diaper, a nos, a snips, a spot, a fed, a wap, a past, a galax, an apos, a zaps, a lat, a rai, a resale, a tallit, a boob, a hag, a fayed, a sports, a loof, a ragg, a sashay, an agar, a selah, a spit, a brad, a strap, a wolf, a reified, a navar, a cases, a bedaub, a tassels, a gaes, a runaway, an amahs, a burg, a sorter, a del, a basely, a kae, an evades, a lader, a waved, a spirts, a pas, a slam, a hasps, a nalas, a snog, a cedar, an ulama, a lasagnas, a leud, a poh, a larum, a si, a retained, a pay, a pit, a raps, a repots, a sell, a datums, a yard, a hey, a terces, a prod, a skips, a gip, a lin, a sib, a lac, a jaded, a later, a beef, a sallet, a panicum, a gnar, a pall, a gas, a rotator, a tamis, a gel, a snawed, a decaf, a pol, a jawans, a dual, a gateman, a tins, a ran, a riff, a za, a minim, a get, a bard, a spud, a sum, a stem, a rases, a tsar, a lobate, a hasp, a hwan, a lotahs, an agas, a pad, a reed, a slim, an attar, a steels, a medina, a gar, a tub, an asarum, a pows, an ava, a mib, a map, a stab, a rasp, a tala, a nadir, a seton, a stoped, a repoll, a waste, a hades, a ragi, a tsades, a bap, a rain, a ratels, a hasten, a gid, a snot, a basal, a vinas, an upas, a nonacid, a naval, a lahar, a sonar, a lovat, a wares, a bal, a ripsaw, a jatos, a sup, a sh, a haed, a hailed, a moor, a vaned, a delf, a revel, a tan, a meter, a sob, an evil, a deific, a peracids, an ass, a bait, a buran, a far, a lamas, a robalo, a gaum, a sha, an arak, a macer, a macs, a snub, a stirps, a deffer, a perp, a naw, a jar, a vain, a ban, a miladi, a rasps, a wallop, a rex, a waney, a rase, a caird, a patrons, a tun, a stob, a japans, a maws, an awa, a con, a spoons, a fir, a stroma, a mil, a keen, a lace, a canon, an allod, a spay, a ten, a kop, a yah, a regal, a modal, a dinar, a madame, a hallahs, a dor, a repel, a reflet, a noma, a gen, a fast, a waggon, a dewan, a recap, a wort, a pin, a sung, a bag, a sos, a tepal, an abash, an ami, a manias, a pees, a diols, a repp, a taxed, a deb, a spans, a reman, a reward, a poons, a smug, a now, a teff, a hakeem, a seem, a rami, a simaruba, a ness, a cadets, a teraohm, a knob, a sewan, a seven, an oot, a skeets, a demur, a lanai, an unassuaged, a stared, a seam, a meed, a naos, a bus, a wad, a xis, a hon, a dan, a snail, a kayak, a looter, a gob, a men, a sorb, an inner, an oxen, an emes, a seder, a reknits, a loid, a saw, a pint, a castes, a stop, a debut, a tows, a tog, a kips, a rep, a cask, a yam, a minareted, a reflow, a laced, a bun, a keels, an ort, an assent, a faun, a reign, a manat, a rakus, a lade, a dater, a minarets, a nay, a fanos, a matt, an arhat, a tram, a tel, a vat, a ba, a pal, a pager, a tarps, a plug, a nod, a rash, a rotas, a ramal, a butane, a galiot, a paws, a pals, a stressed, a derat, an um, a mac, a sri, a mas, a lagan, an ajar, a laid, a big, a lass, a vara, a mail, a sikas, a yawl, an add, a simas, a romano, a kalam, a tannin, a less, a tater, a carer, a pasts, a caseic, a papayas, a salal, a haffits, a snivel, an atlas, a bases, a macaber, a tress, an alerts, a water, an imaginer, a pec, a golf, a pupils, a gnaws, a mos, a la, a baklawa, a gan, a stums, a step, an ikat, a keel, a dot, a loll, a tares, a renamed, a nanas, a sematic, a tap, a zareeba, a mob, an ales, a lox, a tass, a matsah, a ports, a names, a batik, a secs, a fane, a haleru, a lasts, a fats, a valley, a spam, a spools, a saner, an aspis, a hun, a stops, a tots, a tils, a gamely, a put, a las, a vaw, a ranid, a ham, an ataps, a barca, a gig, a strow, a struts, a speed, a draw, an axon, a gaga, a birr, an agnail, a pow, a tuber, a teel, a rip, a tail, a tatar, a hay, a ray, a lall, a barege, a janitor, a caff, an auk, a hah, an areic, a raster, a casks, a basil, a bubal, a donas, a bananas, a lobar, a lades, a cadi, a caw, a tad, an ogam, a janes, a sirup, a cram, a lets, a passus, a nuts, a muton, a cyma, a mix, a madam, a let, a corban, a partan, a melanin, a top, a wot, a bad, a bows, a resod, a nae, a parol, a vastness, an anime, a hares, a rax, a lam, a tabbis, a bins, a liar, a straw, a pant, a cares, a papa, a res, a sass, a mattes, a nopal, a daff, a yapon, a toped, an ivy, a maven, a ganef, a sovran, a gorp, a divan, a dexes, a moot, a prat, a nips, a redips, a mood, a sabir, a zarebas, a shales, a nett, a papal, a japan, a swots, a seral, a sip, a tarots, a path, a baseness, a callan, a drib, a nairu, a salep, a last, a babuls, a kiths, an orc, a daps, a galoot, a stink, a sail, an ameer, a haslets, a pats, a medal, a ganof, a dim, an atoll, a hall, a feral, a robalos, a sin, a radioed, a cycases, a kuna, an as, an arepa, a ki, a haps, an asks, a mair, a saloops, a vug, a stang, a tophs, a dak, a yay, a serac, a vav, a lap, a gad, a laden, a manioc, a fin, a dub, a loons, a teemer, a sris, a soar, a matador, a saros, a vavassor, a kayos, a setal, a maid, a waw, a palet, a rayon, a solon, a dine, a rumaki, a hae, a gaff, a lipases, a ladder, a stub, an akees, a seller, a manats, a barytas, a lied, a nona, a mars, an ait, a suber, a cimex, a tau, a vail, a grana, a pan, a sums, a ramadas, a rood, a teels, a muts, a torr, a paeon, a sow, an alfa, a rob, a mures, a toper, a rerun, a mall, a canoe, an apart, a seer, a saps, a wait, a laxer, a cats, a manor, a baton, a tag, a strown, a fax, a wab, a jauk, a racial, a lev, a gasser, a careens, a swat, a rein, a wanigan, a tenner, a flora, a rolf, a rennet, an agin, a wanier, a taws, a sneer, a caress, a gavel, a laic, a raku, a jab, a wax, a fanworts, a gat, a not, a baron, a mast, a carex, a lati, a wasp, a sarees, a trap, an aeon, a call, a manurer, a repot, a serum, a bora, a flan, a wos, an oe, a parrot, a stum, a sleet, a door, a sad, a marasmus, a napa, an argali, a vau, a taxemic, a rebus, a ti, an asrama, an on, a deil, a satyr, a bast, an amarelles, a seek, an abuts, a redd, a lases, a pilaff, a gae, a haik, a muraenid, a nolos, an oy, a ratel, a paw, a wadi, a malates, a soy, a kaross, a vavasor, a sarod, a tamaraos, a sirs, a remeet, a snool, a bud, an if, a coin, a maned, a lad, a gap, a lav, a vac, a resay, a yak, a dashpot, a gnats, a guv, a spool, a sari, a masks, an asp, a haika, a per, an asana, an ukases, a cycadeoid, a ranis, a sol, a boral, a refall, a hallot, an amid, a fon, a gal, a demast, a pastels, a hareem, an alias, a knits, a tool, a gasp, a dacron, a shtik, a slub, a bats, a lapel, a saurian, a bird, an all, a cassenes, a baht, a pastor, a tapis, a lares, a stows, a nap, a jalap, a patten, a selahs, a saber, a zaribas, a doom, a spider, a spin, a tarp, a toom, a sexed, an avid, a prog, an arvos, a fen, a ganev, a mayvin, a depot, an op, a yaff, a dalapon, a sett, a massas, a sera, a pap, a ser, a catnap, a warts, a rail, a snib, a sibb, a tam, a lax, a raser, a haemin, an assents, a valor, a paean, a doser, a swob, a dab, a tow, a pot, an in, a leman, a trapan, a brocatel, a mad, a maxima, a myc, a notum, a stun, a suss, a pastel, a marc, a puris, a sen, a jam, a gonad, a taw, a caid, a cased, a lar, a bolas, an anabas, a nodal, a bubalis, a basks, a carets, a racier, a nah, a haku, a naff, a carotin, a jaeger, a ball, a lay, a rayah, a rat, a tali, a tapir, a leet, a rebut, a wop, a liang, an arriba, a gag, an ox, an award, a deeps, a sturts, a worts, a giga, a crab, a spat, an amah, a din, a raw, a vasal, a tup, a ylem, a gaslit, a stot, a spots, an uh, a sips, an arenas, a sloops, a maps, a yell, a vast, a fasts, a laurel, a haen, a fasces, a kit, a baseman, a strop, a hast, a mass, a taxol, a sel, an aboma, a beer, a zap, a tacit, a mesas, an anadem, an eraser, a tallol, a tod, a leek, a takin, a pets, a smuts, an aga, a walk, a baal, a som, a swang, a slipup, a flog, a cep, a renig, a minaret, a wastrel, an assert, a reb, a camases, a basalt, an alevins, a stiff, a halalas, a say, a papacies, a casts, a parer, a caret, a tassel, an inn, a tamal, a kaon, a moras, a misadd, an alway, a sakis, a li, a maar, a vassal, a gib, a dial, a raj, a nan, a galas, a mairs, a cam, a mun, a tared, a desserts, a slap, a swap, a toil, a gaen, a tubal, a maras, a torahs, a radon, a gulp, a sprat, a reg, a palapa, a bat, a valet, a mart, a tahr, an att, a mason, a fay, an aster, an imaret, a daedal, a suk, a ratan, a mangier, a nu, a fatness, a natron, a sleek, a nub, a decal, a wolfer, a deter, an imam, a yaks, a caper, a spik, a got, a swot, a tubed, a pots, a sets, a catnip, a was, a diol, a stinker, a redes, a semen, an exon, a rennin, a bros, an em, a bog, a retool, a kay, a kalians, an ad, a noh, a six, a daw, a sub, a so, an adeem, a maes, a derats, a degauss, an unai, an alarumed, a steeks, a toon, a neves, an awes, a bonk, a mho, a retasted, a cassena, a bur, a misaim, a ramees, a meek, a haffet, a won, a gums, a snoop, a drawer, a namer, a snaps, a bed, a dex, a tapper, a sloid, a seep, a sain, a maim, an ahs, a banal, a petasos, a gab, a gnus, a nip, a trow, a pacer, an awed, a nogg, a wats, a fanega, a mon, a telfer, a leper, a rod, a shall, a haem, a damar, a nidal, a domal, a gerah, a yapok, a net, a yaps, a doll, an anon, a caecal, an eek, a lima, a morts, a rif, a snoops, an oca, a wan, a swam, a snap, a jabots, a nut, a snort, a padri, a caesar, a yen, a waxer, a poll, a wasps, a raid, a liman, a bani, a var, a jawan, a prep, a reffed, a sprits, a buns, a scam, a rec, a makar, an aahs, a mu, a gaol, a boras, a malar, a fan, a rubati, a bass, an asdic, a repacified, a liven, a bos, a retem, an at, a lever, a fled, a den, a varoom, a deli, a hade, a hahs, a pus, a sot, a jaw, a spiral, a baser, a wat, a volar, an osar, a halal, a vanadic, a nonas, a punas, a nival, a sabatons, a dig, a nets, a haslet, a rani, a rap, a based, a staig, a rased, a haets, a walloper, a depots, a notes, a rid, an aal, a taps, a rabats, a pam, a bima, a van, a swop, a muras, an abut, a raga, an idem, a sleets, a rattan, a mils, a deer, a dap, a sag, an ash, a tolan, a whaps, a haet, a bolar, a stases, a ramets, a mus, a dups, a drab, a teg, a minima, a zaffir, an ar, a snit, a nametag, a laud, a snaw, a jalop, a faced, a dewans, a leg, a sim, a tarot, a toras, a gall, a parang, a mucin, a patellas, a feeb, a ret, a laded, a jacal, a bis, a nil, a pig, a spiks, a dorp, a secret, a yeh, a dray, a smut, a dalles, a stoper, a spar, a tip, a yap, a deni, a terais, a mural, a hop, a duel, a sang, a salaam, a lunar, a decagons, a sal, an asps, a hamals, a sap, a strips, a dev, a wared, a lased, a venae, a kayles, a baled, a retros, a grub, a shaman, a yaw, an urase, a gasless, a tabu, a debases, a caravan, a deifier, a flow, a parts, a darb, a tips, a hales, a rag, an ayahs, a saggar, a fool, a strops, a dey, a fag, a haboob, a till, a tael, a serai, a ratal, a spaz, a sop, an axal, a gats, a papaw, a def, a tops, a spins, a son, a repaid, a teed, a sat, a kab, a fader, a rev, an axel, a tressed, a tinker, a boy, a noil, a sit, a satsuma, a doc, a sane, a cardamum, a naps, a nota, a genes, a stool, a gayals, a dell, a casas, a masts, a babu, a darer, a yamun, a faecal, a pastina, an er, an alit, a toll, an asci, a laser, a maser, a nasal, a taj, a haffit, a leer, a nit, a tsadi, an ani, a cabala, a torcs, a casa, a mag, a tenon, an arb, an air, a soras, a monadic, a repasts, a waifed, a skees, a star, a seg, a pipal, a citadels, a sned, a tarok, a dip, a rasper, a redip, a raitas, a sputa, a ne, a faux, a tell, a cadet, a pastils, a habus, a slab, a caroli, a terai, an aah, a peen, a mool, a peek, a timer, a sway, a ref, a nail, a kana, a nag, a waded, a haemal, a tapa, a malaperts, a seel, a retag, a ram, a tapaderos, an ires, a steek, a brag, a pun, a set, a cavallas, a smart, a hen, a rot, a velamina, a rub, a dew, a datum, a tang, a drail, a sec, a tamer, a hawala, an abaca, a pina, an ama, a taka, a ruana, a raita, an oda, a tuna, a lota, a gleba, a loca, a leva, a rota, a lunulae, an imarets, a mat, a canal - Panama!http://angrystatistician.blogspot.com/2013/08/a-man-plan-cat-canal-panama.htmlnoreply@blogger.com (Christopher Long)0tag:blogger.com,1999:blog-9149402429183581490.post-3974033514939034397Sun, 11 Aug 2013 06:03:00 +00002013-08-11T15:34:21.222-07:00The Good, the Bad and the Weird: Duels, Truels and Utility FunctionsIn the excellent (and highly recommended) book <a href="http://www.amazon.com/Challenging-Problems-Probability-Solutions-Mathematics/dp/0486653552" target="_blank">"Fifty Challenging Problems in Probability with Solution"</a>, Frederick Mosteller poses "The Three-Cornered Duel":<br /><blockquote class="tr_bq">A, B and C are to fight a three-cornered pistol duel. All know that A's chance of hitting his target is 0.3, C's is 0.5, and B never misses. They are to fire at their choice of target in succession in the order A, B, C, cyclically (but a hit man loses further turns and is no longer shot at) until only one man is left unit. What should A's strategy be?</blockquote>This is problem 20 in Mosteller's book, and it also appears (with an almost identical solution) in Larsen & Marx <a href="http://www.amazon.com/Introduction-Mathematical-Statistics-Its-Applications/dp/0321693949" target="_blank">"An Introduction to Probability and Its Applications"</a>.<br /><br />Mosteller's solution:<br /><div><blockquote class="tr_bq">A is naturally not feeling cheery about this enterprise. Having the first shot he sees that, if he hits C, B will then surely hit him, and so he is not going to shoot at C. If he shoots at B and misses him, then B clearly shoots the more dangerous C first, and A gets one shot at B with probability 0.3 of succeeding. If he misses this time, the less said the better. On the other hand, suppose A hits B. Then C and A shoot alternately until one hits. A's chance of winning is \( (.5)(.3) + (.5)^2(.7)(.3) + (.5)^3(.7)^2(.3) + \ldots\) . Each term corresponds to a sequence of misses by both C and A ending with a final hit by A. Summing the geometric series we get ... \(3/13 < 3/10\). Thus hitting B and finishing off with C has less probability of winning for A than just missing the first shot. So A fires his first shot into the ground and then tries to hit B with his next shot. C is out of luck.</blockquote>Is this right? What if B were to follow A's example and fire into the ground? What if all three were to keep firing into the ground? That this type of an outcome isn't unreasonable for certain sets of shot accuracy probabilities can be illustrated by considering the case where A's accuracy is 0.98, B's accuracy is 1.0 and C's accuracy is 0.99. Mosteller's argument is equally applicable in this case, but if B shoots C after A deliberately misses he'll be shot by A with probability 0.98. Is that reasonable?<br /><br />Under the assumption that deliberately missing is allowed, there are \(2^3-1=7\) possible outcomes - each participant can be shot or not shot, and there must be at least one participant not shot. The lack of clarity for what the ideal strategies are for A, B and C in the general case arises from the utility of 2- or 3-way ties to each of the participants being undefined.<br /><br />In the next article I'll analyze 2-way duels where deliberate missing is allowed by using such fully-defined utility functions. These results will be used in a third article on 3-way duels (truels); in particular, I'll re-examine Mosteller's solution.</div>http://angrystatistician.blogspot.com/2013/08/the-good-bad-and-weird-duels-truels-and.htmlnoreply@blogger.com (Christopher Long)0tag:blogger.com,1999:blog-9149402429183581490.post-2191741530849988465Fri, 09 Aug 2013 08:23:00 +00002013-08-10T01:21:01.004-07:00Solving a Bar Bet Using at Most Three Different OperationsA bar bet as presented in the YouTube video <a href="http://www.youtube.com/watch?v=DoRB7FL02t4" target="_blank">The HARDEST Puzzle Yet!</a> involves starting with three of the same number from 0 through 9, then adding mathematical operations that result in an evaluation of 6. For example, if we start with three of the number 6, one solution could be \[ 6+6-6=6 .\] I'll demonstrate a method for solving this bar bet puzzle starting with three of any number, say \( N \), which involves using at most three different mathematical operations (although some of these may be used many, many times).<br /><br />If \( 0 \leq N \leq 2 \) we have the solutions<br />\begin{eqnarray}<br />(0! + 0! + 0!)! &= 6 \\<br />(1! + 1! + 1!)! &= 6 \\<br />2+2+2 &= 6.<br />\end{eqnarray} If \( N\geq 3\), concatenate the three numbers. Repeatedly applying the square-root operation we'll eventually end up with a result \( x \) with \( 3 \leq x < 9\). If we now take the greatest integer \(\lfloor x \rfloor\) we have an integer \( n \) with \( 3 \leq n \leq 8 \). If we can exhibit solutions for each of these cases that use only square-roots, greatest integers and one other operation, we'll be done. Using factorial for the third operation, some possibilities are<br />\begin{eqnarray}<br />\href{http://www.wolframalpha.com/input/?i=3%21}{3!} &= 6\\<br />\href{http://www.wolframalpha.com/input/?i=%5Cleft%5Clfloor+%5Csqrt%7B%5Csqrt%7B%5Cleft%5Clfloor+%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B%284%21%29%21%7D%7D%7D%7D%7D+%5Cright%5Crfloor%21%7D%7D+%5Cright%5Crfloor+%21}{\left\lfloor \sqrt{\sqrt{\left\lfloor \sqrt{\sqrt{\sqrt{\sqrt{\sqrt{(4!)!}}}}} \right\rfloor!}} \right\rfloor !} &= 6\\<br />\href{http://www.wolframalpha.com/input/?i=%5Cleft%5Clfloor+%5Csqrt%7B%5Csqrt%7B5%21%7D%7D+%5Cright%5Crfloor+%21}{\left\lfloor \sqrt{\sqrt{5!}} \right\rfloor !} &= 6\\<br />\href{http://www.wolframalpha.com/input/?i=6}{6} &= 6 \\<br />\href{http://www.wolframalpha.com/input/?i=%5Cleft%5Clfloor+%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B%5Cleft%5Clfloor+%5Csqrt%7B%5Csqrt%7B7%21%7D%7D%5Cright%5Crfloor+%21%7D%7D%7D+%5Cright%5Crfloor+%21}{\left\lfloor \sqrt{\sqrt{\sqrt{\left\lfloor \sqrt{\sqrt{7!}}\right\rfloor !}}} \right\rfloor !} &= 6\\<br />\href{http://www.wolframalpha.com/input/?i=%5Cleft%5Clfloor+%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B8%21%7D%7D%7D+%5Cright%5Crfloor+%21}{\left\lfloor \sqrt{\sqrt{\sqrt{8!}}} \right\rfloor !} &= 6.<br />\end{eqnarray}<br />I've added Wolfram Alpha links so you can verify that these do indeed evaluate to 6.<br /><br />As an illustrative example, when \(N=1337\) we have the solution \[<br />\href{http://www.wolframalpha.com/input/?i=%5Cleft%5Clfloor+%5Csqrt%7B%5Csqrt%7B%5Cleft%5Clfloor+%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B%28++%5Cleft%5Clfloor+%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B%5Csqrt%7B133713371337%7D%7D%7D%7D+%5Cright%5Crfloor+%21%29%21%7D%7D%7D%7D%7D+%5Cright%5Crfloor%21%7D%7D+%5Cright%5Crfloor+%21}{\left\lfloor \sqrt{\sqrt{\left\lfloor \sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\left( \left\lfloor \sqrt{\sqrt{\sqrt{\sqrt{133713371337}}}} \right\rfloor !\right)!}}}}} \right\rfloor!}} \right\rfloor !} = 6.\]http://angrystatistician.blogspot.com/2013/08/solving-bar-bet-using-at-most-three.htmlnoreply@blogger.com (Christopher Long)0tag:blogger.com,1999:blog-9149402429183581490.post-848226778682792744Sat, 27 Jul 2013 20:25:00 +00002014-10-09T00:40:45.830-07:00Solving Sir Arthur Eddington's Zoo Puzzle using Dirac Matrices<a href="http://en.wikipedia.org/wiki/Arthur_Eddington" target="_blank">Sir Arthur Eddington</a> posed this difficult logic puzzle to the readers of Caliban's (<a href="http://en.wikipedia.org/wiki/Hubert_Phillips" target="_blank">Hubert Phillips</a>) newspaper puzzle columns, and famously (or infamously) provided a solution using <a href="http://mathworld.wolfram.com/DiracMatrices.html" target="_blank">Dirac matrices</a>.<br /><br /><b>Sir Eddington's zoo puzzle: </b><br /><div><br /></div>I took some nephews and nieces to the Zoo, and we halted at a cage marked<br /><ol><li>Tovus Slithius, male and female. </li><li>Beregovus Mimsius, male and female. </li><li>Rathus Momus, male and female. </li><li>Jabberwockius Vulgaris, male and female. </li></ol>The eight animals were asleep in a row, and the children began to guess which was which. "That one at the end is Mr Tove." "No, no! It's Mrs Jabberwock," and so on. I suggested that they should each write down the names in order from left to right, and offered a prize to the one who got most names right.<br /><br />As the four species were easily distinguished, no mistake would arise in pairing the animals; naturally a child who identified one animal as Mr Tove identified the other animal of the same species as Mrs Tove.<br /><br />The keeper, who consented to judge the lists, scrutinised them carefully. "Here's a queer thing. I take two of the lists, say, John's and Mary's. The animal which John supposes to be the animal which Mary supposes to be Mr Tove is the animal which Mary supposes to be the animal which John supposes to be Mrs Tove. It is just the same for every pair of lists, and for all four species.<br /><br />"Curiouser and curiouser! Each boy supposes Mr Tove to be the animal which he supposes to be Mr Tove; but each girl supposes Mr Tove to be the animal which she supposes to be Mrs Tove. And similarly for the other animals. I mean, for instance, that the animal Mary calls Mr Tove is really Mrs Rathe, but the animal she calls Mrs Rathe is really Mrs Tove."<br /><br />"It seems a little involved," I said, "but I suppose it is a remarkable coincidence."<br /><br />"Very remarkable," replied Mr Dodgson (whom I had supposed to be the keeper) "and it could not have happened if you had brought any more children."<br /><br />How many nephews and nieces were there? Was the winner a boy or a girl? And how many names did the winner get right?<br /><div><br /><b>Solution:</b><br /><br /> Note that every child permutes the species, and we may indicate these as \(4\times 4\) permutation matrices, where the row number indicates the actual species numbers and the column number indicates the species number guessed by the child. We can use a little trick to include gender. Consider the species permutation matrix for a given child, but use a -1 in the entry if the child additionally switches gender, too.<br /><br />Here's an illustrative example. Assume a child transposes species 1 (Tovus Slithius) and species 3 (Rathus Momus), but doesn't switch genders; and transposes species 2 (Beregovus Mimsius) and species 4 (Jabberwockius Vulgaris), but in this case also switches genders. We'd represent this child's guesses with the <a href="http://en.wikipedia.org/wiki/Generalized_permutation_matrix#Signed_permutation_group" target="_blank">signed permutation matrix</a> \[ A = \left( \begin{array}{cccc} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & -1 \\ 1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \end{array} \right).\] Let \( A_i \) be the guess matrix for child \(i\). The keeper's first observation implies that these matrices satisfy \( A_i A_j = - A_j A_i \) for all \( i\neq j \). If the child is a boy, the keeper's second observation implies that \( A_i A_i = I\), where \( I\) is the \( 4\times 4\) identity matrix; if the child is a girl, the keeper's second observation implies that \( A_i A_i = -I\). Also note that this observation implies children either get species right, or transpose two species. Thus, the species permutations can be written as the product of disjoint transpositions.<br /><br />From the second observation, any child that gets the species right for an animal must be a boy. Furthermore, because the species guesses are the product of disjoint transpositions, every child must get an even number of species correct.<br /><br />If a boy had gotten all the species and genders right, his matrix would commute with the matrix of any other child, but we know there is at least one boy and at least one girl. Thus no child could have gotten all of the animals correct. Furthermore, if any boy gets a species right, no other child could also get that species right, otherwise this would violate the keeper's first observation.<br /><br />We could continue along these lines and work out the answer, but let's use Eddington's sledgehammer - the keeper's observations imply that these matrices will form a set of anticommuting <a href="http://mathworld.wolfram.com/DiracMatrices.html" target="_blank">Dirac matrices</a> if we multiply the girl's matrices by the imaginary unit \( i\). Such sets have size at most five; sets of size five have three boys and two girls, and the unique winner is a boy with four animals correct.<br /><br />Here's one possible set. Note that I've multiplied two of the matrices given on the Wolfram site by the imaginary unit \( i \) to convert them to signed permutation matrices. As I've mentioned, this doesn't affect the anticommutativity, but it does change the square of the matrix from \( I\) to \( -I\); thus, Dirac matrices with imaginary entries correspond to girls and Dirac matrices with real entries correspond to boys. \begin{eqnarray}<br />A_1 &= \left( \begin{array}{cccc} 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \end{array} \right)\\<br />A_2 &= \left( \begin{array}{cccc} 0 & 0 & 0 & 1 \\ 0 & 0 & -1 & 0 \\ 0 & 1 & 0 & 0 \\ -1 & 0 & 0 & 0<br />\end{array} \right)\\<br />A_3 &= \left( \begin{array}{cccc} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & -1 \\ 1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0<br />\end{array} \right)\\<br />A_4 &= \left( \begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1<br />\end{array} \right)\\<br />A_5 &= \left( \begin{array}{cccc} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ -1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0<br />\end{array} \right)\\<br /> \end{eqnarray} The boys are \( A_1, A_3, A_4\) and the girls are \( A_2, A_5\). The unique winner was \(A_4\) with four animals correct (two correct species with correct gender, so four animals).</div>http://angrystatistician.blogspot.com/2013/07/solving-sir-arthur-eddingtons-zoo.htmlnoreply@blogger.com (Christopher Long)0tag:blogger.com,1999:blog-9149402429183581490.post-4237471736186941582Fri, 26 Jul 2013 00:09:00 +00002013-07-25T19:28:23.450-07:00Solving the TopSpin Puzzle using GAPTopSpin is an oval-track permutation puzzle that was made by Binary Arts; similar puzzles are made and sold by other manufacturers. Here's the <a href="http://www.amazon.com/Binary-Arts-Top-Spin-Top-Spin/dp/B001SDX244/" target="_blank">Binary Arts TopSpin</a>.<br /><br />It's not a difficult puzzle to solve if you play around with it for a few hours and figure out how to generate various permutations. It's more interesting (and difficult) if you observe that the turntable has a distinguishable top and bottom. This suggests an interesting question - can you invert the turntable while keeping the numbers in the track in the same order?<br /><br />The answer is, perhaps surprisingly, yes. Here's one way to find a sequence of operations that produces precisely this outcome.<br /><br /><a href="http://en.wikipedia.org/wiki/GAP_(computer_algebra_system)" target="_blank">GAP (Groups, Algorithms and Programming)</a> is a freely available programming language that specializes in computational group theory, and it's perfect for solving permutation puzzles. Here's my GAP code for TopSpin. Label the top of the turntable with 21 and the bottom with 22. Flipping the turntable generates the permutation \[(1,4)(2,3)(21,22);\] rotating the oval to the left generates the permutation \[ (2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,1). \] Denoting these two permutations as \(x\) and \(y\), we can now simply ask GAP to find a sequence of operations on the free group generated by \(x,y\) that results in the pre-images \((1,2)\), flipping two adjacent numbers on the track, leaving everything else the same (including turntable parity); and \((21,22)\), flipping turntable parity, leaving everything else the same. This corresponds to the operation of flipping the turntable while keeping the order of the numbers in the track the same.<br /><br /><script src="https://gist.github.com/octonion/6084770.js"></script><br />Running the code, we get these lovely results:<br /><blockquote class="tr_bq">(1,2) = y*x^-1*y^-1*x^-1*y*x^-1*y*x^-1*y^-1*x^-1*y^-1*x^-1*y^2*x^-1*y^-1*x^-1*y^-1*x^-1*y^4*x^-1*y^-1*x^-1*y^2*x^-1*y^-2*x^-1*y*x^-1*y^2*x^-1*y^-3*x^-1*y^-3*x*y^-4*x*y*x*y^-1*x*y^4*x^-1*y^-5*x^-1*y^-1*x^-1*y^5*x^-1*y^-6*x^-1*y^2*x^-1*y^-1*x^-1*y^5*x*y*x*y*x*y*x*y*x*y^5*x*y*x*y^-1*x*y^-5*x^-1*y^-1*x^-1*y^-1*x^-1*y^-2*x*y*x*y^-1*x*y^4*x*y^-1*x*y^3*x^-1*y^-1*x^-1*y^6*x^-1*y^-1*x^-2*y^-4*x^-3*y^-3*x^-2*y^-1*x^-1*y*x^-1*y^2*x^-1*y^-2*x^-1*y^-1*x^-1*y*x*y*x^-1*y*x^-1*y^-1*x^-1*y^-2 </blockquote><blockquote class="tr_bq">(21,22) = y*x^-1*y^2*x*y^-1*x*y*x*y^-1*x*y^-2*x*y^2*x*y^-1*x*y*x*y*x*y^-2*x^-1*y*x^-1*y^-2*x^-1</blockquote>Since there are sequences of operations that allow us to flip any two adjacent numbers or the parity of the turntable, it follows that all possible configurations are both solvable and achievable.<br />http://angrystatistician.blogspot.com/2013/07/solving-topspin-puzzle-using-gap.htmlnoreply@blogger.com (Christopher Long)0tag:blogger.com,1999:blog-9149402429183581490.post-5335926023005868274Wed, 10 Jul 2013 07:13:00 +00002013-07-10T00:16:09.399-07:00A Slightly Less Pointless Solution to Le Monde puzzle #824Here's <b><a href="http://xianblog.wordpress.com/2013/06/13/le-monde-puzzle-824/" target="_blank">Le Monde puzzle #824</a>:</b><br /><br />Show that, for any integer \(y\), \[(\sqrt{3}-1)^{2y}+(\sqrt{3}+1)^{2y}\] is an integer multiple of a power of two.<br /><br /><b>Solution:</b><br /><b><br /></b>Consider \[f(n) = (-1+\sqrt{3})^{n}+(-1-\sqrt{3})^{n}\] and observe that the two bases are the roots of the quadratic \(x^2 + 2x - 2\), hence \( f(n) \) obeys the recursion \( x_{n+2} = -2 x_{n+1} + 2 x_n \) with \( x_0=2\) and \(x_1=-2 \). It follows that \( f(n) \) is always an even integer.http://angrystatistician.blogspot.com/2013/07/a-slightly-less-pointless-solution-to.htmlnoreply@blogger.com (Christopher Long)0tag:blogger.com,1999:blog-9149402429183581490.post-2326554146127308818Sat, 29 Jun 2013 22:05:00 +00002013-06-29T15:05:59.179-07:00Asteroids Algebra<br /><b>Problem:</b><br /><br />Waldo is playing Asteroids. This game starts with 3 ships and you earn an extra ship for every 10000 points you score. At the end of his first game, Waldo noticed that he had the lowest score possible while averaging exactly 9000 points per ship. At the end of his second game, Waldo noticed that he had the highest score possible while averaging exactly 9000 points per ship. What were his two scores?<br /><br /><b>Solution:</b><br /><br />Let the number of starting ships be \(S\), points for an extra ship be \(X\) and the player's average scoring rate be \(A\). Furthermore, let \(R = A/X\) and the number of bonus ships earned by the player during the game be \(B\). The number of bonus ships earned by the player during his game is the floor of his final score over the bonus value \(X\). Algebraically, we have \[ B = \lfloor (S+B)\cdot R \rfloor.\] Now let \[ (S+B)\cdot R = \lfloor (S+B)\cdot R \rfloor + f,\] where \( 0 \leq f < 1 \). This gives us<br />\begin{aligned}<br />(S+B)\cdot R &= B + f ,\\<br />SR + B(R-1) &= f,\\<br />B &= (SR-f)/(1-R).<br />\end{aligned}<br />Together with \(0 \leq f < 1\) we get that \[\frac{SR-1}{1-R} < B \leq \frac{SR}{1-R}.\] Thus, the smallest possible number of bonus ships is \[ \left\lfloor \frac{SR-1}{1-R} + 1 \right\rfloor = \left\lfloor \frac{SR-R}{1-R}\right\rfloor \] and the greatest possible number of bonus ships is \[ \left\lfloor \frac{SR}{1-R}\right\rfloor. \] It follows that the lowest possible score while averaging \(A\) is \[ A\cdot \left\lfloor \frac{SR-R}{1-R} + S \right\rfloor = A\cdot \left\lfloor \frac{S-R}{1-R}\right\rfloor \] and the highest possible score while averaging \(A\) is \[ A\cdot \left\lfloor \frac{SR}{1-R} + S\right\rfloor = A\cdot \left\lfloor \frac{S}{1-R}\right\rfloor. \]<br />For Waldo, \(S=3\), \(A=9000\) and \(R=9000/10000 = 0.9\). Waldo therefore scored<br />\[ 9000\cdot \left\lfloor \frac{3-0.9}{1-0.9} \right\rfloor = 189000\] points in his first game, \[ 9000\cdot \left\lfloor \frac{3}{1-0.9} \right\rfloor = 270000\] points in his second game.<br /><br />http://angrystatistician.blogspot.com/2013/06/asteroids-algebra.htmlnoreply@blogger.com (Christopher Long)0tag:blogger.com,1999:blog-9149402429183581490.post-5111765361407049258Wed, 26 Jun 2013 06:31:00 +00002013-06-25T23:35:16.273-07:00Interview Doubler<br /><b>Problem:</b><br /><br />Find the smallest positive number that doubles when you move the last digit to the front.<br /><br /><b>Solution:</b><br /><br />The answer is \( 105263157894736842 \), and the solution is similar to <a href="http://angrystatistician.blogspot.com/2013/06/twisty-temperature.html" target="_blank">Twisty Temperature</a>. Let this number be \[ x = x_{n}\cdot 10^{n-1} + ... + x_{1}\cdot 10^1 + x_{0} \] with \( 0 < x_{n} < 10 \), then after moving the last digit to the front we get \[ y = x_{0}\cdot 10^{n-1} + (x-x_{0})/10. \] We also have that \( y = 2 x \), so our equation becomes \[ 19\cdot x = x_0 \cdot (10^{n}-1). \] Now 10 is a <a href="http://en.wikipedia.org/wiki/Primitive_root_modulo_n" target="_blank">primitive root</a> modulo 19, and so it follows that \( x \) is integral if and only if \( n \) is of the form \( 18\cdot m \). Also note that we need \( 2 \le x_0 \le 9 \) since we require \( 0< x_{n} < 10\). When \( m=1 \) and \( x_0 = 2\) we get \( x =105263157894736842 \); when \( m=2\) and \(x_0 = 2\) we get \[ 2 (10^{36}-1)/19 = 105263157894736842105263157894736842.\] That this number indeed doubles when you move the last digit to the front can be verified by <a href="http://www.wolframalpha.com/input/?i=105263157894736842105263157894736842*2" target="_blank">WolframAlpha</a>.http://angrystatistician.blogspot.com/2013/06/interview-doubler.htmlnoreply@blogger.com (Christopher Long)0tag:blogger.com,1999:blog-9149402429183581490.post-9183522418970664436Tue, 25 Jun 2013 06:51:00 +00002013-06-24T23:51:11.169-07:00Twisty Temperature<br /><b>Problem:</b><br /><br />When Waldo recently did a conversion of a positive integral Celsius temperature \( c = 275 \) to its Fahrenheit equivalent \( f \) (which turned out to be \( 527\) ), he noticed to his amazement that he could have simply moved the last digit of \( c \) to the front to obtain \( f \). Doing some intense calculations he failed to discover the next largest such example. Does one exist, and if so, what is it?<br /><br /><b>Solution:</b><br /><br />Let \( c = x_{n}\cdot 10^{n-1} + ... + x_{1}\cdot 10^1 + x_{0} \) with \( x_{n} > 0 \), then \[ f = x_{0}\cdot 10^{n-1} + (c-x_{0})/10. \] We also have that \[ f = (9/5)\cdot c + 32. \] Notice that in order for \( f \) to be integral \( c \) must be divisible by 5; this implies that \( x_0=5 \) since it cannot equal 0 (since as a number \( f>c \)). Our equation then becomes \[ (9/5)\cdot c + 32 = 5\cdot 10^{n-1} + (c-5)/10 \] implies \[ c = 5\cdot (10^n - 65)/17. \] Now it turns out that 10 is a <a href="http://en.wikipedia.org/wiki/Primitive_root_modulo_n" target="_blank"><b>primitive root</b></a> modulo 17, and so it follows that \( c \) is integral if and only if \( n \) is of the form \( 16\cdot m + 3 \). When \( m=0\) we get \( c=275\); when \( m=1\) we get the next highest such temperature, which is \[ 5\cdot (10^{19}-65)/17 = 2941176470588235275.\]<br />http://angrystatistician.blogspot.com/2013/06/twisty-temperature.htmlnoreply@blogger.com (Christopher Long)0tag:blogger.com,1999:blog-9149402429183581490.post-7078318996494822863Wed, 10 Apr 2013 20:31:00 +00002013-04-10T13:48:24.313-07:00Lunchtime Sports Science: Introducing tanh5As I mentioned in a <a href="http://angrystatistician.blogspot.com/2013/03/baseball-chess-psychology-and.html">previous article on ratings systems</a>, the <a href="http://en.wikipedia.org/wiki/Log5"><b>log5</b></a> estimate for participant 1 beating participant 2 given respective success probabilities \( p_1, p_2 \) is<br /><div>\begin{align}<br />p &= \frac{p_1 q_2}{p_1 q_2+q_1 p_2}\\<br />&= \frac{p_1/q_1}{p_1/q_1+p_2/q_2}\\<br />\frac{p}{q} &= \frac{p_1}{q_1} \cdot \frac{q_2}{p_2}<br />\end{align} where \( q_1=1-p_1, q_2=1-p_2, q=1-p \).<br /><br /><div>Where does this come from? Assume that both participants each played average opposition. In a <a href="http://en.wikipedia.org/wiki/Pairwise_comparison"><b>Bradley-Terry</b></a> setting, this means</div>\begin{align}<br />p_1 &= \frac{R_1}{R_1 + 1}\\<br />p_2 &= \frac{R_2}{R_2 + 1},<br />\end{align} where \( R_1 \) and \( R_2 \) are the (latent) Bradley-Terry ratings; the \( 1 \) in the denominators is an estimate for the average rating of the participants they've played en route to achieving their respective success probabilities.</div><div><div><br /></div><div>In a Bradley-Terry setting, it's true that the product of the ratings in the entire pool is taken to equal 1. But participants don't play themselves! Thus, if participant 1 played every participant but itself, the average opponent would have rating \( R \), where \( R_1 \cdot R^{n-1} = 1 \). Here \( n \) is the number of participants in the pool.<br /><br />Our strength estimate for the average opponent faced is then</div><div>\begin{align}<br />R &= {R_1}^{-\frac{1}{n-1}}.<br />\end{align} </div><div>There are two extreme cases. If \( n=2 \), then \( R = \frac{1}{R_1} \); as \( n \to +\infty \), \( R \to 1 \).</div><div><br />The limit extreme case is <b>log5</b>; the \(n=2\) extreme case I call <b>tanh5</b>. We compute</div><div>\begin{align}<br />p_1 &= \frac{R_1}{R_1 + 1/R_1}\\<br />p_2 &= \frac{R_2}{R_2 + 1/R_2}\\<br />\textrm{tanh5} = p &= \frac{ \sqrt{p_1 q_2} }{ \sqrt{p_1 q_2} + \sqrt{q_1 p_2} }.<br />\end{align} </div><div>Why <b>tanh5</b>? We can think of <b>log5</b> as derived from the <a href="http://en.wikipedia.org/wiki/Logistic_function" target="_blank"><b>logistic function</b></a> by setting \( \log(R)=0 \) for the opponent's rating; analogously, <b>tanh5</b> is derived from the <a href="http://en.wikipedia.org/wiki/Hyperbolic_function" target="_blank"><b>hyperbolic tangent function</b></a> by setting \( \log(R)=0 \) for the opponent's rating.</div><div><br /></div><div>Note that we have a spectrum of estimates corresponding to each value for \(n\), but these are the two extremes. This also gives a new spectrum of <a href="http://en.wikipedia.org/wiki/Activation_function">activation functions</a> for <a href="http://en.wikipedia.org/wiki/Neural_network">neural networks</a>, but I'll explore this application later.</div></div>http://angrystatistician.blogspot.com/2013/04/lunchtime-sports-science-introducing.htmlnoreply@blogger.com (Christopher Long)0tag:blogger.com,1999:blog-9149402429183581490.post-7002778674416622507Wed, 03 Apr 2013 22:06:00 +00002013-04-03T15:18:27.757-07:00Lunchtime Sports Science: Fitting a Bradley-Terry ModelPower rankings are game rankings that also allow you to estimate the likely outcome if two opponents were to face each other. One of the simplest of these models is known as the <a href="http://en.wikipedia.org/wiki/Pairwise_comparison" target="_blank"><b>Bradley-Terry-Luce</b></a> model (or commonly, Bradley-Terry). The idea is that each player \( i \) is assumed to have an unknown rating \( R_i \). If players \( i \) and \( j \) compete, the probability that \( i \) wins under this model is expected to be about \[ \frac{R_i}{R_i + R_j}. \] This model is very popular for hockey and other games; one commonly seen version is called <a href="http://www.collegehockeynews.com/info/?d=krach" target="_blank"><b>KRACH</b></a>.<br /><br />Let's fit a Bradley-Terry model to the current season of NCAA D1 men's hockey. The <a href="http://www.ncaa.com/sports/icehockey-men/d1" target="_blank"><b>Frozen Four</b></a> starts on Thursday, April 11, so you'll get to see how well your predictions do.<br /><br />You'll need to have R installed. Once R is installed, install the "BradleyTerry2" package that's freely available for R (thanks to Heather Turner and David Firth). To do this, start R and run the following command; you'll have to pick a source.<br /><pre class="code_r">install.packages("BradleyTerry2")</pre>Next, download two files from my hockey GitHub - R code that fits a basic Bradley-Terry model and a data file containing the NCAA D1 men's hockey game results going back to 1998.<br /><br /><a href="https://github.com/octonion/hockey/blob/master/lunchtime/uscho_btl.R">https://github.com/octonion/hockey/blob/master/lunchtime/uscho_btl.R</a><br /><a href="https://github.com/octonion/hockey/blob/master/lunchtime/uscho_games.csv">https://github.com/octonion/hockey/blob/master/lunchtime/uscho_games.csv</a><br /><br />Make sure both files are in the same directory and run the R code. That's it, you've built a power ranking using a Bradley-Terry model. You should get output that looks like this:<br /><br /><span style="font-family: Courier New, Courier, monospace;"> ability s.e.</span><br /><span style="font-family: Courier New, Courier, monospace;">Quinnipiac 1.687042594 0.5678939</span><br /><span style="font-family: Courier New, Courier, monospace;">Massachusetts-Lowell 1.480098569 0.5872701</span><br /><span style="font-family: Courier New, Courier, monospace;">Minnesota 1.428503522 0.5638946</span><br /><span style="font-family: Courier New, Courier, monospace;">Yale 1.115338226 0.5641414</span><br /><span style="font-family: Courier New, Courier, monospace;">Miami 1.114307264 0.5479346</span><br /><span style="font-family: 'Courier New', Courier, monospace;">Notre Dame 1.109912670 0.5523657</span><br /><span style="font-family: 'Courier New', Courier, monospace;">Boston College 1.091836391 0.5815233</span><br /><span style="font-family: 'Courier New', Courier, monospace;">St. Cloud State 1.079314965 0.5573018</span><br /><span style="font-family: Courier New, Courier, monospace;"><br /></span>The order should be the same as <a href="http://www.uscho.com/rankings/krach/d-i-men/" target="_blank"><b>USCHO's KRACH rankings</b></a>.<br /><span style="font-family: inherit;"><br /></span><span style="font-family: inherit;">How do we use these ability estimates to predict game outcomes? These values are the logarithms of the ratings I've mentioned above, so first apply the exponential to get the rating, then the estimated winning probability is the team's rating divided by the sum of the team and opponent ratings. For the teams in the Frozen Four we get a power rating of \( e^{1.687} = 5.40 \) for Quinnipiac and </span>\( e^{1.079} = 2.94 \) for St. Cloud State, so we estimate the probability of Quinnipiac beating St. Cloud State to be about \[ \frac{5.40}{5.40+2.94} = 0.65. \] What's your estimate for Massachusetts-Lowell beating Yale?http://angrystatistician.blogspot.com/2013/04/lunchtime-sports-science-fitting.htmlnoreply@blogger.com (Christopher Long)0tag:blogger.com,1999:blog-9149402429183581490.post-8956268123135262261Mon, 01 Apr 2013 23:03:00 +00002013-04-01T16:03:44.806-07:00Lunchtime Sports Science: Cracking a New SportThis is the first and what will be a series of relatively short pieces on sports analytics. I'll be using a variety of sports for examples, including both team sports and single-player sports, and I'll also make my code available through my <a href="https://github.com/octonion" target="_blank">GitHub account</a>.<br /><br />Here are my recommended tools. If you're unfamiliar with some of these, don't worry. You'll pick them up as you go along, and they form a powerful suite that will keep you on the cutting edge even as a professional data scientist.<br /><ol><li><b>Hardware</b> - Ideally you want at least 4GB of RAM for larger data sets, but you'll be able to do high-level analysis with almost any modern computing hardware.</li><li><b>Linux operating system</b> - You can certainly do top-notch data analysis using any operating system, but Linux is an excellent (and free) working environment. There are a variety of ways to install and use Linux, but I'd recommend <a href="http://www.ubuntu.com/download/desktop/windows-installer" target="_blank"><b>Ubuntu's Windows installer</b></a>. This will allow you to easily install Ubuntu alongside Windows, and it also makes it easy to uninstall Ubuntu later (if you choose). Ubuntu is just one of the (many) Linux distributions available, but it's very popular and well supported.</li><li><b>R programming language</b> - R is a powerful statistical programming language, and it has thousands of available packages available. If you're using Ubuntu, installing R is simple - <a href="http://cran.r-project.org/bin/linux/ubuntu/README" target="_blank"><b>sudo apt-get install r-base</b></a>. That's it!</li><li><b>Python programming language</b> -Python is a powerful and relatively easy to use programming language. One of the most common tasks for sports analytics is <a href="http://en.wikipedia.org/wiki/Web_scraping" target="_blank">web scraping</a>, and <a href="http://www.slideshare.net/maikroeder/overview-of-python-web-scraping-tools" target="_blank">Python is an excellent choice</a> thanks to libraries such as <a href="http://wwwsearch.sourceforge.net/mechanize/" target="_blank">Mechanize</a>, <a href="http://www.crummy.com/software/BeautifulSoup/bs4/doc/" target="_blank">Beautiful Soup</a> and <a href="http://lxml.de/" target="_blank">lxml</a>. It's also a great language for <a href="http://en.wikipedia.org/wiki/Data_cleansing" target="_blank">data cleansing</a>. Installing Python under Ubuntu - <b>sudo apt-get install python</b>.</li><li><b>PostgreSQL database server</b> - There are many ways to store and analyze data sets, but a dedicated <a href="https://en.wikipedia.org/wiki/Relational_database" target="_blank">relational database</a> server is necessary tool for high-level analytics. <a href="http://www.postgresql.org/" target="_blank">PostgreSQL</a> is my personal recommendation, but there are other good choices (such as <a href="http://www.mysql.com/" target="_blank">MySQL</a>). PostgreSQL is free, fast, powerful and has a <a href="http://en.wikipedia.org/wiki/PostgreSQL#Procedural_languages" target="_blank">huge variety of procedural languages available</a> (including R and Python). Installing PostgreSQL under Ubuntu - <b>sudo apt-get install postgresql-9.2</b>.</li><li><b>GitHub account</b> - Setting up a <a href="https://github.com/" target="_blank"><b>GitHub account</b></a> is free and will allow you to automatically following any changes to various sports analytics GitHub projects (<a href="https://github.com/octonion" target="_blank">such as mine</a>). Later, you can set up your own repositories if you'd like to share your own work with other people. Don't forget to install git under Ubuntu - <b>sudo apt-get install git</b>.</li><li>We'll start with analyzing hockey. If you'd like to take a look at some of my hockey code and data that I've scraped, you can find them in <a href="https://github.com/octonion/hockey" target="_blank">my hockey GitHub repository</a>. If you've set up Ubuntu and have installed git you can execute the command <b>git clone https://github.com/octonion/hockey.git</b> to make a local copy of my repository.</li></ol><div>Here's a basic outline for tackling a new sport.</div><div><ol><li><b>Understand how teams win</b> - Build a model to project the likely outcome for a team when facing a particular opponent. Example - <a href="http://www.collegehockeynews.com/ratings/krach.php" target="_blank">Krach for hockey</a> (which is based on the Bradley-Terry model).</li><li><b>Understand how teams score</b> - Build a model to project how many goals/points/runs teams are likely to score or allow when facing a particular opponent. Exampe - <a href="http://hockeyanalytics.com/2004/09/poisson-toolbox/" target="_blank">Poisson distribution and hockey</a>.</li><li><b>Relate the two</b> - Characterize the relationship between scoring and winning or losing. Example - <a href="http://en.wikipedia.org/wiki/Pythagorean_expectation" target="_blank">Pythagorean win expectation</a>.</li><li><b>Understand how players score/prevent scoring</b> - Determine which aspects of player performance impact team offense and defense and by how much.</li><li><b>Understand player contribution to winning/losing</b> - This is nearly automatic once you understand the relationship between team offense/defense and team winning/losing.</li></ol><div>In my next article we'll build a basic <a href="http://en.wikipedia.org/wiki/Sports_rating_system" target="_blank">power ranking</a> model for hockey to predict likely game outcomes.</div></div>http://angrystatistician.blogspot.com/2013/04/lunchtime-sports-science-cracking-new.htmlnoreply@blogger.com (Christopher Long)0tag:blogger.com,1999:blog-9149402429183581490.post-3725092235959553735Wed, 27 Mar 2013 05:05:00 +00002013-03-26T22:05:11.366-07:00Solving Recurrences with Difference EquationsHere's an example from Robert Sedgewick's <a href="https://class.coursera.org/introACpartI-001/class" target="_blank">course on analytic combinatorics</a>.<br /><br />Solve the recurrence \[ a_n = 3 a_{n−1} − 3 a_{n−2} + a_{n−3} \] for \( n>2 \) with \( a_0=a_1=0 \) and \( a_2=1 \).<br /><br />Let \( f(n) = a_n \); in the language of difference equations the above becomes simply<br />\[ \frac{{\Delta^3} f}{{\Delta n}^3 } = 0 . \] Immediately,<br />\[ f(n) = c_2 n^2 + c_1 n + c_0 . \] Applying the initial conditions we get<br />\[ c_0 = 0, c_2 + c_1 = 0, 4 c_2 + 2 c_1 = 1, \] and so the solution is \( a_n = \frac{1}{2} n^2 - \frac{1}{2} n \).<br /><br />Now what if the initial conditions are changed so \( a_1 = 1 \)?<br /><br />http://angrystatistician.blogspot.com/2013/03/solving-recurrences-with-difference.htmlnoreply@blogger.com (Christopher Long)0tag:blogger.com,1999:blog-9149402429183581490.post-1454188759237578763Thu, 14 Mar 2013 11:10:00 +00002013-03-14T14:57:37.212-07:00Baseball, Chess, Psychology and Pychometrics: Everyone Uses the Same Damn Rating SystemHere's a short summary of the relationship between common models used in baseball, chess, psychology and education. The starting point for examining the connections between various extended models in these areas. The next steps include multiple attempts, guessing, ordinal and multinomial outcomes, uncertainty and volatility, multiple categories and interactions. There are also connections to standard optimization algorithms (neural nets, simulated annealing).<br /><br /><b>Baseball</b><br /><br />Common in baseball and other sports, the <a href="http://en.wikipedia.org/wiki/Log5" target="_blank"><b>log5</b></a> method provides an estimate for the probability \( p \) of participant 1 beating participant 2 given respective success probabilities \( p_1, p_2 \). Also let \( q_* = 1 -p_* \) in the following. The log5 estimate of the outcome is then:<br /><br />\begin{align}<br />p &= \frac{p_1 q_2}{p_1 q_2+q_1 p_2}\\<br /> &= \frac{p_1/q_1}{p_1/q_1+p_2/q_2}\\<br />\frac{p}{q} &= \frac{p_1}{q_1} \cdot \frac{q_2}{p_2}<br />\end{align}<br /><br />The final form uses the <a href="http://en.wikipedia.org/wiki/Odds_ratio" target="_blank"><b>odds ratio</b></a>, \( \frac{p}{q} \). Additional factors can be easily chained using this form to provide more complex estimates. For example, let \( p_e \) be an environmental factor, then:<br /><br />\begin{align}<br />\frac{p}{q} &= \frac{p_1}{q_1} \cdot \frac{q_2}{p_2} \cdot \frac{q_e}{p_e}<br />\end{align}<br /><br /><b>Chess</b><br /><br />The most common rating system in chess is the <a href="http://en.wikipedia.org/wiki/Elo_rating_system" target="_blank"><b>Elo rating system</b></a>. This has also been adopted for various other uses, e.g. ``hot or not'' websites. This system assigns ratings \( R_1, R_2 \) to players 1 and 2 such that the probability of player 1 beating player 2 is approximately:<br /><br />\begin{align}<br />p &= \frac{e^{R_1/C}}{e^{R_1/C}+e^{R_2/C}}<br />\end{align}<br /><br />Here \( C \) is just a scaling factor (typically \( 400/\ln{10} \) ). The Elo rating is connected to log5 via setting \( e^{R/C} = p/q \). We then recover:<br /><br />\begin{align}<br />\frac{p}{q} &= e^{R/C}\\<br />p &= \frac{e^{R/C}}{1+e^{R/C}}\\<br />R &= C\cdot \ln(p/q)<br />\end{align}<br /><br />Note that \( p \) is also the probability of this player beating another player with Elo rating 0. The Elo system generally includes enhancements accounting for ties, first-move advantage and also an online algorithm for updating ratings. We'll revisit these features later.<br /><br /><b>Psychology</b><br /><br />The <a href="http://en.wikipedia.org/wiki/Pairwise_comparison" target="_blank"><b>Bradley-Terry-Luce</b></a> (BTL) model is commonly used in psychology. Given two items, the probability \( p \) that item 1 is ranked over item 2 is approximately:<br /><br />\begin{align}<br />p &= \frac{Q_1}{Q_1+Q_2}<br />\end{align}<br /><br />In this context \( Q_* \) typically reflects the amount of a certain quality. That this model is equivalent to the previous models is immediate:<br /><br />\begin{align}<br />Q &= e^{R/C} = p/q\\<br />R &= C\cdot \ln(Q) = C\cdot \ln(p/q)\\<br />p &= \frac{Q}{1+Q}<br />\end{align}<br /><br /><b>Psychometrics</b><br /><br />The dichotomous (two-response) <a href="http://en.wikipedia.org/wiki/Rasch_model" target="_blank"><b>Rasch</b></a> and <a href="http://en.wikipedia.org/wiki/Item_response_theory" target="_blank"><b>item response models</b></a> are commonly used in psychometrics. For the Rasch model, let \( r_1 \) represent a measurement of ability and \( r_2 \) the difficulty of the test item. The Rasch model estimates the probability of correct response \( p \) as:<br /><br />\begin{align}<br />p &= \frac{e^{r_1-r_2}}{1+e^{r_1-r_2}}<br />\end{align}<br /><br />The one-parameter item response model estimates:<br /><br />\begin{align}<br />p &= \frac{1}{1+e^{r_2-r_1}}<br />\end{align}<br /><br />These are clearly equivalent to each other and to the previous models.http://angrystatistician.blogspot.com/2013/03/baseball-chess-psychology-and.htmlnoreply@blogger.com (Christopher Long)0tag:blogger.com,1999:blog-9149402429183581490.post-3268426197222386284Mon, 25 Feb 2013 19:41:00 +00002013-02-25T11:43:35.302-08:00Rejoining the San Diego PadresGood news - I'm rejoining the San Diego Padres as a consulting analyst!<br /><br /><a href="https://twitter.com/octonion" target="_blank">Follow me on Twitter</a><br /><a href="https://www.linkedin.com/pub/christopher-d-long/2/67b/432" target="_blank">Join me on LinkedIn</a>http://angrystatistician.blogspot.com/2013/02/rejoining-san-diego-padres.htmlnoreply@blogger.com (Christopher Long)0tag:blogger.com,1999:blog-9149402429183581490.post-4955939776889959120Thu, 21 Feb 2013 07:08:00 +00002013-02-21T01:31:38.025-08:00Mining the First 3.5 Million California Unclaimed Property Records<br /><span style="font-family: Courier New, Courier, monospace;">As I mentioned in my <a href="http://angrystatistician.blogspot.com/2013/02/sergey-brin-please-pick-up-your.html" target="_blank">previous article</a> the state of California has over $6 billion in assets listed in its <a href="http://scoweb.sco.ca.gov/UCP/" target="_blank">unclaimed property database</a>. </span><span style="font-family: 'Courier New', Courier, monospace;">The search interface that California provides is really too simplistic for this type of search, as misspelled names and addresses are both common and no doubt responsible for some of these assets going unclaimed. There is an alternative, however - scrape the entire database and mine it at your leisure using any tools you want.</span><br /><span style="font-family: 'Courier New', Courier, monospace;"><br /></span><span style="font-family: 'Courier New', Courier, monospace;">Here's a <a href="https://github.com/galizur/ilikemoney" target="_blank">basic little scraper written in Ruby</a>.</span><br /><span style="font-family: 'Courier New', Courier, monospace;"><br /></span><span style="font-family: 'Courier New', Courier, monospace;">It's a slow process, but I've managed to pull about 10% of the full database in the past 24 hours (</span><span style="font-family: Courier New, Courier, monospace;">3.5 million out of about 36 million).</span><br /><br /><span style="font-family: Courier New, Courier, monospace;">What does the distribution of these unclaimed assets look like? </span><br /><span style="font-family: 'Courier New', Courier, monospace;"><br /></span><span style="font-family: 'Courier New', Courier, monospace;">Among those with non-zero cash reported amounts:</span><br /><ul><li><span style="font-family: Courier New, Courier, monospace;">Total value - $511 million</span></li><li><span style="font-family: Courier New, Courier, monospace;">Median value - $15</span></li><li><span style="font-family: Courier New, Courier, monospace;">Mean value - $157</span></li><li><span style="font-family: Courier New, Courier, monospace;">90th percentile - $182</span></li><li><span style="font-family: Courier New, Courier, monospace;">95th percentile - $398</span></li><li><span style="font-family: Courier New, Courier, monospace;">98th percentile - $1,000</span></li><li><span style="font-family: Courier New, Courier, monospace;">99th percentile - $1,937</span></li><li><span style="font-family: Courier New, Courier, monospace;">99.9th percentile - $14,203</span></li><li><span style="font-family: Courier New, Courier, monospace;">99.99th percentile - $96,478</span></li></ul><div><span style="font-family: Courier New, Courier, monospace;">Visually, it looks like this:</span></div><div><span style="font-family: Courier New, Courier, monospace;"><br /></span></div><div class="separator" style="clear: both; text-align: center;"><a href="http://2.bp.blogspot.com/-Tf972lF1hjU/USWyHBNXxTI/AAAAAAAAAfY/z6mDo1R3bxA/s1600/ucp_distribution.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="http://2.bp.blogspot.com/-Tf972lF1hjU/USWyHBNXxTI/AAAAAAAAAfY/z6mDo1R3bxA/s320/ucp_distribution.jpg" width="320" /></a></div><ul><li><span style="font-family: Courier New, Courier, monospace;">548309 have value >= $100</span></li><li><span style="font-family: Courier New, Courier, monospace;">67452 have value >= $1,000</span></li><li><span style="font-family: Courier New, Courier, monospace;">4954 have value >= $10,000</span></li><li><span style="font-family: Courier New, Courier, monospace;">304 have have value >= $100,000</span></li><li><span style="font-family: Courier New, Courier, monospace;">4 have value >= $1,000,000</span></li><li><span style="font-family: Courier New, Courier, monospace;">The largest value was $8,050,000</span></li></ul><div><span style="font-family: Courier New, Courier, monospace;">The top 10 by value:</span></div><div><ol><li><span style="font-family: Courier New, Courier, monospace;"><a href="http://scoweb.sco.ca.gov/UCP/NoticeDetails.aspx?propertyRecID=836294" target="_blank">$8,050,000 to Jasmine Holdco</a>. Entered as Hold Co. Sent <a href="http://investing.businessweek.com/research/stocks/private/person.asp?personId=133589" target="_blank">CEO Alexander Slusky</a> a note on LinkedIn.</span></li><li><span style="font-family: Courier New, Courier, monospace;"><a href="http://scoweb.sco.ca.gov/UCP/InterimDetails.aspx?propertyRecID=79851">$2,183,062 to someone associated with Procket Networks</a>. Procket was bought by Cisco. Sent former <a href="http://www.businesswire.com/news/home/20080721005949/en/U4EA-Technologies-Appoints-Curt-Mason-CFO" target="_blank">Procket CFO Curtis Mason</a> a note on LinkedIn.</span></li><li><span style="font-family: Courier New, Courier, monospace;"><a href="http://scoweb.sco.ca.gov/UCP/NoticeDetails.aspx?propertyRecID=618617" target="_blank">$1,669,561 to Wyle Systems</a>. Address and city misspelled. Sent email to Wyle, was told this money belongs to Wyle Electronics, which in turn was purchased by Arrow Electronics.</span></li><li><span style="font-family: Courier New, Courier, monospace;"><a href="http://scoweb.sco.ca.gov/UCP/NoticeDetails.aspx?propertyRecID=213042" target="_blank">$1,419,929 to Anne Baronia</a>. Appears to have moved. Sent her notes on Facebook and LinkedIn.</span></li><li><span style="font-family: Courier New, Courier, monospace;"><a href="http://scoweb.sco.ca.gov/UCP/NoticeDetails.aspx?propertyRecID=854852" target="_blank">$777,856 to Citi Residential Lending</a>. Premium refund.</span></li><li><span style="font-family: Courier New, Courier, monospace;"><a href="http://scoweb.sco.ca.gov/UCP/NoticeDetails.aspx?propertyRecID=213043" target="_blank">$639,000 to Martin Peter Wright</a>. Funds for liquidation.</span></li><li><span style="font-family: Courier New, Courier, monospace;"><a href="http://scoweb.sco.ca.gov/UCP/NoticeDetails.aspx?propertyRecID=1420760" target="_blank">$611,460 to Jeanne and Melvin Hing</a>. Mature CD or savings certificate.</span></li><li><span style="font-family: Courier New, Courier, monospace;"><a href="http://scoweb.sco.ca.gov/UCP/NoticeDetails.aspx?propertyRecID=1566054" target="_blank">$520,761 to Loretta Nisewander</a>. Checking account.</span></li><li><span style="font-family: Courier New, Courier, monospace;"><a href="http://scoweb.sco.ca.gov/UCP/NoticeDetails.aspx?propertyRecID=10077958" target="_blank">$507,077 to Payroll</a> (no address). Now you know what to name your next kid. "This here's my son Payroll, and this one's my daughter Unknown".</span></li><li><span style="font-family: Courier New, Courier, monospace;"><a href="http://scoweb.sco.ca.gov/UCP/NoticeDetails.aspx?propertyRecID=23834" target="_blank">$450,000 Joseph Mallon</a>. Funds for liquidation.</span></li></ol><div><span style="font-family: Courier New, Courier, monospace;">People who should be easy to contact:</span></div></div><div><ol><li><span style="font-family: Courier New, Courier, monospace;"><a href="http://scoweb.sco.ca.gov/UCP/NoticeDetails.aspx?propertyRecID=216232" target="_blank">$58,946 to Ehren Maedge</a>. COO of Scale Computing, Foundation Capital.</span></li><li><span style="font-family: Courier New, Courier, monospace;"><a href="http://scoweb.sco.ca.gov/UCP/NoticeDetails.aspx?propertyRecID=680624" target="_blank">$408,105</a> and <a href="http://scoweb.sco.ca.gov/UCP/NoticeDetails.aspx?propertyRecID=680625" target="_blank">$94,001</a> to Dane Prenovitz. Director of the Dani Investment Collection, appears to go by <a href="http://www.djdanemitchell.com/inc/press.cfm" target="_blank">DJ Dane Mitchell</a> now.</span></li><li><span style="font-family: Courier New, Courier, monospace;"><a href="http://scoweb.sco.ca.gov/UCP/InterimDetails.aspx?propertyRecID=83223" target="_blank">$322,826</a> to the <a href="https://www.facebook.com/pages/Hearts-Afire-Foundation/116433678461773" target="_blank">Hearts Afire Foundation</a>.</span></li><li><span style="font-family: Courier New, Courier, monospace;"><a href="http://scoweb.sco.ca.gov/UCP/InterimDetails.aspx?propertyRecID=121232" target="_blank">$160,825</a> to the Vernon Otto Wahrenbrock Trust. Wahrenbrock's was the biggest used bookstore here in San Diego - it was very sad when Vernon passed away and Wahrenbrock's closed. His grandson, Craig Maxwell, <a href="http://maxwellshouseofbooks.com/about" target="_blank">owns a bookstore in La Mesa</a>.</span></li><li><span style="font-family: Courier New, Courier, monospace;"><a href="http://scoweb.sco.ca.gov/UCP/PropertyDetails.aspx?propertyRecID=22965850" target="_blank">$10 to OJ Simpson from NetZero</a>.</span></li></ol><div><span style="font-family: Courier New, Courier, monospace;">Finally, <a href="http://scoweb.sco.ca.gov/UCP/PropertyDetails.aspx?propertyRecID=19287386">my vote for the person who wants the money back the least</a>.</span></div></div>http://angrystatistician.blogspot.com/2013/02/mining-first-35-million-california.htmlnoreply@blogger.com (Christopher Long)10tag:blogger.com,1999:blog-9149402429183581490.post-1924694804887930878Tue, 19 Feb 2013 06:35:00 +00002013-02-19T04:09:02.872-08:00Sergey Brin, Please Pick up your Paychecks<span style="font-family: inherit;">The state of California is currently holding over <a href="http://sco.ca.gov/upd.html" target="_blank">$6 billion</a> in unclaimed property belonging to millions of people. What type of property and who are the rightful owners? According to California's official unclaimed property website, these assets fall into the following categories:</span><br /><ol><li><span style="background-color: white; font-family: inherit; font-size: 13px;">Bank accounts and safe deposit box contents</span></li><li><span style="background-color: white; font-family: inherit; font-size: 13px;">Stocks, mutual funds, bonds, and dividends</span></li><li><span style="background-color: white; font-family: inherit; font-size: 13px;">Uncashed cashier's checks or money orders</span></li><li><span style="background-color: white; font-family: inherit; font-size: 13px;">Certificates of deposit</span></li><li><span style="background-color: white; font-family: inherit; font-size: 13px;">Matured or terminated insurance policies</span></li><li><span style="background-color: white; font-family: inherit; font-size: 13px;">Estates</span></li><li><span style="background-color: white; font-family: inherit; font-size: 13px;">Mineral interests and royalty payments, trust funds, and escrow accounts</span></li></ol><div><span style="font-family: inherit; ">People forget, people die, people move around. But $6 billion is a staggering amount of money; some of these amounts have to be really large. Let's try to find some interesting examples.</span></div><div><span style="font-family: inherit; font-size: x-small;"><br /></span></div><div><span style="font-family: inherit;">This is <a href="http://scoweb.sco.ca.gov/UCP/Default.aspx" target="_blank">official California UCP search form</a>. Programmer and database types will notice one problem immediately - no <a href="http://en.wikipedia.org/wiki/Approximate_string_matching" target="_blank">fuzzy string matching</a>. If your name or address was misspelled on the assets, or munged in the recording process, tracking down any assets belonging to you could become a difficult to impossible process. Given that this database has <b>at most</b> 18 million rows, there's no excuse for such a basic (and important) feature to be missing.</span></div><div><span style="font-family: inherit; font-size: x-small;"><br /></span></div><div><span style="font-family: inherit">Leaving aside the flaws and the apparent lack of due diligence on the part of the state of California, let's try to find some interesting examples of owed property.</span></div><div><span style="font-family: inherit; font-size: x-small;"><br /></span></div><div><span style="font-family: inherit;">Strategy - celebrities and the super wealthy with either unusual names and/or known cities (through Forbes or others).</span></div><div><ul><li><span style="font-family: inherit;">Steven Spielberg - <a href="http://scoweb.sco.ca.gov/UCP/PropertyDetails.aspx?propertyRecID=19027674" style="font-family: inherit;" target="_blank">Cashier's checks for $10000</a> (George Lucas is owed $35.92)</span></li><li><span style="font-family: inherit;">Richard P Feynman - <a href="http://scoweb.sco.ca.gov/UCP/PropertyDetails.aspx?propertyRecID=3001298" target="_blank">Liquidating fund earnings for $2864.57</a> (plus others totaling $644.16)</span></li><li><span style="font-family: inherit;">Charles R Schwab - <a href="http://scoweb.sco.ca.gov/UCP/PropertyDetails.aspx?propertyRecID=22454882" target="_blank">Funds for liquidation for $35682</a></span></li><li><span style="font-family: inherit;">Current Governor Jerry Brown - <a href="http://scoweb.sco.ca.gov/UCP/PropertyDetails.aspx?propertyRecID=4013355" target="_blank">Other for $2000</a> (plus others totaling $9505.02)</span></li><li><span style="font-family: inherit;">Sergey Brin - <a href="http://scoweb.sco.ca.gov/UCP/PropertyDetails.aspx?propertyRecID=3605986" target="_blank">Stanford University paychecks for $1184.49</a> (plus others totaling $1071.32)</span></li><li><span style="font-family: inherit;">Michael Jackson - <a href="http://scoweb.sco.ca.gov/UCP/PropertyDetails.aspx?propertyRecID=23582687" target="_blank">Safe deposit box</a> (plus wages from NBC for $93; mineral interests)</span></li><li><span style="font-family: inherit;">Paris Hilton - <a href="http://scoweb.sco.ca.gov/UCP/PropertyDetails.aspx?propertyRecID=17690465" target="_blank">Credit balance for $472.96</a> (plus others)</span></li><li><span style="font-family: inherit;">Steve Wozniak - <a href="http://scoweb.sco.ca.gov/UCP/PropertyDetails.aspx?propertyRecID=13074419" target="_blank">Misc outstanding checks for $269</a> (plus some money from Apple)</span></li><li><span style="font-family: inherit;">Meg Whitman - <a href="http://scoweb.sco.ca.gov/UCP/NoticeDetails.aspx?propertyRecID=1067370" target="_blank">$146.14</a> (credit balance from Tiffany & Co)</span></li></ul><div><span style="font-family: inherit;">On a personal note, I've found about $40K for friends plus businesses they own (over $20K in one case).</span></div></div><div><br /></div>http://angrystatistician.blogspot.com/2013/02/sergey-brin-please-pick-up-your.htmlnoreply@blogger.com (Christopher Long)12