Skip to main content

Posts

Showing posts from March, 2013

Why does Kaggle use Log-loss?

Solving Recurrences with Difference Equations

Here's an example from Robert Sedgewick's course on analytic combinatorics.

Solve the recurrence \[ a_n = 3 a_{n−1} − 3 a_{n−2} + a_{n−3} \] for \( n>2 \) with \( a_0=a_1=0 \) and \( a_2=1 \).

Let \( f(n) = a_n \); in the language of difference equations the above becomes simply
\[ \frac{{\Delta^3} f}{{\Delta n}^3 } = 0 . \] Immediately,
\[ f(n) = c_2 n^2 + c_1 n + c_0 . \] Applying the initial conditions we get
\[ c_0 = 0, c_2 + c_1 = 0, 4 c_2 + 2 c_1 = 1, \] and so the solution is \( a_n =  \frac{1}{2} n^2 - \frac{1}{2} n \).

Now what if the initial conditions are changed so \( a_1 = 1 \)?

Baseball, Chess, Psychology and Pychometrics: Everyone Uses the Same Damn Rating System

Here's a short summary of the relationship between common models used in baseball, chess, psychology and education. The starting point for examining the connections between various extended models in these areas. The next steps include multiple attempts, guessing, ordinal and multinomial outcomes, uncertainty and volatility, multiple categories and interactions. There are also connections to standard optimization algorithms (neural  nets, simulated annealing).

Baseball

Common in baseball and other sports, the log5 method provides an estimate for the probability \( p \) of participant 1 beating participant 2 given respective success probabilities \( p_1, p_2 \). Also let \( q_* = 1 -p_* \) in the following. The log5 estimate of the outcome is then:

\begin{align}
p &= \frac{p_1 q_2}{p_1 q_2+q_1 p_2}\\
  &= \frac{p_1/q_1}{p_1/q_1+p_2/q_2}\\
\frac{p}{q} &= \frac{p_1}{q_1} \cdot \frac{q_2}{p_2}
\end{align}

The final form uses the odds ratio, \( \frac{p}{q} \). Additional fac…