Skip to main content

Posts

Solving Recurrences with Difference Equations

Here's an example from Robert Sedgewick's course on analytic combinatorics . Solve the recurrence \[ a_n = 3 a_{n−1} − 3 a_{n−2} + a_{n−3} \] for \( n>2 \) with \( a_0=a_1=0 \) and \( a_2=1 \). Let \( f(n) = a_n \); in the language of difference equations the above becomes simply \[ \frac{{\Delta^3} f}{{\Delta n}^3 } = 0 . \] Immediately, \[ f(n) = c_2 n^2 + c_1 n + c_0 . \] Applying the initial conditions we get \[ c_0 = 0, c_2 + c_1 = 0, 4 c_2 + 2 c_1 = 1, \] and so the solution is \( a_n =  \frac{1}{2} n^2 - \frac{1}{2} n \). Now what if the initial conditions are changed so \( a_1 = 1 \)?

Baseball, Chess, Psychology and Pychometrics: Everyone Uses the Same Damn Rating System

Here's a short summary of the relationship between common models used in baseball, chess, psychology and education. The starting point for examining the connections between various extended models in these areas. The next steps include multiple attempts, guessing, ordinal and multinomial outcomes, uncertainty and volatility, multiple categories and interactions. There are also connections to standard optimization algorithms (neural  nets, simulated annealing). Baseball Common in baseball and other sports, the log5 method provides an estimate for the probability \( p \) of participant 1 beating participant 2 given respective success probabilities \( p_1, p_2 \). Also let \( q_* = 1 -p_* \) in the following. The log5 estimate of the outcome is then: \begin{align} p &= \frac{p_1 q_2}{p_1 q_2+q_1 p_2}\\   &= \frac{p_1/q_1}{p_1/q_1+p_2/q_2}\\ \frac{p}{q} &= \frac{p_1}{q_1} \cdot \frac{q_2}{p_2} \end{align} The final form uses the odds ratio , \( \frac{p}{q}...

Mining the First 3.5 Million California Unclaimed Property Records

As I mentioned in my previous article  the state of California has over $6 billion in assets listed in its unclaimed property database .  The search interface that California provides is really too simplistic for this type of search, as misspelled names and addresses are both common and no doubt responsible for some of these assets going unclaimed. There is an alternative, however - scrape the entire database and mine it at your leisure using any tools you want. Here's a basic little scraper written in Ruby . It's a slow process, but I've managed to pull about 10% of the full database in the past 24 hours ( 3.5 million out of about 36 million). What does the distribution of these unclaimed assets look like?  Among those with non-zero cash reported amounts: Total value - $511 million Median value - $15 Mean value - $157 90th percentile - $182 95th percentile - $398 98th percentile - $1,000 99th percentile - $1,937 99.9th percentile - $14,203 99.99th perc...

Sergey Brin, Please Pick up your Paychecks

The state of California is currently holding over $6 billion  in unclaimed property belonging to millions of people. What type of property and who are the rightful owners? According to California's official unclaimed property website, these assets fall into the following categories: Bank accounts and safe deposit box contents Stocks, mutual funds, bonds, and dividends Uncashed cashier's checks or money orders Certificates of deposit Matured or terminated insurance policies Estates Mineral interests and royalty payments, trust funds, and escrow accounts People forget, people die, people move around. But $6 billion is a staggering amount of money; some of these amounts have to be really large. Let's try to find some interesting examples. This is official California UCP search form . Programmer and database types will notice one problem immediately - no fuzzy string matching . If your name or address was misspelled on the assets, or munged in the recording proce...

Learning SQL

If you aren't aware of it, there's a free online course on databases (and SQL). I took it back when it ran live, but it's just as good self-paced. Jennifer Widom (Stanford) is an outstanding lecturer and the videos and assignments are excellent. SQLite is used to grade the online exercises, so I'd suggesting installing a local copy to experiment with as it's free. I'd also strongly recommend installing either MySQL or PostgreSQL (I recommend PostgreSQL; both are free) so you can learn while using a full-featured database server. BaseX is very helpful for learning XML and mastering XPath and XQuery for web scraping (also free). https://www.coursera.org/course/db http://www.postgresql.org/ http://basex.org/ BaseX has a module for handling JSON. I haven't used it personally, but it looks useful for learning about JSON. http://docs.basex.org/wiki/JSON_Module

NBA Predictions for 11/21/2012

h_str = home team strength (including home court advantage) o_str = opponent team strength (including away court disadvantage) pr_home = estimated probability of home team winning  home | opp | h_str | o_str | pr_home  ------+-----+-------+-------+---------  ATL  | WAS |  1.03 |  0.93 |    0.85  BOS  | SAS |  1.00 |  1.03 |    0.40  CHA  | TOR |  0.99 |  0.96 |    0.63  CLE  | PHI |  0.98 |  0.98 |    0.48  DAL  | NYK |  1.02 |  1.08 |    0.28  GSW  | BRK |  1.01 |  1.00 |    0.56  HOU  | CHI |  1.02 |  0.97 |    0.70  IND  | NOH |  1.01 |  0.95 |    0.72  MIA  | MIL |  1.07 |  0.99 |    0.78  MIN  | DEN |  1.01 |  0.99 |    0.58  OKC  | LAC |  1.05 | ...