Saturday, February 7, 2015

Touring Waldo; Overfitting Waldo; Scanning Waldo; Waldo, Waldo, Waldo

Randal Olson has written a nice article on finding Waldo - Here’s Waldo: Computing the optimal search strategy for finding Waldo. Randal presents a variety of machine learning methods to find very good search paths among the 68 known locations of Waldo. Of course, there's no need for an approximation; modern algorithms can optimize tiny problems like these exactly.

One approach would be to treat this as a traveling salesman problem with Euclidean distances as edge weights, but you'll need to add a dummy node that has edge weight 0 to every node. Once you have the optimal tour, delete the dummy node and you have your optimal Hamiltonian path.

I haven't coded in the dummy node yet, but here's the Waldo problem as a traveling salesman problem using TSPLIB format.


The Condorde software package optimizes this in a fraction of a second:


I'll be updating this article to graphically show you the results for the optimal Hamiltonian path. There are also many additional questions I'll address. Do we really want to use this as our search path? We're obviously overfitting. Do we want to assume Waldo will never appear in a place he hasn't appeared before? When searching for Waldo we see an entire little area, not a point, so a realistic approach would be to develop a scanning algorithm that covers the entire image and accounts for our viewing point and posterior Waldo density. We can also jump where we're looking at from point to point quickly while not searching for Waldo, but scans are much slower.

1 comment:

  1. Great/important stuff. Would love to see an efficient way to pull/clean CA's xls data!

    ReplyDelete