Touring Waldo; Overfitting Waldo; Scanning Waldo; Waldo, Waldo, Waldo

Randal Olson has written a nice article on finding Waldo - Here’s Waldo: Computing the optimal search strategy for finding Waldo. Randal presents a variety of machine learning methods to find very good search paths among the 68 known locations of Waldo. Of course, there's no need for an approximation; modern algorithms can optimize tiny problems like these exactly.

One approach would be to treat this as a traveling salesman problem with Euclidean distances as edge weights, but you'll need to add a dummy node that has edge weight 0 to every node. Once you have the optimal tour, delete the dummy node and you have your optimal Hamiltonian path.

I haven't coded in the dummy node yet, but here's the Waldo problem as a traveling salesman problem using TSPLIB format.

	NAME : waldo68
	COMMENT : 68 locations of Waldo
	TYPE : TSP
	DIMENSION : 68
	EDGE_WEIGHT_TYPE : EUC_2D
	NODE_COORD_SECTION
	0 0.625000000 7.708333333
	1 4.944444444 6.569444444
	2 5.430555556 6.402777778
	3 5.902777778 6.083333333
	4 5.430555556 5.444444444
	5 4.791666667 5.444444444
	6 5.305555556 5.125000000
	7 3.666666667 5.291666667
	8 6.069444444 4.638888889
	9 3.513888889 4.319444444
	10 2.388888889 4.486111111
	11 1.763888889 4.333333333
	12 2.083333333 3.847222222
	13 1.125000000 2.569444444
	14 0.805555556 2.250000000
	15 1.444444444 2.097222222
	16 0.638888889 1.777777778
	17 0.958333333 1.277777778
	18 0.958333333 0.972222222
	19 2.083333333 0.805555556
	20 2.388888889 2.729166667
	21 2.166666667 2.250000000
	22 2.486111111 2.250000000
	23 3.347222222 2.569444444
	24 3.513888889 2.083333333
	25 3.041666667 1.916666667
	26 4.305555556 2.888888889
	27 5.263888889 2.729166667
	28 5.750000000 2.250000000
	29 5.597222222 1.930555556
	30 4.625000000 1.277777778
	31 5.263888889 0.819444444
	32 5.416666667 0.333333333
	33 7.666666667 7.361111111
	34 6.861111111 6.736111111
	35 8.458333333 6.722222222
	36 10.37500000 7.194444444
	37 12.44444444 7.694444444
	38 10.37500000 6.430555556
	39 11.16666667 6.402777778
	40 11.33333333 6.083333333
	41 9.888888889 5.777777778
	42 9.250000000 5.291666667
	43 9.958333333 4.791666667
	44 9.097222222 4.652777778
	45 7.944444444 5.305555556
	46 7.763888889 5.000000000
	47 8.000000000 4.791666667
	48 7.333333333 5.138888889
	49 6.694444444 4.972222222
	50 7.166666667 4.500000000
	51 7.486111111 4.006944444
	52 11.01388889 4.819444444
	53 11.80555556 4.652777778
	54 12.27777778 4.805555556
	55 9.416666667 3.694444444
	56 9.250000000 3.222222222
	57 11.48611111 3.541666667
	58 11.95833333 3.541666667
	59 7.986111111 2.430555556
	60 6.694444444 1.625000000
	61 12.12500000 2.909722222
	62 11.51388889 2.756944444
	63 11.30555556 2.555555556
	64 10.72222222 1.930555556
	65 11.16666667 1.791666667
	66 12.44444444 1.625000000
	67 12.12500000 1.291666667
	EOF

view raw waldo69.tsp hosted with ❤ by GitHub

The Condorde software package optimizes this in a fraction of a second:

	68
	0 7 5 1 2 6 4 8 48 46
	45 42 43 55 44 47 50 51 49 3
	34 33 35 36 41 38 37 39 40 52
	54 53 58 57 61 66 67 65 64 63
	62 56 59 60 28 29 31 32 30 24
	25 19 17 18 15 16 14 13 20 22
	21 23 26 27 9 10 11 12

view raw waldo69.sol hosted with ❤ by GitHub

I'll be updating this article to graphically show you the results for the optimal Hamiltonian path. There are also many additional questions I'll address. Do we really want to use this as our search path? We're obviously overfitting. Do we want to assume Waldo will never appear in a place he hasn't appeared before? When searching for Waldo we see an entire little area, not a point, so a realistic approach would be to develop a scanning algorithm that covers the entire image and accounts for our viewing point and posterior Waldo density. We can also jump where we're looking at from point to point quickly while not searching for Waldo, but scans are much slower.

Comments

AnonymousFebruary 8, 2015 at 10:30 AM
Great/important stuff. Would love to see an efficient way to pull/clean CA's xls data!
ReplyDelete
Replies

The Angry Statistician

Search This Blog

Probability and Cumulative Dice Sums

Touring Waldo; Overfitting Waldo; Scanning Waldo; Waldo, Waldo, Waldo

Comments

Post a Comment

Popular posts from this blog

Mining the First 3.5 Million California Unclaimed Property Records

Mixed Models in R - Bigger, Faster, Stronger

A Bayes' Solution to Monty Hall