Monthly Archives: November 2007

iSnark

Apple’s running this obnoxious commercial featuring a possibly bogus airline pilot who, as a passenger on a weather-delayed flight, heroically saves the day by using his magical iPhone to check the weather:


(video link)

Now, an anecdotal tale has emerged where another guy tried to use his iPhone on a delayed flight to second-guess the airline’s ability to check to weather. He gets a flight attendant to pass his weather prediction on to the pilot, and the pilot’s response over the intercom is hilarious:

“If the passenger with the IPhone would be kind enough to use it to check the weather at our alternate, calculate our fuel burn due to being rerouted around the storms, call the dispatcher to arrange our release, and then make a phone call to the nearest Air Traffic Control center to arrange our timely departure amongst the other aircraft carrying passengers with IPhones, then we will be more than happy to depart. Please ring your call button to advise the Flight Attendant and your fellow passengers when you deem it ready and responsible for this multi-million dollar aircraft and its passengers to safely leave.”

ZING!

I don’t know if the story is true. But, dammit, I still love it.

(source)

Computers is smart

This fortune I read today seems like a natural follow-up to my previous post:

This is the first numerical problem I ever did. It demonstrates the
power of computers:

Enter lots of data on calorie & nutritive content of foods. Instruct
the thing to maximize a function describing nutritive content, with a
minimum level of each component, for fixed caloric content. The
results are that one should eat each day:

1/2 chicken
1 egg
1 glass of skim milk
27 heads of lettuce.

— Rev. Adrian Melott

Do you want to be a millionaire?

All you have to do is help Netflix read people’s minds!

The Netflix Prize is a contest that has been going on since October 2006. I didn’t hear about it until today. When you rate movies on the Netflix DVD rentals site, their proprietary Cinematch algorithm will predict which other movies you might like based on ratings that have been made by all Netflix users. It is very similar to Amazon’s “people who bought X also bought Y” feature. The Netflix Prize challenge is to come up with a new prediction system that is 10% more accurate than Cinematch. Whoever does this will get the top prize of $1,000,000. Netflix is also rewarding a periodic “progress prize” of $50,000 to people who can beat the last progress prize winner by 1%. The current progress prize winner, an AT&T Labs team named BellKor, has a technique that yields a prediction improvement of 8.5% over Cinematch. Read about their technique here. Their technique makes my brain hurt — it blends together 107 different results from a large ensemble of data mining models, including neighborhood-based models (k-NN), factorization models (such as Ridge regression), regressions based on Gaussian priors, restricted Boltzmann Machines, asymmetric factor models, and regression models (using principal components analysis for feature selection, and SVD vectors as predictor variables). Basically, they packed a data mining shotgun with as much shot as they could find, and pulled the trigger. That’s a hell of a lot of work for $50,000!


Figure 1: Oh, no! Data mining engineers found out that I like terrible movies!

By comparison, Netflix’s own Cinematch algorithm uses the following techniques, as quoted in their Netflix Prize FAQ:

How does Cinematch do it?
Straightforward statistical linear models with a lot of data conditioning. But a real-world system is much more than an algorithm, and Cinematch does a lot more than just optimize for RMSE. After all, we have a website to support. In production we have to worry about system scaling and performance, and we have additional sources to data we can use to guide our recommendations.

Netflix has begun a very interesting bounty hunt. As of today, 23021 teams from 164 different countries are clamoring for the cash. I love the idea of putting up public bounties for innovation – the X-Prize comes to mind, particularly the Google-sponsored lunar X prize.

(another informative article about the Netflix Prize)

Can evolution reveal a killer’s mind?

Computer science professor Dr. Ryan Garlick of University of North Texas has a very interesting setup for his symbolic processing course this semester: Each student’s objective is to contribute towards cracking the unsolved 340-character Zodiac cipher.

From a UNT news article:

Cracking the cipher is a difficult task for more than one reason, which is why Garlick, along with his students, are currently developing several computer software techniques that will hopefully make the process far more feasible.
“There are just too many possible keys to look at them all,” Garlick said. “There are 63 different symbols, and each can represent 26 possible letters (we think), which is just too many possible combinations to evaluate them all.”
This is where the computer techniques they are fashioning will hopefully come in handy. Corey Rosemurgy, an Austin senior and computer science major in Garlick’s class, is currently developing a genetic algorithm to solve the cipher.
“The genetic algorithm that I am developing models itself after the inherent properties of biological evolution and the theory of survival of the fittest where only the strong survive,” Rosemurgy said.

I was very interested to discover this course, since I have been working on a similar approach since around March of this year. In my free time (what little there is), I’ve been running experiments using ECJ, a Java-based evolutionary computing framework. So far, my focus has been on trying to get the algorithm to solve Zodiac’s 408-character cipher, which has a known solution. Using a dictionary-oriented approach, the algorithm was able to find the correct solution using a limited 400-word dictionary. Now I am trying to improve this by adding more words to the dictionary used by the algorithm. The basic idea is to get this test case working well before attempting to have it solve the really difficult 340-character cipher. This has proven to be very difficult, because the search space (number of possible solutions) is extremely large, and there is exactly one correct solution. The needle is tiny, and the haystack is vast. Evolutionary computing methods tend to be better-suited for finding really good solutions, rather than the one best solution, so this approach is quite challenging (if not flawed).

I’m glad that many people are still working on this problem; it would be nice to finally find a solution. Still, there is still a strong possibility that there is no solution, and the cipher is just gibberish designed to keep people unnecessarily busy. If so, then the Zodiac killer succeeded beyond his wildest dreams.

More info:
article | course page (of interest here are the syllabus and powerpoint presentation) | google code page and code repository for zodiac decoder software (this is the repository of software used and developed by students in the course)

Count the Start buttons

Another evil incarnation of “turduckenology” arose yesterday at work. From my virtual Windows XP instance running inside Parallels Desktop on my Mac, I needed to make a remote desktop connection to my Windows XP desktop, which is running a Virtual PC 2007 instance of a virtual machine that itself had a remote desktop connection over a VPN to a client machine on their network:

That’s me diving through the rabbit hole of *four* instances of Windows via my Mac, so I can install our software on a client’s machine. The horror!

What really bends my noodle during this is figuring out how to copy and paste all the way up and down the chain of Windows instances. And it’s really easy to lose track of which Windows XP you are in when you are clicking around – it’s easy to run the wrong program in the wrong place!

These are the kinds of memories that will preoccupy my demented dreams when I am an old man in a nursing home.