Zodiac webtoy: Another update

I’ve finally made some more updates to the Zodiac webtoy! The biggest change is the addition of simple letter-frequency statistics.



Remember when Ralphie cracked the “secret code” in A Christmas Story?
Ralphie: [Reading his decoding]: “Be… sure… to… drink… your… Ovaltine. Ovaltine? A crummy commercial?!? Son of a bitch!!”

Here is the list of noteworthy changes:

  • Added frequency tabulations for symbols, decoded plaintext letters, and expected letter frequencies (as specified here). Frequency analysis of this sort might be useful to cracking the cipher. Eventually, I would like to include more n-gram statistics (occurrences of letter combinations of length n) to make this more useful.
  • Added the 408 cipher, which has a known (and creepy) solution. Click on the “Switch to 408 cipher” link to see it. The known solution link is below the 408 cipher (it is labeled “The correct one” next to “Interesting decoders”).
  • Fixed some formatting bugs that were causing the columns of the cipher grid to get squashed.

The webtoy seems to be getting much slower in Internet Explorer 6. I’ve not tested it in Internet Explorer 7. But it works pretty well in Firefox and Safari. I am afraid of all the horrible, horrible JavaScript code that I wrote – Please let me know of any problems you find! If you find the new version to be too crappy, you can still use the old version by clicking here.

Reality is what you can get away with

I enjoyed this brief glimpse of the underbelly of Apple’s product marketing. An interviewer asks a touchy question about iTunes acting as a monopoly, and Apple’s PR folks start to freak out:


(video link)

Such is the ugly reality of companies attempting to control expectations.

Silicon heaven

From Microsoft’s Help and Support pages:

Computer Randomly Plays Classical Music
View products that this article applies to.
Article ID : 261186
Last Review : March 27, 2007
Revision : 3.3
This article was previously published under Q261186
SUMMARY
During normal operation or in Safe mode, your computer may play “Fur Elise” or “It’s a Small, Small World” seemingly at random. This is an indication sent to the PC speaker from the computer’s BIOS that the CPU fan is failing or has failed, or that the power supply voltages have drifted out of tolerance. This is a design feature of a detection circuit and system BIOSes developed by Award/Unicore from 1997 on.

Anyone remember the famous scene in 2001 when HAL the computer gets shut down?

[HAL’s shutdown]
HAL: I’m afraid. I’m afraid, Dave. Dave, my mind is going. I can feel it. I can feel it. My mind is going. There is no question about it. I can feel it. I can feel it. I can feel it. I’m a… fraid. Good afternoon, gentlemen. I am a HAL 9000 computer. I became operational at the H.A.L. plant in Urbana, Illinois on the 12th of January 1992. My instructor was Mr. Langley, and he taught me to sing a song. If you’d like to hear it I can sing it for you.
Dave Bowman: Yes, I’d like to hear it, HAL. Sing it for me.
HAL: It’s called “Daisy.”
[sings while slowing down]
HAL: Daisy, Daisy, give me your answer do. I’m half crazy all for the love of you. It won’t be a stylish marriage, I can’t afford a carriage. But you’ll look sweet upon the seat of a bicycle built for two.

iSnark

Apple’s running this obnoxious commercial featuring a possibly bogus airline pilot who, as a passenger on a weather-delayed flight, heroically saves the day by using his magical iPhone to check the weather:


(video link)

Now, an anecdotal tale has emerged where another guy tried to use his iPhone on a delayed flight to second-guess the airline’s ability to check to weather. He gets a flight attendant to pass his weather prediction on to the pilot, and the pilot’s response over the intercom is hilarious:

“If the passenger with the IPhone would be kind enough to use it to check the weather at our alternate, calculate our fuel burn due to being rerouted around the storms, call the dispatcher to arrange our release, and then make a phone call to the nearest Air Traffic Control center to arrange our timely departure amongst the other aircraft carrying passengers with IPhones, then we will be more than happy to depart. Please ring your call button to advise the Flight Attendant and your fellow passengers when you deem it ready and responsible for this multi-million dollar aircraft and its passengers to safely leave.”

ZING!

I don’t know if the story is true. But, dammit, I still love it.

(source)

Computers is smart

This fortune I read today seems like a natural follow-up to my previous post:

This is the first numerical problem I ever did. It demonstrates the
power of computers:

Enter lots of data on calorie & nutritive content of foods. Instruct
the thing to maximize a function describing nutritive content, with a
minimum level of each component, for fixed caloric content. The
results are that one should eat each day:

1/2 chicken
1 egg
1 glass of skim milk
27 heads of lettuce.

— Rev. Adrian Melott

Do you want to be a millionaire?

All you have to do is help Netflix read people’s minds!

The Netflix Prize is a contest that has been going on since October 2006. I didn’t hear about it until today. When you rate movies on the Netflix DVD rentals site, their proprietary Cinematch algorithm will predict which other movies you might like based on ratings that have been made by all Netflix users. It is very similar to Amazon’s “people who bought X also bought Y” feature. The Netflix Prize challenge is to come up with a new prediction system that is 10% more accurate than Cinematch. Whoever does this will get the top prize of $1,000,000. Netflix is also rewarding a periodic “progress prize” of $50,000 to people who can beat the last progress prize winner by 1%. The current progress prize winner, an AT&T Labs team named BellKor, has a technique that yields a prediction improvement of 8.5% over Cinematch. Read about their technique here. Their technique makes my brain hurt — it blends together 107 different results from a large ensemble of data mining models, including neighborhood-based models (k-NN), factorization models (such as Ridge regression), regressions based on Gaussian priors, restricted Boltzmann Machines, asymmetric factor models, and regression models (using principal components analysis for feature selection, and SVD vectors as predictor variables). Basically, they packed a data mining shotgun with as much shot as they could find, and pulled the trigger. That’s a hell of a lot of work for $50,000!


Figure 1: Oh, no! Data mining engineers found out that I like terrible movies!

By comparison, Netflix’s own Cinematch algorithm uses the following techniques, as quoted in their Netflix Prize FAQ:

How does Cinematch do it?
Straightforward statistical linear models with a lot of data conditioning. But a real-world system is much more than an algorithm, and Cinematch does a lot more than just optimize for RMSE. After all, we have a website to support. In production we have to worry about system scaling and performance, and we have additional sources to data we can use to guide our recommendations.

Netflix has begun a very interesting bounty hunt. As of today, 23021 teams from 164 different countries are clamoring for the cash. I love the idea of putting up public bounties for innovation – the X-Prize comes to mind, particularly the Google-sponsored lunar X prize.

(another informative article about the Netflix Prize)

Can evolution reveal a killer’s mind?

Computer science professor Dr. Ryan Garlick of University of North Texas has a very interesting setup for his symbolic processing course this semester: Each student’s objective is to contribute towards cracking the unsolved 340-character Zodiac cipher.

From a UNT news article:

Cracking the cipher is a difficult task for more than one reason, which is why Garlick, along with his students, are currently developing several computer software techniques that will hopefully make the process far more feasible.
“There are just too many possible keys to look at them all,” Garlick said. “There are 63 different symbols, and each can represent 26 possible letters (we think), which is just too many possible combinations to evaluate them all.”
This is where the computer techniques they are fashioning will hopefully come in handy. Corey Rosemurgy, an Austin senior and computer science major in Garlick’s class, is currently developing a genetic algorithm to solve the cipher.
“The genetic algorithm that I am developing models itself after the inherent properties of biological evolution and the theory of survival of the fittest where only the strong survive,” Rosemurgy said.

I was very interested to discover this course, since I have been working on a similar approach since around March of this year. In my free time (what little there is), I’ve been running experiments using ECJ, a Java-based evolutionary computing framework. So far, my focus has been on trying to get the algorithm to solve Zodiac’s 408-character cipher, which has a known solution. Using a dictionary-oriented approach, the algorithm was able to find the correct solution using a limited 400-word dictionary. Now I am trying to improve this by adding more words to the dictionary used by the algorithm. The basic idea is to get this test case working well before attempting to have it solve the really difficult 340-character cipher. This has proven to be very difficult, because the search space (number of possible solutions) is extremely large, and there is exactly one correct solution. The needle is tiny, and the haystack is vast. Evolutionary computing methods tend to be better-suited for finding really good solutions, rather than the one best solution, so this approach is quite challenging (if not flawed).

I’m glad that many people are still working on this problem; it would be nice to finally find a solution. Still, there is still a strong possibility that there is no solution, and the cipher is just gibberish designed to keep people unnecessarily busy. If so, then the Zodiac killer succeeded beyond his wildest dreams.

More info:
article | course page (of interest here are the syllabus and powerpoint presentation) | google code page and code repository for zodiac decoder software (this is the repository of software used and developed by students in the course)

Count the Start buttons

Another evil incarnation of “turduckenology” arose yesterday at work. From my virtual Windows XP instance running inside Parallels Desktop on my Mac, I needed to make a remote desktop connection to my Windows XP desktop, which is running a Virtual PC 2007 instance of a virtual machine that itself had a remote desktop connection over a VPN to a client machine on their network:

That’s me diving through the rabbit hole of *four* instances of Windows via my Mac, so I can install our software on a client’s machine. The horror!

What really bends my noodle during this is figuring out how to copy and paste all the way up and down the chain of Windows instances. And it’s really easy to lose track of which Windows XP you are in when you are clicking around – it’s easy to run the wrong program in the wrong place!

These are the kinds of memories that will preoccupy my demented dreams when I am an old man in a nursing home.

Go @TEAM!

Chris sent me this funny email he received this morning:

From: terptix
Sent: Mon 10/22/2007 1:38 AM
To: Chris
Subject: Ticket Notification

This is just a reminder that the Request and Claim period for student
tickets for the @Sport game vs. @Opponent on @EventDate will begin
shortly online. Please check the student ticket website for exact
schedule information.

Somebody needs to put some data behind that template!

The cold case that just won’t die

I was surprised tonight to discover that somebody posted my little Zodiac cipher webtoy on Digg recently, and it has been getting some significant traction there:

http://digg.com/playable_web_games/Can_You_Crack_The_Zodiac_Killer_s_Code

I had a lot of fun making it, and I am glad that folks are getting some use out of it. Thanks for the support! I hope we can sustain the Digg Effect!!

I’ve been getting a lot of good feature suggestions from people, such as adding frequency analysis, allowing multiple letters per symbol, and supporting cipher transpositions. I hope to make some of these improvements to the app in the near future.