New site launched today: AgainButSlower.com

I launched a new site site this morning: AgainButSlower.com.



I saw this xkcd comic recently, and it made me want to be able to see Wikipedia articles side-by-side with their “simple” counterparts.

Simple English Wikipedia is a version of the Wikipedia encyclopedia, written in Simple English and started in 2004. The encyclopedia is supposed to be used by children, who might not understand the complicated articles in the English Wikipedia, and other people who are still learning English.”

AgainButSlower.com is a quick hack I put together that lets you view the articles side-by-side. To do this, go to the site and type an article name in the search box (for example, War, or Peace, or Chocolate). Or, paste the article’s URL directly from wikipedia (for example, http://en.wikipedia.org/wiki/Time_Cube). Then click the “Again, but slower” button. The site will try to load the original article and the simplified article side-by-side. If it doesn’t find the simple version, try a different article, because not all of Wikipedia’s articles have been translated into simple versions.

You can also try some examples by choosing one from the pulldown list on the page. Or, try your luck with a random article by clicking the Random button. If you click the full formatting checkbox, the original formatting of the Wikipedia articles will be displayed (the site displays the printable stripped-down format by default).

links for 2009-03-06: Pile o’ toys

This impressive augmented reality demo from GE inserts computer-generated 3D objects into live video. First, watch the short video. Then, try it yourself.
Israeli musician “Kutiman” took a big pile of seemingly random YouTube video clips and used them as instruments in his own musical compositions. I could not stop listening to these. My favorites are tracks 2 and 3. His site is overloaded at the time of this post; for now you can see samples here, here, and here.
Can you be an awesome DJ using nothing but a web browser and your computer’s keyboard? Yes you can.
A curious programmer, inspired by Roger Asling’s evolution of the Mona Lisa, asks if the technique could be a good way to compress images. Also take a look at the nice online version of the image evolver he wrote, in which you can set your own target image.
Hilarious Livejournal diary done in the style of Rorschach from the Watchmen comic book series.
The Crisis of Credit, Visualized – An extremely well-produced video describing the credit crisis in simple terms.
instantwatcher.com – “Netflix for impatient people”. A remix of the Netflix site that is “about a quadrillion times easier to browse than Netflix’s own site”.
$timator: How much is your web site worth?
Cursebird. A real time feed of people swearing on Twitter. THANK YOU, INTERNET!
Leapfish. An interesting new meta-search engine with a clean interface. “It’s OK, you’re not cheating on Google.”
Twittersheep. “Enter your twitter username to see a tag cloud from the ‘bios’ of your twitter flock.”
PWN! YouTube. This is a great idea. You just type “pwn” in front of “youtube” in the URL, and voila; instant links for downloading and saving the videos.

User Interface Candy

Microsoft showed this “view of the future” in a presentation at a recent business technology conference:

<a href="http://video.msn.com/?mkt=en-GB&#038;playlist=videoByUuids:uuids:a517b260-bb6b-48b9-87ac-8e2743a28ec5&#038;showPlaylist=true&#038;from=shared" target="_new" title="Future Vision Montage">Video: Future Vision Montage</a>

(video link)

C’mon, people; hurry up and build these awesome user interfaces! A keyboard and mouse can only do so much.

Source

My share of the stimulus package

Now I can pretend to be on Wall Street, seizing untold riches with my filthy, Ponzi-scheme stained paws!

My share of the stimulus package

My share of the stimulus package

My share of the stimulus package

My share of the stimulus package

…or does this hyperinflationary currency from Zimbabwe’s crumbling economy portend the future of our own currency?

By the way… uh… is it just me, or is the typeface on the 10 trillion dollar banknote the same as the one used for Rock Band?

They really know how to party in Zimbabwe.

Simulated evolution parlor tricks

Here are some interesting tidbits of evolutionary computing to honor Darwin’s birthday yesterday:

Evolution of Mona Lisa

(youtube link)

Roger Alsing’s idea is to start with a random pile of polygons. Random mutations are applied to the polygons. The result is compared to the Mona Lisa source image, and mutations resulting in improvements are kept. Over many generations, the evolved image begins to resemble the Mona Lisa.

This particular application of genetic algorithms is very popular. See what many other people have tried.

Evolectronica

This site evolves music by generating loops randomly from sounds and effects. Listeners to the site’s audio streams rank the results, and the genetic algorithm creates “baby loops” for the listeners to rank.

CSS Evolve

This site shows you variations of a web site’s cascading style sheets. You pick the best results, and their genetic algorithm breeds them to create new styles for the web site.

Pepsanity

This is the new Pepsi logo:

This is evidence of the complete and utter insanity that went into the design of the new Pepsi logo by the Arnell Group, the advertising agency retained to re-brand Pepsi:



















What a Michelangelean Da Vincian effort. And it only cost $1 million. Totally worth it if you want your brand to become the center of some bizarre fictional universe. A universe that is also inhabited by the Hoff:

Download the entire “Pepsi Gravitational Field” document here. And read some of the backstory here and here. (I am still wondering if this is some kind of elaborate hoax to make the Pepsi ad agency look like some bizarre combination of Scientology and Time Cube metaphysics.)

May your emotive forces shape the gestalt of your brand identity!

I wish my 401(k) was this much fun.

Every letter is powerful

A fun nugget from my new favorite blog, Futility Closet:

Show this bold Prussian that praises slaughter, slaughter brings rout. Teach this slaughter-lover his fall nears.

Grim, no? But remove the first letter of each word and the mood changes:

How his old Russian hat raises laughter — laughter rings out! Each, his laughter over, is all ears.

Check out Futility Closet for more fascinating curiosities tinted with language, math, science, antiquity, puzzles, and amusement. I especially enjoy The Random Item Button.

Wolves in sheep’s clothing

“There is a story, which is fairly well known, about when the missionaries came to Africa. They had the Bible and we, the natives, had the land. They said ‘Let us pray,’ and we dutifully shut our eyes. When we opened them, why, they now had the land and we had the Bible.”

– Desmond M. Tutu, “Religious Human Rights and the Bible.”

Automatic programming for the lazy



You never know… a random walk may lead to serendipity.

Today’s xkcd comic is well-timed because just yesterday I sent out an announcement of the availability of my implementation of Cartesian Genetic Programming for ECJ, a Java-based evolutionary computing software framework. Genetic programming (GP) is a problem-solving technique in computer science that is inspired by evolution in biology. You start with a population of randomized computer programs and measure the “fitness” of each program. The fitness is a measurement of how well a program solves a particular problem. You can think of the program itself as the “gene”.

In this simulation of evolution, the best programs in the bunch are selected for “breeding” for the next generations. Breeding is done by exchanging pieces of genetic material – in this case, we exchange pieces of the computer programs themselves. Good programs have different parts that are useful, and these parts are combined or exchanged in new “child” programs that are even better than their parents. Then, every so often, random mutations are introduced into the programs to promote diversity. Good mutations survive to future generations, and bad ones die out.



Parent programs producing offspring by exchanging pieces of themselves. Credit: Michael Adam Lones, Enzyme Genetic Programming

This sounds like a completely random way to solve a problem, but it is surprisingly effective for many kinds of problems, such as learning mathematical expressions that can describe some data set, discovering winning game-playing strategies, making forecasts and predictions from data sets, learning decision trees to classify data, evolving emergent behavior, and optimization of complex systems. Of great interest to me in applying genetic programming is the emergence of unique and fascinating solutions that are not likely to be conceived by human minds.

Cartesian Genetic Programming (CGP), the genetic programming technique I implemented for ECJ, is a variation of genetic programming invented by Julian Miller. CGP uses simple lists of numbers to represent the programs (most GP implementations use some kind of explicit tree representation). It has some interesting benefits over traditional tree-based genetic programs, such as improved search performance, reduced overhead, and less “bloat” in generated programs. My implementation includes some sample problems: regression (fitting a bunch of data to an equation), classification (identifying a species of iris flower based on simple measurements of its parts, or predicting if a breast cancer tumor will be benign or malignant), and parity (counting the number of “on” bits in a binary string).

The iris classification problem is a classic machine learning problem, dating all the way back to 1936. You have a set of measurements taken from various kinds of iris flowers, and your task is to figure out which species it is: iris virginica, iris versicolor, or iris setosa.



From left to right: iris virginica, iris versicolor, and iris setosa.

Starting with randomized programs, one of my CGP tests evolved the following programs (expressions) which correctly classified about 95% of the irises:

virginica = nand (> (- (+ 1.7777395 (/ sepalwidth sepallength)) (if 0.0053305626 petalwidth -0.6896746)) (/ -0.8330147 (neg 1.6308627))) (- (* (+ (- (+ 1.7777395 (/ sepalwidth sepallength)) (- petallength petalwidth)) 1.7777395) (- petallength petalwidth)) petalwidth)

versicolor = * (nand (> (- (+ 1.7777395 (/ sepalwidth sepallength)) (if 0.0053305626 petalwidth -0.6896746)) (/ -0.8330147 (neg 1.6308627))) (- (* (+ (- (+ 1.7777395 (/ sepalwidth sepallength)) (- petallength petalwidth)) 1.7777395) (- petallength petalwidth)) petalwidth)) (- (+ 1.7777395 (/ sepalwidth sepallength)) (- petallength petalwidth))

setosa = - (+ 1.7777395 (/ sepalwidth sepallength)) (- petallength petalwidth)

I don’t expect you to be able to read and understand the expressions – they are in a format that isn’t easy to read! What’s more important is that I didn’t have to create them myself – the CGP algorithm discovered them for me using only the input data (iris measurements) and the fitness function (a measurement of how many irises were correctly identified).

CGP also had good results when I tested the Wisconsin breast cancer data set. This data set contains measurements taken from fine needle biopsies of suspicious breast lumps. Our task is to predict whether the lumps are benign or malignant using only the measurements.



Fine needle aspiration. A thin needle is used to sample material from a suspicious lump in a breast. The data set contains the microscopic measurements taken from the sampled material.

CGP evolved the following program that correctly diagnoses the tumors 95% of the time:


malignant = not (nor (+ (+ (* cellShapeUniformity 1.5756812) bareNuclei) (<= (> (= cellSizeUniformity 0.08695793) -1.9803793) normalNucleoli)) (nor (> (- (iflez (if blandChromatin mitoses 0.75769496) bareNuclei normalNucleoli) (* 1.97491 (+ cellSizeUniformity clumpThickness))) (> 0.08695793 singleEpiCellSize)) (nor (>= marginalAdhesion (not (- (= (> (or clumpThickness 1.5756812) (= cellSizeUniformity 0.08695793)) (+ cellSizeUniformity clumpThickness)) (* (* 1.97491 (+ cellSizeUniformity clumpThickness)) mitoses)))) (iflez (if blandChromatin mitoses 0.75769496) bareNuclei normalNucleoli))))

Again, using only a fitness function and some test data, we are able to evolve a highly accurate cancer diagnosis tool.

Seems too good to be true? Well, it can be. Overfitting is a big problem when evolving these kinds of classifiers. For example, if each of the malignant tumor patients happened to be wearing red shoes during the biopsy (and the shoe type was included in the data set), a machine might be inclined to think that wearing red shoes was what determined the diagnosis. So, the evolved classifier is going to be very sensitive to whatever data set you unleash it upon.

Oh and then there’s the whole “no free lunch” thing. But that’s a depressing topic for another time.

If you want to find out more about my CGP implementation, check out the documentation. If you want to give it a whirl yourself, grab the distribution (you need at least Java 1.5). Check out the ECJ project page for more info about the evolutionary software framework my CGP implementation uses.