Category Archives: Uncategorized

Cobwebs

I’ve upgraded the blog software and moved this much neglected blog to a new host.  I was curious to see which of the old posts were popular over the years and so here’s a sampling:

PlayShinro.com launch

I’ve finally finished the web site for my new book of puzzles.

Go there and check out the book. You can even download a free edition!

United we stand

You enter a contest. A million dollars is at stake. Forty-one thousand teams from 186 different countries are clamoring for the prize and the glory. You edge into the top 5 contestants, but there is only one prize, and one winner. Second place is the first loser. What do you do?

Team up with the winners, of course.

The Netflix Prize is a competition that is awarding $1,000,000 to whomever can come up with the best improvement to their movie recommendation engine. Their system looks at the massive amounts of movie rental data to try to predict how well users will like other movies. For example, if you like Coraline, you may also like Sweeney Todd. But Netflix’s recommendation engine isn’t great at making predictions, so they decided to offer a bounty to anyone who could come up with a system that has a verifiable 10% improvement to Netflix’s prediction accuracy.

The contest recently ended with two teams jockeying for the prize. During the two and a half years the contest has been active, several individuals and small groups dominated the contest leaderboard;, with competition among 41,000 teams from 186 different countries. The competition became fierce, resulting in coalitions forming. The team “BellKor’s Pragmatic Chaos” formed from the separate teams “BellKor” (part of the Statistics Research Group in AT&T labs), “BigChaos” (a group of folks who specialize in building recommender systems), and “PragmaticTheory” (two Canadian engineers with no formal machine learning or mathematics training). Another conglomerate team, “The Ensemble“, is made up of “Grand Prize Team” (itself a coalition of members combining strategies to win the prize), “Vandelay Industries (another mish-mash of volunteers)”, and “Opera Solutions“.

;

At first, it looked like BellKor’s Pragmatic Chaos won. But now it looks like The Ensemble won. Netflix says it will verify and announce the winner in a few weeks.

Who the hell cares? Why is this interesting in the slightest? Ten percent seems so insignificant.

Well, predicting human behavior seems impossible. But this contest has clearly shown that some amount of improvement in prediction of complicated human behavior is indeed possible. And what’s really interesting about the winning teams is that no single machine learning or statistical technique dominates by itself. Each of the winning teams “blends” a lot of different approaches into a single prediction engine.

Artificial neural networks. Singular value decomposition. Restricted Boltzmann Machines. K-Nearest Neighbor Algorithms. Nonnegative matrix factorization. These are all important algorithms and techniques, but they aren’t best in isolation. Blending is key. Even the teams in the contest were blended together.

United we stand.

Each technique has its strengths and weaknesses. Where one predictor fails, another can take up the slack with its own unique take on the problem.

BellKor, in their 2008 paper describing their approach, made the following conclusions about what was important in making predictions:

  • Movies are selected deliberately by users to be ranked. The movies are not randomly selected.
  • Temporal effects:
    • Movies go in and out of popularity over time.
    • User biases change. For example, a user may rate average movies “4 stars”, but later on decide to rate them “3 stars”.
    • User preferences change. For example, a user may like thrillers one year, then a year later become a fan of science fiction.
  • Not all data features are useful. For example, details about descriptions of movies were significant, and explained some user behaviors, but did not improve prediction accuracy.
  • Matrix factorization models were very popular in the contest. Variations of these models were very accurate compared to other models.
  • Neighborhood models and their variants were also popular.
  • For this problem, increasing the number of parameters in the models resulted in more accuracy. This is interesting, because usually when you add more parameters, you risk over-fitting the data. For example, a naive algorithm that has “shoe color” as an input parameter might see a bank that was robbed by someone wearing red shoes, and conclude that anyone wearing red shoes was a potential bank robber. For another classic example of over-fitting, see the Hidenburg Omen.
  • To make a great predictive system, use a few well-selected models. But to win a contest, small incremental improvements are needed, so you need to blend many models to refine the results.


;
RMSE (error) goes down as the number of blended predictors goes up. But the steepest reduction in error happens with only a handful of predictors — the rest of them only gradually draw down the error rate.

Yehuda Koren, one of the members of BellKor’s Pragmatic Chaos and a researcher for Yahoo! Israel, went on to publish another paper that goes into more juicy details about their team’s techniques.

I hope to see more contests like this. The KDD Cup is the most similar one that comes to mind. But where is the ginormous cash prize???

(previously)

RuleDaddy

Bob Parsons, the entrepreneur who started the highly successful GoDaddy domain registration companies, published a list of his 16 rules for “success in business and life in general” way back in 2006. I only recently discovered the list, and many of his rules really ring true to me, especially #3, #7, #9, and #16:

1. Get and stay out of your comfort zone. I believe that not much happens of any significance when we’re in our comfort zone. I hear people say, “But I’m concerned about security.” My response to that is simple: “Security is for cadavers.”

2. Never give up. Almost nothing works the first time it’s attempted. Just because what you’re doing does not seem to be working, doesn’t mean it won’t work. It just means that it might not work the way you’re doing it. If it was easy, everyone would be doing it, and you wouldn’t have an opportunity.

3. When you’re ready to quit, you’re closer than you think. There’s an old Chinese saying that I just love, and I believe it is so true. It goes like this: “The temptation to quit will be greatest just before you are about to succeed.”

4. With regard to whatever worries you, not only accept the worst thing that could happen, but make it a point to quantify what the worst thing could be. Very seldom will the worst consequence be anywhere near as bad as a cloud of “undefined consequences.” My father would tell me early on, when I was struggling and losing my shirt trying to get Parsons Technology going, “Well, Robert, if it doesn’t work, they can’t eat you.”

5. Focus on what you want to have happen. Remember that old saying, “As you think, so shall you be.”

6. Take things a day at a time. No matter how difficult your situation is, you can get through it if you don’t look too far into the future, and focus on the present moment. You can get through anything one day at a time.

7. Always be moving forward. Never stop investing. Never stop improving. Never stop doing something new. The moment you stop improving your organization, it starts to die. Make it your goal to be better each and every day, in some small way. Remember the Japanese concept of Kaizen. Small daily improvements eventually result in huge advantages.

8. Be quick to decide. Remember what General George S. Patton said: “A good plan violently executed today is far and away better than a perfect plan tomorrow.”

9. Measure everything of significance. I swear this is true. Anything that is measured and watched, improves.

10. Anything that is not managed will deteriorate. If you want to uncover problems you don’t know about, take a few moments and look closely at the areas you haven’t examined for a while. I guarantee you problems will be there.

11. Pay attention to your competitors, but pay more attention to what you’re doing. When you look at your competitors, remember that everything looks perfect at a distance. Even the planet Earth, if you get far enough into space, looks like a peaceful place.

12. Never let anybody push you around. In our society, with our laws and even playing field, you have just as much right to what you’re doing as anyone else, provided that what you’re doing is legal.

13. Never expect life to be fair. Life isn’t fair. You make your own breaks. You’ll be doing good if the only meaning fair has to you, is something that you pay when you get on a bus (i.e., fare).

14. Solve your own problems. You’ll find that by coming up with your own solutions, you’ll develop a competitive edge. Masura Ibuka, the co-founder of SONY, said it best: “You never succeed in technology, business, or anything by following the others.” There’s also an old Asian saying that I remind myself of frequently. It goes like this: “A wise man keeps his own counsel.”

15. Don’t take yourself too seriously. Lighten up. Often, at least half of what we accomplish is due to luck. None of us are in control as much as we like to think we are.

16. There’s always a reason to smile. Find it. After all, you’re really lucky just to be alive. Life is short. More and more, I agree with my little brother. He always reminds me: “We’re not here for a long time; we’re here for a good time.”

Is it working for him?



Perhaps. :)

I’m very guilty of violating rule #3 (“When you’re ready to quit, you’re closer than you think”). I find it very odd how quickly I’ll neglect some project when it is very close to completion and/or success. Must be some kind of weird self-destructive streak. Or fear of FAIL.

I like Bob Parsons’ list because it is practical and not as “touchy-feely” as many of the other self-improvement lists out there. Maybe it will help me get rid of my boarding pass to the FAIL BOAT.

links for 2008-09-24

links for 2008-09-23

links for 2008-09-22

links for 2008-09-19

  • Stanford is offering free computer science courses, such as introductory programming, robotics, machine language processing, machine learning, linear dynamical systems, fourier transform, and convex optimization. Includes all course materials!
  • "I was already in this mental cone of silence when the doctor lifted up the covers of my eyeball flaps using what looked like metal chopsticks, mixing around a stir fry while I watched, first-person perspective, from within the wok."
    (tags: medical)

links for 2008-09-18

  • A neat way to visualize stock market performance by industry sector. Similar to the disk space treemaps generated by http://www.derlien.com/ and http://windirstat.info/.

links for 2008-09-17