I've just finished the first version of a new project currently called August Shield 743. The idea behind the name is simple: it was generated by Google when I created the API-key I needed to complete the project. But as the name sounds like a secret Navy Seals operation I decided to keep it.
August Shield 743 is a blog-article-recommendation-system and it took about a week to complete the first live version. The idea behind a recommendation system is to recommend x based on a,b,c,... where x could be a movie, a book, or as in this case a blog article. If you have visited Amazon and at at the bottom of the page saw the heading "Customers Who Bought This Item Also Bought," you will know what I'm talking about. For example, if you want to buy the Steve Jobs book, you will notice that other people also bought the book Einstein and Benjamin Franklin. Amazon is finding these suggestions with the help of a recommendation system.
Last year I read the book The Everything Store, which is a biography on Jeff Bezos, who founded Amazon. The section from the book I remember the most is the section where the author talks about the first version of Amazon's recommendation system. It goes like this:
Eric Benson took about two weeks to construct a preliminary version that grouped together customers who had similar purchasing histories and then found books that appealed to the people in each group. That feature, called Similarities, immediately yielded a noticeable uptick in sales and allowed Amazon to point customers toward books that they might not otherwise have found. Greg Linded, and engineer who worked on the project, recalls [Jeff] Bezos coming into his office, getting downed on his hand and knees, and joking, "I'm not worthy."
According to the book Big Data, a third of all of Amazon's sales are said to result from its recommendation and personalization systems.
So a recommendation system is a really powerful tool you can use to increase sales. With this thought in my mind I had earlier installed another recommendation system om this blog called LinkWithin. I've used that system for a while, but I realized that it didn't give any good recommendations. You don't need just any recommendation system - you also need a recommendation system that generates good results.
So a recommendation system is a really powerful tool you can use to increase sales. With this thought in my mind I had earlier installed another recommendation system om this blog called LinkWithin. I've used that system for a while, but I realized that it didn't give any good recommendations. You don't need just any recommendation system - you also need a recommendation system that generates good results.
Improving a recommendation system is actually very complicated. When Netflix wanted to improve their recommendation system they decided to make a competition out of it called Netflix Prize, where the winning team would win 1 million USD. Despite the reward, it took no less than 3 years before the competitors had developed an improved system.
But improving LinkWithin took about a week, and the solution looks like this:
- Read the blogger-data with JavaScript
- Clean the data to remove strange characters and unneeded words
- Find recommendations by using a similarity measure called Jaccard distance. I've here found the distances between the title, text, and labels and then added them together, so articles can get a score between 0 and 3, where 3 means that it's the same article. The system is general and will work for all blogs because I didn't specialize it to just this blog
- Store the data in a database
- Read the data and add recommendations to blog articles
This was the result:
Recommended articles with the LinkWithin system:
- The Random Show with Kevin Rose and Tim Ferriss (has nothing to do with the article)
- Experiments with Blender (has a little bit to do with the article because I developed the car model in Blender)
- Quote: Edward Tufte (has nothing to do with the article)
Recommended articles with the August Shield 743 system:
- Tesla Motors Simulator Update (has everything to do with the article)
- Tesla Motors Simulator (has everything to do with the article)
- Tesla Motors Test Track Simulator (has everything to do with the article)
- Catacomb Snatch 3D (has a little bit to do with the article because both Catacomb Snatch 3D and the simulator were developed in Unity)
- The Engineer Update 3 (has a little bit to do with the article because The Engineer is a biography on Elon Musk, who co-founded Tesla Motors)
If you are interested in learning how to develop your own recommendation system, you should read the following books:
- The free book Mining of Massive Datasets
- Programming Collective Intelligence - includes a very good example on how to develop a complete recommendation system for movie ratings
Do you plan releasing August Shield for the public? I'm really impressed by the results and the clean look and would relly like to use it/or something similar on my own site http://physicsinsider.com
ReplyDelete