tag:blogger.com,1999:blog-79484480609914078432024-03-13T05:16:46.969+01:00What I doBy doing whatever you can become whoeverArek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.comBlogger22125tag:blogger.com,1999:blog-7948448060991407843.post-84140215220915213772016-03-13T00:01:00.000+01:002016-03-13T00:01:20.146+01:00Nobody in the world knows how to train one hidden layerI said it.<br/>
In recent years, there has been a lot of buzz about deep learning,
where the learning algorithm is not based on the Bayes rule and probability.
People are optimizing arbitrary, complicated cost functions, and they are
doing it with gradient descent, so they don't even reach the minimum
of the (incorrect) function that they want to optimize.<br/>
I just wanted to remind that nobody in the world knows how to train
one hidden layer well, so perhaps instead of hand-waving about deep learning
that much, which gets annoying, it may be worth to examine again simpler,
fundamental models.<br/>
<br/>
My publication on the Netflix Prize is now free. Download it
<a href="http://arek-paterek.com/book/">here</a>.<br/>
The previous 4-page publication has so far
<a href="http://scholar.google.com/scholar?cites=11221835444786555275">over 450 citations</a>,
and the newer publication has 195 pages and 0 citations.<br/>
<br/>
So read it, cite it.
My h-index is 1, and I want to increase it to 2.
I don't feel like a real scientist with an h-index 1.<br/>
<br/>
You have to read it to not stay behind your competition.
Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com4tag:blogger.com,1999:blog-7948448060991407843.post-25635483966000920262016-03-11T02:06:00.001+01:002016-04-04T14:56:34.333+01:00My thoughts on AlphaGoAlphaGo is winning 2-0 with one of the best go players in the world.<br/>
(Edit: it won 3-0)<br/>
(Edit2: they played all five games. It won 4-1.)<br/>
<br/>
I happen to know well a similar domain of computer chess.<br/>
In 2005 my chess playing program won the championship of my country.
The unique thing about my program was that I successfully used machine learning
to learn the weights of my program's evaluation function.
I used value adaptation and move adaptation from games of >2300 ELO players, and I used
also learning from self-play.
The learning methods used in my program were similar to those used in Logistello and Deep Blue.
The difference was that the creators of Deep Blue ultimately used weights chosen by hand,
and my program used automatically learned weights.<br/>
<br/>
So I am one of few people in the world, who understands the challenge and
benefits of using machine learning in computer chess, go, and similar board games.<br/>
<br/>
My thoughts:<br/>
1. I guess that the most recent large improvements of the skill of AlphaGo
come from improving leaf evaluation, and moving from Monte Carlo Search
and playouts to an algorithm closer to alpha-beta search.
Previously they had to use playouts to evaluate leafs, to correct
inaccuracies of static leaf evaluation. Playouts are always nonoptimal - they
should be replaced with a properly designed search and static evaluation.<br/>
2. The goal of the AlphaGo project is PR for Google. I think that they downplay
the effort put in feature engineering, and they exaggerate the role of deep learning.<br/>
3. Best human go players aren't that good at playing go as advertised.<br/>
4. Computer go is not as hard as advertised.<br/>
Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com2tag:blogger.com,1999:blog-7948448060991407843.post-79831016429686902132013-12-14T21:38:00.000+01:002013-12-14T21:38:09.627+01:00Random StrangersMy new project <a href="http://random-strangers.pl">Random Strangers</a> is a random chat like Chatroulette or Omegle, but combined with a recommendation engine. Users rate conversations on a scale 1-5, and the idea is to match users that will like each other, based on predicted ratings. This should solve the well known <a href="http://youtu.be/I7XmmnlR0e0">Chatroulette problem</a> (I always wanted to solve humanity's biggest problems).<br/>
<br/>
<a href="http://random-strangers.pl/"><img src="http://random-strangers.pl/strangers.png" width="400"></a>
Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com0tag:blogger.com,1999:blog-7948448060991407843.post-13605219570607255912013-03-19T18:29:00.001+01:002013-03-19T18:54:17.434+01:005000 best thingsIt's time to sum up my growing website - <a href="http://5000best.com/">lists of 5000 best things</a>.
It started as a simple, searchable list of 5000 movies.
The response was enthusiastic (hacker news front page, wykop front page and <a href="http://gigazine.net/news/20121012-5000-best-movies/">an article in Gigazine</a> retweeted about 1000 times),
so I decided to put extra work into it and explore the opportunity.
So far I created four lists: <a href="http://5000best.com/movies/">movies</a>, <a href="http://5000best.com/books/">books</a>, <a href="http://5000best.com/websites/">websites</a> and <a href="http://5000best.com/videos/">videos</a>.<br/>
<br/>
The movies part is the most advanced, with instant search,
filtering by 13 genres, film/TV series or by year, links to IMDb, Wikipedia, Rotten Tomatoes, Netflix, etc.,
54 different rankings of the 5000 movies,
and additionally, one personalized ranking calculated
by a recommendation engine (just like in my all previously written
recommendation engines, a new ranking is calculated
and displayed immediately at the moment of rating a movie).
5000 best websites are divided into 24 categories,
and 5000 best videos into 35 categories.<br/>
<br/>
What to do next?
All suggestions are welcome.
Create more lists or extend the existing ones?
Or put some serious effort into marketing? People tell me that I don't
advertise the website.<br/>
<br/>
I got 9k visits from stumbleupon in the last weeks to websites/Porn.
Makes me think - should I listen to the market?<br/>
<br/>
<a href="http://5000best.com/movies/"><img src="http://i.imgur.com/1aA9UZ5.png" width="190px" border="0"/></a>
<a href="http://5000best.com/books/"><img src="http://i.imgur.com/96WSuJU.png" width="190px" border="0"/></a><br/>
<a href="http://5000best.com/websites/"><img src="http://i.imgur.com/Db8hDcX.png" width="190px" border="0"/></a>
<a href="http://5000best.com/videos/"><img src="http://i.imgur.com/zsT3QUm.png" width="190px" border="0"/></a><br/>
<br/>
The blog was renamed to "What I Do", because this is what this blog ultimately became - news about my projects for anyone interested.
I stole the subtitle from Remi Gaillard.
I am thinking about starting another, more technical blog on programming,
machine learning, the Internet and business (I know nothing about business),
but who would read that. It kind of does not make sense to write into the void.
And another blog about life and everything (I know nothing about life, but I don't think it matters).
Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com5tag:blogger.com,1999:blog-7948448060991407843.post-9683701293117130742012-06-22T23:07:00.000+01:002016-03-12T22:48:53.139+01:00I wrote a bookOr rather, I call it a monograph.<br/>
<br/>
Why a book? I wanted to do something for humanity. Still, I am not that concerned with humanity to give away all fruits of my work for free.<br/>
<br/>
The title is "Predicting movie ratings and recommender systems" and you can <strike>buy</strike> download it <a href="http://arek-paterek.com/book">here</a>.
It's pretty obscure, specialistic stuff on what I understood about the Netflix Prize data, recommender systems, and about prediction tasks in general.
Definitely not a publication for everyone.<br/>
<br/>
So now after I have done something for humanity, in the rest of my life I will:
1) earn money and 2) do whatever I want.Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com0tag:blogger.com,1999:blog-7948448060991407843.post-4056912170186907672012-06-08T23:40:00.000+01:002013-03-20T01:16:29.329+01:00Movie discovery and recommendationsAs a side effect of digging into the Netflix Prize data
I created a set of flash applications.
For a long time I was the only user, but I am not that beyond.
I finished it up lately and it is ready to share.
It has colors and all. Users should like it.<br/>
<br/>
I launched it under the name <a href="http://arek-paterek.com/movie-galaxy/">"The Galaxy of Movies"</a>.
What's inside: two ways of visualizing similar movies,
a 2D recommender system, and a quiz game.<br/>
<br/>
Features:
interactive 2d maps, search, filtering by classic and experimental
genres, recommendations for two people, option of importing ratings.
The application is self-contained, and is smaller than 150k together
with all data.
The recommendations are built-in, calculated within the flash application,
without having a specialized server.<br/>
<br/>
<a href="http://i.imgur.com/pSV0t.png"><img src="http://i.imgur.com/Mbgth.png" width="190px" alt="tgom - movie visualization" border="0"/></a>
<a href="http://i.imgur.com/TahGO.png"><img src="http://i.imgur.com/klcqH.png" width="190px" alt="tgom - quiz game" border="0"/></a><br/>
<a href="http://i.imgur.com/OGr71.png"><img src="http://i.imgur.com/593oz.png" width="190px" alt="tgom - movie recommendations" border="0"/></a>
<a href="http://i.imgur.com/JtAOp.png"><img src="http://i.imgur.com/2dXIZ.png" width="190px" alt="tgom - movie visualization 2" border="0"/></a><br/>
<br/>
I named the six most contributing features (hidden genres)
learned by the regularized SVD,
creating a set of 12 new, experimental genres (after discretizing) - they can be used to filter the movies:<br/>
* Realism vs. Idealization<br/>
* Safety vs. Surprise<br/>
* Fairy Tale vs. Distrust<br/>
* Feminism vs. Testosterone<br/>
* Innocence vs. Heroism<br/>
* Growing up vs. Journey<br/>
<br/>
From the point of view of rating prediction
those new genres carry much more information
(allow to better assess if you will like the movie)
than the standard genres, like Comedy, Drama, etc.Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com0tag:blogger.com,1999:blog-7948448060991407843.post-80556786221600095052012-05-25T22:55:00.000+01:002012-06-14T17:32:51.202+01:00Moving on with one thingA difficult subject that I feel I have to explain. I do not want to repeat myself to everyone, so I will just write it once here. In 2004 I enrolled in "PhD studies" in computer science at Warsaw University. It turned out that the Institute of Informatics not only has zero didactic offer for people like me, but over all these years I did not manage to convince the university that what I do is worth any funding or any other kind of support. So I was close to this decision every year. I do not keep in touch with those people since long time, so this is rather stating the obvious - I do not want to have a PhD from that place.<br/>
<br/>
The whole situation is odd, because my publication, with over 170 citations, is the most cited on the whole faculty since 2007. My understanding of all that, is that my presence bothered much those various eternal beneficiaries of the system. Maybe it's I am too stubborn, and in such places more welcome are people who bow their heads low. Well, this is how the system works and nothing can be done about it.Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com0tag:blogger.com,1999:blog-7948448060991407843.post-48500432013752382592010-07-12T11:33:00.001+01:002012-06-14T08:56:04.953+01:00Predictions for soccer4th place in Kaggle's World Cup <a href="http://web.archive.org/web/20101124180543/http://kaggle.com/worldcupconf?viewtype=results">"confidence challenge"</a>. This time all predictions were "by hand", using the bookmakers' odds.<br/>Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com0tag:blogger.com,1999:blog-7948448060991407843.post-5446532282132845882009-09-25T22:22:00.001+01:002012-06-14T13:20:19.990+01:00Netflix Grand Prize & what's new1. The <a href="http://netflixprize.com/">Netflix Prize</a> is over, won by the 7-person team BellKor's Pragmatic Chaos. Congratulations to the winners.<br/>
<br/>
I ended up in 43rd place individually, and in 34th place in the two-person team with John Tomfohr.<br/>
My best reached place was 3rd in June 2007, ten months after the start of the contest, and the best reached place of our team was 2nd, at the moment of creating it, one day before the first Progress Prize in October 2007.<br/>
<br/>
2. <a href="http://arek-paterek.com">arek-paterek.com</a> is now my homepage. Basically, that website will be the center of my presence on the web, with my current e-mail, links to things I share, and so on.<br/>
<br/>
3. Some summary of my service lolrate/svdsystem "recommender system for everything". I launched it in December 2008, it was online for a few months, and had about 300 users. Now closed. It will probably evolve into another project.<br/>
<br/>
<a href="http://i.imgur.com/Y8qvG.png"><img src="http://i.imgur.com/Y8qvG.png" height="120px" border="0" alt="lolrate - recommender system for everything"/></a>
<a href="http://i.imgur.com/hTkPz.png"><img src="http://i.imgur.com/hTkPz.png" height="120px" border="0" alt="lolrate - recommender system for everything - screenshot"/></a>Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com0tag:blogger.com,1999:blog-7948448060991407843.post-67015307715868573092008-08-10T10:39:00.002+02:002012-06-13T15:05:54.447+01:00What's new 2Back in Poland. Time to get back to work on the PhD.<br/>Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com0tag:blogger.com,1999:blog-7948448060991407843.post-20975131680751819322008-05-29T10:05:00.008+02:002012-06-12T13:49:43.378+01:00What's newWith John Tomfohr we gave a talk at Morgan Stanley in New York about our Netflix Prize solutions.<br/>
<br/>
Some time later we met again at Stanford Startup School. I liked these two talks most:
<a href="http://www.youtube.com/watch?v=0CDXJ6bMkMY">[1]</a> <a href="http://www.youtube.com/watch?v=6nKfFHuouzA">[2]</a>Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com0tag:blogger.com,1999:blog-7948448060991407843.post-92097800993439761812008-01-16T08:45:00.001+01:002012-06-12T08:54:55.811+01:00US, Santa Cruz and chessI moved for some time to the US. Since one month I'm settling in in Santa Cruz, California, visiting UCSC.<br/>
<br/>
I wouldn't be myself without looking for a place to play chess. It turned out that every week there is a tournament in a cafe in a large bookstore in the middle of Santa Cruz downtown. I played there and later I was surprised to find on the Internet an analysis of one of my games. I lost a game to Dana Mackenzie, and you can follow the analysis on his <a href="http://www.danamackenzie.com/blog/?p=70">blog</a>. Dana is a freelance writer, former mathematics teacher, and holds PhD from Princeton. His USCF rating happens to be equal to my FIDE rating, 2048.Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com0tag:blogger.com,1999:blog-7948448060991407843.post-71200359822121520452007-11-29T20:19:00.000+01:002012-06-11T13:50:40.726+01:00Sharpgame.netI launched a new service - a simple online game:<br /><a style="text-decoration: line-through" href="http://sharpgame.net/">sharpgame.net</a> - English version<br /><a style="text-decoration: line-through" href="http://gra-w-karteczki.net/b">gra-w-karteczki.net</a> - Polish versionArek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com4tag:blogger.com,1999:blog-7948448060991407843.post-68519975561028940322007-11-29T15:59:00.000+01:002007-11-29T17:05:29.552+01:00Netflix Prize: Progress PrizeTeam <a href="http://www.research.att.com/~volinsky/netflix/">Belkor/Korbell</a> from AT&T Research won the $50,000 Progress Prize (see links to the <a href="http://www.research.att.com/~volinsky/netflix/blogs.html">press coverage</a>). Congratulations!<br />According to the rules, they were required to <a href="http://www.research.att.com/~volinsky/netflix/ProgressPrize2007BellKorSolution.pdf">describe their solution</a> - this is a very interesting paper.<br /><br />I've joined John Tomfohr in team <a href="http://tomfohr.blogspot.com/">basho</a> and we are currently in 4th place with the score 8.25%. The leaders have the score 8.50%.Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com0tag:blogger.com,1999:blog-7948448060991407843.post-19300387047397959992007-08-13T08:38:00.000+02:002012-06-11T13:48:27.798+01:00Netflix Prize: Ideas giveawayI learned a lot today. A very short report from the conference KDD 2007, San Jose, California:<br />My <a href="http://arek-paterek.com/ap_kdd.pdf">paper</a>, <a href="http://arek-paterek.com/ap_kdd_poster.pdf">poster</a> and <a href="http://arek-paterek.com/ap_kdd_slides.pdf">slides</a>.<br /><a href="http://www.cs.uic.edu/~liub/KDD-cup-2007/proceedings.html">The workshop proceedings</a>.Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com4tag:blogger.com,1999:blog-7948448060991407843.post-75230703017391564442007-06-29T00:02:00.000+02:002012-06-11T13:44:48.185+01:00Netflix Prize: The Long Walk<a href="http://www.cs.toronto.edu/~rsalakhu/mltoronto.html">ML@UToronto</a>, a team representing one of the strongest <a href="http://learning.cs.toronto.edu/">machine learning research groups</a>, regains 3rd place with their new solution.<br /><br /><div style="text-align: center;"><a href="http://i.imgur.com/GUjD8.jpg"><img src="http://i.imgur.com/GUjD8.jpg" alt="Netflix Prize Leaderboard - Top 10" /><br /></a><br /></div>Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com5tag:blogger.com,1999:blog-7948448060991407843.post-54215388665109724712007-06-27T18:38:00.000+02:002007-06-27T20:06:05.505+02:00Netflix Prize: Even more progress3rd place in the Netflix Prize.Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com0tag:blogger.com,1999:blog-7948448060991407843.post-5685106359084151202007-06-26T10:39:00.000+02:002012-06-11T13:42:38.054+01:00Netflix Prize: More progressI am in 4th place in the Netflix Prize with the score over 7%.<br />This is one of those rare times when I can say I am quite pleased with myself.<br /><br /><div style="text-align: center;"><a href="http://i.imgur.com/jpPwD.jpg"><img src="http://i.imgur.com/jpPwD.jpg" alt="Netflix Prize Leaderboard - Top 10" /><br /></a><br /></div>Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com0tag:blogger.com,1999:blog-7948448060991407843.post-34709451789562858722007-06-11T16:43:00.000+02:002012-06-11T13:41:09.146+01:00Netflix Prize: ProgressMy last submission is in ninth place. Team Bellkor takes the lead.<br /><br /><div style="text-align: center;"><a href="http://i.imgur.com/k9Eux.jpg"><img src="http://i.imgur.com/k9Eux.jpg" alt="Netflix Prize Leaderboard - Top 10" /><br /></a><br /></div>Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com0tag:blogger.com,1999:blog-7948448060991407843.post-6312118915654409272007-06-08T02:13:00.000+02:002007-06-08T02:25:28.798+02:00Netflix Prize: The truth is out there, in the dataNine months after the start of the contest, still in 10th place, less than 1% behind the leading team.<br /><br />Currently leading is a four-person team Gravity from Budapest University of Technology and Economics. In third place is a four-person team from University of Toronto, including Prof. Geoffrey Hinton, a well known person in the machine learning community.<br /><br />Looks like the competitors approach the task really professionally. A recent article in the New York Times, <a href="http://www.nytimes.com/2007/06/04/technology/04netflix.html?ex=1338609600&en=eda04055720ce432&ei=5088&partner=rssnyt&emc=rss">"Netflix Prize Still Awaits a Movie Seer"</a>, cites the leader of Team Gravity:<br />"Domonkos Tikk, a data mining expert who is a senior researcher at the university in Budapest, leads Team Gravity. Dr. Tikk said that since October, his team, which is composed of three Ph.D. candidates and himself, has spent eight hours a day, seven days a week on the problem".Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com0tag:blogger.com,1999:blog-7948448060991407843.post-13510224694023096632007-05-22T13:30:00.000+02:002012-06-11T13:39:11.053+01:00Netflix PrizeI advanced to the top ten in the <a href="http://netflixprize.com/">Netflix Prize</a>.<br /><br />"Torture the data long enough, and sooner or later it will confess." (Julian Faraway / Ronald Coase)<br /><br />After 8 months from the start of the contest the data still do not want to confess. The $1 million prize is a strong motivation. Almost 2000 teams submitted their solutions.<br /><br /><br /><div style="text-align: center;"><a href="http://i.imgur.com/t133E.jpg"><img src="http://i.imgur.com/t133E.jpg" alt="Netflix Prize Leaderboard" /><br /></a><br /></div>Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com0tag:blogger.com,1999:blog-7948448060991407843.post-3840182344113813462007-05-22T12:34:00.000+02:002012-06-11T13:46:10.018+01:00First noteSo, I've decided to start a blog. If the Polish prime minister can have a <a style="text-decoration: line-through" href="http://kmarcinkiewicz.blog.onet.pl/">blog</a>, then I can have one too.Arek Paterekhttp://www.blogger.com/profile/10024063805816504858noreply@blogger.com0