Peter Norvig 这样的大师的意见,我们需要仔细体会。我整理一下我感兴趣的。
- tow phase of google search algorithms
- An offline phase, which is time-consuming and query-independent.
- An on-line phrase, in response to a user query in a few milliseconds.
- Tons of training data ... from the armies of "raters" employed by Google
- The big surprise is that Google still uses the manually-crafted formula for its search results, despite the fact that, their best machine-learned model is now as good as, and sometimes better than, the hand-tuned formula on the results quality metrics that Google uses.
- two reasons
- the human experts who created the algorithm believe they can do better than a machine-learned model
- Google's search team worries that machine-learned models may be susceptible to catastrophic errors on unforeseen query types, which is different from the training data.
- Nassim Taleb divides Black Swan phenomena into two classes
- Mediocristan
- Extremistan
- The current generation of machine learning algorithms can work well in Mediocristan but not in Extremistan.
So the thing is, how to figure out whether new machine learning algorithms can be devised that work well in Extremistan, or prove that it cannot be done?

