Wednesday, 9 March 2016

Why Google's Go win against Lee Se-dol is massive?

THE VERGE

DeepMind’s dramatic victory over legendary Go player Lee Se-dol earlier today is a huge moment in the history of artificial intelligence, and something many predicted would be decades away. "I was very surprised," says Lee. "I didn't expect to lose. I didn't think AlphaGo would play the game in such a perfect manner."
But why is it so impressive that DeepMind’s AlphaGo program — backed by the might of Google — has beaten one of the game’s most celebrated figures? To understand that, you have to understand the game's roots, and how the DeepMind team has built AlphaGo to uproot them.
Go, known as weiqi in China, igo in Japan, and baduk in Korea, is an abstract board game that dates back nearly 3,000 years. It’s a game of strategy played across a 19 x 19 grid; players take turns placing black and white stones to surround points on the grid and capture their opponent’s territory. Although the ruleset is very small, it creates a challenge 

"It’s one of the great intellectual mind sports of the world," says Toby Manning, treasurer of the British Go Association and referee of AlphaGo’s victory over European champion Fan Hui last year. "It’s got extremely simple rules, but these rules give rise to an awful lot of complexity." Manning cites a classic quote from noted 20th-century chess and Go player Edward Lasker: "While the baroque rules of chess could only have been created by humans, the rules of Go are so elegant, organic, and rigorously logical that if intelligent life forms exist elsewhere in the universe, they almost certainly play Go."

Because of Go’s deep intricacy, human players become experts through years of practice, honing their intuition and learning to recognize gameplay patterns. "The immediate appeal is that the rules are simple and easy to understand, but then the long-term appeal is that you can’t get tired of this game because there is such a depth," says Korea Baduk Association secretary general Lee Ha-jin (above). "Although you are spending so much time, there is always something new to learn and you feel that you can get better and stronger."
After starting to play the game at five years old, Lee Ha-jin displayed such a level of talent that her parents decided to send her to a private Go school in Seoul. She lived with her teacher, went to regular school in the daytime, then came back and played Go for several hours every night. Lee eventually turned professional at the age of 16.
A visit to her current workplace, the Korea Baduk Association, illustrates the game’s stature in this country. Members of the Korea Women Baduk League play out matches in stoic silence on one floor. Another floor hosts a room stacked with storied trophies, many of which are slightly creepy disembodied hands. (One old metaphorical name for the game translates as "hand talk.") And in the basement, there’s a full-fledged operating center for Baduk TV, a cable channel dedicated to Go. One of its studios has a mock-up stage for the AlphaGo showdown, where the channel can reenact the matches and provide extra analysis.
Every Go player I’ve spoken to says the same thing about the game: its appeal lies in depth through simplicity. And that also gets to the heart of why it’s so difficult for computers to master. There’s limited data available just from looking at the board, and choosing a good move demands a great deal of intuition.
ALPHAGO GETS BETTER BY PLAYING ITSELF
"Chess and checkers do not need sophisticated evaluation functions," says Jonathan Schaeffer, a computer scientist at the University of Alberta who wrote Chinook, the first program to solve checkers. "Simple heuristics get most of what you need. For example, in chess and checkers the value of material dominates other pieces of knowledge — if I have a rook more than you in chess, then I am almost always winning. Go has no dominant heuristics. From the human's point of view, the knowledge is pattern-based, complex, and hard to program. Until AlphaGo, no one had been able to build an effective evaluation function."

So how did DeepMind do it? AlphaGo uses deep learning and neural networks to essentially teach itself to play. Just as Google Photos lets you search for all your pictures with a cat in them because it holds the memory of countless cat images that have been processed down to the pixel level, AlphaGo’s intelligence is based on it having been shown millions of Go positions and moves from human-played games.
The twist is that DeepMind continually reinforces and improves the system’s ability by making it play millions of games against tweaked versions of itself. This trains a "policy" network to help AlphaGo predict the next moves, which in turn trains a "value" network to ascertain and evaluate those positions. AlphaGo looks ahead at possible moves and permutations, going through various eventualities before selecting the one it deems most likely to succeed. The combined neural nets save AlphaGo from doing excess work: the policy network helps reduce the breadth of moves to search, while the value network saves it from having to internally play out the entirety of each match to come to a conclusion.
This reinforced learning system makes AlphaGo a lot more human-like and, well, artificially intelligent than something like IBM’s Deep Blue, which beat chess grandmaster Garry Kasparov by using brute force computing power to search for the best moves — something that just isn’t practical with Go. It’s also why DeepMind can’t tweak AlphaGo in between matches this week, and since the system only improves by teaching itself, the single match each day isn’t going to make a dent in its learning. DeepMind founder Demis Hassabis says that although AlphaGo has improved since beating Fan Hui in October, it’s using roughly the same computing power for the Lee Se-dol matches, having already hit a point of diminishing returns in that regard.




That’s not to say that AlphaGo as it exists today would be a better system for chess, according to one of Deep Blue’s creators. "I suspect that it could perhaps produce a program that is superior to all human grandmasters," says IBM research scientist Murray Campbell, who describes AlphaGo as a "very impressive" program. "But I don’t think it would be state of the art, and why I say that is that chess is a qualitatively different game on the search side — search is much more important in chess than it is in Go. There are certainly parts of Go that require very deep search but it’s more a game about intuition and evaluation of features and seeing how they interact. In chess there’s really no substitute for search, and modern programs — the best program I know is a program called Komodo — it’s incredibly efficient at searching through the many possible moves and searching incredibly deeply as well. I think it would be difficult for a general mechanism had it been created in AlphaGo and applied to chess, I just don’t think it’d be able to recreate that search and it’d need another breakthrough."

DeepMind, however, believes that the principles it uses in AlphaGo have broader applications than just Go. Hassabis makes a distinction between "narrow" AIs like Deep Blue and artificial "general" intelligence (AGI), the latter being more flexible and adaptive. Ultimately the Google unit thinks its machine learning techniques will be useful in robotics, smartphone assistant systems, and healthcare; last month DeepMind announced that it had struck a deal with the UK’s National Health Service.

Today, though, the focus is on Go, and with good reason — the first victory over Lee Se-dol is major news even if AlphaGo loses the next four matches. "Go would lose one big weapon," Lee Ha-jin told me last week when asked about what defeat for Lee Se-dol would mean for the game at large. "We were always so proud that Go was the only game that can not be defeated by computers, but we wouldn’t be able to say that any more, so that would be a little disappointing."
"WE’RE ABSOLUTELY IN SHOCK."
But AlphaGo could also open up new avenues for the game. Members of the Go community are as stunned with the inventive, aggressive way AlphaGo won as the fact that it did at all. "There were some moves at the beginning — what would you say about those three moves on the right on the fifth line?" American Go Association president asked VP of operations Andrew Jackson, who also happens to be a Google software engineer, at the venue following the match. "As it pushes from behind?" Jackson replied. "If I made those same moves…" Okun continued. "Our teachers would slap our wrists," Jackson agreed. "They’d smack me!" says Okun. "You don’t push from behind on the fifth line!"
"We’re absolutely in shock," said Jackson. "There’s a real question, though. We’ve got this established Go orthodoxy, so what’s this going to reveal to us next? Is it going to shake things up? Are we going to find these things that we thought were true — these things you think you know and they just ain’t so?"

No comments:

Post a Comment

MY AD 2