IBM Watson: The Science Behind an Answer
0 (0 Likes / 0 Dislikes)
What is Watson? The Science Behind an Answer.
How Watson answers a question in four (not so simple) steps.
The first person mentioned by name in 'The Man in the Iron Mask' is this hero of
a previous book by the same author.
This is a typical Jeopardy clue, presented in the tricky Jeopardy format that can be
pretty difficult for a person to understand, much less a computer.
Remember: a computer understands code, ones and zeros, not nouns and verbs,
or people and places, let alone the relationships between them.
Watson can't see or hear, so audio and visual questions are off-limits
but everything else is fair game.
And of course, it's impossible to know everything.
As Alex Trebek often quips, the hardest Jeopardy question is the one you don't know.
How does a computing system reach a single answer to clues
posed in human language?
Unlike traditional databases designed for computers, real language is implicit, ambiguous and
full of complexity.
That's one of the reason why until now computer searches have only spit out documents
filled with keywords.
It's been up to us to find the answers in those documents.
Like our brains, Watson's knowledge base is entirely self-contained,
except while our brain's built in a shoebox, Watson's brain takes up more space
than eight large refrigerators.
When Watson answers a Jeopardy question, there's no Internet or helpline,
so Watson consumes a steady diet of information to prepare for a game.
Watson needs to absorb so much information because Jeopardy is open-domain, which means
it can ask a question about anything.
So Watson reads, analyses and tries to understand millions of books and documents, content written
in the way humans communicate before it gets anywhere near a game board.
Never before in the history of computing has a machine been able to so precisely answer such
a wide breadth of questions in such a short time.
Let's see how Watson does it.
Step 1: question analysis.
The first thing Watson does is parse the question into its parts of speech and identify
the different roles the words and phrases in the sentence are playing.
This helps Watson determine two distinct things: what type of question is being asked and
what the quesion is asking for.
During this early stage of the process Watson doesn't know how to find the best answer yet,
so it increases its chances by looking at many different options of what the question
might be asking for.
Step 2: hypothesis generation.
For each interpretation of the question, Watson quickly searches through hundreds of millions of
documents to come up with thousands of possible answers.
At this point, quantity trumps accuracy.
It's more important for Watson to generate a large number of possible answers and
narrow them down from there because if the correct answer isn't included during the initial sweep,
there is no possible way for Watson to identify and justify the right answer at the end.
Step 3: hypothesis and evidence scoring.
Of course it's not enough for Watson to just come up with answers,
it has to support and defend them.
So after downgrading obviously wrong answers, Watson finds passages from many different sources
to collect positive and negative evidence
for all of the remaining possibilities.
Watson understands these passages having learned the relationships between words,
relationships such as 'books have heroes' or 'authors create characters'.
Scoring algorithms then rate the quality of this evidence based on everything from
the source material's reliability to whether the time and location
appear correct.
There are still hundreds of possible answers left, so thousands of algorithms work in parallel
to score the evidence for each and every one of them.
Remember: this must happen in seconds.
Step 4: final merging and ranking.
Different types of evidence are better at solving different types of questions, so just
like a person learns from practice, Watson uses the experience it gains from trying to
answer similar questions in order to weigh the importance of its different types of evidence.
It's not about memorizing trivia.
By playing thousand of practice games, Watson learns how to weigh, apply and combine its own
algorithms to help decide the degree to which each piece of evidence is useful or not.
These weighted evidence scores are merged together to decide the final rankings for
all the possible answers, with the highest-ranked answers appearing in order
on Watson's answer panel.
In Jeopardy, contestants lose money if they buzz in with the wrong answer, so Watson estimates
its confidence as to whether or not its top answer, along with every other answer possibility, is correct.
This confidence is based on how high the answers
rated during evidence scoring and ranking.
If Watson's confidence for its top answer is low, under 50% for example,
then Watson won't answer.
This is an important step for computing.
Watson knows what it knows and it knows what it doesn't know.
There is no fixed confidence level deciding whether or not Watson buzzes in.
The threshold is constantly changing, based on how well Watson is doing
relative to the other players and how much money is left on the board.
In this case, Watson arrived at its answer with a 78% confidence.
For that stage of the game, it was a high enough confidence level to buzz in and
it won Watson $800 in the process.
Who is d'Artagnan?
IBM is using Jeopardy as a way to push the science of deep analytics and
natural language processing forward but Jeopardy is just the first step.
The same technology that helps Watson answer a Jeopardy question could have enormous implications
for healthcare, finance or virtually any other industry where people are using information
to help them make better decisions.
Watson's ultimate success will be measured not by 'Daily Doubles'
but by what it means for society.
Let's build a smarter planet.