blogger templates blogger widgets
This is part of a list of blog posts.
To browse the contents go to

DeepQA - Getting the pieces together

Three phases to a cognitive query engine would be

  1. Index and store text
  2. Search text
  3. Process text


The first 2 was achieved easily as we have this multi-platform java based search engine - SOLR.

Check out my Notes on SOLR.

How we want to index the data is the most important decision. But once it's indexed and stored, searching is fairly easy but since we need a close description of what we are querying (asking), we need a highlight and not a full document result.

For those who are not familiar with highlighting and search, here is a small example of an solr search and highlight query.

http://localhost:8983/solr/testCollection/select?q=content_raw%3Aborn&wt=xml&indent=true&hl=true&hl.fl=content&hl.simple.pre=%3Cem%3E&hl.simple.post=%3C%2Fem%3E


As you see the response involves both a result and a highlight. It's the highlight that we use to analyse and fetch the answer.


The highlighted text is split into meaningful sentences and is processed using OpenNLP and that does away with Step 3. OpenNLP helps to find occurrences of Names, Location, Date, etc. along with it's confidence scores.


No comments:

Post a Comment