Watson for Oncology (WFO) – more details

Back to Watson for Oncology (WFO). … so today was deep dive day to look at what papers were written specifically re WFO.

So,  on Sunday, June 23, 2019, using Google Scholar … the list below is of  the main useful things I could find.

Conclusions:

  • Shows promise
  • Not ready for solo flight (i.e. needs clinicians to work with it).
  • Benefits from adding diagnostic tests liken GEA (Gene expression assays)
  • Keep working on improving WFO, and understand specifics better.

 

Literature I looked at, will look at it again in more detail, and provide further insights.

  1. Choi, Y. I., Chung, J. W., Kim, K. O., Kwon, K. A., Kim, Y. J., Park, D. K., … & Sung, K. H. (2019). Concordance Rate between Clinicians and Watson for Oncology among Patients with Advanced Gastric Cancer: Early, Real-World Experience in Korea. Canadian Journal of Gastroenterology and Hepatology, 2019.
  2. Kim, Y. Y., Oh, S. J., Chun, Y. S., Lee, W. K., & Park, H. K. (2018). Gene expression assay and Watson for Oncology for optimization of treatment in ER-positive, HER2-negative breast cancer. PloS one, 13(7), e0200100.
  3. Schmidt, C. (2017). MD Anderson breaks with IBM Watson, raising questions about artificial intelligence in oncology. JNCI: Journal of the National Cancer Institute, 109(5).
  4. Zhang, X. C., Zhou, N., Zhang, C. T., Lv, H. Y., Li, T. J., Zhu, J. J., … & Liu, G. (2017). 544P Concordance study between IBM Watson for Oncology (WFO) and clinical practice for breast and lung cancer patients in China. Annals of Oncology, 28(suppl_10), mdx678-001.
  5. Zou, F., Liu, C. Y., Liu, X. H., Tang, Y. F., Ma, J. A., & Hu, C. H. (2018). Concordance Study between IBM Watson for Oncology and Real Clinical Practice for Cervical Cancer Patients in China: A Retrospective Analysis. Available at SSRN 3287513.
  6. Somashekhar, S. P., Sepúlveda, M. J., Puglielli, S., Norden, A. D., Shortliffe, E. H., Rohit Kumar, C., … & Ramya, Y. (2018). Watson for Oncology and breast cancer treatment recommendations: agreement with an expert multidisciplinary tumor board. Annals of Oncology, 29(2), 418-423.
  7. Somashekhar, S. P., Sepúlveda, M. J., Norden, A. D., Rauthan, A., Arun, K., Patil, P., … & Kumar, R. C. (2017). Early experience with IBM Watson for Oncology (WFO) cognitive computing system for lung and colorectal cancer treatment.
  8. Somashekhar, S. P., Kumarc, R., Rauthan, A., Arun, K. R., Patil, P., & Ramya, Y. E. (2017). Abstract S6-07: Double blinded validation study to assess performance of IBM artificial intelligence platform, Watson for oncology in comparison with Manipal multidisciplinary tumour board–First study of 638 breast cancer cases.
  9. Liu, C., Liu, X., Wu, F., Xie, M., Feng, Y., & Hu, C. (2018). Using artificial intelligence (Watson for oncology) for treatment recommendations amongst Chinese patients with lung cancer: Feasibility study. Journal of medical Internet research, 20(9), e11087.
  10. Ross, C., & Swetlitz, I. (2017). IBM pitched its Watson supercomputer as a revolution in cancer care. It’s nowhere close. STAT News.
  11. Zauderer, M. G., Gucalp, A., Epstein, A. S., Seidman, A. D., Caroline, A., Granovsky, S., … & Petri, J. (2014). Piloting IBM Watson Oncology within Memorial Sloan Kettering’s regional network.
  12. Herath, D. H., Wilson-Ing, D., Ramos, E., & Morstyn, G. (2016). Assessing the natural language processing capabilities of IBM Watson for oncology using real Australian lung cancer cases.
  13. Bach, P., Zauderer, M. G., Gucalp, A., Epstein, A. S., Norton, L., Seidman, A. D., … & Keesing, J. (2013). Beyond Jeopardy!: Harnessing IBM’s Watson to improve oncology decision making.
  14. Kris, M. G., Gucalp, A., Epstein, A. S., Seidman, A. D., Fu, J., Keesing, J., … & Setnes, M. (2015). Assessing the performance of Watson for oncology, a decision support system, using actual contemporary clinical cases.

A more serious study of the Public Git Archive (PGA)

Following up on the Octoverse clues, I uncovered this GEM — Markovtsev, Vadim, and Waren Long. “Public git archive: a big code dataset for all.” In Proceedings of the 15th International Conference on Mining Software Repositories, pp. 34-37. ACM, 2018. you can look at the arXiv version here.

This study point to the following being the most popular programming languages

  1. C
  2. JS
  3. C++ 
  4. Java
  5. PHP
  6.  Go
  7. Python
  8. Obj-C
  9. C#
  10. Ruby

If you’re into data mining and analysis of REALLY large public datasets, this one offers lots to work with. According to the authors, the Public Git Archive occupies 3.0 TB on disk .   Enjoy ..

 

Hagen’s Biological and clinical data integration in healthcare study is great!

Just finished looking at Matt Hagen’s 2014 “Biological and clinical data integration and its applications in healthcare.” PhD  dissertation. This is a great piece of work … You can find it here.

While its around 5 years old, the insights and discussion are excellent.  I like the detailed breakdown of how different ontologies and vocabularies align (and how things fall through the cracks).  I liked the discussion of using Neo4j to analyze relationships and simplify searches and relationship mappings.

Particularly liked the discussion of using  ontologies.  to” facilitate improved prioritization of intensive care admissions and accurate clustering of multimorbidity conditions”.  THIS IS BIG! with enormous potential.

Discussion of his BioSPIDA relational database translator and its contrast with  the separate Entrez Gene, Pubmed, CDD, Refseq, MMDB, and Biosystems NCBI databases.

His Table 7.2: Descriptions of patient clusters is rather illuminating, as his discussion and analysis of ICU Electronic Health Records and findings associated with morbidity outcomes.

For example Cluster 1 contains the following Most Prevalent Conditions: Coronary arteriosclerosis, Hypercholesterolemia, Diabetes, Gastroesophageal reflux disease,  Atrial fibrillation, Hyperlipidemia, Tobacco dependence.  Which led to the following Most Prevalent Procedures:  Catheterization of left heart, Cardiopulmonary bypass operation, Angiocardiography of left heart,.

 

 I  am surprised this work is not cited as much as it should be!.  IMHO, this work definitely should be used as blueprint for additional investigations.

 

 

Artificial Intelligence and the Vatican’s Secret Archives

The Atlantic Magazine has a really interesting story …Artificial Intelligence Is Cracking Open the Vatican’s Secret Archives

this is pretty fascinating on many levels … Sam Kean’s Atlantic article says this about the new project:

Known as In Codice Ratio, it uses a combination of artificial intelligence and optical-character-recognition (OCR) software to scour these neglected texts and make their transcripts available for the very first time.

Check it out … maybe some old puzzles will be solved. Certainly, it will provide material for scholars and Hollywood  film makers for a long time. 🙂

 

Codice Ratio states:

The project concentrates on the collections of the Vatican Secret Archives, one of the largest and most important historical archive in the world. In an extension of 85 kilometres of shelving, it maintains more than 600 archival collections containing historical documents on the Vatican activities, such as, all the acts promulgated by the Vatican, account books, correspondence of the popes, starting from the eighth century.

and lists the following publications on the project :

  • Donatella Firmani, Paolo Merialdo, Marco Maiorino, Elena Nieddu: Towards Knowledge Discovery from the Vatican Secret Archives. In Codice Ratio-Episode 1: Machine Transcription of the Manuscripts Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD2018) (2018)

  • Donatella Firmani, Paolo Merialdo, Marco Maiorino, Elena Nieddu: Towards Knowledge Discovery from the Vatican Secret Archives. In Codice Ratio-Episode 1: Machine Transcription of the Manuscripts arXiv:1803.03200

  • Donatella Firmani, Paolo Merialdo, Marco Maiorino: In Codice Ratio: Scalable Transcription of Vatican Registers. ERCIM News 2017(111) (2017)

  • Serena Ammirati, Donatella Firmani, Marco Maiorino, Paolo Merialdo, Elena Nieddu and Andrea Rossi. In Codice Ratio: Scalable Transcription of Historical Handwritten Documents. 25th Italian Symposium on Advanced Database Systems, SEBD 2017 (2017) – PDF

  • Donatella Firmani, Paolo Merialdo, Elena Nieddu and Simone Scardapane. In Codice Ratio: OCR of Handwritten Latin Documents using Deep Convolutional Networks. 11th Italian Workshop on Artificial Intelligence for Cultural Heritage (2017) – PDF

 

 

 

Swoogle Semantic Search Engine

OK … just filing this for future reference – Swoogle2007  a Semantic Web Search … kinda concerning that itS got 2007 in its title …  According to its about page,

Swoogle (http://swoogle.umbc.edu/) is a specialized web search engine that discovers, analyzes and indexes knowledge encoded in semantic web documents published on the Web. Swoogle reasons about these documents and their constituent parts (e.g., terms, individuals, triples) and records meaningful metadata about them. Swoogle provides webscale semantic web data access service, which helps human users and software systems to find relevant documents, terms and triples, via its search and navigation services.

there’s more. its blog has an entry on Knowledge Graph Fact Prediction via Knowledge-Enriched Tensor Factorization (here), and suggests related entries

 

 

OK .. we’ll save this for later. Interesting!

 

Artificial Intelligence in Medicine (AIM) – IBM’s WATSON and WatsonPaths

Am working on providing pointers and discussion on  AI in Medicine (AIM).

The general place to check first is the main Artificial Intelligence in Medicine resource page.  You’ll find some useful ideas about Hetnets in biomedicine. These are heterogeneous networks with multiple node or relationship types. Useful for data integration, translation, and biomedical knowledge mining. You’ll also find out aboutProject Rephetio (Drug Repurposing) developed to predict new uses for existing compounds.  In addition, you’ll find links to  OBI – the Ontology for Biomedical Investigations.

Referenes and links associated with IBM Watson / WatsonPaths medical applications are located here.

For convenience and robustness, I am including the initial references below. Enjoy.

 

AI in Medicine (AIM) approaches and applications have assisted in both trivial and profound ways, and they hold great promise. We argue that there are even larger systemic benefits when AI enabled medicine is considered at a national level.

This page aims to present relevant information and links to resources useful in furthering AIM objectives.  Some links point to reports,preprints,  papers, and books, other point to active and inactive databases, still others point software repositories and AIM specific software and platforms.  While some of the links point to completed and/or terminated projects, we believe there’s much to be learned from the linked resources, and we hope these are used to spark curiosity and further ideas and progress in the spirit of “on the shoulders of Giants”.

AIM Related Projects

IBM WATSON / WatsonPaths: IBM’s Watson architecture was and is being employed in Medical applications.
WatsonPaths: Scenario-Based Question Answering and Inference over Unstructured Information is a key paper available here. As an illustration, the paper discusses a Patient with Erythropoietin Deficiency. Via the query “A 32-year-old woman with type 1 diabetes mellitus has had progressive renal failure… Her hemoglobin concentration is 9 g/dL… A blood smear shows normochromic, normocytic cells. What is the problem?

The table below provides links to some of the key patents in IBM’s Watson Intellectual Property portfolio- SCROLL to the right within the table to see immediate links to the patent PDFs.

id title inventor/author priority date grant date result link
US-10216804-B2 Providing answers to questions using hypothesis pruning Jennifer Chu-Carroll, David A. Ferrucci, David C. Gondek, Adam P. Lally, James C. Murdock, IV 9/28/10 2/26/19 https://patents.google.com/patent/US10216804B2/en
US-10133808-B2 Providing answers to questions using logical synthesis of candidate answers Eric W. Brown, Jennifer Chu-Carroll, David A. Ferrucci, Adam P. Lally, James W. Murdock, John M. Prager 9/28/10 11/20/18 https://patents.google.com/patent/US10133808B2/en
US-9805613-B2 System and method for domain adaptation in question answering Sugato Bagchi, David A. Ferrucci, David C. Gondek, Anthony T. Levas, Wlodek W. Zadrozny 5/14/08 10/31/17 https://patents.google.com/patent/US9805613B2/en
US-9798800-B2 Providing question and answers with deferred type evaluation using text with limited structure Pablo A. Duboue, James J. Fan, David A. Ferrucci, James W. Murdock, IV, Christopher A. Welty, Wlodek W. Zadrozny 9/24/10 10/24/17 https://patents.google.com/patent/US9798800B2/en
US-9690861-B2 Deep semantic search of electronic medical records Keerthana Boloor, Eric W. Brown, Murthy V. Devarakonda, David Ferrucci, John M. Prager 7/17/14 6/27/17 https://patents.google.com/patent/US9690861B2/en
US-9529845-B2 Candidate generation in a question answering system Jennifer Chu-Carroll, James J. Fan, David A. Ferrucci 8/13/08 12/27/16 https://patents.google.com/patent/US9529845B2/en
US-9508038-B2 Using ontological information in open domain type coercion David A. Ferrucci, Aditya Kalyanpur, James W. Murdock, IV, Christopher A. Welty, Wlodek W. Zadrozny 9/24/10 11/29/16 https://patents.google.com/patent/US9508038B2/en
US-9454603-B2 Semantically aware, dynamic, multi-modal concordance for unstructured information analysis Branimir K. Boguraev, Youssef Drissi, David A. Ferrucci, Paul T. Keyser, Anthony T. Levas 8/6/10 9/27/16 https://patents.google.com/patent/US9454603B2/en
US-9262938-B2 Combining different type coercion components for deferred type evaluation Sugato Bagchi, James J. Fan, David A. Ferrucci, Aditya A. Kalyanpur, James W. Murdock, IV, Christopher A. Welty 3/15/13 2/16/16 https://patents.google.com/patent/US9262938B2/en
US-9189541-B2 Evidence profiling Eric W. Brown, Jennifer Chu-Carroll, James J. Fan, David A. Ferrucci, David C. Gondek, Anthony T. Levas, James W. Murdock, IV 9/24/10 11/17/15 https://patents.google.com/patent/US9189541B2/en
US-9165252-B2 Utilizing failures in question and answer system responses to enhance the accuracy of question and answer systems Michael A. Barborak, Jennifer Chu-Carroll, David A. Ferrucci, James W. Murdock, IV, Wlodek W. Zadrozny 7/15/11 10/20/15 https://patents.google.com/patent/US9165252B2/en
US-9153142-B2 User interface for an evidence-based, hypothesis-generating decision support system Sugato Bagchi, Michael A. Barborak, Steven D. Daniels, David A. Ferrucci, Anthony T. Levas 5/26/11 10/6/15 https://patents.google.com/patent/US9153142B2/en
US-9146917-B2 Validating that a user is human Michael A. Barborak, David A. Ferrucci, James W. Murdock, IV, Wlodek W. Zadrozny 7/15/11 9/29/15 https://patents.google.com/patent/US9146917B2/en
US-9031832-B2 Context-based disambiguation of acronyms and abbreviations Branimir K. Boguraev, Jennifer Chu-Carroll, David A. Ferrucci, Anthony T. Levas, John M. Prager 9/29/10 5/12/15 https://patents.google.com/patent/US9031832B2/en
US-8972321-B2 Fact checking using and aiding probabilistic question answering David A. Ferrucci, David C. Gondek, Aditya A. Kalyanpur, Adam P. Lally, Siddharth Patwardham 9/29/10 3/3/15 https://patents.google.com/patent/US8972321B2/en
US-8943051-B2 Lexical answer type confidence estimation and application James J. Fan, David A. Ferrucci, David C. Gondek, Aditya A. Kalyanpur, Adam P. Lally, James W. Murdock, Wlodek W. Zadrozny 9/24/10 1/27/15 https://patents.google.com/patent/US8943051B2/en
US-8880388-B2 Predicting lexical answer types in open domain question and answering (QA) systems David A. Ferrucci, Alfio M. Gliozzo, Aditya A. Kalyanpur 8/4/11 11/4/14 https://patents.google.com/patent/US8880388B2/en
US-2014164303-A1 Method of answering questions and scoring answers using structured knowledge mined from a corpus of data Sugato Bagchi, David A. Ferrucci, Anthony T. Levas, Erik T. Mueller 12/11/12 https://patents.google.com/patent/US20140164303A1/en
US-8738362-B2 Evidence diffusion among candidate answers during question answering David A. Ferrucci, David C. Gondek, Aditya A. Kalyanpur, Adam P. Lally 9/28/10 5/27/14 https://patents.google.com/patent/US8738362B2/en
US-8738617-B2 Providing answers to questions using multiple models to score candidate answers Eric W. Brown, David A. Ferrucci, James W. Murdock, IV 9/28/10 5/27/14 https://patents.google.com/patent/US8738617B2/en
US-2014108322-A1 Text-based inference chaining David W. Buchanan, David A. Ferrucci, Adam P. Lally 10/12/12 https://patents.google.com/patent/US20140108322A1/en
US-2014072948-A1 Generating secondary questions in an introspective question answering system Branimir K. Boguraev, David W. Buchanan, Jennifer Chu-Carroll, David A. Ferrucci, Aditya A. Kalyanpur, James W. Murdock, IV, Siddharth A. Patwardhan 9/11/12 https://patents.google.com/patent/US20140072948A1/en
US-8560300-B2 Error correction using fact repositories David A. Ferrucci, David C. Gondek, Wlodek W. Zadrozny 9/9/09 10/15/13 https://patents.google.com/patent/US8560300B2/en
US-8510327-B2 Method and process for semantic or faceted search over unstructured and annotated data Branimir Konstantinov Boguraev, Eric William Brown, Youssef Drissi, David Angelo Ferrucci, Paul Turquand Keyser, Anthony Tom Levas, Dafna Sheinwald 9/24/10 8/13/13 https://patents.google.com/patent/US8510327B2/en
US-8332394-B2 System and method for providing question and answers with deferred type evaluation James Fan, David Ferrucci, David C. Gondek, Wlodek W. Zadrozny 5/23/08 12/11/12 https://patents.google.com/patent/US8332394B2/en
US-8301438-B2 Method for processing natural language questions and apparatus thereof David Angelo Ferrucci, Li Ma, Yue Pan, Zhao Ming Qiu, Chen Wang, Christopher Welty, Lei Zhang 4/23/09 10/30/12 https://patents.google.com/patent/US8301438B2/en
US-8280838-B2 Evidence evaluation system and method based on question answering David A. Ferrucci, Wlodek W. Zadrozny 9/17/09 10/2/12 https://patents.google.com/patent/US8280838B2/en
US-8275803-B2 System and method for providing answers to questions Eric W. Brown, David Ferrucci, Adam Lally, Wlodek W. Zadrozny 5/14/08 9/25/12 https://patents.google.com/patent/US8275803B2/en
CA-2843405-A1 A decision-support application and system for problem solving using a question-answering system Sugato Bagchi, David A. Ferrucci, Anthony T. Levas, Erik T. Mueller 3/8/11 https://patents.google.com/patent/CA2843405A1/en
US-8200656-B2 Inference-driven multi-source semantic search Eric W. Brown, Jennifer Chu-Carroll, James J. Fan, David A. Ferrucci, David C. Gondek, Anthony T. Levas, James William Murdock, IV 11/17/09 6/12/12 https://patents.google.com/patent/US8200656B2/en
US-2011125734-A1 Questions and answers generation Pablo A. Duboue, David A. Ferrucci, David C. Gondek, James W. Murdock, IV, Wlodek W. Zadrozny 11/23/09 https://patents.google.com/patent/US20110125734A1/en
US-7757163-B2 Method and system for characterizing unknown annotator and its type system with respect to reference annotation types and associated reference taxonomy nodes Yurdaer N. Doganata, Youssef Drissi, David A. Ferrucci, Tong-haing Fin, Genady Grabarnik, Lev Kozakov 1/5/07 7/13/10 https://patents.google.com/patent/US7757163B2/en
US-7333967-B1 Method and system for automatic computation creativity and specifically for story generation Selmer Conrad Bringsjord, David Angelo Ferrucci 12/23/99 2/19/08 https://patents.google.com/patent/US7333967B1/en
US-7178105-B1 Method and system for document component importation and reconciliation David Angelo Ferrucci, Steinar Flatland, Adam Patrick Lally 2/4/00 2/13/07 https://patents.google.com/patent/US7178105B1/en
US-7139752-B2 System, method and computer program product for performing unstructured information management and automatic text analysis, and providing multiple document views derived from different document tokenizations Andrei Z Broder, David Carmel, Arthur C Ciccolo, David Ferrucci, Yoelle Maarek, Yosi Mass, Aya Soffer, Wlodek W Zadrozny 5/30/03 11/21/06 https://patents.google.com/patent/US7139752B2/en
US-7131057-B1 Method and system for loose coupling of document and domain knowledge in interactive document configuration David Angelo Ferrucci, Steinar Flatland, Adam Patrick Lally 2/4/00 10/31/06 https://patents.google.com/patent/US7131057B1/en
US-2004243554-A1 System, method and computer program product for performing unstructured information management and automatic text analysis Andrei Broder, Arthur Ciccolo, David Ferrucci, Alan Marwick, Wlodek Zadrozny 5/30/03 https://patents.google.com/patent/US20040243554A1/en
US-2004243556-A1 System, method and computer program product for performing unstructured information management and automatic text analysis, and including a document common analysis system (CAS) David Ferrucci, Thilo Goetz, Thomas Hampp, Alan Marwick, Oliver Suhre, Wlodek Zadrozny 5/30/03 https://patents.google.com/patent/US20040243556A1/en
US-2004243560-A1 System, method and computer program product for performing unstructured information management and automatic text analysis, including an annotation inverted file system facilitating indexing and searching Andrei Broder, David Ferrucci, Alan Marwick, Yosi Mass, Wlodek Zadrozny 5/30/03 https://patents.google.com/patent/US20040243560A1/en

 

SOME BOOKS ON IBM WATSON

(unless otherwise specified, these are of general domain applicability)

  1. Rob High  and Tanmay Bakshi,  (2019) Cognitive Computing with IBM Watson: Build smart applications using artificial intelligence as a service 
  2. IBM Redbooks, IBM Watson Content Analytics: Discovering Actionable Insight from Your Content. 3rd Edition
  3. Steven Baker, (2011), Final Jeopardy: Man vs. Machine and the Quest to Know Everything 

RELEVANT SOURCES INCLUDING THOSE CITED IN PATENTS:

  1. IBM’s DeepQA Research Team Publications
  2. Ferrucci et al., “Towards the Open Advancement of Question Answering Systems,” IBM Technical Report RC24789, Computer Science, Apr. 22, 2009.
  3. David Ferrucci, Eric Brown, Jennifer Chu-Carroll, James Fan, David Gondek, Aditya A. Kalyanpur, Adam Lally, J. William Murdock, Eric Nyberg, John Prager, Nico Schlaefer, and Chris Welty, (2010) *Building Watson: An Overview of the DeepQA Project, AI Magazine Fall, 2010.
  4. William Murdock (2015), Decision Making in IBM Watson Question Answering Web presentation: Ontology Summit 2015
  5. M. Devarakonda, Dongyang Zhang, Ching-Huei Tsou, M. Bornea, Problem-oriented patient record summary: An early report on a Watson application, e-Health Networking, Applications and Services (Healthcom), 2014 IEEE 16th International Conference on, pp. 281-286
  6. WatsonPaths: Scenario-based Question Answering and Inference over Unstructured Information,IBM Research Report RC25489, IBM, 2014
  7. Nico Schlaefer, (2011),Statistical Source Expansion for Question Answering, PHD Thesis,CMU-LTI-11-019
  8. Special Issue on Question Answering, AI Magazine Vol 31 No 3: Fall 2010
  9. Bernstein et al., Ginseng: A Guided Input Natural Language Search Engine for Querying Ontologies, 2006, Jena User Conference, pp. 1-3.
  10. Blitzer, Domain Adaptation of Natural Language Processing Systems, Presented to the Faculties of the University of Pennsylvania in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy, 2007.
  11. Bollen et al., Mining associative relations from website logs and their application to context-dependent retrieva
  12. l using spreading activation, 1999, ACM, pp. 1-6.
  13. Broekstra et al., Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema, 2002, ISWC, vol. 2342/2002, pp. 54-68.
  14. Chang et al., “Creating An Online Dictionary of Abbreviations from MEDLINE,” J Am Med Inform Assoc. 2002; 9:612-620. DOI 10.1197/jamia.M1139.
  15. Chu-Carroll et al., “In Question-Ansering, Two Heads are Better than One”, HLT-NAACL’03, May-Jun. 2003, pp. 24-31, Edmonton, Canada.
  16. Cucerzan et al., “Factoid Question Answering over Unstructured and Structured Web Content”, In Proceedings of the 14th Text Retrieval Conference TREC 2005, Dec. 31, 2005.
  17. Finin, Swoogle: a search and metadata engine for the semantic web, 2004, ACM, pp. 652-659.
  18. Fininet al., Information Retrieval and the Semantic Web, 2005, System Sciences, pp. 1-10.
  19. Kaufmann et al., How Useful Are Natural Language Interfaces to the Semantic Web for Casual End-Users?, 2007, Springer, pp. 281-294.
  20. Ran et al., Natrual Language Query System for RDF Repositories, 2007, SNLP, pp. 1-6.
  21. Tablan et al., A Natural Language Query Interface to Structured Information, 2008, Springer pp. 1-15.
  22. Wang et al., PANTO: A Portable Natural Language Interface to Ontologies, 2007, Springer, pp. 473-487.
  23. Bouzegboub et al, Search and Composition of Learning Objects in a Visual Environment, in Learning in the Synergy of Multiple Disciplines Lecture Notes in Computer Science, Springer: Berlin & Heidelberg, vol. 5794 (2009) ISSN 0302-9743 (Print) 1611-3349 (Online), ISBN 978-3-642-04635-3.
  24. Cao, TH. et al.; A robust ontology-based method for translating natural language queries to conceptual graphs, 2008.
  25. Chabane Djeraba, Marinette Bouet, and Henri Briand, “Concept-Based Query in Visual Information Systems,” IEEE International Forum on Research and Technology Advances in Digital Libraries ADL’98 ,pp. 299-308.
  26. D. Braga, A. Campi, and S. Ceri, XQBE (XQuery by Example): A Visual Interface to the Standard XML Query Language, ACM Transactions on Database Systems 30.2 (2005) 398-443.
  27. G. Barzdins, E. Liepins, M. Veilande, & M. Zviedris, “Ontology Enabled Graphical Database Query Tool for End-Users,” in H.-M. Haav & A. Kalja, edd., Databases and Information Systems V (2009) 105-116.
  28. Gustavo O. Arocena, Alberto O. Mendelzon, and George A. Mihailal, “Applications of a Web query language,” Computer Networks and ISDN Systems ,29.8-13 (Sep. 1997) 1305-1316 = Papers from the Sixth International World Wide Web Conference.
  29. Irna M.R. Evangelista Filha, Altigran S. Da Silva, Alberto H.F. Laender, and David W. Embley, “Using Nested Tables for Representing and Querying Semistructured Web Data,” in Anne Banks Pidduck, John Mylopoulos, Carson C. Woo, and M. Tamer Ozsu, edd., Advanced Information Systems Engineering LNCS 2348 (2002) 719-723.
  30. Kudelka, M. et al.; Semantic Analysis of Web Pages Using Web Patterns, 2006 (IEEE).
  31. Li et al, “XGI: A Graphical Interface for XQuery Creation,” in AMIA Annu Symp Proc . (2007) 453-457.
  32. Moller, M. et al, RadSem: semantic annotation and retrieval for medical images,2009.
  33. Petropoulos et al, (Querying and Reporting Semistructured Data, QURSED, 2002.
  34. S. Jeromy Carriere and Rick Kazman, “WebQuery: searching and visualizing the Web through connectivity,” Computer Networks and ISDN Systems 29.8-13 (Sep. 1997) 1257-1267 = Papers from the Sixth International World Wide Web Conference.
  35. Sriram Raghavana and Hector Garcia-Molina, “Complex Queries over Web Repositories,” Proceedings 2003 VLDB Conference (2003) 33-44.
  36. Urbain, J. et al.; Probabilistic passages models for semantic search search of genomics literature, 2008.
  37. Wen-Syan Li and Junho Shim, “Facilitating complex Web queries through visual user interfaces and query relaxation,” in Computer Networks and ISDN Systems ,vol. 30, Issues 1-7, Apr. 1998, pp. 149-159 = Proceedings of the Seventh International World Wide Web Conference.
  38. Wen-Syan Li, Junho Shim and K. Selcuk Candan, “WebDB: A System for Querying Semi-structured Data on the Web,” Journal o/Visual Languages & Computing 13.1 (Feb. 2002) 3-33.
  39. Xian Ding et al, An ontology-based semantic expansion search model using semantic condition transform,2009.
  40. Apache incubator, Apache UIMA, http://incubatorapache.org/uima/.
  41. Berger et al., A Maximum Entropy Approach to Natural Language Processing, Association for Computational Linguistics, 1996.
  42. Etzioni et al.,”Open information extraction from the web” Communications of the ACM , vol. 51 Issue 12, Dec. 2008 pp. 68-74. *
  43. Wikipedia, UIMA, http://en.wikipedia.org/wiki/UIMA.
  44. “INDRI Language modeling meets inference networks,” http://www.lemurproject.org/indri/, last modified May 23, 2011; pp. 1-2.
  45. Question answering,” From Wikipedia, the free encyclopedia, http://en.wikipedia.org/wiki/Question-answering
  46. Adar, “SaRAD: a Simple and Robust Abbreviation Dictionary,” Bioinformatics, Mar. 2004, pp. 527-533, vol. 20 Issue 4.
  47. Aditya et al., “Leveraging Community-built Knowledge for Type Coercion in Question Answering,” Proceedings of ISWC 2011.
  48. Balahur, “Going Beyond Traditional QA Systems: Challenges and Keys in Opinions Question Answering,” Coling 2010: Poster Volume, pp. 27-35, Beijing, Aug. 2010.
  49.  — more references coming 🙂

.

Beyond the Usual AI and Machine Intelligence

the Beyond topics
  1. George Gilder –Life After Google: The Fall of Big Data and the Rise of the Blockchain Economy worth reading to obtain additional perspectives. Some may be right, some may be wrong. Definitely technologically provocative. Will Google/Alphabet last?Do you know about the Dalles? You should. My first clue was through the book …OK … find out more about Google’s Data Centers. Find out more about other pieces worth knowing.

the Artificial and Machine Intelligence related topics

  1. Gelernter, D. (2016). The tides of mind: Uncovering the spectrum of consciousness. WW Norton & Company.
  2. Marquis, P., Papini, O., & Prade, H. (2014). Some Elements for a Prehistory of Artificial Intelligence in the Last Four Centuries. ECAI.
  3. Scheutz, M. (Ed.). (2002). Computationalism: new directions. MIT Press.
  4. Russell, S. J., & Norvig, P. (2016). Artificial intelligence: a modern approach.
    This is an updated edition of the 2010 version containing extensive current references. [note the book is getting hard to find sometimes due to demand, and its being the definitive AI textbook. Check the edition you are using/getting]
  5. Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT Press. This is an updated (2nd) edition of the 1998 version
  6. Nilsson, N. J., & Nilsson, N. J. (1998). Artificial intelligence: a new synthesis. Morgan Kaufmann.
  7. Poole, D. L., Mackworth, A. K., & Goebel, R. (1998). Computational intelligence: a logical approach (Vol. 1). New York: Oxford University Press.
    see also Artificial Intelligence: Foundations of Computational Agents 2nd Edition by the same authors.
  8. Pratt, V. (1987). Thinking Machines—The Evolution of Artificial Intelligence. Oxford: Basil Blackwell. – this is a general history of earlier machines … great reference to get historical insights not easily obtained elsewhere.
  9. Turing, A. M. (1948). Intelligent machinery. NPL. Mathematics Division. See also, Turing, A. (2004). Intelligent machinery (1948). The Essential Turing: Seminal Writings in Computing, Logic, Philosophy, Artificial Intelligence, and Artificial Life plus The Secrets of Enigma B. Jack Copeland, 395 which provides context and pointers to additional Turing resources.
  10. B. Jack Copeland (2004), Computability: Turing, Gödel, Church, and Beyond, The MIT Press.

Hard(er) Core Science Fiction and Speculative Fiction works

    1. John C. Wright’s Count to the Eschaton series is worth reading … provides interesting glimpse into a possible (far) future. It’s also fun to read … so good ideas and an interesting, universe spanning plot.

Natural Question Answering Research at Google

just announce on the Google AI Blog …/

Natural Questions: a New Corpus and Challenge for Question Answering Research

 

this is pretty exciting …hope to this grow and have fruitful implementation on the Google search engine.

this is what the Google AI researchers are saying

…. there are currently no large, publicly available sources of naturally occurring questions (i.e. questions asked by a person seeking information) and answers that can be used to train and evaluate QA models. This is because assembling a high-quality dataset for question answering requires a large source of real questions and significant human effort in finding correct answers.

To help spur research advances in QA, we are excited to announce Natural Questions (NQ), a new, large-scale corpus for training and evaluating open-domain question answering systems, and the first to replicate the end-to-end process in which people find answers to questions. NQ is large, consisting of 300,000 naturally occurring questions, along with human annotated answers from Wikipedia pages, to be used in training QA systems. We have additionally included 16,000 examples where answers (to the same questions) are provided by 5 different annotators,

I am really looking forward to digging into this …good questions and good answers are definitely part of the key for solving some great puzzles ….

have fun …