The OAQA Biomedical Question Answering (BioASQ) System aims to identify relevant documents, concepts and passages (snippets) and automatically generate exact answer texts to arbitrary biomedical questions (factoid, list, yes/no). It won the best-performing system in the BioASQ QA Challenges in the factoid and list categories two years in a row in 2015 and 2016 (see official results).
System description papers have the most details about the design and implementation of the architecture and the algorithms:
- Zi Yang, Niloy Gupta, Xiangyu Sun, Di Xu, Chi Zhang, and Eric Nyberg. Learning to Answer Biomedical Factoid & List Questions: OAQA at BioASQ 3B. In Proceedings of CLEF 2015 Evaluation Labs and Workshop, 2015. [pdf]
- Zi Yang, Yue Zhou, and Eric Nyberg. Learning to Answer Biomedical Questions: OAQA at BioASQ 4B. In Proceedings of Workshop on Biomedical Natural Language Processing, 2016. [pdf]
Please contact Zi Yang if you have any questions or comments.
This system uses the ECD/CSE framework (an extension to the Apache UIMA framework which support formal, declarative YAML-based descriptors for the space of system and component configurations to be explored during system optimization), BaseQA type system as well as various natural language processing and information retrieval algorithms and tools.
The system employs a three layered design for both Java source code and YAML descriptors:
Layer | Description |
---|---|
baseqa |
Domain independent QA components, and the basic input/output definition of a QA pipeline, intermediate data objects, QA evaluation components, and data processing components. [source] [descriptor] |
bioqa |
Biomedical resources that can be used in any biomedical QA task (outside the context of BioASQ). [source] [descriptor] |
bioasq |
BioASQ-specific components, e.g. GoPubMed services. [source] [descriptor] |
Each layer contains packages for each processing step, e.g. preprocess, question analysis, abstract query generation, document retrieval and reranking, concept retrieval and reranking, passage retrieval, answer type prediction, evidence gathering, answer generation and ranking. Please refer to the architecture diagrams in the system description papers
We define the following workflow descriptors (i.e. entry points) under bioasq
for preprocessing, training, evaluation, and testing the Phase A (retrieval tasks) and Phase B (factoid, list and yes/no answer generation).
Descriptor | Description |
---|---|
preprocess-kb-cache |
Cache the requests and responses of concept and concept search services |
preprocess-answer-type-gslabel |
Label gold-standard answer types |
phase-a-train-concept-document |
Train document and concept reranking models |
phase-a-train-snippet |
Train snippet reranking models |
phase-a-evaluate , phase-a-test |
Evaluate (using development subset) and test (using test set) retrieval performance |
phase-b-train-answer-type |
Train answer type prediction model for factoid and list questions |
phase-b-train-answer-score |
Train answer scoring model for factoid and list questions |
phase-b-train-answer-collective-score |
Train answer collective scoring model for list questions |
phase-b-train-yesno |
Train yes/no prediction model |
phase-b-evaluate-factoid-list , phase-b-test-factoid-list |
Evaluate (using development subset) and test (using test set) factoid and list QA |
phase-b-evaluate-yesno , phase-b-test-yesno |
Evaluate (using development subset) and test (using test set) yes/no QA |
A workflow descriptor can be executed by the ECDDriver
, which has been configured as the main class in the Maven exec
goal, and thus it can be executed from the command line with the config
specified as the path.to.the.descriptor
.
The system also depends on other types of resources, including dictionaries
, pretrained machine learning models
, and service related properties
.
- Update Lucene from version 5.5.1 to 6.2.1, which results in change of default similarity.
- Update skr-webapi from version 0.0.4 to 0.0.6, due to an upstream API update to version 2.3.
- Update uts-api from version 0.0.2 to 0.0.3, due to an upstream API update.
- Update the TmTool URL to HTTPS (https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/#RESTfulAPIs).
- Bug fixes, including stability enhanced to avoid ConcurrentModificationException in LuceneDocumentScorer and ShapeDistanceCollectiveAnswerScorer, possible DuplicateKey in LuceneInMemorySentenceRetrievalExecutor, retrying if UTS service fails to obtain service ticket.
This system needs to access external structured and unstructured resources for question answering and files for evaluating the system. Due to licensing issues, you may have to obtain these resources or credentials on your own. If you are a CMU OAQA person, please read the internal resource preparation instruction instead.
-
Pre-prerequisites. Java 8, Maven 3, Python 2.
-
(Recommended) UMLS license/account. The system needs to access the online UMLS services (UTS and MetaMap), which require UMLS license/account (username, password, email). You can request them from https://uts.nlm.nih.gov//license.html. Otherwise, you need to remove all the
*-uts-*
and*-metamap-*
steps from the descriptors, which will hugely hurt the performance.If you want to increase the system's throughput, you may consider to download and install local instances of UMLS and MetaMap services. Currently, we only have the Web services integrated.
-
(Recommended) Medline corpus and Lucene index. The system can use a local Medline index or the GoPubMed Web API for searching the PubMed. However, we recommend a local index because the reranking component may send up to hundreds of search requests per question. Using a Web API can take forever to process one question.
-
Download
.xml.gz
or.xml
files from https://www.nlm.nih.gov/databases/download/pubmed_medline.html. -
(Optional) Check out the
medline-indexer
project. -
Create a Lucene index using the
StandardAnalyzer
. The index should contain three mandatory fields:pmid
,abstractText
, andarticleTitle
. We include an example Java codeMedlineCitationIndexer.java
that indexes.xml.gz
or.xml
files inside a directory. -
Create a sqlite database that has a
pmid2abstract
table with two fieldspmid
andabstract
, which is used to fix the section label errors in the provided development set. We include an example Java codeMedlineAbstractStoreBuilder.java
that builds the sqlite file.
-
-
Biomedical ontology dumps and Lucene index. You can skip this step if you don't need relevant concept retrieval, but please also remove the
concept-retrieval
andconcept-rerank
steps from the descriptors if you do so. If you prefer using a local biomedical ontology index (recommended) to the official GoPubMed services, you need to obtain the ontology dumps and create a Lucene index.-
Download the ontology dumps from all the sources according to the official resources guideline document.
-
(Optional) Check out the
biomedical-concept-indexer
project. -
Create a Lucene index. The index should contain four mandatory fields:
id
,name
,definition
, andsource
. Different sources of ontologies need to be adapted into the same single schema, and specify thesource
andid
of the concept in the original ontology source.Definition
andname
fields are intended to be used for retrieval. We include an example Java codeBiomedicalConceptIndexer.java
that indexes multiple ontologies.
-
-
BioASQ development and test files. You will need the test files for
*-test-*
workflows and the development files for*-evaluate-*
and*-train-*
workflows. However, the official development file has various errors. We created a python scriptbioasq-dev-fixer.py
to fix the errors, includeupdate_year
,fix_go_url
,normalize_yesno_answer
,listify_ideal_answer
,listify_exact_answer
,split_parenthesis_answer
,fix_section_label
, etc.-
Obtain the test set and development set (containing the gold-standard answers) from the BioASQ website.
-
Install the Python
editdistance
package. -
Use the provided script to fix the formatting errors in the development file.
python bioasq-dev-fixer.py path_to_orig_4b_dev_set path_to_pmid2abstract_db 4b-dev.json.auto.fulltext
-
The resulting file should have a md5 of
8751b3a962eafb5c2aa8f09d5998fcd4
.
-
-
(Optional) PubMed Central corpus and document service. Since the PubMed Central full text is no longer used in the evaluation from BioASQ 2016, it is not integrated into the predefined workflow descriptors. If you plan to use the PubMed Central corpus for passage retrieval (see below), you also need to download the PMC corpus and set up a document server.
-
Download the PMC open access subset: https://www.ncbi.nlm.nih.gov/pmc/tools/ftp/
-
Use the BioASQTasks.jar (provided in the official preparation package prior to 2015) to convert the xml files to a single JSON Array file.
java -jar BioASQTasks.jar
-
Create a directory
pmc
and split the JSON Array file to individual JSON documents, each containing a single document and named by its PMID, and put into thepmc
directory. -
Set up a HTTP document server with the resource root being the directory that contains
pmc
directory. Make sure you can access each document using the URL:http://HOST:PORT/pmc/DOC_ID
.
-
-
Clone the project into a local directory.
-
Put the test json files into the
input
directory, and rename them todryrun-a.json
,dryrun-b.json
,1b-1-a.json
, ...,4b-5-b.json
. (Read thecollection-reader.file
parameter value in each descriptor to understand what the system will look for.) If you use a customized input directory and/or json file names, please change thecollection-reader.file
parameter in the workflow descriptor. -
Create the
result
directory under the project folder, which is used for the system final output. If you use a customized output directory, you can change the following descriptors -
Create the
persistence
directory and download theoaqa-cse.db3
file into thepersistence
folder. As this project uses the CSE framework, the sqlite database persists the experiment metadata, the intermediate data objects (optionally) and the evaluation results for debugging and reporting purposes. If you use a customized persistence directory and/or file name, you can create your ownpersistence-provider
descriptor and update thepersistence-provider
parameters wherever used. -
Create
concept-search-cache
,metamap-cache
,synonym-cache
, andtmtool-cache
directories undersrc/main/resources/
directory. If you don't need the cache, you can replace the*-cached
descriptors with the non-cached versions (direct access). If you use a customized cache directories, you need to update thedb-file
parameter in the*-cached
descriptors, includingresources/bioqa/providers/kb/concept-search-uts-cached.yaml.template
resources/bioqa/providers/kb/metamap-cached.yaml.template
resources/bioqa/providers/kb/synonym-uts-cached.yaml.template
resources/bioqa/providers/kb/tmtool-cached.yaml.template
(Checkpoint) At this point, the project structure should look like this unless you have customized it.
|-- bioasq/ | |-- input/ | | |-- 1b-1-a.json | | |-- . | | |-- . | | |-- . | | |-- 4b-5-b.json | | |-- 4b-dev.json.auto.fulltext | | |-- dryrun-a.json | | |-- dryrun-b.json | | |-- one-question.json | |-- persistence/ | | |-- oaqa-cse.db3 | |-- result/ | |-- src/ | | |-- main/ | | | |-- java/ | | | |-- resources/ | | | | |-- baseqa/ | | | | |-- bioasq/ | | | | |-- bioqa/ | | | | |-- concept-search-cache/ | | | | |-- dictionaries/ | | | | |-- metamap-cache/ | | | | |-- models/ | | | | |-- synonym-cache/ | | | | |-- tmtool-cache/ | | | |-- script/
-
Update the
index
parameter in thelucene-bioconcept
descriptors with the path to the Lucene index for the biomedical ontology. Also, you need to change other parameters if you use customized field names. Remove the.template
suffix from the file names, including -
Update the
index
parameter in thelucene-medline
descriptors with the path to the Lucene Medline index. Also, you need to change other parameters if you use customized field names. Remove the.template
suffix from the file names, including -
Update the
version
,username
,password
, andemail
parameters in theuts
andmetamap
related providers, and remove the.template
suffix from the file names, includingresources/bioqa/providers/kb/concept-search-uts.yaml.template
resources/bioqa/providers/kb/concept-search-uts-cached.yaml.template
resources/bioqa/providers/kb/metamap.yaml.template
resources/bioqa/providers/kb/metamap-cached.yaml.template
resources/bioqa/providers/kb/synonym-uts.yaml.template
resources/bioqa/providers/kb/synonym-uts-cached.yaml.template
Note that the
version
parameter takes a string value, which means you have to add single or double quotes around the metamap version number, e.g.'1516'
. Otherwise, YAML would intepret1516
as an integer. -
Install the dependencies and compile the resources via Maven:
mvn clean compile
When you see
BUILD SUCCESS
, the installation is done.
-
(Optional, Recommended) Execute the
preprocess-kb-cache
workflow if you haven't done yet:mvn exec:exec -Dconfig=bioasq.preprocess-kb-cache
At the end of the execution, you should see mapdb files generated in the
*-cache
directories. This step could be extremely slow (> 10 hours) depending on the workload on the UTS/MetaMap servers. -
Execute any
*-test-*
workflow descriptor to test the pipeline:mvn exec:exec -Dconfig=bioasq.phase-a-test mvn exec:exec -Dconfig=bioasq.phase-b-test-factoid-list mvn exec:exec -Dconfig=bioasq.phase-b-test-yest-no
-
You should see the output in the
result
directory at the end of each execution.
The common evaluation metrics are defined in the BaseQA project's eval
package. The system extends the evaluation metrics for the BioASQ task in the eval
package. All the *-evaluate-*
descriptors add additional post-process
ing steps to generate the evaluation results automatically.
-
Put the
4b-dev.json.auto.fulltext
file under the directoryinput
if you haven't done yet. If you use a customized directory and/or file name, you need to change theresources/bioasq/gs/bioasq-qa-decorator.yaml
descriptor content accordingly. -
(Optional, Recommended) Execute the
preprocess-kb-cache
workflow if you haven't done yet, and at the end of the execution, you should see mapdb files generated in the*-cache
directories. -
Execute any
*-evaluate-*
workflow descriptor to test the pipeline.mvn exec:exec -Dconfig=bioasq.phase-a-evaluate mvn exec:exec -Dconfig=bioasq.phase-b-evaluate-factoid-list mvn exec:exec -Dconfig=bioasq.phase-b-evaluate-yest-no
-
You should see the evaluation results at the end of the execution in the console.
For example,
Experiment: 8f5876cc-7dcf-41c2-9da3-7fe841ae92d9:1 traceId,Answer/Answer/YESNO_COUNT,Answer/Answer/YESNO_MEAN_ACCURACY,Answer/Answer/YESNO_MEAN_NEG_ACCURACY,Answer/Answer/YESNO_MEAN_POS_ACCURACY 1|QuestionParser[inherit:bioqa.question.parse.clearnlp-bioinformatics#parser-provider:inherit: bioqa.providers.parser.clearnlp-bioinformatics]>2|QuestionConceptRecognizer[inherit:bioqa.question.concept.metamap-cached#concept-provider:inherit: bioqa.providers.kb.metamap-cached]>3|QuestionConceptRecognizer[inherit:bioqa.question.concept.tmtool-cached#concept-provider:inherit: bioqa.providers.kb.tmtool-cached]>4|QuestionConceptRecognizer[inherit:bioqa.question.concept.lingpipe-genia#concept-provider:inherit: bioqa.providers.kb.lingpipe-genia]>5|PassageToViewCopier[inherit:baseqa.evidence.passage-to-view#view-name-prefix:ptv]>6|PassageParser[inherit:bioqa.evidence.parse.clearnlp-bioinformatics#parser-provider:inherit: bioqa.providers.parser.clearnlp-bioinformatics#view-name-prefix:ptv]>7|PassageConceptRecognizer[inherit:bioqa.evidence.concept.metamap-cached#allowed-concept-types:/dictionaries/allowed-umls-types.txt#concept-provider:inherit: bioqa.providers.kb.metamap-cached#view-name-prefix:ptv]>8|PassageConceptRecognizer[inherit:bioqa.evidence.concept.tmtool-cached#concept-provider:inherit: bioqa.providers.kb.tmtool-cached#view-name-prefix:ptv]>9|PassageConceptRecognizer[inherit:bioqa.evidence.concept.lingpipe-genia#concept-provider:inherit: bioqa.providers.kb.lingpipe-genia#view-name-prefix:ptv]>10|PassageConceptRecognizer[inherit:baseqa.evidence.concept.frequent-phrase#concept-provider:inherit: baseqa.providers.kb.frequent-phrase#view-name-prefix:ptv]>11|ConceptSearcher[inherit:bioqa.evidence.concept.search-uts-cached#concept-search-provider:inherit: bioqa.providers.kb.concept-search-uts-cached#synonym-expansion-provider:inherit: bioqa.providers.kb.synonym-uts-cached]>12|ConceptMerger[inherit:baseqa.evidence.concept.merge#include-default-view:true#view-name-prefix:ptv#use-name:true]>13|YesNoAnswerPredictor[inherit:bioqa.answer.yesno.liblinear-predict#classifier:inherit: bioqa.answer.yesno.liblinear#feature-file:result/answer-yesno-predict-liblinear.tsv#scorers:- inherit: baseqa.answer.yesno.scorers.concept-overlap - inherit: bioqa.answer.yesno.scorers.token-overlap - inherit: baseqa.answer.yesno.scorers.expected-answer-overlap - inherit: baseqa.answer.yesno.scorers.sentiment - inherit: baseqa.answer.yesno.scorers.negation - inherit: bioqa.answer.yesno.scorers.alternate-answer ],28.0000,0.5714,0.3333,0.6842 1|QuestionParser[inherit:bioqa.question.parse.clearnlp-bioinformatics#parser-provider:inherit: bioqa.providers.parser.clearnlp-bioinformatics]>2|QuestionConceptRecognizer[inherit:bioqa.question.concept.metamap-cached#concept-provider:inherit: bioqa.providers.kb.metamap-cached]>3|QuestionConceptRecognizer[inherit:bioqa.question.concept.tmtool-cached#concept-provider:inherit: bioqa.providers.kb.tmtool-cached]>4|QuestionConceptRecognizer[inherit:bioqa.question.concept.lingpipe-genia#concept-provider:inherit: bioqa.providers.kb.lingpipe-genia]>5|PassageToViewCopier[inherit:baseqa.evidence.passage-to-view#view-name-prefix:ptv]>6|PassageParser[inherit:bioqa.evidence.parse.clearnlp-bioinformatics#parser-provider:inherit: bioqa.providers.parser.clearnlp-bioinformatics#view-name-prefix:ptv]>7|PassageConceptRecognizer[inherit:bioqa.evidence.concept.metamap-cached#allowed-concept-types:/dictionaries/allowed-umls-types.txt#concept-provider:inherit: bioqa.providers.kb.metamap-cached#view-name-prefix:ptv]>8|PassageConceptRecognizer[inherit:bioqa.evidence.concept.tmtool-cached#concept-provider:inherit: bioqa.providers.kb.tmtool-cached#view-name-prefix:ptv]>9|PassageConceptRecognizer[inherit:bioqa.evidence.concept.lingpipe-genia#concept-provider:inherit: bioqa.providers.kb.lingpipe-genia#view-name-prefix:ptv]>10|PassageConceptRecognizer[inherit:baseqa.evidence.concept.frequent-phrase#concept-provider:inherit: baseqa.providers.kb.frequent-phrase#view-name-prefix:ptv]>11|ConceptSearcher[inherit:bioqa.evidence.concept.search-uts-cached#concept-search-provider:inherit: bioqa.providers.kb.concept-search-uts-cached#synonym-expansion-provider:inherit: bioqa.providers.kb.synonym-uts-cached]>12|ConceptMerger[inherit:baseqa.evidence.concept.merge#include-default-view:true#view-name-prefix:ptv#use-name:true]>13|AllYesYesNoAnswerPredictor[inherit:baseqa.answer.yesno.all-yes],28.0000,0.6786,0.0000,1.0000 1|QuestionParser[inherit:bioqa.question.parse.clearnlp-bioinformatics#parser-provider:inherit: bioqa.providers.parser.clearnlp-bioinformatics]>2|QuestionConceptRecognizer[inherit:bioqa.question.concept.metamap-cached#concept-provider:inherit: bioqa.providers.kb.metamap-cached]>3|QuestionConceptRecognizer[inherit:bioqa.question.concept.tmtool-cached#concept-provider:inherit: bioqa.providers.kb.tmtool-cached]>4|QuestionConceptRecognizer[inherit:bioqa.question.concept.lingpipe-genia#concept-provider:inherit: bioqa.providers.kb.lingpipe-genia]>5|PassageToViewCopier[inherit:baseqa.evidence.passage-to-view#view-name-prefix:ptv]>6|PassageParser[inherit:bioqa.evidence.parse.clearnlp-bioinformatics#parser-provider:inherit: bioqa.providers.parser.clearnlp-bioinformatics#view-name-prefix:ptv]>7|PassageConceptRecognizer[inherit:bioqa.evidence.concept.metamap-cached#allowed-concept-types:/dictionaries/allowed-umls-types.txt#concept-provider:inherit: bioqa.providers.kb.metamap-cached#view-name-prefix:ptv]>8|PassageConceptRecognizer[inherit:bioqa.evidence.concept.tmtool-cached#concept-provider:inherit: bioqa.providers.kb.tmtool-cached#view-name-prefix:ptv]>9|PassageConceptRecognizer[inherit:bioqa.evidence.concept.lingpipe-genia#concept-provider:inherit: bioqa.providers.kb.lingpipe-genia#view-name-prefix:ptv]>10|PassageConceptRecognizer[inherit:baseqa.evidence.concept.frequent-phrase#concept-provider:inherit: baseqa.providers.kb.frequent-phrase#view-name-prefix:ptv]>11|ConceptSearcher[inherit:bioqa.evidence.concept.search-uts-cached#concept-search-provider:inherit: bioqa.providers.kb.concept-search-uts-cached#synonym-expansion-provider:inherit: bioqa.providers.kb.synonym-uts-cached]>12|ConceptMerger[inherit:baseqa.evidence.concept.merge#include-default-view:true#view-name-prefix:ptv#use-name:true]>13|YesNoAnswerPredictor[inherit:bioqa.answer.yesno.weka-logistic-predict#classifier:inherit: bioqa.answer.yesno.weka-logistic#feature-file:result/answer-yesno-predict-weka-logistic.tsv#scorers:- inherit: baseqa.answer.yesno.scorers.concept-overlap - inherit: bioqa.answer.yesno.scorers.token-overlap - inherit: baseqa.answer.yesno.scorers.expected-answer-overlap - inherit: baseqa.answer.yesno.scorers.sentiment - inherit: baseqa.answer.yesno.scorers.negation - inherit: bioqa.answer.yesno.scorers.alternate-answer ],28.0000,0.6429,0.2222,0.8421 1|QuestionParser[inherit:bioqa.question.parse.clearnlp-bioinformatics#parser-provider:inherit: bioqa.providers.parser.clearnlp-bioinformatics]>2|QuestionConceptRecognizer[inherit:bioqa.question.concept.metamap-cached#concept-provider:inherit: bioqa.providers.kb.metamap-cached]>3|QuestionConceptRecognizer[inherit:bioqa.question.concept.tmtool-cached#concept-provider:inherit: bioqa.providers.kb.tmtool-cached]>4|QuestionConceptRecognizer[inherit:bioqa.question.concept.lingpipe-genia#concept-provider:inherit: bioqa.providers.kb.lingpipe-genia]>5|PassageToViewCopier[inherit:baseqa.evidence.passage-to-view#view-name-prefix:ptv]>6|PassageParser[inherit:bioqa.evidence.parse.clearnlp-bioinformatics#parser-provider:inherit: bioqa.providers.parser.clearnlp-bioinformatics#view-name-prefix:ptv]>7|PassageConceptRecognizer[inherit:bioqa.evidence.concept.metamap-cached#allowed-concept-types:/dictionaries/allowed-umls-types.txt#concept-provider:inherit: bioqa.providers.kb.metamap-cached#view-name-prefix:ptv]>8|PassageConceptRecognizer[inherit:bioqa.evidence.concept.tmtool-cached#concept-provider:inherit: bioqa.providers.kb.tmtool-cached#view-name-prefix:ptv]>9|PassageConceptRecognizer[inherit:bioqa.evidence.concept.lingpipe-genia#concept-provider:inherit: bioqa.providers.kb.lingpipe-genia#view-name-prefix:ptv]>10|PassageConceptRecognizer[inherit:baseqa.evidence.concept.frequent-phrase#concept-provider:inherit: baseqa.providers.kb.frequent-phrase#view-name-prefix:ptv]>11|ConceptSearcher[inherit:bioqa.evidence.concept.search-uts-cached#concept-search-provider:inherit: bioqa.providers.kb.concept-search-uts-cached#synonym-expansion-provider:inherit: bioqa.providers.kb.synonym-uts-cached]>12|ConceptMerger[inherit:baseqa.evidence.concept.merge#include-default-view:true#view-name-prefix:ptv#use-name:true]>13|YesNoAnswerPredictor[inherit:bioqa.answer.yesno.weka-cvr-predict#classifier:inherit: bioqa.answer.yesno.weka-cvr#feature-file:result/answer-yesno-predict-weka-cvr.tsv#scorers:- inherit: baseqa.answer.yesno.scorers.concept-overlap - inherit: bioqa.answer.yesno.scorers.token-overlap - inherit: baseqa.answer.yesno.scorers.expected-answer-overlap - inherit: baseqa.answer.yesno.scorers.sentiment - inherit: baseqa.answer.yesno.scorers.negation - inherit: bioqa.answer.yesno.scorers.alternate-answer ],28.0000,0.6071,0.6667,0.5789
For better visualization, you can split the lines into cells using the comma separators, like this:
traceId COUNT ACCURACY NEG_ACCURACY POS_ACCURACY `...>13 YesNoAnswerPredictor[inherit:bioqa.answer.yesno.liblinear-predict#classifier:inherit: bioqa.answer.yesno.liblinear#feature-file:result/answer-yesno-predict-liblinear.tsv#scorers:- inherit: baseqa.answer.yesno.scorers.concept-overlap - inherit: bioqa.answer.yesno.scorers.token-overlap - inherit: baseqa.answer.yesno.scorers.expected-answer-overlap - inherit: baseqa.answer.yesno.scorers.sentiment - inherit: baseqa.answer.yesno.scorers.negation - inherit: bioqa.answer.yesno.scorers.alternate-answer ]` 28.0000 0.5714 0.3333 `...>13 AllYesYesNoAnswerPredictor[inherit:baseqa.answer.yesno.all-yes]` 28.0000 0.6786 0.0000 `...>13 YesNoAnswerPredictor[inherit:bioqa.answer.yesno.weka-logistic-predict#classifier:inherit: bioqa.answer.yesno.weka-logistic#feature-file:result/answer-yesno-predict-weka-logistic.tsv#scorers:- inherit: baseqa.answer.yesno.scorers.concept-overlap - inherit: bioqa.answer.yesno.scorers.token-overlap - inherit: baseqa.answer.yesno.scorers.expected-answer-overlap - inherit: baseqa.answer.yesno.scorers.sentiment - inherit: baseqa.answer.yesno.scorers.negation - inherit: bioqa.answer.yesno.scorers.alternate-answer ]` 28.0000 0.6429 0.2222 `...>13 YesNoAnswerPredictor[inherit:bioqa.answer.yesno.weka-cvr-predict#classifier:inherit: bioqa.answer.yesno.weka-cvr#feature-file:result/answer-yesno-predict-weka-cvr.tsv#scorers:- inherit: baseqa.answer.yesno.scorers.concept-overlap - inherit: bioqa.answer.yesno.scorers.token-overlap - inherit: baseqa.answer.yesno.scorers.expected-answer-overlap - inherit: baseqa.answer.yesno.scorers.sentiment - inherit: baseqa.answer.yesno.scorers.negation - inherit: bioqa.answer.yesno.scorers.alternate-answer ]` 28.0000 0.6071 0.6667
The system includes pretrained models using the predefined *-train-*
descriptors (i.e. 4b-dev set minus 3b-5 test set). However, if you plan to retrain the models, you can follow these steps. Please be aware that the models are saved under resources/models
, and loaded from classpath directly, which means you might want to recompile the project using mvn clean compile
to copy the newly generated models into the target
directory between the training processes, so that the next training can use the models from the previous one.
-
Put the
4b-dev.json.auto.fulltext
file under the directoryinput
if you haven't done so. If you use a customized directory and/or gold-standard file, you need to change theresources/bioasq/gs/bioasq-qa-decorator.yaml
descriptor content accordingly. -
(Optional, Recommended) Execute the
preprocess-kb-cache
workflow if you haven't done yet, and at the end of the execution, you should see mapdb files generated in the*-cache
directories. -
Execute the
preprocess-answer-type-gslabel
workflow if you haven't done yet, and at the end of the execution, you should see4b-dev-gslabel-tmtool.json
and4b-dev-gslabel-uts.json
files generated in theresources/models/bioqa/answer_type
directories.mvn clean compile exec:exec -Dconfig=bioasq.preprocess-answer-type-gslabel
This step could take about 30 minutes.
-
Training Phase A requires execution of
phase-a-train-concept-document
beforephase-a-train-snippet
.mvn clean compile exec:exec -Dconfig=bioasq.phase-a-train-concept-document mvn clean compile exec:exec -Dconfig=bioasq.phase-a-train-snippet
Executing
phase-a-train-concept-document
could take 3-4 hours, and executingphase-a-train-snippet
could take 80 minutes.Training Phase B factoid and list QA requires execution of
phase-b-train-answer-type
first, thenphase-b-train-answer-score
, and finallyphase-b-train-answer-collective-score
.mvn clean compile exec:exec -Dconfig=bioasq.phase-b-train-answer-type mvn clean compile exec:exec -Dconfig=bioasq.phase-b-train-answer-score mvn clean compile exec:exec -Dconfig=bioasq.phase-b-train-answer-collective-score
Executing
phase-b-train-answer-type
orphase-b-train-answer-score
could take 30 minutes each. Executingphase-b-train-answer-collective-score
could take 10 minutes.Training Phase B yes/no QA requires execution of
phase-b-train-yesno
.mvn clean compile exec:exec -Dconfig=bioasq.phase-b-train-yesno
Executing
phase-b-train-answer-collective-score
could take about 10 minutes. -
You should see cross-validation results at the end of each training.
You can use your own biomedical questions to test the system in either Phase A or Phase B, similar to testing on BioASQ test set.
-
You can refer to the
input/one-question.json
file, and update the question.{ "questions": [ { "body": "What is the role of MMP-1 in breast cancer?", "type": "factoid", "id": "0" } ] }
-
You need to change the
collection-reader.file
parameter toinput/one-question.json
in thephase-a-test
descriptor to test Phase A.
-
You need to manually add relevant snippets to the
input/one-question.json
file, similar to the Phase B test file (i.e.*b-*-b.json
). -
You need to change the
collection-reader.file
parameter toinput/one-question.json
in thephase-b-test-factoid-list
descriptor to test Phase B.
We are working on testing an end-to-end QA system that combines Phase A and Phase B workflows. You may also creatively combine the steps from both descriptors on your own.
Since the PubMed Central full text is not used in the evaluation from BioASQ 2016, it is not integrated into the predefined workflow descriptors. However, you can still use it for relevant passage retrieval.
-
Make sure you have the PubMed Central full text and document server.
-
Update the
url-format
parameter in theresources/bioasq/passage/pmc-content.yaml.template
with the PubMed Central document server URL, and remove the.template
suffix from the file name. -
Add the
pmc-content
step after thedocument-retrieval
/document-rerank
step, but beforepassage-retrieval
step, in the descriptor.
The official GoPubMed is sometimes slow. If you use a local or proxy GoPubMed server different from the official server, as those specified in the properties
folder, and you plan to use the GoPubMed components, which are not used the predefined workflow descriptors, you can change the conf
parameter in the gopubmed
related descriptors, including
resources/bioasq/concept/retrieval/gopubmed.yaml
resources/bioasq/concept/retrieval/gopubmed-separate.yaml
resources/bioasq/concept/rerank/scorers/gopubmed.yaml
resources/bioasq/document/retrieval/gopubmed.yaml
resources/bioasq/triple/retrieval/gopubmed.yaml
The system is far from perfect, and it needs tuning and component development. In addition to the system description papers, you may also read the UIMA and OAQA Tutorial to get familiar with the UIMA/ECD/CSE frameworks used by this system.
We thank Ying Li, Xing Yang, Venus So, James Cai and the other team members at Roche Innovation Center New York for their support of OAQA and biomedical question answering research and development.
This project is licensed under the Apache License ver 2.0 - see the LICENSE.txt file for details. However, please note that some third-party dependencies may be licensed differently.