Conversation

Contextualized Query Rewriting

Dataset com.github.aagohary.canard

datamaestro.data.ml.Supervised

Question-in-context rewriting

Tags: query, context, conversation

Tasks: query rewriting

External link: https://sites.google.com/view/qanta/projects/canard

CANARD is a dataset for question-in-context rewriting that consists of questions each given in a dialog context together with a context-independent rewriting of the question. The context of each question is the dialog utterances that precede the question. CANARD can be used to evaluate question rewriting models that handle important linguistic phenomena such as co-reference and ellipsis resolution.

Each dataset is an instance of :class:datamaestro_text.data.conversation.CanardDataset

Dataset com.github.prdwb.orconvqa.preprocessed

datamaestro.data.ml.Supervised

Open-Retrieval Conversational Question Answering datasets

Tags: query, context, conversation

Tasks: query rewriting

External link: https://github.com/prdwb/orconvqa-release

OrConvQA is an aggregation of three existing datasets:

  1. the QuAC dataset that offers information-seeking conversations,

  2. the CANARD dataset that consists of context-independent rewrites of QuAC questions, and

  3. the Wikipedia corpus that serves as the knowledge source of answering questions.

Each dataset is an instance of :class:datamaestro_text.data.conversation.OrConvQADataset

Dataset com.github.prdwb.orconvqa.passages

datamaestro_text.data.ir.stores.OrConvQADocumentStore

orConvQA wikipedia files

External link: https://github.com/prdwb/orconvqa-release

OrConvQA is an aggregation of three existing datasets:

  1. the QuAC dataset that offers information-seeking conversations,

  2. the CANARD dataset that consists of context-independent rewrites of QuAC questions, and

  3. the Wikipedia corpus that serves as the knowledge source of answering questions.