IR-Datasets Integration
Datamaestro-text provides an interface to the ir-datasets library, giving access to hundreds of IR benchmarks through a unified API.
Install ir-datasets:
pip install ir-datasets
Usage:
from datamaestro import prepare_dataset
# Load any ir-datasets collection via the irds namespace
dataset = prepare_dataset("irds.msmarco-passage")
# Same API as native datasets
for doc in dataset.documents.iter_documents():
print(doc)
The list below is auto-generated and may not reflect the exact version of ir-datasets installed on your system.
Data Types
These wrapper types provide the datamaestro interface for ir-datasets data:
- XPM Configdatamaestro_text.datasets.irds.data.Topics(*, irds, id)
Bases:
TopicsStore,IRDSId- irds: str
The id to load the dataset from ir_datasets
- id: str
The unique (sub-)dataset ID
- XPM Configdatamaestro_text.datasets.irds.data.Documents(*, irds, id, count, file_access)
Bases:
DocumentStore,IRDSId- irds: str
The id to load the dataset from ir_datasets
- id: str
The unique (sub-)dataset ID
- count: int
Number of documents
- file_access: FileAccess = FileAccess.MMAP
How to access the file collection (might not have any impact, depends on the docstore)
- XPM Configdatamaestro_text.datasets.irds.data.AdhocAssessments(*, irds, id)
Bases:
AdhocAssessments,IRDSId- irds: str
The id to load the dataset from ir_datasets
- id: str
The unique (sub-)dataset ID
See also LZ4DocumentStore in the Information Retrieval API section.
Available Datasets
ANTIQUE
<p> “ANTIQUE is a non-factoid quesiton answering dataset based on the questions and answers of Yahoo! Webscope L6.” </p> <ul> <li>Documents: Short answer passages (from Yahoo Answers)</li> <li>Queries: Natural language questions (from Yahoo Answers)</li> <li><a href=”https://arxiv.org/abs/1905.08957”>Dataset Paper</a></li> </ul>
-
Dataset irds.antique.documents
datamaestro_text.datasets.irds.data.Documents
<p> “ANTIQUE is a non-factoid quesiton answering dataset based on the questions and answers of Yahoo! Webscope L6.” </p> <ul> <li>Documents: Short answer passages (from Yahoo Answers)</li> <li>Queries: Natural language questions (from Yahoo Answers)</li> <li><a href=”https://arxiv.org/abs/1905.08957”>Dataset Paper</a></li> </ul>
-
Dataset irds.antique.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official test set of the ANTIQUE dataset. </p>
-
Dataset irds.antique.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official test set of the ANTIQUE dataset. </p>
-
Dataset irds.antique.test
datamaestro_text.datasets.irds.data.Adhoc
<p> Official test set of the ANTIQUE dataset. </p>
-
Dataset irds.antique.test.non-offensive.queries
datamaestro_text.datasets.irds.data.Topics
<p> <a class=”ds-ref”>antique/test</a> without a set of queries deemed by the authors of ANTIQUE to be “offensive (and noisy).” </p>
-
Dataset irds.antique.test.non-offensive.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> <a class=”ds-ref”>antique/test</a> without a set of queries deemed by the authors of ANTIQUE to be “offensive (and noisy).” </p>
-
Dataset irds.antique.test.non-offensive
datamaestro_text.datasets.irds.data.Adhoc
<p> <a class=”ds-ref”>antique/test</a> without a set of queries deemed by the authors of ANTIQUE to be “offensive (and noisy).” </p>
-
Dataset irds.antique.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official train set of the ANTIQUE dataset. </p>
-
Dataset irds.antique.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official train set of the ANTIQUE dataset. </p>
-
Dataset irds.antique.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Official train set of the ANTIQUE dataset. </p>
-
Dataset irds.antique.train.split200-train.queries
datamaestro_text.datasets.irds.data.Topics
<p> <a class=”ds-ref”>antique/train</a> without the 200 queries used by <a class=”ds-ref”>antique/train/split200-valid</a>. </p>
-
Dataset irds.antique.train.split200-train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> <a class=”ds-ref”>antique/train</a> without the 200 queries used by <a class=”ds-ref”>antique/train/split200-valid</a>. </p>
-
Dataset irds.antique.train.split200-train
datamaestro_text.datasets.irds.data.Adhoc
<p> <a class=”ds-ref”>antique/train</a> without the 200 queries used by <a class=”ds-ref”>antique/train/split200-valid</a>. </p>
-
Dataset irds.antique.train.split200-valid.queries
datamaestro_text.datasets.irds.data.Topics
<p> A held-out subset of 200 queries from <a class=”ds-ref”>antique/train</a>. Use in conjunction with <a class=”ds-ref”>antique/train/split200-train</a>. </p>
-
Dataset irds.antique.train.split200-valid.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A held-out subset of 200 queries from <a class=”ds-ref”>antique/train</a>. Use in conjunction with <a class=”ds-ref”>antique/train/split200-train</a>. </p>
-
Dataset irds.antique.train.split200-valid
datamaestro_text.datasets.irds.data.Adhoc
<p> A held-out subset of 200 queries from <a class=”ds-ref”>antique/train</a>. Use in conjunction with <a class=”ds-ref”>antique/train/split200-train</a>. </p>
AOL-IA (Internet Archive)
<p> This is a version of the AOL Query Log. Documents use versions that appeared around the time of the query log (early 2006) via the Internet Archive. </p> <p> The query log does not include document or query IDs. These are instead created by ir_datasets. Document IDs are assigned using a hash of the URL that appears in the query log. Query IDs are assigned using the a hash of the noramlised query. All unique normalized queries are available from <kbd>queries</kbd>, and all clicked documents are available from <kbd>qrels</kbd> (iteration value set to the user ID). Full information (including original query) are available from <kbd>qlogs</kbd>. </p>
-
Dataset irds.aol-ia.documents
datamaestro_text.datasets.irds.data.Documents
<p> This is a version of the AOL Query Log. Documents use versions that appeared around the time of the query log (early 2006) via the Internet Archive. </p> <p> The query log does not include document or query IDs. These are instead created by ir_datasets. Document IDs are assigned using a hash of the URL that appears in the query log. Query IDs are assigned using the a hash of the noramlised query. All unique normalized queries are available from <kbd>queries</kbd>, and all clicked documents are available from <kbd>qrels</kbd> (iteration value set to the user ID). Full information (including original query) are available from <kbd>qlogs</kbd>. </p>
-
Dataset irds.aol-ia.queries
datamaestro_text.datasets.irds.data.Topics
<p> This is a version of the AOL Query Log. Documents use versions that appeared around the time of the query log (early 2006) via the Internet Archive. </p> <p> The query log does not include document or query IDs. These are instead created by ir_datasets. Document IDs are assigned using a hash of the URL that appears in the query log. Query IDs are assigned using the a hash of the noramlised query. All unique normalized queries are available from <kbd>queries</kbd>, and all clicked documents are available from <kbd>qrels</kbd> (iteration value set to the user ID). Full information (including original query) are available from <kbd>qlogs</kbd>. </p>
-
Dataset irds.aol-ia.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> This is a version of the AOL Query Log. Documents use versions that appeared around the time of the query log (early 2006) via the Internet Archive. </p> <p> The query log does not include document or query IDs. These are instead created by ir_datasets. Document IDs are assigned using a hash of the URL that appears in the query log. Query IDs are assigned using the a hash of the noramlised query. All unique normalized queries are available from <kbd>queries</kbd>, and all clicked documents are available from <kbd>qrels</kbd> (iteration value set to the user ID). Full information (including original query) are available from <kbd>qlogs</kbd>. </p>
-
Dataset irds.aol-ia
datamaestro_text.datasets.irds.data.Adhoc
<p> This is a version of the AOL Query Log. Documents use versions that appeared around the time of the query log (early 2006) via the Internet Archive. </p> <p> The query log does not include document or query IDs. These are instead created by ir_datasets. Document IDs are assigned using a hash of the URL that appears in the query log. Query IDs are assigned using the a hash of the noramlised query. All unique normalized queries are available from <kbd>queries</kbd>, and all clicked documents are available from <kbd>qrels</kbd> (iteration value set to the user ID). Full information (including original query) are available from <kbd>qlogs</kbd>. </p>
AQUAINT
<p> A document collection of about 1M English newswire text. Sources are the Xinhua News Service (People’s Republic of China), the New York Times News Service, and the Associated Press Worldstream News Service. </p> <ul> <li><a href=”https://catalog.ldc.upenn.edu/LDC2002T31”>Dataset details</a></li> </ul>
-
Dataset irds.aquaint.documents
datamaestro_text.datasets.irds.data.Documents
<p> A document collection of about 1M English newswire text. Sources are the Xinhua News Service (People’s Republic of China), the New York Times News Service, and the Associated Press Worldstream News Service. </p> <ul> <li><a href=”https://catalog.ldc.upenn.edu/LDC2002T31”>Dataset details</a></li> </ul>
-
Dataset irds.aquaint.trec-robust-2005.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Robust 2005 dataset. Contains a subset of 50 “hard” queries from <a class=”ds-ref”>trec-robust04</a>. </p> <ul> <li>Documents: News articles</li> <li>Queries: keyword queries, descriptions, narratives</li> <li>Relevance: Deep judgments</li> <li><a href=”https://trec.nist.gov/data/robust/05/05.guidelines.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec14/papers/ROBUST.OVERVIEW.pdf”>Task overview paper</a></li> <li>See also: <a class=”ds-ref”>trec-robust04</a></li> </ul>
-
Dataset irds.aquaint.trec-robust-2005.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Robust 2005 dataset. Contains a subset of 50 “hard” queries from <a class=”ds-ref”>trec-robust04</a>. </p> <ul> <li>Documents: News articles</li> <li>Queries: keyword queries, descriptions, narratives</li> <li>Relevance: Deep judgments</li> <li><a href=”https://trec.nist.gov/data/robust/05/05.guidelines.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec14/papers/ROBUST.OVERVIEW.pdf”>Task overview paper</a></li> <li>See also: <a class=”ds-ref”>trec-robust04</a></li> </ul>
-
Dataset irds.aquaint.trec-robust-2005
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Robust 2005 dataset. Contains a subset of 50 “hard” queries from <a class=”ds-ref”>trec-robust04</a>. </p> <ul> <li>Documents: News articles</li> <li>Queries: keyword queries, descriptions, narratives</li> <li>Relevance: Deep judgments</li> <li><a href=”https://trec.nist.gov/data/robust/05/05.guidelines.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec14/papers/ROBUST.OVERVIEW.pdf”>Task overview paper</a></li> <li>See also: <a class=”ds-ref”>trec-robust04</a></li> </ul>
args.me version 1.0
<p> Corpus version 1.0 with 387 606 arguments crawled from Debatewise, IDebate.org, Debatepedia, Debate.org. It was released on July 9, 2019 on <a href=”https://zenodo.org/record/3274636”>Zenodo</a>. The cleaned version <a class=”ds-ref”>argsme/1.0-cleaned</a> should be preferred. </p> <p> This collection is licensed with the <a href=”https://creativecommons.org/licenses/by/4.0/”>Creative Commons Attribution 4.0 International</a>. Individual rights to the content still apply. </p>
-
Dataset irds.argsme.1.0.documents
datamaestro_text.datasets.irds.data.Documents
<p> Corpus version 1.0 with 387 606 arguments crawled from Debatewise, IDebate.org, Debatepedia, Debate.org. It was released on July 9, 2019 on <a href=”https://zenodo.org/record/3274636”>Zenodo</a>. The cleaned version <a class=”ds-ref”>argsme/1.0-cleaned</a> should be preferred. </p> <p> This collection is licensed with the <a href=”https://creativecommons.org/licenses/by/4.0/”>Creative Commons Attribution 4.0 International</a>. Individual rights to the content still apply. </p>
-
Dataset irds.argsme.1.0.touche-2020-task-1.uncorrected.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>argsme/2020-04-01/touche-2020-task-1</a> that uses the <a class=”ds-ref”>argsme/1.0</a> corpus with uncorrected relevance judgements derived from crowdworkers. This dataset’s relevance judgements should <em>not</em> be used without preprocessing. </p> <ul> <li><a href=”https://webis.de/events/touche-20/shared-task-1.html”>Task 1 website</a></li> <li><a href=”https://webis.de/events/touche-20/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-58219-7_26”>Overview paper</a></li> <li><a href=”https://www.youtube.com/playlist?list=PLgD1TOdHQCI90NnCLg9f4g32KLuOfPXR4”>Workshop videos</a></li> </ul>
-
Dataset irds.argsme.1.0.touche-2020-task-1.uncorrected.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>argsme/2020-04-01/touche-2020-task-1</a> that uses the <a class=”ds-ref”>argsme/1.0</a> corpus with uncorrected relevance judgements derived from crowdworkers. This dataset’s relevance judgements should <em>not</em> be used without preprocessing. </p> <ul> <li><a href=”https://webis.de/events/touche-20/shared-task-1.html”>Task 1 website</a></li> <li><a href=”https://webis.de/events/touche-20/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-58219-7_26”>Overview paper</a></li> <li><a href=”https://www.youtube.com/playlist?list=PLgD1TOdHQCI90NnCLg9f4g32KLuOfPXR4”>Workshop videos</a></li> </ul>
-
Dataset irds.argsme.1.0.touche-2020-task-1.uncorrected
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>argsme/2020-04-01/touche-2020-task-1</a> that uses the <a class=”ds-ref”>argsme/1.0</a> corpus with uncorrected relevance judgements derived from crowdworkers. This dataset’s relevance judgements should <em>not</em> be used without preprocessing. </p> <ul> <li><a href=”https://webis.de/events/touche-20/shared-task-1.html”>Task 1 website</a></li> <li><a href=”https://webis.de/events/touche-20/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-58219-7_26”>Overview paper</a></li> <li><a href=”https://www.youtube.com/playlist?list=PLgD1TOdHQCI90NnCLg9f4g32KLuOfPXR4”>Workshop videos</a></li> </ul>
args.me version 1.0 cleaned
<p> Corpus version 1.0-cleaned with 382 545 arguments crawled from Debatewise, IDebate.org, Debatepedia, Debate.org. This version contains the same arguments as <a class=”ds-ref”>argsme/1.0</a>, but was cleaned as described in the corresponding publication. It was released on October 27, 2020 on <a href=”https://zenodo.org/record/4139439”>Zenodo</a>. </p> <p> This collection is licensed with the <a href=”https://creativecommons.org/licenses/by/4.0/”>Creative Commons Attribution 4.0 International</a>. Individual rights to the content still apply. </p>
-
Dataset irds.argsme.1.0-cleaned.documents
datamaestro_text.datasets.irds.data.Documents
<p> Corpus version 1.0-cleaned with 382 545 arguments crawled from Debatewise, IDebate.org, Debatepedia, Debate.org. This version contains the same arguments as <a class=”ds-ref”>argsme/1.0</a>, but was cleaned as described in the corresponding publication. It was released on October 27, 2020 on <a href=”https://zenodo.org/record/4139439”>Zenodo</a>. </p> <p> This collection is licensed with the <a href=”https://creativecommons.org/licenses/by/4.0/”>Creative Commons Attribution 4.0 International</a>. Individual rights to the content still apply. </p>
argsme/2020-04-01/debateorg
<p> Subset of the 338 620 arguments from <a class=”ds-ref”>argsme/2020-04-01</a> that were crawled from the debate portal Debate.org. </p>
-
Dataset irds.argsme.2020-04-01.debateorg.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of the 338 620 arguments from <a class=”ds-ref”>argsme/2020-04-01</a> that were crawled from the debate portal Debate.org. </p>
argsme/2020-04-01/debatepedia
<p> Subset of the 21 197 arguments from <a class=”ds-ref”>argsme/2020-04-01</a> that were crawled from the debate portal Debatepedia. </p>
-
Dataset irds.argsme.2020-04-01.debatepedia.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of the 21 197 arguments from <a class=”ds-ref”>argsme/2020-04-01</a> that were crawled from the debate portal Debatepedia. </p>
argsme/2020-04-01/debatewise
<p> Subset of the 14 353 arguments from <a class=”ds-ref”>argsme/2020-04-01</a> that were crawled from the debate portal Debatewise. </p>
-
Dataset irds.argsme.2020-04-01.debatewise.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of the 14 353 arguments from <a class=”ds-ref”>argsme/2020-04-01</a> that were crawled from the debate portal Debatewise. </p>
argsme/2020-04-01/idebate
<p> Subset of the 13 522 arguments from <a class=”ds-ref”>argsme/2020-04-01</a> that were crawled from the debate portal IDebate.org. </p>
-
Dataset irds.argsme.2020-04-01.idebate.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of the 13 522 arguments from <a class=”ds-ref”>argsme/2020-04-01</a> that were crawled from the debate portal IDebate.org. </p>
argsme/2020-04-01/parliamentary
<p> Subset of the 48 arguments from <a class=”ds-ref”>argsme/2020-04-01</a> that were crawled from Canadian Parliament discussions. </p>
-
Dataset irds.argsme.2020-04-01.parliamentary.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of the 48 arguments from <a class=”ds-ref”>argsme/2020-04-01</a> that were crawled from Canadian Parliament discussions. </p>
argsme/2020-04-01/processed
<p> Pre-processed version of <a class=”ds-ref”>argsme/2020-04-01</a> where each argument is split into sentences. </p>
-
Dataset irds.argsme.2020-04-01.processed.documents
datamaestro_text.datasets.irds.data.Documents
<p> Pre-processed version of <a class=”ds-ref”>argsme/2020-04-01</a> where each argument is split into sentences. </p>
-
Dataset irds.argsme.2020-04-01.processed.touche-2022-task-1.queries
datamaestro_text.datasets.irds.data.Topics
<p> Decision making processes, be it at the societal or at the personal level, often come to a point where one side challenges the other with a why-question, which is a prompt to justify some stance based on arguments. Since technologies for argument mining are maturing at a rapid pace, also ad-hoc argument retrieval becomes a feasible task in reach. Touché 2022 is the third lab on argument retrieval at CLEF 2022 featuring three tasks. </p> <p> Given a query about a controversial topic, retrieve and rank a relevant pair of sentences from a collection of arguments (<a class=”ds-ref”>argsme/2020-04-01-processed</a>). </p> <p> Documents are judged based on their general topical relevance and for rhetorical quality, i.e., “well-writtenness” of the document: (1) whether the text has a good style of speech (formal language is preferred over informal), (2) whether the text has a proper sentence structure and is easy to read, (3) whether it includes profanity, has typos, and makes use of other detrimental style choices. </p> <ul> <li><a href=”https://touche.webis.de/clef22/touche22-web/argument-retrieval-for-controversial-questions.html”>Task 1 website</a></li> <li><a href=”https://touche.webis.de/clef22/touche22-web/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-99739-7_43”>Overview paper</a></li> </ul>
-
Dataset irds.argsme.2020-04-01.processed.touche-2022-task-1.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Decision making processes, be it at the societal or at the personal level, often come to a point where one side challenges the other with a why-question, which is a prompt to justify some stance based on arguments. Since technologies for argument mining are maturing at a rapid pace, also ad-hoc argument retrieval becomes a feasible task in reach. Touché 2022 is the third lab on argument retrieval at CLEF 2022 featuring three tasks. </p> <p> Given a query about a controversial topic, retrieve and rank a relevant pair of sentences from a collection of arguments (<a class=”ds-ref”>argsme/2020-04-01-processed</a>). </p> <p> Documents are judged based on their general topical relevance and for rhetorical quality, i.e., “well-writtenness” of the document: (1) whether the text has a good style of speech (formal language is preferred over informal), (2) whether the text has a proper sentence structure and is easy to read, (3) whether it includes profanity, has typos, and makes use of other detrimental style choices. </p> <ul> <li><a href=”https://touche.webis.de/clef22/touche22-web/argument-retrieval-for-controversial-questions.html”>Task 1 website</a></li> <li><a href=”https://touche.webis.de/clef22/touche22-web/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-99739-7_43”>Overview paper</a></li> </ul>
-
Dataset irds.argsme.2020-04-01.processed.touche-2022-task-1
datamaestro_text.datasets.irds.data.Adhoc
<p> Decision making processes, be it at the societal or at the personal level, often come to a point where one side challenges the other with a why-question, which is a prompt to justify some stance based on arguments. Since technologies for argument mining are maturing at a rapid pace, also ad-hoc argument retrieval becomes a feasible task in reach. Touché 2022 is the third lab on argument retrieval at CLEF 2022 featuring three tasks. </p> <p> Given a query about a controversial topic, retrieve and rank a relevant pair of sentences from a collection of arguments (<a class=”ds-ref”>argsme/2020-04-01-processed</a>). </p> <p> Documents are judged based on their general topical relevance and for rhetorical quality, i.e., “well-writtenness” of the document: (1) whether the text has a good style of speech (formal language is preferred over informal), (2) whether the text has a proper sentence structure and is easy to read, (3) whether it includes profanity, has typos, and makes use of other detrimental style choices. </p> <ul> <li><a href=”https://touche.webis.de/clef22/touche22-web/argument-retrieval-for-controversial-questions.html”>Task 1 website</a></li> <li><a href=”https://touche.webis.de/clef22/touche22-web/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-99739-7_43”>Overview paper</a></li> </ul>
args.me
<p> Corpus version 2020-04-01 with 387 740 arguments crawled from Debatewise, IDebate.org, Debatepedia, Debate.org, and from Canadian Parliament discussions. It was released on April 1, 2020 on <a href=”https://zenodo.org/record/3734893”>Zenodo</a>. </p> <p> This collection is licensed with the <a href=”https://creativecommons.org/licenses/by/4.0/”>Creative Commons Attribution 4.0 International</a>. Individual rights to the content still apply. </p>
-
Dataset irds.argsme.2020-04-01.documents
datamaestro_text.datasets.irds.data.Documents
<p> Corpus version 2020-04-01 with 387 740 arguments crawled from Debatewise, IDebate.org, Debatepedia, Debate.org, and from Canadian Parliament discussions. It was released on April 1, 2020 on <a href=”https://zenodo.org/record/3734893”>Zenodo</a>. </p> <p> This collection is licensed with the <a href=”https://creativecommons.org/licenses/by/4.0/”>Creative Commons Attribution 4.0 International</a>. Individual rights to the content still apply. </p>
-
Dataset irds.argsme.2020-04-01.touche-2020-task-1.queries
datamaestro_text.datasets.irds.data.Topics
<p> Decision making processes, be it at the societal or at the personal level, eventually come to a point where one side will challenge the other with a why-question, which is a prompt to justify one’s stance. Thus, technologies for argument mining and argumentation processing are maturing at a rapid pace, giving rise for the first time to argument retrieval. Touché 2020 is the first lab on Argument Retrieval at CLEF 2020 featuring two tasks. </p> <p> Given a question on a controversial topic, retrieve relevant arguments from a focused crawl of online debate portals (<a class=”ds-ref”>argsme/2020-04-01</a>). </p> <p> Documents are judged based on their general topical relevance. </p> <ul> <li><a href=”https://webis.de/events/touche-20/shared-task-1.html”>Task 1 website</a></li> <li><a href=”https://webis.de/events/touche-20/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-58219-7_26”>Overview paper</a></li> <li><a href=”https://www.youtube.com/playlist?list=PLgD1TOdHQCI90NnCLg9f4g32KLuOfPXR4”>Workshop videos</a></li> </ul>
-
Dataset irds.argsme.2020-04-01.touche-2020-task-1.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Decision making processes, be it at the societal or at the personal level, eventually come to a point where one side will challenge the other with a why-question, which is a prompt to justify one’s stance. Thus, technologies for argument mining and argumentation processing are maturing at a rapid pace, giving rise for the first time to argument retrieval. Touché 2020 is the first lab on Argument Retrieval at CLEF 2020 featuring two tasks. </p> <p> Given a question on a controversial topic, retrieve relevant arguments from a focused crawl of online debate portals (<a class=”ds-ref”>argsme/2020-04-01</a>). </p> <p> Documents are judged based on their general topical relevance. </p> <ul> <li><a href=”https://webis.de/events/touche-20/shared-task-1.html”>Task 1 website</a></li> <li><a href=”https://webis.de/events/touche-20/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-58219-7_26”>Overview paper</a></li> <li><a href=”https://www.youtube.com/playlist?list=PLgD1TOdHQCI90NnCLg9f4g32KLuOfPXR4”>Workshop videos</a></li> </ul>
-
Dataset irds.argsme.2020-04-01.touche-2020-task-1
datamaestro_text.datasets.irds.data.Adhoc
<p> Decision making processes, be it at the societal or at the personal level, eventually come to a point where one side will challenge the other with a why-question, which is a prompt to justify one’s stance. Thus, technologies for argument mining and argumentation processing are maturing at a rapid pace, giving rise for the first time to argument retrieval. Touché 2020 is the first lab on Argument Retrieval at CLEF 2020 featuring two tasks. </p> <p> Given a question on a controversial topic, retrieve relevant arguments from a focused crawl of online debate portals (<a class=”ds-ref”>argsme/2020-04-01</a>). </p> <p> Documents are judged based on their general topical relevance. </p> <ul> <li><a href=”https://webis.de/events/touche-20/shared-task-1.html”>Task 1 website</a></li> <li><a href=”https://webis.de/events/touche-20/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-58219-7_26”>Overview paper</a></li> <li><a href=”https://www.youtube.com/playlist?list=PLgD1TOdHQCI90NnCLg9f4g32KLuOfPXR4”>Workshop videos</a></li> </ul>
-
Dataset irds.argsme.2020-04-01.touche-2021-task-1.queries
datamaestro_text.datasets.irds.data.Topics
<p> Decision making processes, be it at the societal or at the personal level, often come to a point where one side challenges the other with a why-question, which is a prompt to justify some stance based on arguments. Since technologies for argument mining are maturing at a rapid pace, also ad-hoc argument retrieval becomes a feasible task in reach. Touché 2021 is the second lab on argument retrieval at CLEF 2021 featuring two tasks. </p> <p> Given a question on a controversial topic, retrieve relevant arguments from a focused crawl of online debate portals (<a class=”ds-ref”>argsme/2020-04-01</a>). </p> <p> Documents are judged based on their general topical relevance and for rhetorical quality, i.e., “well-writtenness” of the document: (1) whether the text has a good style of speech (formal language is preferred over informal), (2) whether the text has a proper sentence structure and is easy to read, (3) whether it includes profanity, has typos, and makes use of other detrimental style choices. </p> <ul> <li><a href=”https://webis.de/events/touche-21/shared-task-1.html”>Task 1 website</a></li> <li><a href=”https://webis.de/events/touche-21/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-85251-1_28”>Overview paper</a></li> <li><a href=”https://www.youtube.com/playlist?list=PLgD1TOdHQCI8FDfYnzcjbsf26RIatNgM3”>Workshop videos</a></li> </ul>
-
Dataset irds.argsme.2020-04-01.touche-2021-task-1.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Decision making processes, be it at the societal or at the personal level, often come to a point where one side challenges the other with a why-question, which is a prompt to justify some stance based on arguments. Since technologies for argument mining are maturing at a rapid pace, also ad-hoc argument retrieval becomes a feasible task in reach. Touché 2021 is the second lab on argument retrieval at CLEF 2021 featuring two tasks. </p> <p> Given a question on a controversial topic, retrieve relevant arguments from a focused crawl of online debate portals (<a class=”ds-ref”>argsme/2020-04-01</a>). </p> <p> Documents are judged based on their general topical relevance and for rhetorical quality, i.e., “well-writtenness” of the document: (1) whether the text has a good style of speech (formal language is preferred over informal), (2) whether the text has a proper sentence structure and is easy to read, (3) whether it includes profanity, has typos, and makes use of other detrimental style choices. </p> <ul> <li><a href=”https://webis.de/events/touche-21/shared-task-1.html”>Task 1 website</a></li> <li><a href=”https://webis.de/events/touche-21/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-85251-1_28”>Overview paper</a></li> <li><a href=”https://www.youtube.com/playlist?list=PLgD1TOdHQCI8FDfYnzcjbsf26RIatNgM3”>Workshop videos</a></li> </ul>
-
Dataset irds.argsme.2020-04-01.touche-2021-task-1
datamaestro_text.datasets.irds.data.Adhoc
<p> Decision making processes, be it at the societal or at the personal level, often come to a point where one side challenges the other with a why-question, which is a prompt to justify some stance based on arguments. Since technologies for argument mining are maturing at a rapid pace, also ad-hoc argument retrieval becomes a feasible task in reach. Touché 2021 is the second lab on argument retrieval at CLEF 2021 featuring two tasks. </p> <p> Given a question on a controversial topic, retrieve relevant arguments from a focused crawl of online debate portals (<a class=”ds-ref”>argsme/2020-04-01</a>). </p> <p> Documents are judged based on their general topical relevance and for rhetorical quality, i.e., “well-writtenness” of the document: (1) whether the text has a good style of speech (formal language is preferred over informal), (2) whether the text has a proper sentence structure and is easy to read, (3) whether it includes profanity, has typos, and makes use of other detrimental style choices. </p> <ul> <li><a href=”https://webis.de/events/touche-21/shared-task-1.html”>Task 1 website</a></li> <li><a href=”https://webis.de/events/touche-21/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-85251-1_28”>Overview paper</a></li> <li><a href=”https://www.youtube.com/playlist?list=PLgD1TOdHQCI8FDfYnzcjbsf26RIatNgM3”>Workshop videos</a></li> </ul>
-
Dataset irds.argsme.2020-04-01.touche-2020-task-1.uncorrected.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>argsme/2020-04-01/touche-2020-task-1</a> that uses uncorrected relevance judgements derived from crowdworkers. This dataset’s relevance judgements should <em>not</em> be used without preprocessing. </p> <ul> <li><a href=”https://webis.de/events/touche-20/shared-task-1.html”>Task 1 website</a></li> <li><a href=”https://webis.de/events/touche-20/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-58219-7_26”>Overview paper</a></li> <li><a href=”https://www.youtube.com/playlist?list=PLgD1TOdHQCI90NnCLg9f4g32KLuOfPXR4”>Workshop videos</a></li> </ul>
-
Dataset irds.argsme.2020-04-01.touche-2020-task-1.uncorrected.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>argsme/2020-04-01/touche-2020-task-1</a> that uses uncorrected relevance judgements derived from crowdworkers. This dataset’s relevance judgements should <em>not</em> be used without preprocessing. </p> <ul> <li><a href=”https://webis.de/events/touche-20/shared-task-1.html”>Task 1 website</a></li> <li><a href=”https://webis.de/events/touche-20/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-58219-7_26”>Overview paper</a></li> <li><a href=”https://www.youtube.com/playlist?list=PLgD1TOdHQCI90NnCLg9f4g32KLuOfPXR4”>Workshop videos</a></li> </ul>
-
Dataset irds.argsme.2020-04-01.touche-2020-task-1.uncorrected
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>argsme/2020-04-01/touche-2020-task-1</a> that uses uncorrected relevance judgements derived from crowdworkers. This dataset’s relevance judgements should <em>not</em> be used without preprocessing. </p> <ul> <li><a href=”https://webis.de/events/touche-20/shared-task-1.html”>Task 1 website</a></li> <li><a href=”https://webis.de/events/touche-20/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-58219-7_26”>Overview paper</a></li> <li><a href=”https://www.youtube.com/playlist?list=PLgD1TOdHQCI90NnCLg9f4g32KLuOfPXR4”>Workshop videos</a></li> </ul>
beir/arguana
<p> A version of the ArguAna Counterargs dataset, for argument retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/P18-1023.pdf”>Dataset paper</a></li> <li><a href=”http://argumentation.bplaced.net/arguana/data”>Dataset website</a></li> </ul>
-
Dataset irds.beir.arguana.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the ArguAna Counterargs dataset, for argument retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/P18-1023.pdf”>Dataset paper</a></li> <li><a href=”http://argumentation.bplaced.net/arguana/data”>Dataset website</a></li> </ul>
-
Dataset irds.beir.arguana.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the ArguAna Counterargs dataset, for argument retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/P18-1023.pdf”>Dataset paper</a></li> <li><a href=”http://argumentation.bplaced.net/arguana/data”>Dataset website</a></li> </ul>
-
Dataset irds.beir.arguana.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the ArguAna Counterargs dataset, for argument retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/P18-1023.pdf”>Dataset paper</a></li> <li><a href=”http://argumentation.bplaced.net/arguana/data”>Dataset website</a></li> </ul>
-
Dataset irds.beir.arguana
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the ArguAna Counterargs dataset, for argument retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/P18-1023.pdf”>Dataset paper</a></li> <li><a href=”http://argumentation.bplaced.net/arguana/data”>Dataset website</a></li> </ul>
beir/climate-fever
<p> A version of the CLIMATE-FEVER dataset, for fact verification on claims about climate. </p> <ul> <li><a href=”https://arxiv.org/pdf/2012.00614.pdf”>Dataset paper</a></li> <li><a href=”https://www.sustainablefinance.uzh.ch/en/research/climate-fever.html”>Dataset website</a></li> </ul>
-
Dataset irds.beir.climate-fever.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the CLIMATE-FEVER dataset, for fact verification on claims about climate. </p> <ul> <li><a href=”https://arxiv.org/pdf/2012.00614.pdf”>Dataset paper</a></li> <li><a href=”https://www.sustainablefinance.uzh.ch/en/research/climate-fever.html”>Dataset website</a></li> </ul>
-
Dataset irds.beir.climate-fever.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the CLIMATE-FEVER dataset, for fact verification on claims about climate. </p> <ul> <li><a href=”https://arxiv.org/pdf/2012.00614.pdf”>Dataset paper</a></li> <li><a href=”https://www.sustainablefinance.uzh.ch/en/research/climate-fever.html”>Dataset website</a></li> </ul>
-
Dataset irds.beir.climate-fever.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the CLIMATE-FEVER dataset, for fact verification on claims about climate. </p> <ul> <li><a href=”https://arxiv.org/pdf/2012.00614.pdf”>Dataset paper</a></li> <li><a href=”https://www.sustainablefinance.uzh.ch/en/research/climate-fever.html”>Dataset website</a></li> </ul>
-
Dataset irds.beir.climate-fever
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the CLIMATE-FEVER dataset, for fact verification on claims about climate. </p> <ul> <li><a href=”https://arxiv.org/pdf/2012.00614.pdf”>Dataset paper</a></li> <li><a href=”https://www.sustainablefinance.uzh.ch/en/research/climate-fever.html”>Dataset website</a></li> </ul>
beir/cqadupstack/android
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>android</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.android.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>android</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.android.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>android</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.android.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>android</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.android
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>android</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
beir/cqadupstack/english
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>english</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.english.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>english</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.english.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>english</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.english.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>english</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.english
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>english</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
beir/cqadupstack/gaming
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>gaming</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.gaming.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>gaming</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.gaming.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>gaming</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.gaming.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>gaming</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.gaming
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>gaming</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
beir/cqadupstack/gis
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>gis</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.gis.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>gis</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.gis.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>gis</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.gis.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>gis</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.gis
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>gis</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
beir/cqadupstack/mathematica
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>mathematica</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.mathematica.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>mathematica</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.mathematica.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>mathematica</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.mathematica.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>mathematica</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.mathematica
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>mathematica</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
beir/cqadupstack/physics
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>physics</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.physics.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>physics</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.physics.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>physics</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.physics.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>physics</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.physics
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>physics</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
beir/cqadupstack/programmers
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>programmers</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.programmers.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>programmers</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.programmers.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>programmers</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.programmers.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>programmers</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.programmers
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>programmers</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
beir/cqadupstack/stats
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>stats</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.stats.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>stats</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.stats.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>stats</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.stats.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>stats</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.stats
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>stats</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
beir/cqadupstack/tex
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>tex</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.tex.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>tex</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.tex.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>tex</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.tex.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>tex</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.tex
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>tex</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
beir/cqadupstack/unix
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>unix</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.unix.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>unix</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.unix.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>unix</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.unix.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>unix</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.unix
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>unix</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
beir/cqadupstack/webmasters
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>webmasters</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.webmasters.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>webmasters</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.webmasters.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>webmasters</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.webmasters.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>webmasters</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.webmasters
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>webmasters</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
beir/cqadupstack/wordpress
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>wordpress</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.wordpress.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>wordpress</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.wordpress.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>wordpress</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.wordpress.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>wordpress</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
-
Dataset irds.beir.cqadupstack.wordpress
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the <kbd>wordpress</kbd> StackExchange subforum. </p> <ul> <li><a href=”https://people.eng.unimelb.edu.au/tbaldwin/pubs/adcs2015.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cis.unimelb.edu.au/resources/cqadupstack/”>Dataset website</a></li> <li><a href=”https://github.com/D1Doris/CQADupStack”>Dataset repository</a></li> </ul>
beir/dbpedia-entity
<p> A version of the DBPedia-Entity-v2 dataset for entity retrieval. </p> <ul> <li><a href=”http://hasibi.com/files/sigir2017-dbpedia_entity.pdf”>Dataset paper</a></li> <li><a href=”https://github.com/iai-group/DBpedia-Entity”>Dataset website</a></li> </ul>
-
Dataset irds.beir.dbpedia-entity.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the DBPedia-Entity-v2 dataset for entity retrieval. </p> <ul> <li><a href=”http://hasibi.com/files/sigir2017-dbpedia_entity.pdf”>Dataset paper</a></li> <li><a href=”https://github.com/iai-group/DBpedia-Entity”>Dataset website</a></li> </ul>
-
Dataset irds.beir.dbpedia-entity.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the DBPedia-Entity-v2 dataset for entity retrieval. </p> <ul> <li><a href=”http://hasibi.com/files/sigir2017-dbpedia_entity.pdf”>Dataset paper</a></li> <li><a href=”https://github.com/iai-group/DBpedia-Entity”>Dataset website</a></li> </ul>
-
Dataset irds.beir.dbpedia-entity.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> A random sample of 67 queries from the official test set, used as a dev set. </p>
-
Dataset irds.beir.dbpedia-entity.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A random sample of 67 queries from the official test set, used as a dev set. </p>
-
Dataset irds.beir.dbpedia-entity.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> A random sample of 67 queries from the official test set, used as a dev set. </p>
-
Dataset irds.beir.dbpedia-entity.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> A the official test set, without 67 queries used as a dev set. </p>
-
Dataset irds.beir.dbpedia-entity.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A the official test set, without 67 queries used as a dev set. </p>
-
Dataset irds.beir.dbpedia-entity.test
datamaestro_text.datasets.irds.data.Adhoc
<p> A the official test set, without 67 queries used as a dev set. </p>
beir/fever
<p> A version of the FEVER dataset for fact verification. Includes queries from the /train /dev and /test subsets. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/N18-1074.pdf”>Dataset paper</a></li> <li><a href=”https://fever.ai/resources.html”>Dataset website</a></li> </ul>
-
Dataset irds.beir.fever.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the FEVER dataset for fact verification. Includes queries from the /train /dev and /test subsets. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/N18-1074.pdf”>Dataset paper</a></li> <li><a href=”https://fever.ai/resources.html”>Dataset website</a></li> </ul>
-
Dataset irds.beir.fever.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the FEVER dataset for fact verification. Includes queries from the /train /dev and /test subsets. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/N18-1074.pdf”>Dataset paper</a></li> <li><a href=”https://fever.ai/resources.html”>Dataset website</a></li> </ul>
-
Dataset irds.beir.fever.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> The official dev set. </p>
-
Dataset irds.beir.fever.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The official dev set. </p>
-
Dataset irds.beir.fever.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> The official dev set. </p>
-
Dataset irds.beir.fever.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> The official test set. </p>
-
Dataset irds.beir.fever.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The official test set. </p>
-
Dataset irds.beir.fever.test
datamaestro_text.datasets.irds.data.Adhoc
<p> The official test set. </p>
-
Dataset irds.beir.fever.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> The official train set. </p>
-
Dataset irds.beir.fever.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The official train set. </p>
-
Dataset irds.beir.fever.train
datamaestro_text.datasets.irds.data.Adhoc
<p> The official train set. </p>
beir/fiqa
<p> A version of the FIQA-2018 dataset (financial opinion question answering). Queries include those in the /train /dev and /test subsets. </p> <ul> <li><a href=”https://dl.acm.org/doi/10.1145/3184558.3192301”>Dataset paper</a></li> <li><a href=”https://sites.google.com/view/fiqa/home”>Dataset site</a></li> </ul>
-
Dataset irds.beir.fiqa.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the FIQA-2018 dataset (financial opinion question answering). Queries include those in the /train /dev and /test subsets. </p> <ul> <li><a href=”https://dl.acm.org/doi/10.1145/3184558.3192301”>Dataset paper</a></li> <li><a href=”https://sites.google.com/view/fiqa/home”>Dataset site</a></li> </ul>
-
Dataset irds.beir.fiqa.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the FIQA-2018 dataset (financial opinion question answering). Queries include those in the /train /dev and /test subsets. </p> <ul> <li><a href=”https://dl.acm.org/doi/10.1145/3184558.3192301”>Dataset paper</a></li> <li><a href=”https://sites.google.com/view/fiqa/home”>Dataset site</a></li> </ul>
-
Dataset irds.beir.fiqa.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Random sample of 500 queries from the official dataset. </p>
-
Dataset irds.beir.fiqa.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Random sample of 500 queries from the official dataset. </p>
-
Dataset irds.beir.fiqa.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Random sample of 500 queries from the official dataset. </p>
-
Dataset irds.beir.fiqa.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> Random sample of 648 queries from the official dataset. </p>
-
Dataset irds.beir.fiqa.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Random sample of 648 queries from the official dataset. </p>
-
Dataset irds.beir.fiqa.test
datamaestro_text.datasets.irds.data.Adhoc
<p> Random sample of 648 queries from the official dataset. </p>
-
Dataset irds.beir.fiqa.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official dataset without the 1148 queries sampled for /dev and /test. </p>
-
Dataset irds.beir.fiqa.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official dataset without the 1148 queries sampled for /dev and /test. </p>
-
Dataset irds.beir.fiqa.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Official dataset without the 1148 queries sampled for /dev and /test. </p>
beir/hotpotqa
<p> A version of the Hotpot QA dataset for multi-hop question answering. Queries include all those in /train /dev and /test. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/D18-1259”>Dataset paper</a></li> <li><a href=”https://github.com/hotpotqa/hotpot”>Dataset website</a></li> </ul>
-
Dataset irds.beir.hotpotqa.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the Hotpot QA dataset for multi-hop question answering. Queries include all those in /train /dev and /test. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/D18-1259”>Dataset paper</a></li> <li><a href=”https://github.com/hotpotqa/hotpot”>Dataset website</a></li> </ul>
-
Dataset irds.beir.hotpotqa.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the Hotpot QA dataset for multi-hop question answering. Queries include all those in /train /dev and /test. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/D18-1259”>Dataset paper</a></li> <li><a href=”https://github.com/hotpotqa/hotpot”>Dataset website</a></li> </ul>
-
Dataset irds.beir.hotpotqa.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Random selection of the 5447 queries from /train. </p>
-
Dataset irds.beir.hotpotqa.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Random selection of the 5447 queries from /train. </p>
-
Dataset irds.beir.hotpotqa.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Random selection of the 5447 queries from /train. </p>
-
Dataset irds.beir.hotpotqa.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official <em>dev</em> set from HotpotQA, here used as a test set. </p>
-
Dataset irds.beir.hotpotqa.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official <em>dev</em> set from HotpotQA, here used as a test set. </p>
-
Dataset irds.beir.hotpotqa.test
datamaestro_text.datasets.irds.data.Adhoc
<p> Official <em>dev</em> set from HotpotQA, here used as a test set. </p>
-
Dataset irds.beir.hotpotqa.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official train set, without the random selection of the 5447 queries used for /dev. </p>
-
Dataset irds.beir.hotpotqa.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official train set, without the random selection of the 5447 queries used for /dev. </p>
-
Dataset irds.beir.hotpotqa.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Official train set, without the random selection of the 5447 queries used for /dev. </p>
beir/msmarco
<p> A version of the MS MARCO passage ranking dataset. Includes queries from the /train, /dev, and /test sub-datasets. </p> <p> Note that this version differs from <a class=”ds-ref”>msmarco-passage</a>, in that it does not correct the encoding problems in the source documents. </p> <ul> <li><a href=”https://microsoft.github.io/msmarco/#ranking”>Leaderboard</a></li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> <li>See also: <a class=”ds-ref”>msmarco-passage</a></li> </ul>
-
Dataset irds.beir.msmarco.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the MS MARCO passage ranking dataset. Includes queries from the /train, /dev, and /test sub-datasets. </p> <p> Note that this version differs from <a class=”ds-ref”>msmarco-passage</a>, in that it does not correct the encoding problems in the source documents. </p> <ul> <li><a href=”https://microsoft.github.io/msmarco/#ranking”>Leaderboard</a></li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> <li>See also: <a class=”ds-ref”>msmarco-passage</a></li> </ul>
-
Dataset irds.beir.msmarco.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the MS MARCO passage ranking dataset. Includes queries from the /train, /dev, and /test sub-datasets. </p> <p> Note that this version differs from <a class=”ds-ref”>msmarco-passage</a>, in that it does not correct the encoding problems in the source documents. </p> <ul> <li><a href=”https://microsoft.github.io/msmarco/#ranking”>Leaderboard</a></li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> <li>See also: <a class=”ds-ref”>msmarco-passage</a></li> </ul>
-
Dataset irds.beir.msmarco.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the MS MARCO passage ranking dev set. </p> <ul> <li>See also: <a class=”ds-ref”>msmarco-passage/dev</a></li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> </ul>
-
Dataset irds.beir.msmarco.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the MS MARCO passage ranking dev set. </p> <ul> <li>See also: <a class=”ds-ref”>msmarco-passage/dev</a></li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> </ul>
-
Dataset irds.beir.msmarco.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the MS MARCO passage ranking dev set. </p> <ul> <li>See also: <a class=”ds-ref”>msmarco-passage/dev</a></li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> </ul>
-
Dataset irds.beir.msmarco.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the TREC Deep Learning 2019 set. </p> <ul> <li>See also: <a class=”ds-ref”>msmarco-passage/trec-dl-2019</a></li> <li><a href=”https://arxiv.org/pdf/2003.07820.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.beir.msmarco.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the TREC Deep Learning 2019 set. </p> <ul> <li>See also: <a class=”ds-ref”>msmarco-passage/trec-dl-2019</a></li> <li><a href=”https://arxiv.org/pdf/2003.07820.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.beir.msmarco.test
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the TREC Deep Learning 2019 set. </p> <ul> <li>See also: <a class=”ds-ref”>msmarco-passage/trec-dl-2019</a></li> <li><a href=”https://arxiv.org/pdf/2003.07820.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.beir.msmarco.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the MS MARCO passage ranking train set. </p> <ul> <li>See also: <a class=”ds-ref”>msmarco-passage/train</a></li> </ul>
-
Dataset irds.beir.msmarco.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the MS MARCO passage ranking train set. </p> <ul> <li>See also: <a class=”ds-ref”>msmarco-passage/train</a></li> </ul>
-
Dataset irds.beir.msmarco.train
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the MS MARCO passage ranking train set. </p> <ul> <li>See also: <a class=”ds-ref”>msmarco-passage/train</a></li> </ul>
beir/nfcorpus
<p> A version of the NF Corpus (Nutrition Facts). Queries use the “title” variant of the query, which here are often natural language questions. Queries include all those from /train /dev and /test. </p> <p> Data pre-processing may be different than what is done in <a class=”ds-ref”>nfcorpus</a>. </p> <ul> <li><a href=”https://www.cl.uni-heidelberg.de/statnlpgroup/nfcorpus/”>Dataset website</p></li> <li><a href=”https://link.springer.com/chapter/10.1007/978-3-319-30671-1_58”>Dataset paper</p></li> <li>See also: <a class=”ds-ref”>nfcorpus</a></li> </ul>
-
Dataset irds.beir.nfcorpus.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the NF Corpus (Nutrition Facts). Queries use the “title” variant of the query, which here are often natural language questions. Queries include all those from /train /dev and /test. </p> <p> Data pre-processing may be different than what is done in <a class=”ds-ref”>nfcorpus</a>. </p> <ul> <li><a href=”https://www.cl.uni-heidelberg.de/statnlpgroup/nfcorpus/”>Dataset website</p></li> <li><a href=”https://link.springer.com/chapter/10.1007/978-3-319-30671-1_58”>Dataset paper</p></li> <li>See also: <a class=”ds-ref”>nfcorpus</a></li> </ul>
-
Dataset irds.beir.nfcorpus.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the NF Corpus (Nutrition Facts). Queries use the “title” variant of the query, which here are often natural language questions. Queries include all those from /train /dev and /test. </p> <p> Data pre-processing may be different than what is done in <a class=”ds-ref”>nfcorpus</a>. </p> <ul> <li><a href=”https://www.cl.uni-heidelberg.de/statnlpgroup/nfcorpus/”>Dataset website</p></li> <li><a href=”https://link.springer.com/chapter/10.1007/978-3-319-30671-1_58”>Dataset paper</p></li> <li>See also: <a class=”ds-ref”>nfcorpus</a></li> </ul>
-
Dataset irds.beir.nfcorpus.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Combined dev set of NFCorpus. </p> <ul> <li>See also: <a class=”ds-ref”>nfcorpus/dev</a></li> </ul>
-
Dataset irds.beir.nfcorpus.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Combined dev set of NFCorpus. </p> <ul> <li>See also: <a class=”ds-ref”>nfcorpus/dev</a></li> </ul>
-
Dataset irds.beir.nfcorpus.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Combined dev set of NFCorpus. </p> <ul> <li>See also: <a class=”ds-ref”>nfcorpus/dev</a></li> </ul>
-
Dataset irds.beir.nfcorpus.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> Combined test set of NFCorpus. </p> <ul> <li>See also: <a class=”ds-ref”>nfcorpus/test</a></li> </ul>
-
Dataset irds.beir.nfcorpus.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Combined test set of NFCorpus. </p> <ul> <li>See also: <a class=”ds-ref”>nfcorpus/test</a></li> </ul>
-
Dataset irds.beir.nfcorpus.test
datamaestro_text.datasets.irds.data.Adhoc
<p> Combined test set of NFCorpus. </p> <ul> <li>See also: <a class=”ds-ref”>nfcorpus/test</a></li> </ul>
-
Dataset irds.beir.nfcorpus.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Combined train set of NFCorpus. </p> <ul> <li>See also: <a class=”ds-ref”>nfcorpus/train</a></li> </ul>
-
Dataset irds.beir.nfcorpus.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Combined train set of NFCorpus. </p> <ul> <li>See also: <a class=”ds-ref”>nfcorpus/train</a></li> </ul>
-
Dataset irds.beir.nfcorpus.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Combined train set of NFCorpus. </p> <ul> <li>See also: <a class=”ds-ref”>nfcorpus/train</a></li> </ul>
beir/nq
<p> A version of the Natural Questions dev dataset. </p> <p> Data pre-processing differs both from what is done in <a class=”ds-ref”>natural-questions</a> and <a class=”ds-ref”>dpr-w100/natural-questions</a>, especially with respect to the document collection and filtering conducted on the queries. See the Beir paper for details. </p> <ul> <li><a href=”https://ai.google.com/research/NaturalQuestions”>Dataset website</a></li> <li><a href=”https://storage.googleapis.com/pub-tools-public-publication-data/pdf/1f7b46b5378d757553d3e92ead36bda2e4254244.pdf”>Dataset paper</a></li> <li>See also: <a class=”ds-ref”>natural-questions</a>, <a class=”ds-ref”>dpr-w100/natural-questions</a></li> </ul>
-
Dataset irds.beir.nq.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the Natural Questions dev dataset. </p> <p> Data pre-processing differs both from what is done in <a class=”ds-ref”>natural-questions</a> and <a class=”ds-ref”>dpr-w100/natural-questions</a>, especially with respect to the document collection and filtering conducted on the queries. See the Beir paper for details. </p> <ul> <li><a href=”https://ai.google.com/research/NaturalQuestions”>Dataset website</a></li> <li><a href=”https://storage.googleapis.com/pub-tools-public-publication-data/pdf/1f7b46b5378d757553d3e92ead36bda2e4254244.pdf”>Dataset paper</a></li> <li>See also: <a class=”ds-ref”>natural-questions</a>, <a class=”ds-ref”>dpr-w100/natural-questions</a></li> </ul>
-
Dataset irds.beir.nq.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the Natural Questions dev dataset. </p> <p> Data pre-processing differs both from what is done in <a class=”ds-ref”>natural-questions</a> and <a class=”ds-ref”>dpr-w100/natural-questions</a>, especially with respect to the document collection and filtering conducted on the queries. See the Beir paper for details. </p> <ul> <li><a href=”https://ai.google.com/research/NaturalQuestions”>Dataset website</a></li> <li><a href=”https://storage.googleapis.com/pub-tools-public-publication-data/pdf/1f7b46b5378d757553d3e92ead36bda2e4254244.pdf”>Dataset paper</a></li> <li>See also: <a class=”ds-ref”>natural-questions</a>, <a class=”ds-ref”>dpr-w100/natural-questions</a></li> </ul>
-
Dataset irds.beir.nq.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the Natural Questions dev dataset. </p> <p> Data pre-processing differs both from what is done in <a class=”ds-ref”>natural-questions</a> and <a class=”ds-ref”>dpr-w100/natural-questions</a>, especially with respect to the document collection and filtering conducted on the queries. See the Beir paper for details. </p> <ul> <li><a href=”https://ai.google.com/research/NaturalQuestions”>Dataset website</a></li> <li><a href=”https://storage.googleapis.com/pub-tools-public-publication-data/pdf/1f7b46b5378d757553d3e92ead36bda2e4254244.pdf”>Dataset paper</a></li> <li>See also: <a class=”ds-ref”>natural-questions</a>, <a class=”ds-ref”>dpr-w100/natural-questions</a></li> </ul>
-
Dataset irds.beir.nq
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the Natural Questions dev dataset. </p> <p> Data pre-processing differs both from what is done in <a class=”ds-ref”>natural-questions</a> and <a class=”ds-ref”>dpr-w100/natural-questions</a>, especially with respect to the document collection and filtering conducted on the queries. See the Beir paper for details. </p> <ul> <li><a href=”https://ai.google.com/research/NaturalQuestions”>Dataset website</a></li> <li><a href=”https://storage.googleapis.com/pub-tools-public-publication-data/pdf/1f7b46b5378d757553d3e92ead36bda2e4254244.pdf”>Dataset paper</a></li> <li>See also: <a class=”ds-ref”>natural-questions</a>, <a class=”ds-ref”>dpr-w100/natural-questions</a></li> </ul>
beir/quora
<p> A version of the Quora duplicate question detection dataset (QQP). Includes queries from /dev and /test sets. </p> <ul> <li><a href=”https://www.kaggle.com/c/quora-question-pairs”>Dataset website</a></li> </ul>
-
Dataset irds.beir.quora.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the Quora duplicate question detection dataset (QQP). Includes queries from /dev and /test sets. </p> <ul> <li><a href=”https://www.kaggle.com/c/quora-question-pairs”>Dataset website</a></li> </ul>
-
Dataset irds.beir.quora.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the Quora duplicate question detection dataset (QQP). Includes queries from /dev and /test sets. </p> <ul> <li><a href=”https://www.kaggle.com/c/quora-question-pairs”>Dataset website</a></li> </ul>
-
Dataset irds.beir.quora.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> A 5,000 question subset of the original dataset, without overlaps in the other subsets. </p>
-
Dataset irds.beir.quora.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A 5,000 question subset of the original dataset, without overlaps in the other subsets. </p>
-
Dataset irds.beir.quora.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> A 5,000 question subset of the original dataset, without overlaps in the other subsets. </p>
-
Dataset irds.beir.quora.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> A 10,000 question subset of the original dataset, without overlaps in the other subsets. </p>
-
Dataset irds.beir.quora.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A 10,000 question subset of the original dataset, without overlaps in the other subsets. </p>
-
Dataset irds.beir.quora.test
datamaestro_text.datasets.irds.data.Adhoc
<p> A 10,000 question subset of the original dataset, without overlaps in the other subsets. </p>
beir/scidocs
<p> A version of the SciDocs dataset, used for citation retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/2020.acl-main.207.pdf”>Dataset paper</a></li> <li><a href=”https://allenai.org/data/scidocs”>Dataset website</a></li> </ul>
-
Dataset irds.beir.scidocs.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the SciDocs dataset, used for citation retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/2020.acl-main.207.pdf”>Dataset paper</a></li> <li><a href=”https://allenai.org/data/scidocs”>Dataset website</a></li> </ul>
-
Dataset irds.beir.scidocs.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the SciDocs dataset, used for citation retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/2020.acl-main.207.pdf”>Dataset paper</a></li> <li><a href=”https://allenai.org/data/scidocs”>Dataset website</a></li> </ul>
-
Dataset irds.beir.scidocs.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the SciDocs dataset, used for citation retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/2020.acl-main.207.pdf”>Dataset paper</a></li> <li><a href=”https://allenai.org/data/scidocs”>Dataset website</a></li> </ul>
-
Dataset irds.beir.scidocs
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the SciDocs dataset, used for citation retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/2020.acl-main.207.pdf”>Dataset paper</a></li> <li><a href=”https://allenai.org/data/scidocs”>Dataset website</a></li> </ul>
beir/scifact
<p> A version of the SciFact dataset, for fact verification. Queries include those form the /train and /test sets. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/2020.emnlp-main.609.pdf”>Dataset paper</a></li> <li><a href=”https://www.aclweb.org/anthology/2020.emnlp-main.609.pdf”>Dataset website</a></li> </ul>
-
Dataset irds.beir.scifact.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the SciFact dataset, for fact verification. Queries include those form the /train and /test sets. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/2020.emnlp-main.609.pdf”>Dataset paper</a></li> <li><a href=”https://www.aclweb.org/anthology/2020.emnlp-main.609.pdf”>Dataset website</a></li> </ul>
-
Dataset irds.beir.scifact.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the SciFact dataset, for fact verification. Queries include those form the /train and /test sets. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/2020.emnlp-main.609.pdf”>Dataset paper</a></li> <li><a href=”https://www.aclweb.org/anthology/2020.emnlp-main.609.pdf”>Dataset website</a></li> </ul>
-
Dataset irds.beir.scifact.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> The official <em>dev</em> set. </p>
-
Dataset irds.beir.scifact.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The official <em>dev</em> set. </p>
-
Dataset irds.beir.scifact.test
datamaestro_text.datasets.irds.data.Adhoc
<p> The official <em>dev</em> set. </p>
-
Dataset irds.beir.scifact.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> The official train set. </p>
-
Dataset irds.beir.scifact.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The official train set. </p>
-
Dataset irds.beir.scifact.train
datamaestro_text.datasets.irds.data.Adhoc
<p> The official train set. </p>
beir/trec-covid
<p> A version of the TREC COVID (complete) dataset, with titles and abstracts as documents. Queries are the question variant. </p> <p> Data pre-processing may be different than what is done in <a class=”ds-ref”>cord19/trec-covid</a>. </p> <ul> <li><a href=”https://www.semanticscholar.org/cord19”>Document collection site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> <li>See also: <a class=”ds-ref”>cord19/trec-covid</a></li> </ul>
-
Dataset irds.beir.trec-covid.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the TREC COVID (complete) dataset, with titles and abstracts as documents. Queries are the question variant. </p> <p> Data pre-processing may be different than what is done in <a class=”ds-ref”>cord19/trec-covid</a>. </p> <ul> <li><a href=”https://www.semanticscholar.org/cord19”>Document collection site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> <li>See also: <a class=”ds-ref”>cord19/trec-covid</a></li> </ul>
-
Dataset irds.beir.trec-covid.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the TREC COVID (complete) dataset, with titles and abstracts as documents. Queries are the question variant. </p> <p> Data pre-processing may be different than what is done in <a class=”ds-ref”>cord19/trec-covid</a>. </p> <ul> <li><a href=”https://www.semanticscholar.org/cord19”>Document collection site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> <li>See also: <a class=”ds-ref”>cord19/trec-covid</a></li> </ul>
-
Dataset irds.beir.trec-covid.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the TREC COVID (complete) dataset, with titles and abstracts as documents. Queries are the question variant. </p> <p> Data pre-processing may be different than what is done in <a class=”ds-ref”>cord19/trec-covid</a>. </p> <ul> <li><a href=”https://www.semanticscholar.org/cord19”>Document collection site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> <li>See also: <a class=”ds-ref”>cord19/trec-covid</a></li> </ul>
-
Dataset irds.beir.trec-covid
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the TREC COVID (complete) dataset, with titles and abstracts as documents. Queries are the question variant. </p> <p> Data pre-processing may be different than what is done in <a class=”ds-ref”>cord19/trec-covid</a>. </p> <ul> <li><a href=”https://www.semanticscholar.org/cord19”>Document collection site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> <li>See also: <a class=”ds-ref”>cord19/trec-covid</a></li> </ul>
beir/webis-touche2020
<p> Original version of the Touchè-2020 dataset, for argument retrieval. </p> <div class=”warn”> Consider using <a class=”ds-ref”>beir/webis-touche2020/v2</a> instead; it uses an updated, more complete version of the qrels. </div> <ul> <li><a href=”https://link.springer.com/chapter/10.1007%2F978-3-030-58219-7_26”>Dataset paper</a></li> <li><a href=”https://webis.de/events/touche-20/”>Dataset webiste</a></li> </ul>
-
Dataset irds.beir.webis-touche2020.documents
datamaestro_text.datasets.irds.data.Documents
<p> Original version of the Touchè-2020 dataset, for argument retrieval. </p> <div class=”warn”> Consider using <a class=”ds-ref”>beir/webis-touche2020/v2</a> instead; it uses an updated, more complete version of the qrels. </div> <ul> <li><a href=”https://link.springer.com/chapter/10.1007%2F978-3-030-58219-7_26”>Dataset paper</a></li> <li><a href=”https://webis.de/events/touche-20/”>Dataset webiste</a></li> </ul>
-
Dataset irds.beir.webis-touche2020.queries
datamaestro_text.datasets.irds.data.Topics
<p> Original version of the Touchè-2020 dataset, for argument retrieval. </p> <div class=”warn”> Consider using <a class=”ds-ref”>beir/webis-touche2020/v2</a> instead; it uses an updated, more complete version of the qrels. </div> <ul> <li><a href=”https://link.springer.com/chapter/10.1007%2F978-3-030-58219-7_26”>Dataset paper</a></li> <li><a href=”https://webis.de/events/touche-20/”>Dataset webiste</a></li> </ul>
-
Dataset irds.beir.webis-touche2020.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Original version of the Touchè-2020 dataset, for argument retrieval. </p> <div class=”warn”> Consider using <a class=”ds-ref”>beir/webis-touche2020/v2</a> instead; it uses an updated, more complete version of the qrels. </div> <ul> <li><a href=”https://link.springer.com/chapter/10.1007%2F978-3-030-58219-7_26”>Dataset paper</a></li> <li><a href=”https://webis.de/events/touche-20/”>Dataset webiste</a></li> </ul>
-
Dataset irds.beir.webis-touche2020
datamaestro_text.datasets.irds.data.Adhoc
<p> Original version of the Touchè-2020 dataset, for argument retrieval. </p> <div class=”warn”> Consider using <a class=”ds-ref”>beir/webis-touche2020/v2</a> instead; it uses an updated, more complete version of the qrels. </div> <ul> <li><a href=”https://link.springer.com/chapter/10.1007%2F978-3-030-58219-7_26”>Dataset paper</a></li> <li><a href=”https://webis.de/events/touche-20/”>Dataset webiste</a></li> </ul>
beir/webis-touche2020/v2
<p> Version 2 of the Touchè-2020 dataset, for argument retrieval. This version uses the “corrected” version of the qrels, mapped to version 1 of the corpus. </p> <ul> <li><a href=”https://link.springer.com/chapter/10.1007%2F978-3-030-58219-7_26”>Dataset paper</a></li> <li><a href=”https://webis.de/events/touche-20/”>Dataset webiste</a></li> </ul>
-
Dataset irds.beir.webis-touche2020.v2.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version 2 of the Touchè-2020 dataset, for argument retrieval. This version uses the “corrected” version of the qrels, mapped to version 1 of the corpus. </p> <ul> <li><a href=”https://link.springer.com/chapter/10.1007%2F978-3-030-58219-7_26”>Dataset paper</a></li> <li><a href=”https://webis.de/events/touche-20/”>Dataset webiste</a></li> </ul>
-
Dataset irds.beir.webis-touche2020.v2.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version 2 of the Touchè-2020 dataset, for argument retrieval. This version uses the “corrected” version of the qrels, mapped to version 1 of the corpus. </p> <ul> <li><a href=”https://link.springer.com/chapter/10.1007%2F978-3-030-58219-7_26”>Dataset paper</a></li> <li><a href=”https://webis.de/events/touche-20/”>Dataset webiste</a></li> </ul>
-
Dataset irds.beir.webis-touche2020.v2.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version 2 of the Touchè-2020 dataset, for argument retrieval. This version uses the “corrected” version of the qrels, mapped to version 1 of the corpus. </p> <ul> <li><a href=”https://link.springer.com/chapter/10.1007%2F978-3-030-58219-7_26”>Dataset paper</a></li> <li><a href=”https://webis.de/events/touche-20/”>Dataset webiste</a></li> </ul>
-
Dataset irds.beir.webis-touche2020.v2
datamaestro_text.datasets.irds.data.Adhoc
<p> Version 2 of the Touchè-2020 dataset, for argument retrieval. This version uses the “corrected” version of the qrels, mapped to version 1 of the corpus. </p> <ul> <li><a href=”https://link.springer.com/chapter/10.1007%2F978-3-030-58219-7_26”>Dataset paper</a></li> <li><a href=”https://webis.de/events/touche-20/”>Dataset webiste</a></li> </ul>
c4/en-noclean-tr
<p> The “en-noclean” train subset of the corpus, consisting of ~1B documents written in English. Document IDs are assigned as proposed by the <a href=”https://trec-health-misinfo.github.io/”> TREC Health Misinformation 2021 track</a>. </p>
-
Dataset irds.c4.en-noclean-tr.documents
datamaestro_text.datasets.irds.data.Documents
<p> The “en-noclean” train subset of the corpus, consisting of ~1B documents written in English. Document IDs are assigned as proposed by the <a href=”https://trec-health-misinfo.github.io/”> TREC Health Misinformation 2021 track</a>. </p>
-
Dataset irds.c4.en-noclean-tr.trec-misinfo-2021.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Health Misinformation 2021 track. </p> <ul> <li><a href=”https://trec-health-misinfo.github.io/”>Shared Task Website</a> </ul>
car/v1.5
<p> Version 1.5 of the TREC dataset. This version is used for year 1 (2017) of the TREC CAR shared task. </p>
-
Dataset irds.car.v1.5.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version 1.5 of the TREC dataset. This version is used for year 1 (2017) of the TREC CAR shared task. </p>
-
Dataset irds.car.v1.5.test200.queries
datamaestro_text.datasets.irds.data.Topics
<p> Un-official test set consisting of manually-selected articles. Sometimes used as a validation set. </p>
-
Dataset irds.car.v1.5.test200.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Un-official test set consisting of manually-selected articles. Sometimes used as a validation set. </p>
-
Dataset irds.car.v1.5.test200
datamaestro_text.datasets.irds.data.Adhoc
<p> Un-official test set consisting of manually-selected articles. Sometimes used as a validation set. </p>
-
Dataset irds.car.v1.5.train.fold0.queries
datamaestro_text.datasets.irds.data.Topics
<p> Fold 0 of the official large training set for TREC CAR 2017. Relevance assumed from hierarchical structure of pages (i.e., paragraphs under a header are assumed relevant.) </p>
-
Dataset irds.car.v1.5.train.fold0.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Fold 0 of the official large training set for TREC CAR 2017. Relevance assumed from hierarchical structure of pages (i.e., paragraphs under a header are assumed relevant.) </p>
-
Dataset irds.car.v1.5.train.fold0
datamaestro_text.datasets.irds.data.Adhoc
<p> Fold 0 of the official large training set for TREC CAR 2017. Relevance assumed from hierarchical structure of pages (i.e., paragraphs under a header are assumed relevant.) </p>
-
Dataset irds.car.v1.5.train.fold1.queries
datamaestro_text.datasets.irds.data.Topics
<p> Fold 1 of the official large training set for TREC CAR 2017. Relevance assumed from hierarchical structure of pages (i.e., paragraphs under a header are assumed relevant.) </p>
-
Dataset irds.car.v1.5.train.fold1.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Fold 1 of the official large training set for TREC CAR 2017. Relevance assumed from hierarchical structure of pages (i.e., paragraphs under a header are assumed relevant.) </p>
-
Dataset irds.car.v1.5.train.fold1
datamaestro_text.datasets.irds.data.Adhoc
<p> Fold 1 of the official large training set for TREC CAR 2017. Relevance assumed from hierarchical structure of pages (i.e., paragraphs under a header are assumed relevant.) </p>
-
Dataset irds.car.v1.5.train.fold2.queries
datamaestro_text.datasets.irds.data.Topics
<p> Fold 2 of the official large training set for TREC CAR 2017. Relevance assumed from hierarchical structure of pages (i.e., paragraphs under a header are assumed relevant.) </p>
-
Dataset irds.car.v1.5.train.fold2.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Fold 2 of the official large training set for TREC CAR 2017. Relevance assumed from hierarchical structure of pages (i.e., paragraphs under a header are assumed relevant.) </p>
-
Dataset irds.car.v1.5.train.fold2
datamaestro_text.datasets.irds.data.Adhoc
<p> Fold 2 of the official large training set for TREC CAR 2017. Relevance assumed from hierarchical structure of pages (i.e., paragraphs under a header are assumed relevant.) </p>
-
Dataset irds.car.v1.5.train.fold3.queries
datamaestro_text.datasets.irds.data.Topics
<p> Fold 3 of the official large training set for TREC CAR 2017. Relevance assumed from hierarchical structure of pages (i.e., paragraphs under a header are assumed relevant.) </p>
-
Dataset irds.car.v1.5.train.fold3.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Fold 3 of the official large training set for TREC CAR 2017. Relevance assumed from hierarchical structure of pages (i.e., paragraphs under a header are assumed relevant.) </p>
-
Dataset irds.car.v1.5.train.fold3
datamaestro_text.datasets.irds.data.Adhoc
<p> Fold 3 of the official large training set for TREC CAR 2017. Relevance assumed from hierarchical structure of pages (i.e., paragraphs under a header are assumed relevant.) </p>
-
Dataset irds.car.v1.5.train.fold4.queries
datamaestro_text.datasets.irds.data.Topics
<p> Fold 4 of the official large training set for TREC CAR 2017. Relevance assumed from hierarchical structure of pages (i.e., paragraphs under a header are assumed relevant.) </p>
-
Dataset irds.car.v1.5.train.fold4.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Fold 4 of the official large training set for TREC CAR 2017. Relevance assumed from hierarchical structure of pages (i.e., paragraphs under a header are assumed relevant.) </p>
-
Dataset irds.car.v1.5.train.fold4
datamaestro_text.datasets.irds.data.Adhoc
<p> Fold 4 of the official large training set for TREC CAR 2017. Relevance assumed from hierarchical structure of pages (i.e., paragraphs under a header are assumed relevant.) </p>
-
Dataset irds.car.v1.5.trec-y1.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official test set of TREC CAR 2017 (year 1). </p>
-
Dataset irds.car.v1.5.trec-y1.auto.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official test set of TREC CAR 2017 (year 1), using automatic relevance judgments (assumed from hierarchical structure of pages, i.e., paragraphs under a header are assumed relevant.) </p>
-
Dataset irds.car.v1.5.trec-y1.auto.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official test set of TREC CAR 2017 (year 1), using automatic relevance judgments (assumed from hierarchical structure of pages, i.e., paragraphs under a header are assumed relevant.) </p>
-
Dataset irds.car.v1.5.trec-y1.auto
datamaestro_text.datasets.irds.data.Adhoc
<p> Official test set of TREC CAR 2017 (year 1), using automatic relevance judgments (assumed from hierarchical structure of pages, i.e., paragraphs under a header are assumed relevant.) </p>
-
Dataset irds.car.v1.5.trec-y1.manual.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official test set of TREC CAR 2017 (year 1), using manual graded relevance judgments. </p>
-
Dataset irds.car.v1.5.trec-y1.manual.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official test set of TREC CAR 2017 (year 1), using manual graded relevance judgments. </p>
-
Dataset irds.car.v1.5.trec-y1.manual
datamaestro_text.datasets.irds.data.Adhoc
<p> Official test set of TREC CAR 2017 (year 1), using manual graded relevance judgments. </p>
car/v2.0
<p> Version 2.0 of the TREC CAR dataset. </p>
-
Dataset irds.car.v2.0.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version 2.0 of the TREC CAR dataset. </p>
Highwire (TREC Genomics 2006-07)
<p> Medical document collection from <a href=”https://www.highwirepress.com/”>Highwire Press</a>. Includes 162,259 scientific articles from 49 journals. </p> <p> This dataset is used for the TREC 2006-07 TREC Genomics track. </p> <p> Note that these documents are split into passages based on paragraph tags in the HTML. </p> <ul> <li>Documents: Biomedical journal articles</li> <li><a href=”https://dmice.ohsu.edu/trec-gen/2006data.html#docs”>Information about document collection</a></li> </ul>
-
Dataset irds.highwire.documents
datamaestro_text.datasets.irds.data.Documents
<p> Medical document collection from <a href=”https://www.highwirepress.com/”>Highwire Press</a>. Includes 162,259 scientific articles from 49 journals. </p> <p> This dataset is used for the TREC 2006-07 TREC Genomics track. </p> <p> Note that these documents are split into passages based on paragraph tags in the HTML. </p> <ul> <li>Documents: Biomedical journal articles</li> <li><a href=”https://dmice.ohsu.edu/trec-gen/2006data.html#docs”>Information about document collection</a></li> </ul>
-
Dataset irds.highwire.trec-genomics-2006.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Genomics Track 2006 benchmark. Contains 28 queries with passage-level relevance judgments. </p> <ul> <li>Documents: Biomedical journal articles</li> <li>Queries: Natural language questions</li> <li>Qrels: deep, by passage</li> <li><a href=”https://dmice.ohsu.edu/trec-gen/2006data.html”>Shared task data site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec15/papers/GEO06.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.highwire.trec-genomics-2006.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Genomics Track 2006 benchmark. Contains 28 queries with passage-level relevance judgments. </p> <ul> <li>Documents: Biomedical journal articles</li> <li>Queries: Natural language questions</li> <li>Qrels: deep, by passage</li> <li><a href=”https://dmice.ohsu.edu/trec-gen/2006data.html”>Shared task data site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec15/papers/GEO06.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.highwire.trec-genomics-2006
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Genomics Track 2006 benchmark. Contains 28 queries with passage-level relevance judgments. </p> <ul> <li>Documents: Biomedical journal articles</li> <li>Queries: Natural language questions</li> <li>Qrels: deep, by passage</li> <li><a href=”https://dmice.ohsu.edu/trec-gen/2006data.html”>Shared task data site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec15/papers/GEO06.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.highwire.trec-genomics-2007.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Genomics Track 2007 benchmark. Contains 36 queries with passage-level relevance judgments. </p> <ul> <li>Documents: Biomedical journal articles</li> <li>Queries: Natural language questions</li> <li>Qrels: deep, by passage</li> <li><a href=”https://dmice.ohsu.edu/trec-gen/2007data.html”>Shared task data site</a></li> <li><a href=”https://dmice.ohsu.edu/hersh/trec-07-genomics.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.highwire.trec-genomics-2007.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Genomics Track 2007 benchmark. Contains 36 queries with passage-level relevance judgments. </p> <ul> <li>Documents: Biomedical journal articles</li> <li>Queries: Natural language questions</li> <li>Qrels: deep, by passage</li> <li><a href=”https://dmice.ohsu.edu/trec-gen/2007data.html”>Shared task data site</a></li> <li><a href=”https://dmice.ohsu.edu/hersh/trec-07-genomics.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.highwire.trec-genomics-2007
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Genomics Track 2007 benchmark. Contains 36 queries with passage-level relevance judgments. </p> <ul> <li>Documents: Biomedical journal articles</li> <li>Queries: Natural language questions</li> <li>Qrels: deep, by passage</li> <li><a href=”https://dmice.ohsu.edu/trec-gen/2007data.html”>Shared task data site</a></li> <li><a href=”https://dmice.ohsu.edu/hersh/trec-07-genomics.pdf”>Shared task paper</a></li> </ul>
medline/2004
<p> 3M Medline articles including titles and abstracts, used for the TREC 2004-05 Genomics track. </p> <ul> <li>Documents: Biomedical article titles and abstracts</li> <li><a href=”https://dmice.ohsu.edu/trec-gen/2004data.html”>Information about document collection</a></li> </ul>
-
Dataset irds.medline.2004.documents
datamaestro_text.datasets.irds.data.Documents
<p> 3M Medline articles including titles and abstracts, used for the TREC 2004-05 Genomics track. </p> <ul> <li>Documents: Biomedical article titles and abstracts</li> <li><a href=”https://dmice.ohsu.edu/trec-gen/2004data.html”>Information about document collection</a></li> </ul>
-
Dataset irds.medline.2004.trec-genomics-2004.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Genomics Track 2004 benchmark. Contains 50 queries with article-level relevance judgments. </p> <ul> <li>Documents: Biomedical article titles and abstracts</li> <li>Queries: Natural language questions</li> <li>Qrels: deep, graded</li> <li><a href=”https://dmice.ohsu.edu/trec-gen/2004data.html”>Shared task data site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec13/papers/GEO.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.medline.2004.trec-genomics-2004.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Genomics Track 2004 benchmark. Contains 50 queries with article-level relevance judgments. </p> <ul> <li>Documents: Biomedical article titles and abstracts</li> <li>Queries: Natural language questions</li> <li>Qrels: deep, graded</li> <li><a href=”https://dmice.ohsu.edu/trec-gen/2004data.html”>Shared task data site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec13/papers/GEO.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.medline.2004.trec-genomics-2004
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Genomics Track 2004 benchmark. Contains 50 queries with article-level relevance judgments. </p> <ul> <li>Documents: Biomedical article titles and abstracts</li> <li>Queries: Natural language questions</li> <li>Qrels: deep, graded</li> <li><a href=”https://dmice.ohsu.edu/trec-gen/2004data.html”>Shared task data site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec13/papers/GEO.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.medline.2004.trec-genomics-2005.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Genomics Track 2005 benchmark. Contains 50 queries with article-level relevance judgments. </p> <ul> <li>Documents: Biomedical article titles and abstracts</li> <li>Queries: Natural language questions</li> <li>Qrels: deep, graded</li> <li><a href=”https://dmice.ohsu.edu/trec-gen/2005data.html”>Shared task data site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec14/papers/GEO.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.medline.2004.trec-genomics-2005.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Genomics Track 2005 benchmark. Contains 50 queries with article-level relevance judgments. </p> <ul> <li>Documents: Biomedical article titles and abstracts</li> <li>Queries: Natural language questions</li> <li>Qrels: deep, graded</li> <li><a href=”https://dmice.ohsu.edu/trec-gen/2005data.html”>Shared task data site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec14/papers/GEO.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.medline.2004.trec-genomics-2005
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Genomics Track 2005 benchmark. Contains 50 queries with article-level relevance judgments. </p> <ul> <li>Documents: Biomedical article titles and abstracts</li> <li>Queries: Natural language questions</li> <li>Qrels: deep, graded</li> <li><a href=”https://dmice.ohsu.edu/trec-gen/2005data.html”>Shared task data site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec14/papers/GEO.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
medline/2017
<p> 26M Medline and AACR/ASCO Proceedings articles including titles and abstracts. This collection is used for the TREC 2017-18 TREC Precision Medicine track. </p> <ul> <li>Documents: Biomedical article titles and abstracts</li> <li><a href=”http://www.trec-cds.org/2017.html”>Information about document collection</a></li> </ul>
-
Dataset irds.medline.2017.documents
datamaestro_text.datasets.irds.data.Documents
<p> 26M Medline and AACR/ASCO Proceedings articles including titles and abstracts. This collection is used for the TREC 2017-18 TREC Precision Medicine track. </p> <ul> <li>Documents: Biomedical article titles and abstracts</li> <li><a href=”http://www.trec-cds.org/2017.html”>Information about document collection</a></li> </ul>
-
Dataset irds.medline.2017.trec-pm-2017.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Precision Medicine (PM) Track 2017 benchmark. Contains 30 queries containing disease, gene, and target demographic information. </p> <ul> <li>Documents: Biomedical article titles and abstracts</li> <li>Queries: Specific to TREC PM information need</li> <li>Qrels: deep, graded</li> <li><a href=”http://www.trec-cds.org/2017.html”>Shared task data site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec26/papers/Overview-PM.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.medline.2017.trec-pm-2017.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Precision Medicine (PM) Track 2017 benchmark. Contains 30 queries containing disease, gene, and target demographic information. </p> <ul> <li>Documents: Biomedical article titles and abstracts</li> <li>Queries: Specific to TREC PM information need</li> <li>Qrels: deep, graded</li> <li><a href=”http://www.trec-cds.org/2017.html”>Shared task data site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec26/papers/Overview-PM.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.medline.2017.trec-pm-2017
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Precision Medicine (PM) Track 2017 benchmark. Contains 30 queries containing disease, gene, and target demographic information. </p> <ul> <li>Documents: Biomedical article titles and abstracts</li> <li>Queries: Specific to TREC PM information need</li> <li>Qrels: deep, graded</li> <li><a href=”http://www.trec-cds.org/2017.html”>Shared task data site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec26/papers/Overview-PM.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.medline.2017.trec-pm-2018.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Precision Medicine (PM) Track 2018 benchmark. Contains 50 queries containing disease, gene, and target demographic information. </p> <ul> <li>Documents: Biomedical article titles and abstracts</li> <li>Queries: Specific to TREC PM information need</li> <li>Qrels: deep, graded</li> <li><a href=”http://www.trec-cds.org/2018.html”>Shared task data site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec27/papers/Overview-PM.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.medline.2017.trec-pm-2018.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Precision Medicine (PM) Track 2018 benchmark. Contains 50 queries containing disease, gene, and target demographic information. </p> <ul> <li>Documents: Biomedical article titles and abstracts</li> <li>Queries: Specific to TREC PM information need</li> <li>Qrels: deep, graded</li> <li><a href=”http://www.trec-cds.org/2018.html”>Shared task data site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec27/papers/Overview-PM.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.medline.2017.trec-pm-2018
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Precision Medicine (PM) Track 2018 benchmark. Contains 50 queries containing disease, gene, and target demographic information. </p> <ul> <li>Documents: Biomedical article titles and abstracts</li> <li>Queries: Specific to TREC PM information need</li> <li>Qrels: deep, graded</li> <li><a href=”http://www.trec-cds.org/2018.html”>Shared task data site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec27/papers/Overview-PM.pdf”>Shared task paper</a></li> </ul>
clinicaltrials/2017
<p> A snapshot of <a href=”https://clinicaltrials.gov/”>ClinicalTrials.gov</a> from April 2017 for use with the <a class=”ds-ref”>clinicaltrials/2017/trec-pm-2017</a> and <a class=”ds-ref”>clinicaltrials/2017/trec-pm-2018</a> Clinical Trials subtasks. </p> <ul> <li><a href=”http://www.trec-cds.org/2017.html#documents”>Dataset information</a></li> </ul>
-
Dataset irds.clinicaltrials.2017.documents
datamaestro_text.datasets.irds.data.Documents
<p> A snapshot of <a href=”https://clinicaltrials.gov/”>ClinicalTrials.gov</a> from April 2017 for use with the <a class=”ds-ref”>clinicaltrials/2017/trec-pm-2017</a> and <a class=”ds-ref”>clinicaltrials/2017/trec-pm-2018</a> Clinical Trials subtasks. </p> <ul> <li><a href=”http://www.trec-cds.org/2017.html#documents”>Dataset information</a></li> </ul>
-
Dataset irds.clinicaltrials.2017.trec-pm-2017.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC 2017 Precision Medicine clinical trials subtask. </p> <ul> <li><a href=”http://www.trec-cds.org/2017.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec26/papers/Overview-PM.pdf”>Shared task paper</a></li> <li>See also: <a class=”ds-ref”>medline/2017/trec-pm-2017</a></li> </ul>
-
Dataset irds.clinicaltrials.2017.trec-pm-2017.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC 2017 Precision Medicine clinical trials subtask. </p> <ul> <li><a href=”http://www.trec-cds.org/2017.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec26/papers/Overview-PM.pdf”>Shared task paper</a></li> <li>See also: <a class=”ds-ref”>medline/2017/trec-pm-2017</a></li> </ul>
-
Dataset irds.clinicaltrials.2017.trec-pm-2017
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC 2017 Precision Medicine clinical trials subtask. </p> <ul> <li><a href=”http://www.trec-cds.org/2017.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec26/papers/Overview-PM.pdf”>Shared task paper</a></li> <li>See also: <a class=”ds-ref”>medline/2017/trec-pm-2017</a></li> </ul>
-
Dataset irds.clinicaltrials.2017.trec-pm-2018.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC 2018 Precision Medicine clinical trials subtask. </p> <ul> <li><a href=”http://www.trec-cds.org/2018.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec27/papers/Overview-PM.pdf”>Shared task paper</a></li> <li>See also: <a class=”ds-ref”>medline/2017/trec-pm-2018</a></li> </ul>
-
Dataset irds.clinicaltrials.2017.trec-pm-2018.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC 2018 Precision Medicine clinical trials subtask. </p> <ul> <li><a href=”http://www.trec-cds.org/2018.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec27/papers/Overview-PM.pdf”>Shared task paper</a></li> <li>See also: <a class=”ds-ref”>medline/2017/trec-pm-2018</a></li> </ul>
-
Dataset irds.clinicaltrials.2017.trec-pm-2018
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC 2018 Precision Medicine clinical trials subtask. </p> <ul> <li><a href=”http://www.trec-cds.org/2018.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec27/papers/Overview-PM.pdf”>Shared task paper</a></li> <li>See also: <a class=”ds-ref”>medline/2017/trec-pm-2018</a></li> </ul>
clinicaltrials/2019
<p> A snapshot of <a href=”https://clinicaltrials.gov/”>ClinicalTrials.gov</a> from May 2019 for use with the <a class=”ds-ref”>clinicaltrials/2019/trec-pm-2019</a> Clinical Trials subtask. </p> <ul> <li><a href=”http://www.trec-cds.org/2019.html#documents”>Dataset information</a></li> </ul>
-
Dataset irds.clinicaltrials.2019.documents
datamaestro_text.datasets.irds.data.Documents
<p> A snapshot of <a href=”https://clinicaltrials.gov/”>ClinicalTrials.gov</a> from May 2019 for use with the <a class=”ds-ref”>clinicaltrials/2019/trec-pm-2019</a> Clinical Trials subtask. </p> <ul> <li><a href=”http://www.trec-cds.org/2019.html#documents”>Dataset information</a></li> </ul>
-
Dataset irds.clinicaltrials.2019.trec-pm-2019.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC 2019 Precision Medicine clinical trials subtask. </p> <ul> <li><a href=”http://www.trec-cds.org/2019.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec28/papers/OVERVIEW.PM.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clinicaltrials.2019.trec-pm-2019.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC 2019 Precision Medicine clinical trials subtask. </p> <ul> <li><a href=”http://www.trec-cds.org/2019.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec28/papers/OVERVIEW.PM.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clinicaltrials.2019.trec-pm-2019
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC 2019 Precision Medicine clinical trials subtask. </p> <ul> <li><a href=”http://www.trec-cds.org/2019.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec28/papers/OVERVIEW.PM.pdf”>Shared task paper</a></li> </ul>
clinicaltrials/2021
<p> A snapshot of <a href=”https://clinicaltrials.gov/”>ClinicalTrials.gov</a> from April 2021 for use with the <a href=”http://www.trec-cds.org/2021.html”>TREC Clinical Trials 2021 Track</a>. </p> <ul> <li><a href=”http://www.trec-cds.org/2021.html#documents”>Dataset information</a></li> </ul>
-
Dataset irds.clinicaltrials.2021.documents
datamaestro_text.datasets.irds.data.Documents
<p> A snapshot of <a href=”https://clinicaltrials.gov/”>ClinicalTrials.gov</a> from April 2021 for use with the <a href=”http://www.trec-cds.org/2021.html”>TREC Clinical Trials 2021 Track</a>. </p> <ul> <li><a href=”http://www.trec-cds.org/2021.html#documents”>Dataset information</a></li> </ul>
-
Dataset irds.clinicaltrials.2021.trec-ct-2021.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Clinical Trials 2021 track. </p> <ul> <li><a href=”http://www.trec-cds.org/2021.html”>Shared Task Website</a></li> </ul>
-
Dataset irds.clinicaltrials.2021.trec-ct-2021.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Clinical Trials 2021 track. </p> <ul> <li><a href=”http://www.trec-cds.org/2021.html”>Shared Task Website</a></li> </ul>
-
Dataset irds.clinicaltrials.2021.trec-ct-2021
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Clinical Trials 2021 track. </p> <ul> <li><a href=”http://www.trec-cds.org/2021.html”>Shared Task Website</a></li> </ul>
-
Dataset irds.clinicaltrials.2021.trec-ct-2022.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Clinical Trials 2022 track. </p> <ul> <li><a href=”https://www.trec-cds.org/2022.html”>Shared Task Website</a></li> </ul>
ClueWeb09
<p> ClueWeb 2009 web document collection. Contains over 1B web pages, in 10 languages. </p> <p> The dataset is obtained for a fee from CMU, and is shipped as hard drives. More information is provided <a href=”https://lemurproject.org/clueweb09/”>here</a>. </p> <ul> <li><a href=”https://lemurproject.org/clueweb09/”>Document collection site</a></li> </ul>
-
Dataset irds.clueweb09.documents
datamaestro_text.datasets.irds.data.Documents
<p> ClueWeb 2009 web document collection. Contains over 1B web pages, in 10 languages. </p> <p> The dataset is obtained for a fee from CMU, and is shipped as hard drives. More information is provided <a href=”https://lemurproject.org/clueweb09/”>here</a>. </p> <ul> <li><a href=”https://lemurproject.org/clueweb09/”>Document collection site</a></li> </ul>
-
Dataset irds.clueweb09.trec-mq-2009.queries
datamaestro_text.datasets.irds.data.Topics
<p> TREC 2009 Million Query track. </p> <ul> <li><a href=”https://trec.nist.gov/data/million.query09.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec18/papers/MQ09OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.trec-mq-2009.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> TREC 2009 Million Query track. </p> <ul> <li><a href=”https://trec.nist.gov/data/million.query09.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec18/papers/MQ09OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.trec-mq-2009
datamaestro_text.datasets.irds.data.Adhoc
<p> TREC 2009 Million Query track. </p> <ul> <li><a href=”https://trec.nist.gov/data/million.query09.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec18/papers/MQ09OVERVIEW.pdf”>Shared task paper</a></li> </ul>
clueweb09/ar
<p> Subset of ClueWeb09 with only Arabic-language documents. </p>
-
Dataset irds.clueweb09.ar.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of ClueWeb09 with only Arabic-language documents. </p>
clueweb09/catb
<p> Subset of ClueWeb09 with the first ~50 million English-language documents. Used as a smaller collection for TREC Web Track tasks. </p>
-
Dataset irds.clueweb09.catb.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of ClueWeb09 with the first ~50 million English-language documents. Used as a smaller collection for TREC Web Track tasks. </p>
-
Dataset irds.clueweb09.catb.trec-web-2009.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2009 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web09.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec18/papers/WEB09.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2009.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2009 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web09.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec18/papers/WEB09.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2009
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2009 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web09.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec18/papers/WEB09.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2009.diversity.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2009 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web09.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec18/papers/WEB09.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2009.diversity.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2009 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web09.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec18/papers/WEB09.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2009.diversity
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2009 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web09.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec18/papers/WEB09.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2010.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2010 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web10.html”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2010.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2010 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web10.html”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2010
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2010 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web10.html”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2010.diversity.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2010 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web10.html”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2010.diversity.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2010 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web10.html”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2010.diversity
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2010 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web10.html”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2011.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2011 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec20/papers/WEB.OVERVIEW.pdf”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2011.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2011 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec20/papers/WEB.OVERVIEW.pdf”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2011
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2011 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec20/papers/WEB.OVERVIEW.pdf”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2011.diversity.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2011 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec20/papers/WEB.OVERVIEW.pdf”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2011.diversity.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2011 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec20/papers/WEB.OVERVIEW.pdf”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2011.diversity
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2011 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec20/papers/WEB.OVERVIEW.pdf”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2012.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2012 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2012.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec21/papers/WEB12.overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2012.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2012 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2012.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec21/papers/WEB12.overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2012
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2012 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2012.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec21/papers/WEB12.overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2012.diversity.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2012 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2012.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec21/papers/WEB12.overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2012.diversity.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2012 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2012.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec21/papers/WEB12.overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.catb.trec-web-2012.diversity
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2012 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2012.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec21/papers/WEB12.overview.pdf”>Shared task paper</a></li> </ul>
clueweb09/de
<p> Subset of ClueWeb09 with only German-language documents. </p>
-
Dataset irds.clueweb09.de.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of ClueWeb09 with only German-language documents. </p>
clueweb09/en
<p> Subset of ClueWeb09 with only English-language documents. </p>
-
Dataset irds.clueweb09.en.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of ClueWeb09 with only English-language documents. </p>
-
Dataset irds.clueweb09.en.trec-web-2009.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2009 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web09.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec18/papers/WEB09.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2009.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2009 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web09.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec18/papers/WEB09.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2009
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2009 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web09.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec18/papers/WEB09.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2009.diversity.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2009 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web09.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec18/papers/WEB09.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2009.diversity.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2009 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web09.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec18/papers/WEB09.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2009.diversity
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2009 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web09.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec18/papers/WEB09.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2010.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2010 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web10.html”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2010.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2010 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web10.html”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2010
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2010 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web10.html”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2010.diversity.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2010 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web10.html”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2010.diversity.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2010 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web10.html”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2010.diversity
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2010 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web10.html”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2011.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2011 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec20/papers/WEB.OVERVIEW.pdf”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2011.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2011 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec20/papers/WEB.OVERVIEW.pdf”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2011
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2011 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec20/papers/WEB.OVERVIEW.pdf”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2011.diversity.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2011 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec20/papers/WEB.OVERVIEW.pdf”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2011.diversity.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2011 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec20/papers/WEB.OVERVIEW.pdf”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2011.diversity
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2011 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec20/papers/WEB.OVERVIEW.pdf”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2012.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2012 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2012.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec21/papers/WEB12.overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2012.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2012 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2012.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec21/papers/WEB12.overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2012
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2012 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2012.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec21/papers/WEB12.overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2012.diversity.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2012 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2012.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec21/papers/WEB12.overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2012.diversity.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2012 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2012.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec21/papers/WEB12.overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb09.en.trec-web-2012.diversity
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2012 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2012.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec21/papers/WEB12.overview.pdf”>Shared task paper</a></li> </ul>
clueweb09/es
<p> Subset of ClueWeb09 with only Spanish-language documents. </p>
-
Dataset irds.clueweb09.es.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of ClueWeb09 with only Spanish-language documents. </p>
clueweb09/fr
<p> Subset of ClueWeb09 with only French-language documents. </p>
-
Dataset irds.clueweb09.fr.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of ClueWeb09 with only French-language documents. </p>
clueweb09/it
<p> Subset of ClueWeb09 with only Italian-language documents. </p>
-
Dataset irds.clueweb09.it.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of ClueWeb09 with only Italian-language documents. </p>
clueweb09/ja
<p> Subset of ClueWeb09 with only Japanese-language documents. </p>
-
Dataset irds.clueweb09.ja.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of ClueWeb09 with only Japanese-language documents. </p>
clueweb09/ko
<p> Subset of ClueWeb09 with only Korean-language documents. </p>
-
Dataset irds.clueweb09.ko.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of ClueWeb09 with only Korean-language documents. </p>
clueweb09/pt
<p> Subset of ClueWeb09 with only Portuguese-language documents. </p>
-
Dataset irds.clueweb09.pt.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of ClueWeb09 with only Portuguese-language documents. </p>
clueweb09/zh
<p> Subset of ClueWeb09 with only Chinese-language documents. </p>
-
Dataset irds.clueweb09.zh.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of ClueWeb09 with only Chinese-language documents. </p>
ClueWeb12
<p> ClueWeb 2012 web document collection. Contains 733M web pages. </p> <p> The dataset is obtained for a fee from CMU, and is shipped as hard drives. More information is provided <a href=”https://lemurproject.org/clueweb12/”>here</a>. </p> <ul> <li><a href=”https://lemurproject.org/clueweb12/”>Document collection site</a></li> <li><a href=”http://boston.lti.cs.cmu.edu/clueweb12/”>Dataset construction details</a></li> </ul>
-
Dataset irds.clueweb12.documents
datamaestro_text.datasets.irds.data.Documents
<p> ClueWeb 2012 web document collection. Contains 733M web pages. </p> <p> The dataset is obtained for a fee from CMU, and is shipped as hard drives. More information is provided <a href=”https://lemurproject.org/clueweb12/”>here</a>. </p> <ul> <li><a href=”https://lemurproject.org/clueweb12/”>Document collection site</a></li> <li><a href=”http://boston.lti.cs.cmu.edu/clueweb12/”>Dataset construction details</a></li> </ul>
-
Dataset irds.clueweb12.trec-web-2013.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2013 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2013.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec22/papers/WEB.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.trec-web-2013.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2013 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2013.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec22/papers/WEB.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.trec-web-2013
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2013 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2013.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec22/papers/WEB.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.trec-web-2013.diversity.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2013 diverse ranking benchmark. Contains 50 queries with deep subtopic relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2013.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec22/papers/WEB.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.trec-web-2013.diversity.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2013 diverse ranking benchmark. Contains 50 queries with deep subtopic relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2013.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec22/papers/WEB.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.trec-web-2013.diversity
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2013 diverse ranking benchmark. Contains 50 queries with deep subtopic relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2013.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec22/papers/WEB.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.trec-web-2014.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2014 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2014.html”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.trec-web-2014.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2014 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2014.html”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.trec-web-2014
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2014 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2014.html”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.trec-web-2014.diversity.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2014 diverse ranking benchmark. Contains 50 queries with deep subtopic relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2014.html”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.trec-web-2014.diversity.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2014 diverse ranking benchmark. Contains 50 queries with deep subtopic relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2014.html”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.trec-web-2014.diversity
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2014 diverse ranking benchmark. Contains 50 queries with deep subtopic relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/web2014.html”>Shared task site</a></li> <li><a href=”http://www-personal.umich.edu/~kevynct/pubs/trec-web-2014-overview.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.touche-2020-task-2.queries
datamaestro_text.datasets.irds.data.Topics
<p> Decision making processes, be it at the societal or at the personal level, eventually come to a point where one side will challenge the other with a why-question, which is a prompt to justify one’s stance. Thus, technologies for argument mining and argumentation processing are maturing at a rapid pace, giving rise for the first time to argument retrieval. Touché 2020 is the first lab on Argument Retrieval at CLEF 2020 featuring two tasks. </p> <p> Given a comparative question, retrieve and rank documents from the ClueWeb12 that help to answer the comparative question. </p> <p> Documents are judged based on their general topical relevance. </p> <ul> <li><a href=”https://webis.de/events/touche-20/shared-task-2.html”>Task 2 website</a></li> <li><a href=”https://webis.de/events/touche-20/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-58219-7_26”>Overview paper</a></li> <li><a href=”https://www.youtube.com/playlist?list=PLgD1TOdHQCI90NnCLg9f4g32KLuOfPXR4”>Workshop videos</a></li> </ul>
-
Dataset irds.clueweb12.touche-2020-task-2.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Decision making processes, be it at the societal or at the personal level, eventually come to a point where one side will challenge the other with a why-question, which is a prompt to justify one’s stance. Thus, technologies for argument mining and argumentation processing are maturing at a rapid pace, giving rise for the first time to argument retrieval. Touché 2020 is the first lab on Argument Retrieval at CLEF 2020 featuring two tasks. </p> <p> Given a comparative question, retrieve and rank documents from the ClueWeb12 that help to answer the comparative question. </p> <p> Documents are judged based on their general topical relevance. </p> <ul> <li><a href=”https://webis.de/events/touche-20/shared-task-2.html”>Task 2 website</a></li> <li><a href=”https://webis.de/events/touche-20/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-58219-7_26”>Overview paper</a></li> <li><a href=”https://www.youtube.com/playlist?list=PLgD1TOdHQCI90NnCLg9f4g32KLuOfPXR4”>Workshop videos</a></li> </ul>
-
Dataset irds.clueweb12.touche-2020-task-2
datamaestro_text.datasets.irds.data.Adhoc
<p> Decision making processes, be it at the societal or at the personal level, eventually come to a point where one side will challenge the other with a why-question, which is a prompt to justify one’s stance. Thus, technologies for argument mining and argumentation processing are maturing at a rapid pace, giving rise for the first time to argument retrieval. Touché 2020 is the first lab on Argument Retrieval at CLEF 2020 featuring two tasks. </p> <p> Given a comparative question, retrieve and rank documents from the ClueWeb12 that help to answer the comparative question. </p> <p> Documents are judged based on their general topical relevance. </p> <ul> <li><a href=”https://webis.de/events/touche-20/shared-task-2.html”>Task 2 website</a></li> <li><a href=”https://webis.de/events/touche-20/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-58219-7_26”>Overview paper</a></li> <li><a href=”https://www.youtube.com/playlist?list=PLgD1TOdHQCI90NnCLg9f4g32KLuOfPXR4”>Workshop videos</a></li> </ul>
-
Dataset irds.clueweb12.touche-2021-task-2.queries
datamaestro_text.datasets.irds.data.Topics
<p> Decision making processes, be it at the societal or at the personal level, often come to a point where one side challenges the other with a why-question, which is a prompt to justify some stance based on arguments. Since technologies for argument mining are maturing at a rapid pace, also ad-hoc argument retrieval becomes a feasible task in reach. Touché 2021 is the second lab on argument retrieval at CLEF 2021 featuring two tasks. </p> <p> Given a comparative question, retrieve and rank documents from the ClueWeb12 that help to answer the comparative question. </p> <p> Documents are judged based on their general topical relevance and for rhetorical quality, i.e., “well-writtenness” of the document: (1) whether the text has a good style of speech (formal language is preferred over informal), (2) whether the text has a proper sentence structure and is easy to read, (3) whether it includes profanity, has typos, and makes use of other detrimental style choices. </p> <ul> <li><a href=”https://webis.de/events/touche-21/shared-task-2.html”>Task 2 website</a></li> <li><a href=”https://webis.de/events/touche-21/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-85251-1_28”>Overview paper</a></li> <li><a href=”https://www.youtube.com/playlist?list=PLgD1TOdHQCI8FDfYnzcjbsf26RIatNgM3”>Workshop videos</a></li> </ul>
-
Dataset irds.clueweb12.touche-2021-task-2.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Decision making processes, be it at the societal or at the personal level, often come to a point where one side challenges the other with a why-question, which is a prompt to justify some stance based on arguments. Since technologies for argument mining are maturing at a rapid pace, also ad-hoc argument retrieval becomes a feasible task in reach. Touché 2021 is the second lab on argument retrieval at CLEF 2021 featuring two tasks. </p> <p> Given a comparative question, retrieve and rank documents from the ClueWeb12 that help to answer the comparative question. </p> <p> Documents are judged based on their general topical relevance and for rhetorical quality, i.e., “well-writtenness” of the document: (1) whether the text has a good style of speech (formal language is preferred over informal), (2) whether the text has a proper sentence structure and is easy to read, (3) whether it includes profanity, has typos, and makes use of other detrimental style choices. </p> <ul> <li><a href=”https://webis.de/events/touche-21/shared-task-2.html”>Task 2 website</a></li> <li><a href=”https://webis.de/events/touche-21/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-85251-1_28”>Overview paper</a></li> <li><a href=”https://www.youtube.com/playlist?list=PLgD1TOdHQCI8FDfYnzcjbsf26RIatNgM3”>Workshop videos</a></li> </ul>
-
Dataset irds.clueweb12.touche-2021-task-2
datamaestro_text.datasets.irds.data.Adhoc
<p> Decision making processes, be it at the societal or at the personal level, often come to a point where one side challenges the other with a why-question, which is a prompt to justify some stance based on arguments. Since technologies for argument mining are maturing at a rapid pace, also ad-hoc argument retrieval becomes a feasible task in reach. Touché 2021 is the second lab on argument retrieval at CLEF 2021 featuring two tasks. </p> <p> Given a comparative question, retrieve and rank documents from the ClueWeb12 that help to answer the comparative question. </p> <p> Documents are judged based on their general topical relevance and for rhetorical quality, i.e., “well-writtenness” of the document: (1) whether the text has a good style of speech (formal language is preferred over informal), (2) whether the text has a proper sentence structure and is easy to read, (3) whether it includes profanity, has typos, and makes use of other detrimental style choices. </p> <ul> <li><a href=”https://webis.de/events/touche-21/shared-task-2.html”>Task 2 website</a></li> <li><a href=”https://webis.de/events/touche-21/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-85251-1_28”>Overview paper</a></li> <li><a href=”https://www.youtube.com/playlist?list=PLgD1TOdHQCI8FDfYnzcjbsf26RIatNgM3”>Workshop videos</a></li> </ul>
clueweb12/b13
<p> Official subset of the ClueWeb12 datasets with 52M web pages. </p>
-
Dataset irds.clueweb12.b13.documents
datamaestro_text.datasets.irds.data.Documents
<p> Official subset of the ClueWeb12 datasets with 52M web pages. </p>
-
Dataset irds.clueweb12.b13.clef-ehealth.queries
datamaestro_text.datasets.irds.data.Topics
<p> The CLEF eHealth 2016-17 IR dataset. Contains consumer health queries and judgments containing trustworthiness and understandability scores, in addition to the normal relevance assessments. </p> <p> This dataset contains the combined 2016 and 2017 relevance judgments, since the same queries were used in the two year. The assessment year can be distinguished using iteration (2016 is iteration 0, 2017 is iteration 1). </p> <ul> <li><a href=”https://sites.google.com/site/clefehealth2016/task-3”>2016 shared task site</a></li> <li><a href=”https://sites.google.com/site/clefehealth2017/task-3”>2017 shared task site</a></li> <li><a href=”http://ceur-ws.org/Vol-1609/16090015.pdf”>2016 shared task paper</a></li> <li><a href=”http://ceur-ws.org/Vol-1866/invited_paper_16.pdf”>2017 shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.b13.clef-ehealth.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The CLEF eHealth 2016-17 IR dataset. Contains consumer health queries and judgments containing trustworthiness and understandability scores, in addition to the normal relevance assessments. </p> <p> This dataset contains the combined 2016 and 2017 relevance judgments, since the same queries were used in the two year. The assessment year can be distinguished using iteration (2016 is iteration 0, 2017 is iteration 1). </p> <ul> <li><a href=”https://sites.google.com/site/clefehealth2016/task-3”>2016 shared task site</a></li> <li><a href=”https://sites.google.com/site/clefehealth2017/task-3”>2017 shared task site</a></li> <li><a href=”http://ceur-ws.org/Vol-1609/16090015.pdf”>2016 shared task paper</a></li> <li><a href=”http://ceur-ws.org/Vol-1866/invited_paper_16.pdf”>2017 shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.b13.clef-ehealth
datamaestro_text.datasets.irds.data.Adhoc
<p> The CLEF eHealth 2016-17 IR dataset. Contains consumer health queries and judgments containing trustworthiness and understandability scores, in addition to the normal relevance assessments. </p> <p> This dataset contains the combined 2016 and 2017 relevance judgments, since the same queries were used in the two year. The assessment year can be distinguished using iteration (2016 is iteration 0, 2017 is iteration 1). </p> <ul> <li><a href=”https://sites.google.com/site/clefehealth2016/task-3”>2016 shared task site</a></li> <li><a href=”https://sites.google.com/site/clefehealth2017/task-3”>2017 shared task site</a></li> <li><a href=”http://ceur-ws.org/Vol-1609/16090015.pdf”>2016 shared task paper</a></li> <li><a href=”http://ceur-ws.org/Vol-1866/invited_paper_16.pdf”>2017 shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.b13.clef-ehealth.cs.queries
datamaestro_text.datasets.irds.data.Topics
<p> The CLEF eHealth 2016-17 IR dataset, with queries professionally translataed to Czech. See <a class=”ds-ref”>clueweb12/b13/clef-ehealth</a> for more details. </p>
-
Dataset irds.clueweb12.b13.clef-ehealth.cs.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The CLEF eHealth 2016-17 IR dataset, with queries professionally translataed to Czech. See <a class=”ds-ref”>clueweb12/b13/clef-ehealth</a> for more details. </p>
-
Dataset irds.clueweb12.b13.clef-ehealth.cs
datamaestro_text.datasets.irds.data.Adhoc
<p> The CLEF eHealth 2016-17 IR dataset, with queries professionally translataed to Czech. See <a class=”ds-ref”>clueweb12/b13/clef-ehealth</a> for more details. </p>
-
Dataset irds.clueweb12.b13.clef-ehealth.de.queries
datamaestro_text.datasets.irds.data.Topics
<p> The CLEF eHealth 2016-17 IR dataset, with queries professionally translataed to German. See <a class=”ds-ref”>clueweb12/b13/clef-ehealth</a> for more details. </p>
-
Dataset irds.clueweb12.b13.clef-ehealth.de.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The CLEF eHealth 2016-17 IR dataset, with queries professionally translataed to German. See <a class=”ds-ref”>clueweb12/b13/clef-ehealth</a> for more details. </p>
-
Dataset irds.clueweb12.b13.clef-ehealth.de
datamaestro_text.datasets.irds.data.Adhoc
<p> The CLEF eHealth 2016-17 IR dataset, with queries professionally translataed to German. See <a class=”ds-ref”>clueweb12/b13/clef-ehealth</a> for more details. </p>
-
Dataset irds.clueweb12.b13.clef-ehealth.fr.queries
datamaestro_text.datasets.irds.data.Topics
<p> The CLEF eHealth 2016-17 IR dataset, with queries professionally translataed to French. See <a class=”ds-ref”>clueweb12/b13/clef-ehealth</a> for more details. </p>
-
Dataset irds.clueweb12.b13.clef-ehealth.fr.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The CLEF eHealth 2016-17 IR dataset, with queries professionally translataed to French. See <a class=”ds-ref”>clueweb12/b13/clef-ehealth</a> for more details. </p>
-
Dataset irds.clueweb12.b13.clef-ehealth.fr
datamaestro_text.datasets.irds.data.Adhoc
<p> The CLEF eHealth 2016-17 IR dataset, with queries professionally translataed to French. See <a class=”ds-ref”>clueweb12/b13/clef-ehealth</a> for more details. </p>
-
Dataset irds.clueweb12.b13.clef-ehealth.hu.queries
datamaestro_text.datasets.irds.data.Topics
<p> The CLEF eHealth 2016-17 IR dataset, with queries professionally translataed to Hungarian. See <a class=”ds-ref”>clueweb12/b13/clef-ehealth</a> for more details. </p>
-
Dataset irds.clueweb12.b13.clef-ehealth.hu.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The CLEF eHealth 2016-17 IR dataset, with queries professionally translataed to Hungarian. See <a class=”ds-ref”>clueweb12/b13/clef-ehealth</a> for more details. </p>
-
Dataset irds.clueweb12.b13.clef-ehealth.hu
datamaestro_text.datasets.irds.data.Adhoc
<p> The CLEF eHealth 2016-17 IR dataset, with queries professionally translataed to Hungarian. See <a class=”ds-ref”>clueweb12/b13/clef-ehealth</a> for more details. </p>
-
Dataset irds.clueweb12.b13.clef-ehealth.pl.queries
datamaestro_text.datasets.irds.data.Topics
<p> The CLEF eHealth 2016-17 IR dataset, with queries professionally translataed to Polish. See <a class=”ds-ref”>clueweb12/b13/clef-ehealth</a> for more details. </p>
-
Dataset irds.clueweb12.b13.clef-ehealth.pl.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The CLEF eHealth 2016-17 IR dataset, with queries professionally translataed to Polish. See <a class=”ds-ref”>clueweb12/b13/clef-ehealth</a> for more details. </p>
-
Dataset irds.clueweb12.b13.clef-ehealth.pl
datamaestro_text.datasets.irds.data.Adhoc
<p> The CLEF eHealth 2016-17 IR dataset, with queries professionally translataed to Polish. See <a class=”ds-ref”>clueweb12/b13/clef-ehealth</a> for more details. </p>
-
Dataset irds.clueweb12.b13.clef-ehealth.sv.queries
datamaestro_text.datasets.irds.data.Topics
<p> The CLEF eHealth 2016-17 IR dataset, with queries professionally translataed to Swedish. See <a class=”ds-ref”>clueweb12/b13/clef-ehealth</a> for more details. </p>
-
Dataset irds.clueweb12.b13.clef-ehealth.sv.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The CLEF eHealth 2016-17 IR dataset, with queries professionally translataed to Swedish. See <a class=”ds-ref”>clueweb12/b13/clef-ehealth</a> for more details. </p>
-
Dataset irds.clueweb12.b13.clef-ehealth.sv
datamaestro_text.datasets.irds.data.Adhoc
<p> The CLEF eHealth 2016-17 IR dataset, with queries professionally translataed to Swedish. See <a class=”ds-ref”>clueweb12/b13/clef-ehealth</a> for more details. </p>
-
Dataset irds.clueweb12.b13.ntcir-www-1.queries
datamaestro_text.datasets.irds.data.Topics
<p> The NTCIR-13 We Want Web (WWW) 1 ad-hoc ranking benchmark. Contains 100 queries with deep relevance judgments (avg 255 per query). Judgments aggregated from two assessors. Note that the qrels contain additional judgments from the NTCIR-14 CENTRE track. </p> <ul> <li><a href=”http://www.thuir.cn/ntcirwww/”>Shared task site</a></li> <li><a href=”http://www.thuir.cn/ntcirwww/files/ntcir13wwwov.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.b13.ntcir-www-1.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The NTCIR-13 We Want Web (WWW) 1 ad-hoc ranking benchmark. Contains 100 queries with deep relevance judgments (avg 255 per query). Judgments aggregated from two assessors. Note that the qrels contain additional judgments from the NTCIR-14 CENTRE track. </p> <ul> <li><a href=”http://www.thuir.cn/ntcirwww/”>Shared task site</a></li> <li><a href=”http://www.thuir.cn/ntcirwww/files/ntcir13wwwov.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.b13.ntcir-www-1
datamaestro_text.datasets.irds.data.Adhoc
<p> The NTCIR-13 We Want Web (WWW) 1 ad-hoc ranking benchmark. Contains 100 queries with deep relevance judgments (avg 255 per query). Judgments aggregated from two assessors. Note that the qrels contain additional judgments from the NTCIR-14 CENTRE track. </p> <ul> <li><a href=”http://www.thuir.cn/ntcirwww/”>Shared task site</a></li> <li><a href=”http://www.thuir.cn/ntcirwww/files/ntcir13wwwov.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.b13.ntcir-www-2.queries
datamaestro_text.datasets.irds.data.Topics
<p> The NTCIR-14 We Want Web (WWW) 2 ad-hoc ranking benchmark. Contains 80 queries with deep relevance judgments (avg 345 per query). Judgments aggregated from two assessors. </p> <ul> <li><a href=”http://www.thuir.cn/ntcirwww2/”>Shared task site</a></li> <li><a href=”http://research.nii.ac.jp/ntcir/workshop/OnlineProceedings14/pdf/ntcir/01-NTCIR14-OV-WWW-MaoJ.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.b13.ntcir-www-2.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The NTCIR-14 We Want Web (WWW) 2 ad-hoc ranking benchmark. Contains 80 queries with deep relevance judgments (avg 345 per query). Judgments aggregated from two assessors. </p> <ul> <li><a href=”http://www.thuir.cn/ntcirwww2/”>Shared task site</a></li> <li><a href=”http://research.nii.ac.jp/ntcir/workshop/OnlineProceedings14/pdf/ntcir/01-NTCIR14-OV-WWW-MaoJ.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.b13.ntcir-www-2
datamaestro_text.datasets.irds.data.Adhoc
<p> The NTCIR-14 We Want Web (WWW) 2 ad-hoc ranking benchmark. Contains 80 queries with deep relevance judgments (avg 345 per query). Judgments aggregated from two assessors. </p> <ul> <li><a href=”http://www.thuir.cn/ntcirwww2/”>Shared task site</a></li> <li><a href=”http://research.nii.ac.jp/ntcir/workshop/OnlineProceedings14/pdf/ntcir/01-NTCIR14-OV-WWW-MaoJ.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.b13.ntcir-www-3.queries
datamaestro_text.datasets.irds.data.Topics
<p> The NTCIR-15 We Want Web (WWW) 3 ad-hoc ranking benchmark. Contains 160 queries with deep relevance judgments (to be released). 80 of the queries are from <a class=”ds-ref”>clueweb12/b13/ntcir-www-2</a>. </p> <ul> <li><a href=”http://sakailab.com/www3/”>Shared task site</a></li> </ul>
-
Dataset irds.clueweb12.b13.trec-misinfo-2019.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Medical Misinformation 2019 dataset. </p> <ul> <li><a href=”https://trec.nist.gov/data/misinfo2019.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec28/papers/OVERVIEW.D.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.b13.trec-misinfo-2019.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Medical Misinformation 2019 dataset. </p> <ul> <li><a href=”https://trec.nist.gov/data/misinfo2019.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec28/papers/OVERVIEW.D.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.clueweb12.b13.trec-misinfo-2019
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Medical Misinformation 2019 dataset. </p> <ul> <li><a href=”https://trec.nist.gov/data/misinfo2019.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec28/papers/OVERVIEW.D.pdf”>Shared task paper</a></li> </ul>
CODEC
<p> CODEC Document Ranking sub-task. </p> <ul> <li>Documents: curated web articles</li> <li>Queries: challenging, entity-focused queries</li> <li><a href=”https://github.com/grill-lab/CODEC”>Task Repository</a></li> <li>See also: <a class=”ds-ref”>kilt/codec</a>, the entity ranking subtask</li> </ul>
-
Dataset irds.codec.documents
datamaestro_text.datasets.irds.data.Documents
<p> CODEC Document Ranking sub-task. </p> <ul> <li>Documents: curated web articles</li> <li>Queries: challenging, entity-focused queries</li> <li><a href=”https://github.com/grill-lab/CODEC”>Task Repository</a></li> <li>See also: <a class=”ds-ref”>kilt/codec</a>, the entity ranking subtask</li> </ul>
-
Dataset irds.codec.queries
datamaestro_text.datasets.irds.data.Topics
<p> CODEC Document Ranking sub-task. </p> <ul> <li>Documents: curated web articles</li> <li>Queries: challenging, entity-focused queries</li> <li><a href=”https://github.com/grill-lab/CODEC”>Task Repository</a></li> <li>See also: <a class=”ds-ref”>kilt/codec</a>, the entity ranking subtask</li> </ul>
-
Dataset irds.codec.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> CODEC Document Ranking sub-task. </p> <ul> <li>Documents: curated web articles</li> <li>Queries: challenging, entity-focused queries</li> <li><a href=”https://github.com/grill-lab/CODEC”>Task Repository</a></li> <li>See also: <a class=”ds-ref”>kilt/codec</a>, the entity ranking subtask</li> </ul>
-
Dataset irds.codec
datamaestro_text.datasets.irds.data.Adhoc
<p> CODEC Document Ranking sub-task. </p> <ul> <li>Documents: curated web articles</li> <li>Queries: challenging, entity-focused queries</li> <li><a href=”https://github.com/grill-lab/CODEC”>Task Repository</a></li> <li>See also: <a class=”ds-ref”>kilt/codec</a>, the entity ranking subtask</li> </ul>
-
Dataset irds.codec.economics.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of <a class=”ds-ref”>codec</a> that only contains topics about economics. </p>
-
Dataset irds.codec.economics.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of <a class=”ds-ref”>codec</a> that only contains topics about economics. </p>
-
Dataset irds.codec.economics
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of <a class=”ds-ref”>codec</a> that only contains topics about economics. </p>
-
Dataset irds.codec.history.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of <a class=”ds-ref”>codec</a> that only contains topics about history. </p>
-
Dataset irds.codec.history.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of <a class=”ds-ref”>codec</a> that only contains topics about history. </p>
-
Dataset irds.codec.history
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of <a class=”ds-ref”>codec</a> that only contains topics about history. </p>
-
Dataset irds.codec.politics.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of <a class=”ds-ref”>codec</a> that only contains topics about politics. </p>
-
Dataset irds.codec.politics.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of <a class=”ds-ref”>codec</a> that only contains topics about politics. </p>
-
Dataset irds.codec.politics
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of <a class=”ds-ref”>codec</a> that only contains topics about politics. </p>
CORD-19
<p> Collection of scientific articles related to COVID-19. </p> <p> Uses the 2020-07-16 version of the dataset, corresponding to the “complete” collection used for TREC COVID. </p> <p> Note that this version of the document collection only provides article meta-data. To get the full text, use <a class=”ds-ref”>cord19/fulltext</a>. </p> <ul> <li><a href=”https://www.semanticscholar.org/cord19”>Document collection site</a></li> </ul>
-
Dataset irds.cord19.documents
datamaestro_text.datasets.irds.data.Documents
<p> Collection of scientific articles related to COVID-19. </p> <p> Uses the 2020-07-16 version of the dataset, corresponding to the “complete” collection used for TREC COVID. </p> <p> Note that this version of the document collection only provides article meta-data. To get the full text, use <a class=”ds-ref”>cord19/fulltext</a>. </p> <ul> <li><a href=”https://www.semanticscholar.org/cord19”>Document collection site</a></li> </ul>
-
Dataset irds.cord19.trec-covid.queries
datamaestro_text.datasets.irds.data.Topics
<p> The Complete TREC COVID collection. Queries related to COVID-19, including deep relevance judgments. </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The Complete TREC COVID collection. Queries related to COVID-19, including deep relevance judgments. </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid
datamaestro_text.datasets.irds.data.Adhoc
<p> The Complete TREC COVID collection. Queries related to COVID-19, including deep relevance judgments. </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.round5.queries
datamaestro_text.datasets.irds.data.Topics
<p> Round 5 of the TREC COVID task. Includes 50 queries related to COVID-19. This uses the “2020-07-16” version of the collection. </p> <p> Note that the qrels do not contain results from the prior round(s). Use the “complete” version for this setting (<a class=”ds-ref”>cord19/trec-covid</a>). </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round5.html”>Round 5 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.round5.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Round 5 of the TREC COVID task. Includes 50 queries related to COVID-19. This uses the “2020-07-16” version of the collection. </p> <p> Note that the qrels do not contain results from the prior round(s). Use the “complete” version for this setting (<a class=”ds-ref”>cord19/trec-covid</a>). </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round5.html”>Round 5 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.round5
datamaestro_text.datasets.irds.data.Adhoc
<p> Round 5 of the TREC COVID task. Includes 50 queries related to COVID-19. This uses the “2020-07-16” version of the collection. </p> <p> Note that the qrels do not contain results from the prior round(s). Use the “complete” version for this setting (<a class=”ds-ref”>cord19/trec-covid</a>). </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round5.html”>Round 5 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
cord19/fulltext
<p> Version of <a class=”ds-ref”>cord19</a> dataset that includes article full texts. This dataset takes longer to load than the version that only includes article meata-data. </p>
-
Dataset irds.cord19.fulltext.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>cord19</a> dataset that includes article full texts. This dataset takes longer to load than the version that only includes article meata-data. </p>
-
Dataset irds.cord19.fulltext.trec-covid.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>cord19/trec-covid</a> dataset that includes article full texts. This dataset takes longer to load than the version that only includes article meata-data. </p> <p> Queries and qrels are the same as <a class=”ds-ref”>cord19/trec-covid</a>; it just uses the extended documents from <a class=”ds-ref”>cord19/fulltext</a>. </p>
-
Dataset irds.cord19.fulltext.trec-covid.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>cord19/trec-covid</a> dataset that includes article full texts. This dataset takes longer to load than the version that only includes article meata-data. </p> <p> Queries and qrels are the same as <a class=”ds-ref”>cord19/trec-covid</a>; it just uses the extended documents from <a class=”ds-ref”>cord19/fulltext</a>. </p>
-
Dataset irds.cord19.fulltext.trec-covid
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>cord19/trec-covid</a> dataset that includes article full texts. This dataset takes longer to load than the version that only includes article meata-data. </p> <p> Queries and qrels are the same as <a class=”ds-ref”>cord19/trec-covid</a>; it just uses the extended documents from <a class=”ds-ref”>cord19/fulltext</a>. </p>
cord19/trec-covid/round1
<p> Round 1 of the TREC COVID task. Includes 30 queries related to COVID-19. This uses the “2020-04-10” version of the collection. </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round1.html”>Round 1 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.round1.documents
datamaestro_text.datasets.irds.data.Documents
<p> Round 1 of the TREC COVID task. Includes 30 queries related to COVID-19. This uses the “2020-04-10” version of the collection. </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round1.html”>Round 1 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.round1.queries
datamaestro_text.datasets.irds.data.Topics
<p> Round 1 of the TREC COVID task. Includes 30 queries related to COVID-19. This uses the “2020-04-10” version of the collection. </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round1.html”>Round 1 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.round1.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Round 1 of the TREC COVID task. Includes 30 queries related to COVID-19. This uses the “2020-04-10” version of the collection. </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round1.html”>Round 1 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.round1
datamaestro_text.datasets.irds.data.Adhoc
<p> Round 1 of the TREC COVID task. Includes 30 queries related to COVID-19. This uses the “2020-04-10” version of the collection. </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round1.html”>Round 1 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
cord19/trec-covid/round2
<p> Round 2 of the TREC COVID task. Includes 35 queries related to COVID-19. This uses the “2020-05-01” version of the collection. </p> <p> Note that the qrels do not contain results from the prior round(s). Use the “complete” version for this setting (<a class=”ds-ref”>cord19/trec-covid</a>). </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round2.html”>Round 2 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.round2.documents
datamaestro_text.datasets.irds.data.Documents
<p> Round 2 of the TREC COVID task. Includes 35 queries related to COVID-19. This uses the “2020-05-01” version of the collection. </p> <p> Note that the qrels do not contain results from the prior round(s). Use the “complete” version for this setting (<a class=”ds-ref”>cord19/trec-covid</a>). </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round2.html”>Round 2 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.round2.queries
datamaestro_text.datasets.irds.data.Topics
<p> Round 2 of the TREC COVID task. Includes 35 queries related to COVID-19. This uses the “2020-05-01” version of the collection. </p> <p> Note that the qrels do not contain results from the prior round(s). Use the “complete” version for this setting (<a class=”ds-ref”>cord19/trec-covid</a>). </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round2.html”>Round 2 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.round2.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Round 2 of the TREC COVID task. Includes 35 queries related to COVID-19. This uses the “2020-05-01” version of the collection. </p> <p> Note that the qrels do not contain results from the prior round(s). Use the “complete” version for this setting (<a class=”ds-ref”>cord19/trec-covid</a>). </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round2.html”>Round 2 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.round2
datamaestro_text.datasets.irds.data.Adhoc
<p> Round 2 of the TREC COVID task. Includes 35 queries related to COVID-19. This uses the “2020-05-01” version of the collection. </p> <p> Note that the qrels do not contain results from the prior round(s). Use the “complete” version for this setting (<a class=”ds-ref”>cord19/trec-covid</a>). </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round2.html”>Round 2 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
cord19/trec-covid/round3
<p> Round 3 of the TREC COVID task. Includes 40 queries related to COVID-19. This uses the “2020-05-19” version of the collection. </p> <p> Note that the qrels do not contain results from the prior round(s). Use the “complete” version for this setting (<a class=”ds-ref”>cord19/trec-covid</a>). </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round3.html”>Round 3 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.round3.documents
datamaestro_text.datasets.irds.data.Documents
<p> Round 3 of the TREC COVID task. Includes 40 queries related to COVID-19. This uses the “2020-05-19” version of the collection. </p> <p> Note that the qrels do not contain results from the prior round(s). Use the “complete” version for this setting (<a class=”ds-ref”>cord19/trec-covid</a>). </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round3.html”>Round 3 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.round3.queries
datamaestro_text.datasets.irds.data.Topics
<p> Round 3 of the TREC COVID task. Includes 40 queries related to COVID-19. This uses the “2020-05-19” version of the collection. </p> <p> Note that the qrels do not contain results from the prior round(s). Use the “complete” version for this setting (<a class=”ds-ref”>cord19/trec-covid</a>). </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round3.html”>Round 3 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.round3.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Round 3 of the TREC COVID task. Includes 40 queries related to COVID-19. This uses the “2020-05-19” version of the collection. </p> <p> Note that the qrels do not contain results from the prior round(s). Use the “complete” version for this setting (<a class=”ds-ref”>cord19/trec-covid</a>). </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round3.html”>Round 3 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.round3
datamaestro_text.datasets.irds.data.Adhoc
<p> Round 3 of the TREC COVID task. Includes 40 queries related to COVID-19. This uses the “2020-05-19” version of the collection. </p> <p> Note that the qrels do not contain results from the prior round(s). Use the “complete” version for this setting (<a class=”ds-ref”>cord19/trec-covid</a>). </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round3.html”>Round 3 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
cord19/trec-covid/round4
<p> Round 4 of the TREC COVID task. Includes 45 queries related to COVID-19. This uses the “2020-06-19” version of the collection. </p> <p> Note that the qrels do not contain results from the prior round(s). Use the “complete” version for this setting (<a class=”ds-ref”>cord19/trec-covid</a>). </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round4.html”>Round 4 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.round4.documents
datamaestro_text.datasets.irds.data.Documents
<p> Round 4 of the TREC COVID task. Includes 45 queries related to COVID-19. This uses the “2020-06-19” version of the collection. </p> <p> Note that the qrels do not contain results from the prior round(s). Use the “complete” version for this setting (<a class=”ds-ref”>cord19/trec-covid</a>). </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round4.html”>Round 4 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.round4.queries
datamaestro_text.datasets.irds.data.Topics
<p> Round 4 of the TREC COVID task. Includes 45 queries related to COVID-19. This uses the “2020-06-19” version of the collection. </p> <p> Note that the qrels do not contain results from the prior round(s). Use the “complete” version for this setting (<a class=”ds-ref”>cord19/trec-covid</a>). </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round4.html”>Round 4 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.round4.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Round 4 of the TREC COVID task. Includes 45 queries related to COVID-19. This uses the “2020-06-19” version of the collection. </p> <p> Note that the qrels do not contain results from the prior round(s). Use the “complete” version for this setting (<a class=”ds-ref”>cord19/trec-covid</a>). </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round4.html”>Round 4 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.cord19.trec-covid.round4
datamaestro_text.datasets.irds.data.Adhoc
<p> Round 4 of the TREC COVID task. Includes 45 queries related to COVID-19. This uses the “2020-06-19” version of the collection. </p> <p> Note that the qrels do not contain results from the prior round(s). Use the “complete” version for this setting (<a class=”ds-ref”>cord19/trec-covid</a>). </p> <ul> <li><a href=”https://ir.nist.gov/covidSubmit/round4.html”>Round 4 Guidelines</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/index.html”>Shared task site</a></li> <li><a href=”https://ir.nist.gov/covidSubmit/papers/Forum_TRECCOVID1.pdf”>Shared task paper</a></li> </ul>
Cranfield
<p> A small corpus of 1,400 scientific abstracts. </p> <ul> <li>Documents: Scientific abstracts</li> <li>Queries: Natural language questions</li> <li><a href=”http://ir.dcs.gla.ac.uk/resources/test_collections/cran/”>Dataset Information</a></li> </ul>
-
Dataset irds.cranfield.documents
datamaestro_text.datasets.irds.data.Documents
<p> A small corpus of 1,400 scientific abstracts. </p> <ul> <li>Documents: Scientific abstracts</li> <li>Queries: Natural language questions</li> <li><a href=”http://ir.dcs.gla.ac.uk/resources/test_collections/cran/”>Dataset Information</a></li> </ul>
-
Dataset irds.cranfield.queries
datamaestro_text.datasets.irds.data.Topics
<p> A small corpus of 1,400 scientific abstracts. </p> <ul> <li>Documents: Scientific abstracts</li> <li>Queries: Natural language questions</li> <li><a href=”http://ir.dcs.gla.ac.uk/resources/test_collections/cran/”>Dataset Information</a></li> </ul>
-
Dataset irds.cranfield.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A small corpus of 1,400 scientific abstracts. </p> <ul> <li>Documents: Scientific abstracts</li> <li>Queries: Natural language questions</li> <li><a href=”http://ir.dcs.gla.ac.uk/resources/test_collections/cran/”>Dataset Information</a></li> </ul>
-
Dataset irds.cranfield
datamaestro_text.datasets.irds.data.Adhoc
<p> A small corpus of 1,400 scientific abstracts. </p> <ul> <li>Documents: Scientific abstracts</li> <li>Queries: Natural language questions</li> <li><a href=”http://ir.dcs.gla.ac.uk/resources/test_collections/cran/”>Dataset Information</a></li> </ul>
CSL
<p> The CSL dataset, used for the TREC NueCLIR technical document task. </p>
-
Dataset irds.csl.documents
datamaestro_text.datasets.irds.data.Documents
<p> The CSL dataset, used for the TREC NueCLIR technical document task. </p>
-
Dataset irds.csl.trec-2023.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC NeuCLIR 2023 technical documen task. </p>
-
Dataset irds.csl.trec-2023.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC NeuCLIR 2023 technical documen task. </p>
-
Dataset irds.csl.trec-2023
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC NeuCLIR 2023 technical documen task. </p>
disks45/nocr
<p> A version of <a class=”ds-ref”>disks45</a> without the Congressional Record. This is the typical setting for tasks like TREC 7, TREC 8, and TREC Robust 2004. </p>
-
Dataset irds.disks45.nocr.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of <a class=”ds-ref”>disks45</a> without the Congressional Record. This is the typical setting for tasks like TREC 7, TREC 8, and TREC Robust 2004. </p>
-
Dataset irds.disks45.nocr.trec-robust-2004.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Robust retrieval task focuses on “improving the consistency of retrieval technology by focusing on poorly performing topics.” </p> <p> The TREC Robust document collection is from TREC disks 4 and 5. Due to the copyrighted nature of the documents, this collection is for research use only, which requires agreements to be filed with NIST. See details <a href=”https://trec.nist.gov/data/cd45/index.html”>here</a>. </p> <ul> <li>Documents: News articles</li> <li>Queries: keyword queries, descriptions, narratives</li> <li>Relevance: Deep judgments</li> <li><a href=”https://trec.nist.gov/pubs/trec13/papers/ROBUST.OVERVIEW.pdf”>Task Overview Paper</a></li> <li>See also: <a class=”ds-ref”>aquaint/trec-robust-2005</a></li> </ul>
-
Dataset irds.disks45.nocr.trec-robust-2004.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Robust retrieval task focuses on “improving the consistency of retrieval technology by focusing on poorly performing topics.” </p> <p> The TREC Robust document collection is from TREC disks 4 and 5. Due to the copyrighted nature of the documents, this collection is for research use only, which requires agreements to be filed with NIST. See details <a href=”https://trec.nist.gov/data/cd45/index.html”>here</a>. </p> <ul> <li>Documents: News articles</li> <li>Queries: keyword queries, descriptions, narratives</li> <li>Relevance: Deep judgments</li> <li><a href=”https://trec.nist.gov/pubs/trec13/papers/ROBUST.OVERVIEW.pdf”>Task Overview Paper</a></li> <li>See also: <a class=”ds-ref”>aquaint/trec-robust-2005</a></li> </ul>
-
Dataset irds.disks45.nocr.trec-robust-2004
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Robust retrieval task focuses on “improving the consistency of retrieval technology by focusing on poorly performing topics.” </p> <p> The TREC Robust document collection is from TREC disks 4 and 5. Due to the copyrighted nature of the documents, this collection is for research use only, which requires agreements to be filed with NIST. See details <a href=”https://trec.nist.gov/data/cd45/index.html”>here</a>. </p> <ul> <li>Documents: News articles</li> <li>Queries: keyword queries, descriptions, narratives</li> <li>Relevance: Deep judgments</li> <li><a href=”https://trec.nist.gov/pubs/trec13/papers/ROBUST.OVERVIEW.pdf”>Task Overview Paper</a></li> <li>See also: <a class=”ds-ref”>aquaint/trec-robust-2005</a></li> </ul>
-
Dataset irds.disks45.nocr.trec-robust-2004.fold1.queries
datamaestro_text.datasets.irds.data.Topics
<p>Robust04 Fold 1 (Title) proposed by Huston & Croft (2014) and used in numerous works</p>
-
Dataset irds.disks45.nocr.trec-robust-2004.fold1.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Robust04 Fold 1 (Title) proposed by Huston & Croft (2014) and used in numerous works</p>
-
Dataset irds.disks45.nocr.trec-robust-2004.fold1
datamaestro_text.datasets.irds.data.Adhoc
<p>Robust04 Fold 1 (Title) proposed by Huston & Croft (2014) and used in numerous works</p>
-
Dataset irds.disks45.nocr.trec-robust-2004.fold2.queries
datamaestro_text.datasets.irds.data.Topics
<p>Robust04 Fold 2 (Title) proposed by Huston & Croft (2014) and used in numerous works</p>
-
Dataset irds.disks45.nocr.trec-robust-2004.fold2.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Robust04 Fold 2 (Title) proposed by Huston & Croft (2014) and used in numerous works</p>
-
Dataset irds.disks45.nocr.trec-robust-2004.fold2
datamaestro_text.datasets.irds.data.Adhoc
<p>Robust04 Fold 2 (Title) proposed by Huston & Croft (2014) and used in numerous works</p>
-
Dataset irds.disks45.nocr.trec-robust-2004.fold3.queries
datamaestro_text.datasets.irds.data.Topics
<p>Robust04 Fold 3 (Title) proposed by Huston & Croft (2014) and used in numerous works</p>
-
Dataset irds.disks45.nocr.trec-robust-2004.fold3.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Robust04 Fold 3 (Title) proposed by Huston & Croft (2014) and used in numerous works</p>
-
Dataset irds.disks45.nocr.trec-robust-2004.fold3
datamaestro_text.datasets.irds.data.Adhoc
<p>Robust04 Fold 3 (Title) proposed by Huston & Croft (2014) and used in numerous works</p>
-
Dataset irds.disks45.nocr.trec-robust-2004.fold4.queries
datamaestro_text.datasets.irds.data.Topics
<p>Robust04 Fold 4 (Title) proposed by Huston & Croft (2014) and used in numerous works</p>
-
Dataset irds.disks45.nocr.trec-robust-2004.fold4.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Robust04 Fold 4 (Title) proposed by Huston & Croft (2014) and used in numerous works</p>
-
Dataset irds.disks45.nocr.trec-robust-2004.fold4
datamaestro_text.datasets.irds.data.Adhoc
<p>Robust04 Fold 4 (Title) proposed by Huston & Croft (2014) and used in numerous works</p>
-
Dataset irds.disks45.nocr.trec-robust-2004.fold5.queries
datamaestro_text.datasets.irds.data.Topics
<p>Robust04 Fold 5 (Title) proposed by Huston & Croft (2014) and used in numerous works</p>
-
Dataset irds.disks45.nocr.trec-robust-2004.fold5.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Robust04 Fold 5 (Title) proposed by Huston & Croft (2014) and used in numerous works</p>
-
Dataset irds.disks45.nocr.trec-robust-2004.fold5
datamaestro_text.datasets.irds.data.Adhoc
<p>Robust04 Fold 5 (Title) proposed by Huston & Croft (2014) and used in numerous works</p>
-
Dataset irds.disks45.nocr.trec7.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC 7 Adhoc Retrieval track. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec7/papers/overview_7.pdf.gz”>Task Overview Paper</a></li> </ul>
-
Dataset irds.disks45.nocr.trec7.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC 7 Adhoc Retrieval track. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec7/papers/overview_7.pdf.gz”>Task Overview Paper</a></li> </ul>
-
Dataset irds.disks45.nocr.trec7
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC 7 Adhoc Retrieval track. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec7/papers/overview_7.pdf.gz”>Task Overview Paper</a></li> </ul>
-
Dataset irds.disks45.nocr.trec8.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC 8 Adhoc Retrieval track. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec8/papers/overview_8.pdf”>Task Overview Paper</a></li> </ul>
-
Dataset irds.disks45.nocr.trec8.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC 8 Adhoc Retrieval track. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec8/papers/overview_8.pdf”>Task Overview Paper</a></li> </ul>
-
Dataset irds.disks45.nocr.trec8
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC 8 Adhoc Retrieval track. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec8/papers/overview_8.pdf”>Task Overview Paper</a></li> </ul>
DPR Wiki100
<p> A wikipedia dump from 20 December, 2018, split into passages of 100 words. Used in experiments in the DPR paper (and other subsequent works) for retrieval experiments over Q&A collections. </p> <ul> <li><a href=”https://arxiv.org/pdf/2004.04906.pdf”>Dataset paper</a></li> <li><a href=”https://github.com/facebookresearch/DPR”>Repository</a></li> </ul>
-
Dataset irds.dpr-w100.documents
datamaestro_text.datasets.irds.data.Documents
<p> A wikipedia dump from 20 December, 2018, split into passages of 100 words. Used in experiments in the DPR paper (and other subsequent works) for retrieval experiments over Q&A collections. </p> <ul> <li><a href=”https://arxiv.org/pdf/2004.04906.pdf”>Dataset paper</a></li> <li><a href=”https://github.com/facebookresearch/DPR”>Repository</a></li> </ul>
-
Dataset irds.dpr-w100.natural-questions.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Dev subset from the Natural Questions Q&A collection. This differs from the <a class=”ds-ref”>natural-questions/dev</a> dataset in that it uses the full Wikipedia dump and additional filtering (described in the DPR paper) was applied. </p> <ul> <li>See also: <a class=”ds-ref”>natural-questions</a></li> </ul>
-
Dataset irds.dpr-w100.natural-questions.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Dev subset from the Natural Questions Q&A collection. This differs from the <a class=”ds-ref”>natural-questions/dev</a> dataset in that it uses the full Wikipedia dump and additional filtering (described in the DPR paper) was applied. </p> <ul> <li>See also: <a class=”ds-ref”>natural-questions</a></li> </ul>
-
Dataset irds.dpr-w100.natural-questions.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Dev subset from the Natural Questions Q&A collection. This differs from the <a class=”ds-ref”>natural-questions/dev</a> dataset in that it uses the full Wikipedia dump and additional filtering (described in the DPR paper) was applied. </p> <ul> <li>See also: <a class=”ds-ref”>natural-questions</a></li> </ul>
-
Dataset irds.dpr-w100.natural-questions.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Training subset from the Natural Questions Q&A collection. This differs from the <a class=”ds-ref”>natural-questions/train</a> dataset in that it uses the full Wikipedia dump and additional filtering (described in the DPR paper) was applied. </p> <ul> <li>See also: <a class=”ds-ref”>natural-questions</a></li> </ul>
-
Dataset irds.dpr-w100.natural-questions.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Training subset from the Natural Questions Q&A collection. This differs from the <a class=”ds-ref”>natural-questions/train</a> dataset in that it uses the full Wikipedia dump and additional filtering (described in the DPR paper) was applied. </p> <ul> <li>See also: <a class=”ds-ref”>natural-questions</a></li> </ul>
-
Dataset irds.dpr-w100.natural-questions.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Training subset from the Natural Questions Q&A collection. This differs from the <a class=”ds-ref”>natural-questions/train</a> dataset in that it uses the full Wikipedia dump and additional filtering (described in the DPR paper) was applied. </p> <ul> <li>See also: <a class=”ds-ref”>natural-questions</a></li> </ul>
-
Dataset irds.dpr-w100.trivia-qa.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Dev subset from the Trivia QA dataset. Differing from the official Trivia QA collection, this uses the DPR Wikipedia dump as the source collection. Refer to the DPR paper for more details. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/P17-1147.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cs.washington.edu/triviaqa/”>Dataset website</a></li> </ul>
-
Dataset irds.dpr-w100.trivia-qa.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Dev subset from the Trivia QA dataset. Differing from the official Trivia QA collection, this uses the DPR Wikipedia dump as the source collection. Refer to the DPR paper for more details. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/P17-1147.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cs.washington.edu/triviaqa/”>Dataset website</a></li> </ul>
-
Dataset irds.dpr-w100.trivia-qa.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Dev subset from the Trivia QA dataset. Differing from the official Trivia QA collection, this uses the DPR Wikipedia dump as the source collection. Refer to the DPR paper for more details. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/P17-1147.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cs.washington.edu/triviaqa/”>Dataset website</a></li> </ul>
-
Dataset irds.dpr-w100.trivia-qa.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Training subset from the Trivia QA dataset. Differing from the official Trivia QA collection, this uses the DPR Wikipedia dump as the source collection. Refer to the DPR paper for more details. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/P17-1147.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cs.washington.edu/triviaqa/”>Dataset website</a></li> </ul>
-
Dataset irds.dpr-w100.trivia-qa.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Training subset from the Trivia QA dataset. Differing from the official Trivia QA collection, this uses the DPR Wikipedia dump as the source collection. Refer to the DPR paper for more details. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/P17-1147.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cs.washington.edu/triviaqa/”>Dataset website</a></li> </ul>
-
Dataset irds.dpr-w100.trivia-qa.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Training subset from the Trivia QA dataset. Differing from the official Trivia QA collection, this uses the DPR Wikipedia dump as the source collection. Refer to the DPR paper for more details. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/P17-1147.pdf”>Dataset paper</a></li> <li><a href=”http://nlp.cs.washington.edu/triviaqa/”>Dataset website</a></li> </ul>
CodeSearchNet
<p> A benchmark for semantic code search. Uses </p> <ul> <li>Documents: Code functions in python, java, go, php, ruby, and javascript</li> <li>Queries: Inferred from docstrings, or </li> <li><a href=”https://arxiv.org/pdf/1909.09436.pdf”>Dataset Paper</a></li> <li><a href=”https://wandb.ai/github/codesearchnet/benchmark/leaderboard”>Challenge Task Leaderboard</a></li> </ul>
-
Dataset irds.codesearchnet.documents
datamaestro_text.datasets.irds.data.Documents
<p> A benchmark for semantic code search. Uses </p> <ul> <li>Documents: Code functions in python, java, go, php, ruby, and javascript</li> <li>Queries: Inferred from docstrings, or </li> <li><a href=”https://arxiv.org/pdf/1909.09436.pdf”>Dataset Paper</a></li> <li><a href=”https://wandb.ai/github/codesearchnet/benchmark/leaderboard”>Challenge Task Leaderboard</a></li> </ul>
-
Dataset irds.codesearchnet.challenge.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official challenge set, with keyword queries and deep relevance assessments. </p>
-
Dataset irds.codesearchnet.challenge.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official challenge set, with keyword queries and deep relevance assessments. </p>
-
Dataset irds.codesearchnet.challenge
datamaestro_text.datasets.irds.data.Adhoc
<p> Official challenge set, with keyword queries and deep relevance assessments. </p>
-
Dataset irds.codesearchnet.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official test set, using queries inferred from docstrings. </p>
-
Dataset irds.codesearchnet.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official test set, using queries inferred from docstrings. </p>
-
Dataset irds.codesearchnet.test
datamaestro_text.datasets.irds.data.Adhoc
<p> Official test set, using queries inferred from docstrings. </p>
-
Dataset irds.codesearchnet.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official train set, using queries inferred from docstrings. </p>
-
Dataset irds.codesearchnet.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official train set, using queries inferred from docstrings. </p>
-
Dataset irds.codesearchnet.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Official train set, using queries inferred from docstrings. </p>
-
Dataset irds.codesearchnet.valid.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official validation set, using queries inferred from docstrings. </p>
-
Dataset irds.codesearchnet.valid.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official validation set, using queries inferred from docstrings. </p>
-
Dataset irds.codesearchnet.valid
datamaestro_text.datasets.irds.data.Adhoc
<p> Official validation set, using queries inferred from docstrings. </p>
GOV
<p> GOV web document collection. Used for early TREC Web Tracks. Not to be confused with <a class=”ds-ref”>gov2</a>. </p> <p> The dataset is obtained for a fee from UoG, and is shipped as a hard drive. More information is provided <a href=”http://ir.dcs.gla.ac.uk/test_collections/access_to_data.html”>here</a>. </p> <ul> <li><a href=”http://ir.dcs.gla.ac.uk/test_collections/gov2-summary.htm”>Document collection site</a></li> </ul>
-
Dataset irds.gov.documents
datamaestro_text.datasets.irds.data.Documents
<p> GOV web document collection. Used for early TREC Web Tracks. Not to be confused with <a class=”ds-ref”>gov2</a>. </p> <p> The dataset is obtained for a fee from UoG, and is shipped as a hard drive. More information is provided <a href=”http://ir.dcs.gla.ac.uk/test_collections/access_to_data.html”>here</a>. </p> <ul> <li><a href=”http://ir.dcs.gla.ac.uk/test_collections/gov2-summary.htm”>Document collection site</a></li> </ul>
-
Dataset irds.gov.trec-web-2002.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2002 ad-hoc ranking benchmark. </p> <ul> <li><a href=”https://trec.nist.gov/data/t11.web.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec11/papers/WEB.OVER.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov.trec-web-2002.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2002 ad-hoc ranking benchmark. </p> <ul> <li><a href=”https://trec.nist.gov/data/t11.web.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec11/papers/WEB.OVER.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov.trec-web-2002
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2002 ad-hoc ranking benchmark. </p> <ul> <li><a href=”https://trec.nist.gov/data/t11.web.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec11/papers/WEB.OVER.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov.trec-web-2002.named-page.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2002 named page ranking benchmark. </p> <ul> <li><a href=”https://trec.nist.gov/data/t11.web.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec11/papers/WEB.OVER.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov.trec-web-2002.named-page.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2002 named page ranking benchmark. </p> <ul> <li><a href=”https://trec.nist.gov/data/t11.web.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec11/papers/WEB.OVER.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov.trec-web-2002.named-page
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2002 named page ranking benchmark. </p> <ul> <li><a href=”https://trec.nist.gov/data/t11.web.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec11/papers/WEB.OVER.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov.trec-web-2003.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2003 ad-hoc ranking benchmark. </p> <ul> <li><a href=”https://trec.nist.gov/data/t12.web.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec12/papers/WEB.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov.trec-web-2003.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2003 ad-hoc ranking benchmark. </p> <ul> <li><a href=”https://trec.nist.gov/data/t12.web.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec12/papers/WEB.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov.trec-web-2003
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2003 ad-hoc ranking benchmark. </p> <ul> <li><a href=”https://trec.nist.gov/data/t12.web.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec12/papers/WEB.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov.trec-web-2003.named-page.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2003 named page ranking benchmark. </p> <ul> <li><a href=”https://trec.nist.gov/data/t12.web.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec12/papers/WEB.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov.trec-web-2003.named-page.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2003 named page ranking benchmark. </p> <ul> <li><a href=”https://trec.nist.gov/data/t12.web.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec12/papers/WEB.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov.trec-web-2003.named-page
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2003 named page ranking benchmark. </p> <ul> <li><a href=”https://trec.nist.gov/data/t12.web.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec12/papers/WEB.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov.trec-web-2004.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Web Track 2004 ad-hoc ranking benchmark. </p> <p> Queries include a combination of topic distillation, homepage finding, and named page finding. </p> <ul> <li><a href=”https://trec.nist.gov/data/t13.web.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec12/papers/WEB.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov.trec-web-2004.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Web Track 2004 ad-hoc ranking benchmark. </p> <p> Queries include a combination of topic distillation, homepage finding, and named page finding. </p> <ul> <li><a href=”https://trec.nist.gov/data/t13.web.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec12/papers/WEB.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov.trec-web-2004
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Web Track 2004 ad-hoc ranking benchmark. </p> <p> Queries include a combination of topic distillation, homepage finding, and named page finding. </p> <ul> <li><a href=”https://trec.nist.gov/data/t13.web.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec12/papers/WEB.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
GOV2
<p> GOV2 web document collection. Used for the TREC Terabyte Track. </p> <p> The dataset is obtained for a fee from UoG, and is shipped as a hard drive. More information is provided <a href=”http://ir.dcs.gla.ac.uk/test_collections/access_to_data.html”>here</a>. </p> <ul> <li><a href=”http://ir.dcs.gla.ac.uk/test_collections/gov2-summary.htm”>Document collection site</a></li> </ul>
-
Dataset irds.gov2.documents
datamaestro_text.datasets.irds.data.Documents
<p> GOV2 web document collection. Used for the TREC Terabyte Track. </p> <p> The dataset is obtained for a fee from UoG, and is shipped as a hard drive. More information is provided <a href=”http://ir.dcs.gla.ac.uk/test_collections/access_to_data.html”>here</a>. </p> <ul> <li><a href=”http://ir.dcs.gla.ac.uk/test_collections/gov2-summary.htm”>Document collection site</a></li> </ul>
-
Dataset irds.gov2.trec-mq-2007.queries
datamaestro_text.datasets.irds.data.Topics
<p> TREC 2007 Million Query track. </p> <ul> <li><a href=”https://trec.nist.gov/data/million.query07.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec16/papers/1MQ.OVERVIEW16.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-mq-2007.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> TREC 2007 Million Query track. </p> <ul> <li><a href=”https://trec.nist.gov/data/million.query07.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec16/papers/1MQ.OVERVIEW16.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-mq-2007
datamaestro_text.datasets.irds.data.Adhoc
<p> TREC 2007 Million Query track. </p> <ul> <li><a href=”https://trec.nist.gov/data/million.query07.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec16/papers/1MQ.OVERVIEW16.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-mq-2008.queries
datamaestro_text.datasets.irds.data.Topics
<p> TREC 2008 Million Query track. </p> <ul> <li><a href=”https://trec.nist.gov/data/million.query08.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec17/papers/MQ.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-mq-2008.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> TREC 2008 Million Query track. </p> <ul> <li><a href=”https://trec.nist.gov/data/million.query08.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec17/papers/MQ.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-mq-2008
datamaestro_text.datasets.irds.data.Adhoc
<p> TREC 2008 Million Query track. </p> <ul> <li><a href=”https://trec.nist.gov/data/million.query08.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec17/papers/MQ.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2004.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Terabyte Track 2004 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte04.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec13/papers/TERA.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2004.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Terabyte Track 2004 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte04.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec13/papers/TERA.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2004
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Terabyte Track 2004 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte04.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec13/papers/TERA.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2005.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Terabyte Track 2005 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte05.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec14/papers/TERABYTE.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2005.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Terabyte Track 2005 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte05.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec14/papers/TERABYTE.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2005
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Terabyte Track 2005 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte05.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec14/papers/TERABYTE.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2005.efficiency.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Terabyte Track 2005 efficiency ranking benchmark. Contains 50,000 queries from a search engine, including the 50 topics from <a class=”ds-ref”>gov2/trec-tb-2005</a>. Only the 50 topics have judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte05.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec14/papers/TERABYTE.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2005.efficiency.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Terabyte Track 2005 efficiency ranking benchmark. Contains 50,000 queries from a search engine, including the 50 topics from <a class=”ds-ref”>gov2/trec-tb-2005</a>. Only the 50 topics have judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte05.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec14/papers/TERABYTE.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2005.efficiency
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Terabyte Track 2005 efficiency ranking benchmark. Contains 50,000 queries from a search engine, including the 50 topics from <a class=”ds-ref”>gov2/trec-tb-2005</a>. Only the 50 topics have judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte05.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec14/papers/TERABYTE.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2005.named-page.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Terabyte Track 2005 named page ranking benchmark. Contains 252 queries with titles that resemble bookmark labels. Relevance judgments include near-duplicate pages and other pages that may satisfy the bookmark label. </p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte05.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec14/papers/TERABYTE.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2005.named-page.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Terabyte Track 2005 named page ranking benchmark. Contains 252 queries with titles that resemble bookmark labels. Relevance judgments include near-duplicate pages and other pages that may satisfy the bookmark label. </p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte05.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec14/papers/TERABYTE.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2005.named-page
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Terabyte Track 2005 named page ranking benchmark. Contains 252 queries with titles that resemble bookmark labels. Relevance judgments include near-duplicate pages and other pages that may satisfy the bookmark label. </p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte05.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec14/papers/TERABYTE.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2006.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Terabyte Track 2006 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte06.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec15/papers/TERA06.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2006.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Terabyte Track 2006 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte06.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec15/papers/TERA06.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2006
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Terabyte Track 2006 ad-hoc ranking benchmark. Contains 50 queries with deep relevance judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte06.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec15/papers/TERA06.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2006.efficiency.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Terabyte Track 2006 efficiency ranking benchmark. Contains 100,000 queries from a search engine, including the 50 topics from <a class=”ds-ref”>gov2/trec-tb-2006</a>. Only the 50 topics have judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte05.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec14/papers/TERABYTE.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2006.efficiency.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Terabyte Track 2006 efficiency ranking benchmark. Contains 100,000 queries from a search engine, including the 50 topics from <a class=”ds-ref”>gov2/trec-tb-2006</a>. Only the 50 topics have judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte05.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec14/papers/TERABYTE.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2006.efficiency
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Terabyte Track 2006 efficiency ranking benchmark. Contains 100,000 queries from a search engine, including the 50 topics from <a class=”ds-ref”>gov2/trec-tb-2006</a>. Only the 50 topics have judgments. </p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte05.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec14/papers/TERABYTE.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2006.efficiency.10k.queries
datamaestro_text.datasets.irds.data.Topics
<p> Small stream from <a class=”ds-ref”>gov2/trec-tb-2006/efficiency</a>, with 10,000 queries. </p>
-
Dataset irds.gov2.trec-tb-2006.efficiency.stream1.queries
datamaestro_text.datasets.irds.data.Topics
<p> Stream 1 of <a class=”ds-ref”>gov2/trec-tb-2006/efficiency</a> (25,000 queries). </p>
-
Dataset irds.gov2.trec-tb-2006.efficiency.stream2.queries
datamaestro_text.datasets.irds.data.Topics
<p> Stream 2 of <a class=”ds-ref”>gov2/trec-tb-2006/efficiency</a> (25,000 queries). </p>
-
Dataset irds.gov2.trec-tb-2006.efficiency.stream3.queries
datamaestro_text.datasets.irds.data.Topics
<p> Stream 3 of <a class=”ds-ref”>gov2/trec-tb-2006/efficiency</a> (25,000 queries). </p>
-
Dataset irds.gov2.trec-tb-2006.efficiency.stream3.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Stream 3 of <a class=”ds-ref”>gov2/trec-tb-2006/efficiency</a> (25,000 queries). </p>
-
Dataset irds.gov2.trec-tb-2006.efficiency.stream3
datamaestro_text.datasets.irds.data.Adhoc
<p> Stream 3 of <a class=”ds-ref”>gov2/trec-tb-2006/efficiency</a> (25,000 queries). </p>
-
Dataset irds.gov2.trec-tb-2006.efficiency.stream4.queries
datamaestro_text.datasets.irds.data.Topics
<p> Stream 4 of <a class=”ds-ref”>gov2/trec-tb-2006/efficiency</a> (25,000 queries). </p>
-
Dataset irds.gov2.trec-tb-2006.named-page.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Terabyte Track 2006 named page ranking benchmark. Contains 181 queries with titles that resemble bookmark labels. Relevance judgments include near-duplicate pages and other pages that may satisfy the bookmark label.</p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte06.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec15/papers/TERA06.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2006.named-page.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Terabyte Track 2006 named page ranking benchmark. Contains 181 queries with titles that resemble bookmark labels. Relevance judgments include near-duplicate pages and other pages that may satisfy the bookmark label.</p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte06.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec15/papers/TERA06.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
-
Dataset irds.gov2.trec-tb-2006.named-page
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Terabyte Track 2006 named page ranking benchmark. Contains 181 queries with titles that resemble bookmark labels. Relevance judgments include near-duplicate pages and other pages that may satisfy the bookmark label.</p> <ul> <li><a href=”https://trec.nist.gov/data/terabyte06.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec15/papers/TERA06.OVERVIEW.pdf”>Shared task paper</a></li> </ul>
Istella22
<p> The Istella22 dataset facilitates comparisions between traditional and neural learning-to-rank by including query and document text along with LTR features (not included in ir_datasets). </p> <p> Note that to use the dataset, you must <b>read and accept</b> the <a href=”https://www.istella.ai/dataset/Istella22-LicenseAgreement.txt”>Istella22 License Agreement</a>. By using the dataset, you agree to be bound by the terms of the license: the <b>Istella dataset is solely for non-commercial use</b>. </p> <ul> <li><a href=”https://dl.acm.org/doi/abs/10.1145/3477495.3531740”>Paper</a></li> <li><a href=”https://istella.ai/data/istella22-dataset/”>Website</a></li> </ul>
-
Dataset irds.istella22.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Istella22 dataset facilitates comparisions between traditional and neural learning-to-rank by including query and document text along with LTR features (not included in ir_datasets). </p> <p> Note that to use the dataset, you must <b>read and accept</b> the <a href=”https://www.istella.ai/dataset/Istella22-LicenseAgreement.txt”>Istella22 License Agreement</a>. By using the dataset, you agree to be bound by the terms of the license: the <b>Istella dataset is solely for non-commercial use</b>. </p> <ul> <li><a href=”https://dl.acm.org/doi/abs/10.1145/3477495.3531740”>Paper</a></li> <li><a href=”https://istella.ai/data/istella22-dataset/”>Website</a></li> </ul>
-
Dataset irds.istella22.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official test query set. </p>
-
Dataset irds.istella22.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official test query set. </p>
-
Dataset irds.istella22.test
datamaestro_text.datasets.irds.data.Adhoc
<p> Official test query set. </p>
-
Dataset irds.istella22.test.fold1.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official test query set. </p>
-
Dataset irds.istella22.test.fold1.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official test query set. </p>
-
Dataset irds.istella22.test.fold1
datamaestro_text.datasets.irds.data.Adhoc
<p> Official test query set. </p>
-
Dataset irds.istella22.test.fold2.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official test query set. </p>
-
Dataset irds.istella22.test.fold2.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official test query set. </p>
-
Dataset irds.istella22.test.fold2
datamaestro_text.datasets.irds.data.Adhoc
<p> Official test query set. </p>
-
Dataset irds.istella22.test.fold3.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official test query set. </p>
-
Dataset irds.istella22.test.fold3.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official test query set. </p>
-
Dataset irds.istella22.test.fold3
datamaestro_text.datasets.irds.data.Adhoc
<p> Official test query set. </p>
-
Dataset irds.istella22.test.fold4.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official test query set. </p>
-
Dataset irds.istella22.test.fold4.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official test query set. </p>
-
Dataset irds.istella22.test.fold4
datamaestro_text.datasets.irds.data.Adhoc
<p> Official test query set. </p>
-
Dataset irds.istella22.test.fold5.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official test query set. </p>
-
Dataset irds.istella22.test.fold5.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official test query set. </p>
-
Dataset irds.istella22.test.fold5
datamaestro_text.datasets.irds.data.Adhoc
<p> Official test query set. </p>
KILT
<p> KILT is a corpus used for various “knowledge intensive language tasks”. </p> <ul> <li>Documents: Wikipedia articles</li> <li><a href=”https://github.com/facebookresearch/KILT”>Repository</a></li> <li><a href=”https://arxiv.org/abs/2009.02252”>Paper</a></li> <li><a href=”https://ai.facebook.com/tools/kilt/”>Leaderboard</a></li> </ul>
-
Dataset irds.kilt.documents
datamaestro_text.datasets.irds.data.Documents
<p> KILT is a corpus used for various “knowledge intensive language tasks”. </p> <ul> <li>Documents: Wikipedia articles</li> <li><a href=”https://github.com/facebookresearch/KILT”>Repository</a></li> <li><a href=”https://arxiv.org/abs/2009.02252”>Paper</a></li> <li><a href=”https://ai.facebook.com/tools/kilt/”>Leaderboard</a></li> </ul>
-
Dataset irds.kilt.codec.queries
datamaestro_text.datasets.irds.data.Topics
<p> CODEC Entity Ranking sub-task. </p> <ul> <li><a href=”https://github.com/grill-lab/CODEC”>Task Repository</a></li> <li>See also: <a class=”ds-ref”>codec</a>, the document ranking subtask</li> </ul>
-
Dataset irds.kilt.codec.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> CODEC Entity Ranking sub-task. </p> <ul> <li><a href=”https://github.com/grill-lab/CODEC”>Task Repository</a></li> <li>See also: <a class=”ds-ref”>codec</a>, the document ranking subtask</li> </ul>
-
Dataset irds.kilt.codec
datamaestro_text.datasets.irds.data.Adhoc
<p> CODEC Entity Ranking sub-task. </p> <ul> <li><a href=”https://github.com/grill-lab/CODEC”>Task Repository</a></li> <li>See also: <a class=”ds-ref”>codec</a>, the document ranking subtask</li> </ul>
-
Dataset irds.kilt.codec.economics.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of <a class=”ds-ref”>codec</a> that only contains topics about economics. </p>
-
Dataset irds.kilt.codec.economics.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of <a class=”ds-ref”>codec</a> that only contains topics about economics. </p>
-
Dataset irds.kilt.codec.economics
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of <a class=”ds-ref”>codec</a> that only contains topics about economics. </p>
-
Dataset irds.kilt.codec.history.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of <a class=”ds-ref”>codec</a> that only contains topics about history. </p>
-
Dataset irds.kilt.codec.history.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of <a class=”ds-ref”>codec</a> that only contains topics about history. </p>
-
Dataset irds.kilt.codec.history
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of <a class=”ds-ref”>codec</a> that only contains topics about history. </p>
-
Dataset irds.kilt.codec.politics.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of <a class=”ds-ref”>codec</a> that only contains topics about politics. </p>
-
Dataset irds.kilt.codec.politics.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of <a class=”ds-ref”>codec</a> that only contains topics about politics. </p>
-
Dataset irds.kilt.codec.politics
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of <a class=”ds-ref”>codec</a> that only contains topics about politics. </p>
lotte/lifestyle/dev
<p> Answers from lifestyle-focused forums, including bicycles, coffee, crafts, diy, gardening, lifehacks, mechanics, music, outdoors, parenting, pets, sports, and travel. </p>
-
Dataset irds.lotte.lifestyle.dev.documents
datamaestro_text.datasets.irds.data.Documents
<p> Answers from lifestyle-focused forums, including bicycles, coffee, crafts, diy, gardening, lifehacks, mechanics, music, outdoors, parenting, pets, sports, and travel. </p>
-
Dataset irds.lotte.lifestyle.dev.forum.queries
datamaestro_text.datasets.irds.data.Topics
<p> Forum queries for <a class=”ds-ref”>lotte/lifestyle/dev</a>. </p>
-
Dataset irds.lotte.lifestyle.dev.forum.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Forum queries for <a class=”ds-ref”>lotte/lifestyle/dev</a>. </p>
-
Dataset irds.lotte.lifestyle.dev.forum
datamaestro_text.datasets.irds.data.Adhoc
<p> Forum queries for <a class=”ds-ref”>lotte/lifestyle/dev</a>. </p>
-
Dataset irds.lotte.lifestyle.dev.search.queries
datamaestro_text.datasets.irds.data.Topics
<p> Search queries for <a class=”ds-ref”>lotte/lifestyle/dev</a>. </p>
-
Dataset irds.lotte.lifestyle.dev.search.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Search queries for <a class=”ds-ref”>lotte/lifestyle/dev</a>. </p>
-
Dataset irds.lotte.lifestyle.dev.search
datamaestro_text.datasets.irds.data.Adhoc
<p> Search queries for <a class=”ds-ref”>lotte/lifestyle/dev</a>. </p>
lotte/lifestyle/test
<p> Queries and answers from lifestyle-focused forums, including bicycles, coffee, crafts, diy, gardening, lifehacks, mechanics, music, outdoors, parenting, pets, sports, and travel. </p>
-
Dataset irds.lotte.lifestyle.test.documents
datamaestro_text.datasets.irds.data.Documents
<p> Queries and answers from lifestyle-focused forums, including bicycles, coffee, crafts, diy, gardening, lifehacks, mechanics, music, outdoors, parenting, pets, sports, and travel. </p>
-
Dataset irds.lotte.lifestyle.test.forum.queries
datamaestro_text.datasets.irds.data.Topics
<p> Forum queries for <a class=”ds-ref”>lotte/lifestyle/test</a>. </p>
-
Dataset irds.lotte.lifestyle.test.forum.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Forum queries for <a class=”ds-ref”>lotte/lifestyle/test</a>. </p>
-
Dataset irds.lotte.lifestyle.test.forum
datamaestro_text.datasets.irds.data.Adhoc
<p> Forum queries for <a class=”ds-ref”>lotte/lifestyle/test</a>. </p>
-
Dataset irds.lotte.lifestyle.test.search.queries
datamaestro_text.datasets.irds.data.Topics
<p> Search queries for <a class=”ds-ref”>lotte/lifestyle/test</a>. </p>
-
Dataset irds.lotte.lifestyle.test.search.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Search queries for <a class=”ds-ref”>lotte/lifestyle/test</a>. </p>
-
Dataset irds.lotte.lifestyle.test.search
datamaestro_text.datasets.irds.data.Adhoc
<p> Search queries for <a class=”ds-ref”>lotte/lifestyle/test</a>. </p>
lotte/pooled/dev
<p> Combined version of <a class=”ds-ref”>lotte/lifestyle/dev</a>, <a class=”ds-ref”>lotte/recreation/dev</a>, <a class=”ds-ref”>lotte/science/dev</a>, <a class=”ds-ref”>lotte/technology/dev</a>, and <a class=”ds-ref”>lotte/writing/dev</a>. </p>
-
Dataset irds.lotte.pooled.dev.documents
datamaestro_text.datasets.irds.data.Documents
<p> Combined version of <a class=”ds-ref”>lotte/lifestyle/dev</a>, <a class=”ds-ref”>lotte/recreation/dev</a>, <a class=”ds-ref”>lotte/science/dev</a>, <a class=”ds-ref”>lotte/technology/dev</a>, and <a class=”ds-ref”>lotte/writing/dev</a>. </p>
-
Dataset irds.lotte.pooled.dev.forum.queries
datamaestro_text.datasets.irds.data.Topics
<p> Forum queries for <a class=”ds-ref”>lotte/pooled/dev</a>. </p>
-
Dataset irds.lotte.pooled.dev.forum.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Forum queries for <a class=”ds-ref”>lotte/pooled/dev</a>. </p>
-
Dataset irds.lotte.pooled.dev.forum
datamaestro_text.datasets.irds.data.Adhoc
<p> Forum queries for <a class=”ds-ref”>lotte/pooled/dev</a>. </p>
-
Dataset irds.lotte.pooled.dev.search.queries
datamaestro_text.datasets.irds.data.Topics
<p> Search queries for <a class=”ds-ref”>lotte/pooled/dev</a>. </p>
-
Dataset irds.lotte.pooled.dev.search.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Search queries for <a class=”ds-ref”>lotte/pooled/dev</a>. </p>
-
Dataset irds.lotte.pooled.dev.search
datamaestro_text.datasets.irds.data.Adhoc
<p> Search queries for <a class=”ds-ref”>lotte/pooled/dev</a>. </p>
lotte/pooled/test
<p> Combined version of <a class=”ds-ref”>lotte/lifestyle/test</a>, <a class=”ds-ref”>lotte/recreation/test</a>, <a class=”ds-ref”>lotte/science/test</a>, <a class=”ds-ref”>lotte/technology/test</a>, and <a class=”ds-ref”>lotte/writing/test</a>. </p>
-
Dataset irds.lotte.pooled.test.documents
datamaestro_text.datasets.irds.data.Documents
<p> Combined version of <a class=”ds-ref”>lotte/lifestyle/test</a>, <a class=”ds-ref”>lotte/recreation/test</a>, <a class=”ds-ref”>lotte/science/test</a>, <a class=”ds-ref”>lotte/technology/test</a>, and <a class=”ds-ref”>lotte/writing/test</a>. </p>
-
Dataset irds.lotte.pooled.test.forum.queries
datamaestro_text.datasets.irds.data.Topics
<p> Forum queries for <a class=”ds-ref”>lotte/pooled/test</a>. </p>
-
Dataset irds.lotte.pooled.test.forum.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Forum queries for <a class=”ds-ref”>lotte/pooled/test</a>. </p>
-
Dataset irds.lotte.pooled.test.forum
datamaestro_text.datasets.irds.data.Adhoc
<p> Forum queries for <a class=”ds-ref”>lotte/pooled/test</a>. </p>
-
Dataset irds.lotte.pooled.test.search.queries
datamaestro_text.datasets.irds.data.Topics
<p> Search queries for <a class=”ds-ref”>lotte/pooled/test</a>. </p>
-
Dataset irds.lotte.pooled.test.search.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Search queries for <a class=”ds-ref”>lotte/pooled/test</a>. </p>
-
Dataset irds.lotte.pooled.test.search
datamaestro_text.datasets.irds.data.Adhoc
<p> Search queries for <a class=”ds-ref”>lotte/pooled/test</a>. </p>
lotte/recreation/dev
<p> Answers from recreation-focused forums, including anime, boardgames, gaming, movies, photo, rpg, and scifi. </p>
-
Dataset irds.lotte.recreation.dev.documents
datamaestro_text.datasets.irds.data.Documents
<p> Answers from recreation-focused forums, including anime, boardgames, gaming, movies, photo, rpg, and scifi. </p>
-
Dataset irds.lotte.recreation.dev.forum.queries
datamaestro_text.datasets.irds.data.Topics
<p> Forum queries for <a class=”ds-ref”>lotte/recreation/dev</a>. </p>
-
Dataset irds.lotte.recreation.dev.forum.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Forum queries for <a class=”ds-ref”>lotte/recreation/dev</a>. </p>
-
Dataset irds.lotte.recreation.dev.forum
datamaestro_text.datasets.irds.data.Adhoc
<p> Forum queries for <a class=”ds-ref”>lotte/recreation/dev</a>. </p>
-
Dataset irds.lotte.recreation.dev.search.queries
datamaestro_text.datasets.irds.data.Topics
<p> Search queries for <a class=”ds-ref”>lotte/recreation/dev</a>. </p>
-
Dataset irds.lotte.recreation.dev.search.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Search queries for <a class=”ds-ref”>lotte/recreation/dev</a>. </p>
-
Dataset irds.lotte.recreation.dev.search
datamaestro_text.datasets.irds.data.Adhoc
<p> Search queries for <a class=”ds-ref”>lotte/recreation/dev</a>. </p>
lotte/recreation/test
<p> Answers from recreation-focused forums, including anime, boardgames, gaming, movies, photo, rpg, and scifi. </p>
-
Dataset irds.lotte.recreation.test.documents
datamaestro_text.datasets.irds.data.Documents
<p> Answers from recreation-focused forums, including anime, boardgames, gaming, movies, photo, rpg, and scifi. </p>
-
Dataset irds.lotte.recreation.test.forum.queries
datamaestro_text.datasets.irds.data.Topics
<p> Forum queries for <a class=”ds-ref”>lotte/recreation/test</a>. </p>
-
Dataset irds.lotte.recreation.test.forum.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Forum queries for <a class=”ds-ref”>lotte/recreation/test</a>. </p>
-
Dataset irds.lotte.recreation.test.forum
datamaestro_text.datasets.irds.data.Adhoc
<p> Forum queries for <a class=”ds-ref”>lotte/recreation/test</a>. </p>
-
Dataset irds.lotte.recreation.test.search.queries
datamaestro_text.datasets.irds.data.Topics
<p> Search queries for <a class=”ds-ref”>lotte/recreation/test</a>. </p>
-
Dataset irds.lotte.recreation.test.search.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Search queries for <a class=”ds-ref”>lotte/recreation/test</a>. </p>
-
Dataset irds.lotte.recreation.test.search
datamaestro_text.datasets.irds.data.Adhoc
<p> Search queries for <a class=”ds-ref”>lotte/recreation/test</a>. </p>
lotte/science/dev
<p> Answers from science-focused forums, including academia, astronomy, biology, chemistry, datasciene, earthscience, engineering, math, philosophy, physics, and stats. </p>
-
Dataset irds.lotte.science.dev.documents
datamaestro_text.datasets.irds.data.Documents
<p> Answers from science-focused forums, including academia, astronomy, biology, chemistry, datasciene, earthscience, engineering, math, philosophy, physics, and stats. </p>
-
Dataset irds.lotte.science.dev.forum.queries
datamaestro_text.datasets.irds.data.Topics
<p> Forum queries for <a class=”ds-ref”>lotte/science/dev</a>. </p>
-
Dataset irds.lotte.science.dev.forum.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Forum queries for <a class=”ds-ref”>lotte/science/dev</a>. </p>
-
Dataset irds.lotte.science.dev.forum
datamaestro_text.datasets.irds.data.Adhoc
<p> Forum queries for <a class=”ds-ref”>lotte/science/dev</a>. </p>
-
Dataset irds.lotte.science.dev.search.queries
datamaestro_text.datasets.irds.data.Topics
<p> Search queries for <a class=”ds-ref”>lotte/science/dev</a>. </p>
-
Dataset irds.lotte.science.dev.search.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Search queries for <a class=”ds-ref”>lotte/science/dev</a>. </p>
-
Dataset irds.lotte.science.dev.search
datamaestro_text.datasets.irds.data.Adhoc
<p> Search queries for <a class=”ds-ref”>lotte/science/dev</a>. </p>
lotte/science/test
<p> Answers from science-focused forums, including academia, astronomy, biology, chemistry, datasciene, earthscience, engineering, math, philosophy, physics, and stats. </p>
-
Dataset irds.lotte.science.test.documents
datamaestro_text.datasets.irds.data.Documents
<p> Answers from science-focused forums, including academia, astronomy, biology, chemistry, datasciene, earthscience, engineering, math, philosophy, physics, and stats. </p>
-
Dataset irds.lotte.science.test.forum.queries
datamaestro_text.datasets.irds.data.Topics
<p> Forum queries for <a class=”ds-ref”>lotte/science/test</a>. </p>
-
Dataset irds.lotte.science.test.forum.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Forum queries for <a class=”ds-ref”>lotte/science/test</a>. </p>
-
Dataset irds.lotte.science.test.forum
datamaestro_text.datasets.irds.data.Adhoc
<p> Forum queries for <a class=”ds-ref”>lotte/science/test</a>. </p>
-
Dataset irds.lotte.science.test.search.queries
datamaestro_text.datasets.irds.data.Topics
<p> Search queries for <a class=”ds-ref”>lotte/science/test</a>. </p>
-
Dataset irds.lotte.science.test.search.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Search queries for <a class=”ds-ref”>lotte/science/test</a>. </p>
-
Dataset irds.lotte.science.test.search
datamaestro_text.datasets.irds.data.Adhoc
<p> Search queries for <a class=”ds-ref”>lotte/science/test</a>. </p>
lotte/technology/dev
<p> Answers from technology-focused forums, including android, apple, askubuntu, electronics, networkengineering, security, serverfault, softwareengineering, superuser, unix, and webapps. </p>
-
Dataset irds.lotte.technology.dev.documents
datamaestro_text.datasets.irds.data.Documents
<p> Answers from technology-focused forums, including android, apple, askubuntu, electronics, networkengineering, security, serverfault, softwareengineering, superuser, unix, and webapps. </p>
-
Dataset irds.lotte.technology.dev.forum.queries
datamaestro_text.datasets.irds.data.Topics
<p> Forum queries for <a class=”ds-ref”>lotte/technology/dev</a>. </p>
-
Dataset irds.lotte.technology.dev.forum.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Forum queries for <a class=”ds-ref”>lotte/technology/dev</a>. </p>
-
Dataset irds.lotte.technology.dev.forum
datamaestro_text.datasets.irds.data.Adhoc
<p> Forum queries for <a class=”ds-ref”>lotte/technology/dev</a>. </p>
-
Dataset irds.lotte.technology.dev.search.queries
datamaestro_text.datasets.irds.data.Topics
<p> Search queries for <a class=”ds-ref”>lotte/technology/dev</a>. </p>
-
Dataset irds.lotte.technology.dev.search.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Search queries for <a class=”ds-ref”>lotte/technology/dev</a>. </p>
-
Dataset irds.lotte.technology.dev.search
datamaestro_text.datasets.irds.data.Adhoc
<p> Search queries for <a class=”ds-ref”>lotte/technology/dev</a>. </p>
lotte/technology/test
<p> Answers from technology-focused forums, including android, apple, askubuntu, electronics, networkengineering, security, serverfault, softwareengineering, superuser, unix, and webapps. </p>
-
Dataset irds.lotte.technology.test.documents
datamaestro_text.datasets.irds.data.Documents
<p> Answers from technology-focused forums, including android, apple, askubuntu, electronics, networkengineering, security, serverfault, softwareengineering, superuser, unix, and webapps. </p>
-
Dataset irds.lotte.technology.test.forum.queries
datamaestro_text.datasets.irds.data.Topics
<p> Forum queries for <a class=”ds-ref”>lotte/technology/test</a>. </p>
-
Dataset irds.lotte.technology.test.forum.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Forum queries for <a class=”ds-ref”>lotte/technology/test</a>. </p>
-
Dataset irds.lotte.technology.test.forum
datamaestro_text.datasets.irds.data.Adhoc
<p> Forum queries for <a class=”ds-ref”>lotte/technology/test</a>. </p>
-
Dataset irds.lotte.technology.test.search.queries
datamaestro_text.datasets.irds.data.Topics
<p> Search queries for <a class=”ds-ref”>lotte/technology/test</a>. </p>
-
Dataset irds.lotte.technology.test.search.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Search queries for <a class=”ds-ref”>lotte/technology/test</a>. </p>
-
Dataset irds.lotte.technology.test.search
datamaestro_text.datasets.irds.data.Adhoc
<p> Search queries for <a class=”ds-ref”>lotte/technology/test</a>. </p>
lotte/writing/dev
<p> Answers from writing-focused forums, including ell, english, linguistics, literature, worldbuilding, and writing. </p>
-
Dataset irds.lotte.writing.dev.documents
datamaestro_text.datasets.irds.data.Documents
<p> Answers from writing-focused forums, including ell, english, linguistics, literature, worldbuilding, and writing. </p>
-
Dataset irds.lotte.writing.dev.forum.queries
datamaestro_text.datasets.irds.data.Topics
<p> Forum queries for <a class=”ds-ref”>lotte/writing/dev</a>. </p>
-
Dataset irds.lotte.writing.dev.forum.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Forum queries for <a class=”ds-ref”>lotte/writing/dev</a>. </p>
-
Dataset irds.lotte.writing.dev.forum
datamaestro_text.datasets.irds.data.Adhoc
<p> Forum queries for <a class=”ds-ref”>lotte/writing/dev</a>. </p>
-
Dataset irds.lotte.writing.dev.search.queries
datamaestro_text.datasets.irds.data.Topics
<p> Search queries for <a class=”ds-ref”>lotte/writing/dev</a>. </p>
-
Dataset irds.lotte.writing.dev.search.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Search queries for <a class=”ds-ref”>lotte/writing/dev</a>. </p>
-
Dataset irds.lotte.writing.dev.search
datamaestro_text.datasets.irds.data.Adhoc
<p> Search queries for <a class=”ds-ref”>lotte/writing/dev</a>. </p>
lotte/writing/test
<p> Answers from writing-focused forums, including ell, english, linguistics, literature, worldbuilding, and writing. </p>
-
Dataset irds.lotte.writing.test.documents
datamaestro_text.datasets.irds.data.Documents
<p> Answers from writing-focused forums, including ell, english, linguistics, literature, worldbuilding, and writing. </p>
-
Dataset irds.lotte.writing.test.forum.queries
datamaestro_text.datasets.irds.data.Topics
<p> Forum queries for <a class=”ds-ref”>lotte/writing/test</a>. </p>
-
Dataset irds.lotte.writing.test.forum.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Forum queries for <a class=”ds-ref”>lotte/writing/test</a>. </p>
-
Dataset irds.lotte.writing.test.forum
datamaestro_text.datasets.irds.data.Adhoc
<p> Forum queries for <a class=”ds-ref”>lotte/writing/test</a>. </p>
-
Dataset irds.lotte.writing.test.search.queries
datamaestro_text.datasets.irds.data.Topics
<p> Search queries for <a class=”ds-ref”>lotte/writing/test</a>. </p>
-
Dataset irds.lotte.writing.test.search.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Search queries for <a class=”ds-ref”>lotte/writing/test</a>. </p>
-
Dataset irds.lotte.writing.test.search
datamaestro_text.datasets.irds.data.Adhoc
<p> Search queries for <a class=”ds-ref”>lotte/writing/test</a>. </p>
miracl/ar
<p> The Arabic corpus. </p>
-
Dataset irds.miracl.ar.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Arabic corpus. </p>
-
Dataset irds.miracl.ar.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> The dev set for Arabic. </p>
-
Dataset irds.miracl.ar.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The dev set for Arabic. </p>
-
Dataset irds.miracl.ar.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> The dev set for Arabic. </p>
-
Dataset irds.miracl.ar.test-a.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version a) for Arabic. </p>
-
Dataset irds.miracl.ar.test-b.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version b) for Arabic. </p>
-
Dataset irds.miracl.ar.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> The train set for Arabic. </p>
-
Dataset irds.miracl.ar.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The train set for Arabic. </p>
-
Dataset irds.miracl.ar.train
datamaestro_text.datasets.irds.data.Adhoc
<p> The train set for Arabic. </p>
miracl/bn
<p> The Bengali corpus. </p>
-
Dataset irds.miracl.bn.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Bengali corpus. </p>
-
Dataset irds.miracl.bn.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> The dev set for Bengali. </p>
-
Dataset irds.miracl.bn.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The dev set for Bengali. </p>
-
Dataset irds.miracl.bn.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> The dev set for Bengali. </p>
-
Dataset irds.miracl.bn.test-a.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version a) for Bengali. </p>
-
Dataset irds.miracl.bn.test-b.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version b) for Bengali. </p>
-
Dataset irds.miracl.bn.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> The train set for Bengali. </p>
-
Dataset irds.miracl.bn.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The train set for Bengali. </p>
-
Dataset irds.miracl.bn.train
datamaestro_text.datasets.irds.data.Adhoc
<p> The train set for Bengali. </p>
miracl/de
<p> The German corpus. </p>
-
Dataset irds.miracl.de.documents
datamaestro_text.datasets.irds.data.Documents
<p> The German corpus. </p>
-
Dataset irds.miracl.de.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> The dev set for German. </p>
-
Dataset irds.miracl.de.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The dev set for German. </p>
-
Dataset irds.miracl.de.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> The dev set for German. </p>
-
Dataset irds.miracl.de.test-b.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version b) for German. </p>
miracl/en
<p> The English corpus. </p>
-
Dataset irds.miracl.en.documents
datamaestro_text.datasets.irds.data.Documents
<p> The English corpus. </p>
-
Dataset irds.miracl.en.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> The dev set for English. </p>
-
Dataset irds.miracl.en.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The dev set for English. </p>
-
Dataset irds.miracl.en.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> The dev set for English. </p>
-
Dataset irds.miracl.en.test-a.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version a) for English. </p>
-
Dataset irds.miracl.en.test-b.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version b) for English. </p>
-
Dataset irds.miracl.en.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> The train set for English. </p>
-
Dataset irds.miracl.en.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The train set for English. </p>
-
Dataset irds.miracl.en.train
datamaestro_text.datasets.irds.data.Adhoc
<p> The train set for English. </p>
miracl/es
<p> The Spanish corpus. </p>
-
Dataset irds.miracl.es.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Spanish corpus. </p>
-
Dataset irds.miracl.es.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> The dev set for Spanish. </p>
-
Dataset irds.miracl.es.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The dev set for Spanish. </p>
-
Dataset irds.miracl.es.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> The dev set for Spanish. </p>
-
Dataset irds.miracl.es.test-b.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version b) for Spanish. </p>
-
Dataset irds.miracl.es.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> The train set for Spanish. </p>
-
Dataset irds.miracl.es.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The train set for Spanish. </p>
-
Dataset irds.miracl.es.train
datamaestro_text.datasets.irds.data.Adhoc
<p> The train set for Spanish. </p>
miracl/fa
<p> The Persian corpus. </p>
-
Dataset irds.miracl.fa.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Persian corpus. </p>
-
Dataset irds.miracl.fa.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> The dev set for Persian. </p>
-
Dataset irds.miracl.fa.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The dev set for Persian. </p>
-
Dataset irds.miracl.fa.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> The dev set for Persian. </p>
-
Dataset irds.miracl.fa.test-b.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version b) for Persian. </p>
-
Dataset irds.miracl.fa.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> The train set for Persian. </p>
-
Dataset irds.miracl.fa.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The train set for Persian. </p>
-
Dataset irds.miracl.fa.train
datamaestro_text.datasets.irds.data.Adhoc
<p> The train set for Persian. </p>
miracl/fi
<p> The Finnish corpus. </p>
-
Dataset irds.miracl.fi.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Finnish corpus. </p>
-
Dataset irds.miracl.fi.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> The dev set for Finnish. </p>
-
Dataset irds.miracl.fi.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The dev set for Finnish. </p>
-
Dataset irds.miracl.fi.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> The dev set for Finnish. </p>
-
Dataset irds.miracl.fi.test-a.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version a) for Finnish. </p>
-
Dataset irds.miracl.fi.test-b.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version b) for Finnish. </p>
-
Dataset irds.miracl.fi.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> The train set for Finnish. </p>
-
Dataset irds.miracl.fi.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The train set for Finnish. </p>
-
Dataset irds.miracl.fi.train
datamaestro_text.datasets.irds.data.Adhoc
<p> The train set for Finnish. </p>
miracl/fr
<p> The French corpus. </p>
-
Dataset irds.miracl.fr.documents
datamaestro_text.datasets.irds.data.Documents
<p> The French corpus. </p>
-
Dataset irds.miracl.fr.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> The dev set for French. </p>
-
Dataset irds.miracl.fr.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The dev set for French. </p>
-
Dataset irds.miracl.fr.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> The dev set for French. </p>
-
Dataset irds.miracl.fr.test-b.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version b) for French. </p>
-
Dataset irds.miracl.fr.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> The train set for French. </p>
-
Dataset irds.miracl.fr.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The train set for French. </p>
-
Dataset irds.miracl.fr.train
datamaestro_text.datasets.irds.data.Adhoc
<p> The train set for French. </p>
miracl/hi
<p> The Hindi corpus. </p>
-
Dataset irds.miracl.hi.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Hindi corpus. </p>
-
Dataset irds.miracl.hi.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> The dev set for Hindi. </p>
-
Dataset irds.miracl.hi.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The dev set for Hindi. </p>
-
Dataset irds.miracl.hi.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> The dev set for Hindi. </p>
-
Dataset irds.miracl.hi.test-b.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version b) for Hindi. </p>
-
Dataset irds.miracl.hi.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> The train set for Hindi. </p>
-
Dataset irds.miracl.hi.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The train set for Hindi. </p>
-
Dataset irds.miracl.hi.train
datamaestro_text.datasets.irds.data.Adhoc
<p> The train set for Hindi. </p>
miracl/id
<p> The Indonesian corpus. </p>
-
Dataset irds.miracl.id.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Indonesian corpus. </p>
-
Dataset irds.miracl.id.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> The dev set for Indonesian. </p>
-
Dataset irds.miracl.id.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The dev set for Indonesian. </p>
-
Dataset irds.miracl.id.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> The dev set for Indonesian. </p>
-
Dataset irds.miracl.id.test-a.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version a) for Indonesian. </p>
-
Dataset irds.miracl.id.test-b.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version b) for Indonesian. </p>
-
Dataset irds.miracl.id.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> The train set for Indonesian. </p>
-
Dataset irds.miracl.id.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The train set for Indonesian. </p>
-
Dataset irds.miracl.id.train
datamaestro_text.datasets.irds.data.Adhoc
<p> The train set for Indonesian. </p>
miracl/ja
<p> The Japanese corpus. </p>
-
Dataset irds.miracl.ja.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Japanese corpus. </p>
-
Dataset irds.miracl.ja.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> The dev set for Japanese. </p>
-
Dataset irds.miracl.ja.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The dev set for Japanese. </p>
-
Dataset irds.miracl.ja.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> The dev set for Japanese. </p>
-
Dataset irds.miracl.ja.test-a.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version a) for Japanese. </p>
-
Dataset irds.miracl.ja.test-b.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version b) for Japanese. </p>
-
Dataset irds.miracl.ja.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> The train set for Japanese. </p>
-
Dataset irds.miracl.ja.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The train set for Japanese. </p>
-
Dataset irds.miracl.ja.train
datamaestro_text.datasets.irds.data.Adhoc
<p> The train set for Japanese. </p>
miracl/ko
<p> The Korean corpus. </p>
-
Dataset irds.miracl.ko.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Korean corpus. </p>
-
Dataset irds.miracl.ko.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> The dev set for Korean. </p>
-
Dataset irds.miracl.ko.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The dev set for Korean. </p>
-
Dataset irds.miracl.ko.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> The dev set for Korean. </p>
-
Dataset irds.miracl.ko.test-a.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version a) for Korean. </p>
-
Dataset irds.miracl.ko.test-b.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version b) for Korean. </p>
-
Dataset irds.miracl.ko.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> The train set for Korean. </p>
-
Dataset irds.miracl.ko.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The train set for Korean. </p>
-
Dataset irds.miracl.ko.train
datamaestro_text.datasets.irds.data.Adhoc
<p> The train set for Korean. </p>
miracl/ru
<p> The Russian corpus. </p>
-
Dataset irds.miracl.ru.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Russian corpus. </p>
-
Dataset irds.miracl.ru.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> The dev set for Russian. </p>
-
Dataset irds.miracl.ru.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The dev set for Russian. </p>
-
Dataset irds.miracl.ru.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> The dev set for Russian. </p>
-
Dataset irds.miracl.ru.test-a.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version a) for Russian. </p>
-
Dataset irds.miracl.ru.test-b.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version b) for Russian. </p>
-
Dataset irds.miracl.ru.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> The train set for Russian. </p>
-
Dataset irds.miracl.ru.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The train set for Russian. </p>
-
Dataset irds.miracl.ru.train
datamaestro_text.datasets.irds.data.Adhoc
<p> The train set for Russian. </p>
miracl/sw
<p> The Swahili corpus. </p>
-
Dataset irds.miracl.sw.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Swahili corpus. </p>
-
Dataset irds.miracl.sw.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> The dev set for Swahili. </p>
-
Dataset irds.miracl.sw.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The dev set for Swahili. </p>
-
Dataset irds.miracl.sw.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> The dev set for Swahili. </p>
-
Dataset irds.miracl.sw.test-a.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version a) for Swahili. </p>
-
Dataset irds.miracl.sw.test-b.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version b) for Swahili. </p>
-
Dataset irds.miracl.sw.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> The train set for Swahili. </p>
-
Dataset irds.miracl.sw.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The train set for Swahili. </p>
-
Dataset irds.miracl.sw.train
datamaestro_text.datasets.irds.data.Adhoc
<p> The train set for Swahili. </p>
miracl/te
<p> The Telugu corpus. </p>
-
Dataset irds.miracl.te.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Telugu corpus. </p>
-
Dataset irds.miracl.te.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> The dev set for Telugu. </p>
-
Dataset irds.miracl.te.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The dev set for Telugu. </p>
-
Dataset irds.miracl.te.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> The dev set for Telugu. </p>
-
Dataset irds.miracl.te.test-a.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version a) for Telugu. </p>
-
Dataset irds.miracl.te.test-b.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version b) for Telugu. </p>
-
Dataset irds.miracl.te.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> The train set for Telugu. </p>
-
Dataset irds.miracl.te.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The train set for Telugu. </p>
-
Dataset irds.miracl.te.train
datamaestro_text.datasets.irds.data.Adhoc
<p> The train set for Telugu. </p>
miracl/th
<p> The Thai corpus. </p>
-
Dataset irds.miracl.th.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Thai corpus. </p>
-
Dataset irds.miracl.th.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> The dev set for Thai. </p>
-
Dataset irds.miracl.th.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The dev set for Thai. </p>
-
Dataset irds.miracl.th.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> The dev set for Thai. </p>
-
Dataset irds.miracl.th.test-a.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version a) for Thai. </p>
-
Dataset irds.miracl.th.test-b.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version b) for Thai. </p>
-
Dataset irds.miracl.th.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> The train set for Thai. </p>
-
Dataset irds.miracl.th.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The train set for Thai. </p>
-
Dataset irds.miracl.th.train
datamaestro_text.datasets.irds.data.Adhoc
<p> The train set for Thai. </p>
miracl/yo
<p> The Yoruba corpus. </p>
-
Dataset irds.miracl.yo.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Yoruba corpus. </p>
-
Dataset irds.miracl.yo.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> The dev set for Yoruba. </p>
-
Dataset irds.miracl.yo.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The dev set for Yoruba. </p>
-
Dataset irds.miracl.yo.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> The dev set for Yoruba. </p>
-
Dataset irds.miracl.yo.test-b.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version b) for Yoruba. </p>
miracl/zh
<p> The Chinese corpus. </p>
-
Dataset irds.miracl.zh.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Chinese corpus. </p>
-
Dataset irds.miracl.zh.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> The dev set for Chinese. </p>
-
Dataset irds.miracl.zh.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The dev set for Chinese. </p>
-
Dataset irds.miracl.zh.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> The dev set for Chinese. </p>
-
Dataset irds.miracl.zh.test-b.queries
datamaestro_text.datasets.irds.data.Topics
<p> The held-out test set (version b) for Chinese. </p>
-
Dataset irds.miracl.zh.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> The train set for Chinese. </p>
-
Dataset irds.miracl.zh.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The train set for Chinese. </p>
-
Dataset irds.miracl.zh.train
datamaestro_text.datasets.irds.data.Adhoc
<p> The train set for Chinese. </p>
MSMARCO (passage)
<p> A passage ranking benchmark with a collection of 8.8 million passages and question queries. Most relevance judgments are shallow (typically at most 1-2 per query), but the TREC Deep Learning track adds deep judgments. Evaluation typically conducted using MRR@10. </p> <p> Note that the original document source files for this collection contain a double-encoding error that cause strange sequences like “嬔 and “ðºð”. These are automatically corrrected (properly converting previous examples to “公” and “🇺🇸”). </p> <ul> <li>See also: <a class=”ds-ref”>msmarco-document</a></li> <li>Documents: Short passages (from web)</li> <li>Queries: Natural language questions (from query log)</li> <li><a href=”https://microsoft.github.io/msmarco/#ranking”>Leaderboard</a></li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> </ul>
-
Dataset irds.msmarco-passage.documents
datamaestro_text.datasets.irds.data.Documents
<p> A passage ranking benchmark with a collection of 8.8 million passages and question queries. Most relevance judgments are shallow (typically at most 1-2 per query), but the TREC Deep Learning track adds deep judgments. Evaluation typically conducted using MRR@10. </p> <p> Note that the original document source files for this collection contain a double-encoding error that cause strange sequences like “嬔 and “ðºð”. These are automatically corrrected (properly converting previous examples to “公” and “🇺🇸”). </p> <ul> <li>See also: <a class=”ds-ref”>msmarco-document</a></li> <li>Documents: Short passages (from web)</li> <li>Queries: Natural language questions (from query log)</li> <li><a href=”https://microsoft.github.io/msmarco/#ranking”>Leaderboard</a></li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> </ul>
-
Dataset irds.msmarco-passage.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official dev set. </p> <p> scoreddocs are the top 1000 results from BM25. These are used for the “re-ranking” setting. Note that these are sub-sampled to about 1/8 of the total available dev queries by the MSMARCO authors for faster evaluation. The BM25 scores from scoreddocs are not available (all have a score of 0). </p>
-
Dataset irds.msmarco-passage.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official dev set. </p> <p> scoreddocs are the top 1000 results from BM25. These are used for the “re-ranking” setting. Note that these are sub-sampled to about 1/8 of the total available dev queries by the MSMARCO authors for faster evaluation. The BM25 scores from scoreddocs are not available (all have a score of 0). </p>
-
Dataset irds.msmarco-passage.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Official dev set. </p> <p> scoreddocs are the top 1000 results from BM25. These are used for the “re-ranking” setting. Note that these are sub-sampled to about 1/8 of the total available dev queries by the MSMARCO authors for faster evaluation. The BM25 scores from scoreddocs are not available (all have a score of 0). </p>
-
Dataset irds.msmarco-passage.dev.2.queries
datamaestro_text.datasets.irds.data.Topics
<p> “Dev2” split of the <a class=”ds-ref”>msmarco-passage/dev</a> set. Originally released as part of the v2 corpus. </p>
-
Dataset irds.msmarco-passage.dev.2.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> “Dev2” split of the <a class=”ds-ref”>msmarco-passage/dev</a> set. Originally released as part of the v2 corpus. </p>
-
Dataset irds.msmarco-passage.dev.2
datamaestro_text.datasets.irds.data.Adhoc
<p> “Dev2” split of the <a class=”ds-ref”>msmarco-passage/dev</a> set. Originally released as part of the v2 corpus. </p>
-
Dataset irds.msmarco-passage.dev.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of <a class=”ds-ref”>msmarco-passage/dev</a> that only includes queries that have at least one qrel. </p>
-
Dataset irds.msmarco-passage.dev.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of <a class=”ds-ref”>msmarco-passage/dev</a> that only includes queries that have at least one qrel. </p>
-
Dataset irds.msmarco-passage.dev.judged
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of <a class=”ds-ref”>msmarco-passage/dev</a> that only includes queries that have at least one qrel. </p>
-
Dataset irds.msmarco-passage.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official “small” version of the dev set, consisting of 6,980 queries (6.9% of the full dev set). </p>
-
Dataset irds.msmarco-passage.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official “small” version of the dev set, consisting of 6,980 queries (6.9% of the full dev set). </p>
-
Dataset irds.msmarco-passage.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official “small” version of the dev set, consisting of 6,980 queries (6.9% of the full dev set). </p>
-
Dataset irds.msmarco-passage.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Official “small” version of the dev set, consisting of 6,980 queries (6.9% of the full dev set). </p>
-
Dataset irds.msmarco-passage.eval.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official eval set for submission to MS MARCO leaderboard. Relevance judgments are hidden. </p> <p> scoreddocs are the top 1000 results from BM25. These are used for the “re-ranking” setting. Note that these are sub-sampled to about 1/8 of the total available eval queries by the MSMARCO authors for faster evaluation. The BM25 scores from scoreddocs are not available (all have a score of 0). </p>
-
Dataset irds.msmarco-passage.eval.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official “small” version of the eval set, consisting of 6,837 queries (6.8% of the full eval set). </p>
-
Dataset irds.msmarco-passage.eval.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official “small” version of the eval set, consisting of 6,837 queries (6.8% of the full eval set). </p>
-
Dataset irds.msmarco-passage.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official train set. </p> <p> Not all queries have relevance judgments. Use <a class=”ds-ref”>msmarco-passage/train/judged</a> for a filtered list that only includes documents that have at least one qrel. </p> <p> scoreddocs are the top 1000 results from BM25. These are used for the “re-ranking” setting. Note that these are sub-sampled to about 1/8 of the total available train queries by the MSMARCO authors for faster evaluation. The BM25 scores from scoreddocs are not available (all have a score of 0). </p> <p> docpairs provides access to the “official” sequence for pairwise training. </p>
-
Dataset irds.msmarco-passage.train.docpairs
<p> Official train set. </p> <p> Not all queries have relevance judgments. Use <a class=”ds-ref”>msmarco-passage/train/judged</a> for a filtered list that only includes documents that have at least one qrel. </p> <p> scoreddocs are the top 1000 results from BM25. These are used for the “re-ranking” setting. Note that these are sub-sampled to about 1/8 of the total available train queries by the MSMARCO authors for faster evaluation. The BM25 scores from scoreddocs are not available (all have a score of 0). </p> <p> docpairs provides access to the “official” sequence for pairwise training. </p>
-
Dataset irds.msmarco-passage.train.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official train set. </p> <p> Not all queries have relevance judgments. Use <a class=”ds-ref”>msmarco-passage/train/judged</a> for a filtered list that only includes documents that have at least one qrel. </p> <p> scoreddocs are the top 1000 results from BM25. These are used for the “re-ranking” setting. Note that these are sub-sampled to about 1/8 of the total available train queries by the MSMARCO authors for faster evaluation. The BM25 scores from scoreddocs are not available (all have a score of 0). </p> <p> docpairs provides access to the “official” sequence for pairwise training. </p>
-
Dataset irds.msmarco-passage.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official train set. </p> <p> Not all queries have relevance judgments. Use <a class=”ds-ref”>msmarco-passage/train/judged</a> for a filtered list that only includes documents that have at least one qrel. </p> <p> scoreddocs are the top 1000 results from BM25. These are used for the “re-ranking” setting. Note that these are sub-sampled to about 1/8 of the total available train queries by the MSMARCO authors for faster evaluation. The BM25 scores from scoreddocs are not available (all have a score of 0). </p> <p> docpairs provides access to the “official” sequence for pairwise training. </p>
-
Dataset irds.msmarco-passage.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Official train set. </p> <p> Not all queries have relevance judgments. Use <a class=”ds-ref”>msmarco-passage/train/judged</a> for a filtered list that only includes documents that have at least one qrel. </p> <p> scoreddocs are the top 1000 results from BM25. These are used for the “re-ranking” setting. Note that these are sub-sampled to about 1/8 of the total available train queries by the MSMARCO authors for faster evaluation. The BM25 scores from scoreddocs are not available (all have a score of 0). </p> <p> docpairs provides access to the “official” sequence for pairwise training. </p>
-
Dataset irds.msmarco-passage.train.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> that only includes queries that have at least one qrel. </p>
-
Dataset irds.msmarco-passage.train.judged.docpairs
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> that only includes queries that have at least one qrel. </p>
-
Dataset irds.msmarco-passage.train.judged.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> that only includes queries that have at least one qrel. </p>
-
Dataset irds.msmarco-passage.train.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> that only includes queries that have at least one qrel. </p>
-
Dataset irds.msmarco-passage.train.judged
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> that only includes queries that have at least one qrel. </p>
-
Dataset irds.msmarco-passage.train.medical.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> that only includes queries that have a layman or expert medical term. Note that this includes about 20% false matches due to terms with multiple senses. </p>
-
Dataset irds.msmarco-passage.train.medical.docpairs
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> that only includes queries that have a layman or expert medical term. Note that this includes about 20% false matches due to terms with multiple senses. </p>
-
Dataset irds.msmarco-passage.train.medical.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> that only includes queries that have a layman or expert medical term. Note that this includes about 20% false matches due to terms with multiple senses. </p>
-
Dataset irds.msmarco-passage.train.medical.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> that only includes queries that have a layman or expert medical term. Note that this includes about 20% false matches due to terms with multiple senses. </p>
-
Dataset irds.msmarco-passage.train.medical
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> that only includes queries that have a layman or expert medical term. Note that this includes about 20% false matches due to terms with multiple senses. </p>
-
Dataset irds.msmarco-passage.train.split200-train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> without 200 queries that are meant to be used as a small validation set. From various works. </p>
-
Dataset irds.msmarco-passage.train.split200-train.docpairs
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> without 200 queries that are meant to be used as a small validation set. From various works. </p>
-
Dataset irds.msmarco-passage.train.split200-train.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> without 200 queries that are meant to be used as a small validation set. From various works. </p>
-
Dataset irds.msmarco-passage.train.split200-train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> without 200 queries that are meant to be used as a small validation set. From various works. </p>
-
Dataset irds.msmarco-passage.train.split200-train
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> without 200 queries that are meant to be used as a small validation set. From various works. </p>
-
Dataset irds.msmarco-passage.train.split200-valid.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> with only 200 queries that are meant to be used as a small validation set. From various works. </p>
-
Dataset irds.msmarco-passage.train.split200-valid.docpairs
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> with only 200 queries that are meant to be used as a small validation set. From various works. </p>
-
Dataset irds.msmarco-passage.train.split200-valid.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> with only 200 queries that are meant to be used as a small validation set. From various works. </p>
-
Dataset irds.msmarco-passage.train.split200-valid.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> with only 200 queries that are meant to be used as a small validation set. From various works. </p>
-
Dataset irds.msmarco-passage.train.split200-valid
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of <a class=”ds-ref”>msmarco-passage/train</a> with only 200 queries that are meant to be used as a small validation set. From various works. </p>
-
Dataset irds.msmarco-passage.train.triples-small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, but with the “small” triples file (a 10% sample of the full file). </p> <p> Note that to save on storage space (27GB), the contents of the file are mapped to their corresponding query and document IDs. This process takes a few minutes to run the first time the triples are requested. </p>
-
Dataset irds.msmarco-passage.train.triples-small.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, but with the “small” triples file (a 10% sample of the full file). </p> <p> Note that to save on storage space (27GB), the contents of the file are mapped to their corresponding query and document IDs. This process takes a few minutes to run the first time the triples are requested. </p>
-
Dataset irds.msmarco-passage.train.triples-small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, but with the “small” triples file (a 10% sample of the full file). </p> <p> Note that to save on storage space (27GB), the contents of the file are mapped to their corresponding query and document IDs. This process takes a few minutes to run the first time the triples are requested. </p>
-
Dataset irds.msmarco-passage.train.triples-small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, but with the “small” triples file (a 10% sample of the full file). </p> <p> Note that to save on storage space (27GB), the contents of the file are mapped to their corresponding query and document IDs. This process takes a few minutes to run the first time the triples are requested. </p>
-
Dataset irds.msmarco-passage.train.triples-small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, but with the “small” triples file (a 10% sample of the full file). </p> <p> Note that to save on storage space (27GB), the contents of the file are mapped to their corresponding query and document IDs. This process takes a few minutes to run the first time the triples are requested. </p>
-
Dataset irds.msmarco-passage.train.triples-v2.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, but with version 2 of the triples file. </p> <p> This version of the triples file includes rows that were accidently missing from version 1 of the file (see discussion <a href=”https://github.com/microsoft/MSMARCO-Passage-Ranking/commit/4695a71c6c76ce85c07a51c0f12690cab19abbb0”>here</a>). </p> <p> Note that this is sorted by the IDs in the file, so you probably would not want to use it unless you first shuffle it before usage. <a href=”https://github.com/microsoft/MSMARCO-Passage-Ranking/issues/21”>We opened an issue</a> suggesting that a third version of the file is provided that is shuffled so that the order is consistent across groups using the data, but at this time, no such file exists in an official capacity. </p>
-
Dataset irds.msmarco-passage.train.triples-v2.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, but with version 2 of the triples file. </p> <p> This version of the triples file includes rows that were accidently missing from version 1 of the file (see discussion <a href=”https://github.com/microsoft/MSMARCO-Passage-Ranking/commit/4695a71c6c76ce85c07a51c0f12690cab19abbb0”>here</a>). </p> <p> Note that this is sorted by the IDs in the file, so you probably would not want to use it unless you first shuffle it before usage. <a href=”https://github.com/microsoft/MSMARCO-Passage-Ranking/issues/21”>We opened an issue</a> suggesting that a third version of the file is provided that is shuffled so that the order is consistent across groups using the data, but at this time, no such file exists in an official capacity. </p>
-
Dataset irds.msmarco-passage.train.triples-v2.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, but with version 2 of the triples file. </p> <p> This version of the triples file includes rows that were accidently missing from version 1 of the file (see discussion <a href=”https://github.com/microsoft/MSMARCO-Passage-Ranking/commit/4695a71c6c76ce85c07a51c0f12690cab19abbb0”>here</a>). </p> <p> Note that this is sorted by the IDs in the file, so you probably would not want to use it unless you first shuffle it before usage. <a href=”https://github.com/microsoft/MSMARCO-Passage-Ranking/issues/21”>We opened an issue</a> suggesting that a third version of the file is provided that is shuffled so that the order is consistent across groups using the data, but at this time, no such file exists in an official capacity. </p>
-
Dataset irds.msmarco-passage.train.triples-v2.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, but with version 2 of the triples file. </p> <p> This version of the triples file includes rows that were accidently missing from version 1 of the file (see discussion <a href=”https://github.com/microsoft/MSMARCO-Passage-Ranking/commit/4695a71c6c76ce85c07a51c0f12690cab19abbb0”>here</a>). </p> <p> Note that this is sorted by the IDs in the file, so you probably would not want to use it unless you first shuffle it before usage. <a href=”https://github.com/microsoft/MSMARCO-Passage-Ranking/issues/21”>We opened an issue</a> suggesting that a third version of the file is provided that is shuffled so that the order is consistent across groups using the data, but at this time, no such file exists in an official capacity. </p>
-
Dataset irds.msmarco-passage.train.triples-v2
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, but with version 2 of the triples file. </p> <p> This version of the triples file includes rows that were accidently missing from version 1 of the file (see discussion <a href=”https://github.com/microsoft/MSMARCO-Passage-Ranking/commit/4695a71c6c76ce85c07a51c0f12690cab19abbb0”>here</a>). </p> <p> Note that this is sorted by the IDs in the file, so you probably would not want to use it unless you first shuffle it before usage. <a href=”https://github.com/microsoft/MSMARCO-Passage-Ranking/issues/21”>We opened an issue</a> suggesting that a third version of the file is provided that is shuffled so that the order is consistent across groups using the data, but at this time, no such file exists in an official capacity. </p>
-
Dataset irds.msmarco-passage.trec-dl-2019.queries
datamaestro_text.datasets.irds.data.Topics
<p> Queries from the TREC Deep Learning (DL) 2019 shared task, which were sampled from <a class=”ds-ref”>msmarco-passage/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-passage/trec-dl-2019/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2003.07820.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-passage.trec-dl-2019.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Queries from the TREC Deep Learning (DL) 2019 shared task, which were sampled from <a class=”ds-ref”>msmarco-passage/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-passage/trec-dl-2019/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2003.07820.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-passage.trec-dl-2019.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Queries from the TREC Deep Learning (DL) 2019 shared task, which were sampled from <a class=”ds-ref”>msmarco-passage/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-passage/trec-dl-2019/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2003.07820.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-passage.trec-dl-2019
datamaestro_text.datasets.irds.data.Adhoc
<p> Queries from the TREC Deep Learning (DL) 2019 shared task, which were sampled from <a class=”ds-ref”>msmarco-passage/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-passage/trec-dl-2019/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2003.07820.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-passage.trec-dl-2019.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of <a class=”ds-ref”>msmarco-passage/trec-dl-2019</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-passage.trec-dl-2019.judged.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Subset of <a class=”ds-ref”>msmarco-passage/trec-dl-2019</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-passage.trec-dl-2019.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of <a class=”ds-ref”>msmarco-passage/trec-dl-2019</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-passage.trec-dl-2019.judged
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of <a class=”ds-ref”>msmarco-passage/trec-dl-2019</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-passage.trec-dl-2020.queries
datamaestro_text.datasets.irds.data.Topics
<p> Queries from the TREC Deep Learning (DL) 2020 shared task, which were sampled from <a class=”ds-ref”>msmarco-passage/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-passage/trec-dl-2020/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2102.07662.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-passage.trec-dl-2020.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Queries from the TREC Deep Learning (DL) 2020 shared task, which were sampled from <a class=”ds-ref”>msmarco-passage/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-passage/trec-dl-2020/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2102.07662.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-passage.trec-dl-2020.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Queries from the TREC Deep Learning (DL) 2020 shared task, which were sampled from <a class=”ds-ref”>msmarco-passage/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-passage/trec-dl-2020/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2102.07662.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-passage.trec-dl-2020
datamaestro_text.datasets.irds.data.Adhoc
<p> Queries from the TREC Deep Learning (DL) 2020 shared task, which were sampled from <a class=”ds-ref”>msmarco-passage/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-passage/trec-dl-2020/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2102.07662.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-passage.trec-dl-2020.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of <a class=”ds-ref”>msmarco-passage/trec-dl-2020</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-passage.trec-dl-2020.judged.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Subset of <a class=”ds-ref”>msmarco-passage/trec-dl-2020</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-passage.trec-dl-2020.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of <a class=”ds-ref”>msmarco-passage/trec-dl-2020</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-passage.trec-dl-2020.judged
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of <a class=”ds-ref”>msmarco-passage/trec-dl-2020</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-passage.trec-dl-hard.queries
datamaestro_text.datasets.irds.data.Topics
<p> A more challenging subset of <a class=”ds-ref”>msmarco-passage/trec-dl-2019</a> and <a class=”ds-ref”>msmarco-document/trec-dl-2020</a>. </p> <ul> <li><a href=”https://github.com/grill-lab/DL-Hard”>data website</a></li> <li>See Also: <a class=”ds-ref”>msmarco-document/trec-dl-hard</a></li> </ul>
-
Dataset irds.msmarco-passage.trec-dl-hard.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A more challenging subset of <a class=”ds-ref”>msmarco-passage/trec-dl-2019</a> and <a class=”ds-ref”>msmarco-document/trec-dl-2020</a>. </p> <ul> <li><a href=”https://github.com/grill-lab/DL-Hard”>data website</a></li> <li>See Also: <a class=”ds-ref”>msmarco-document/trec-dl-hard</a></li> </ul>
-
Dataset irds.msmarco-passage.trec-dl-hard
datamaestro_text.datasets.irds.data.Adhoc
<p> A more challenging subset of <a class=”ds-ref”>msmarco-passage/trec-dl-2019</a> and <a class=”ds-ref”>msmarco-document/trec-dl-2020</a>. </p> <ul> <li><a href=”https://github.com/grill-lab/DL-Hard”>data website</a></li> <li>See Also: <a class=”ds-ref”>msmarco-document/trec-dl-hard</a></li> </ul>
-
Dataset irds.msmarco-passage.trec-dl-hard.fold1.queries
datamaestro_text.datasets.irds.data.Topics
<p> Fold 1 of <a class=”ds-ref”>msmarco-passage/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-passage.trec-dl-hard.fold1.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Fold 1 of <a class=”ds-ref”>msmarco-passage/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-passage.trec-dl-hard.fold1
datamaestro_text.datasets.irds.data.Adhoc
<p> Fold 1 of <a class=”ds-ref”>msmarco-passage/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-passage.trec-dl-hard.fold2.queries
datamaestro_text.datasets.irds.data.Topics
<p> Fold 2 of <a class=”ds-ref”>msmarco-passage/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-passage.trec-dl-hard.fold2.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Fold 2 of <a class=”ds-ref”>msmarco-passage/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-passage.trec-dl-hard.fold2
datamaestro_text.datasets.irds.data.Adhoc
<p> Fold 2 of <a class=”ds-ref”>msmarco-passage/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-passage.trec-dl-hard.fold3.queries
datamaestro_text.datasets.irds.data.Topics
<p> Fold 3 of <a class=”ds-ref”>msmarco-passage/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-passage.trec-dl-hard.fold3.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Fold 3 of <a class=”ds-ref”>msmarco-passage/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-passage.trec-dl-hard.fold3
datamaestro_text.datasets.irds.data.Adhoc
<p> Fold 3 of <a class=”ds-ref”>msmarco-passage/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-passage.trec-dl-hard.fold4.queries
datamaestro_text.datasets.irds.data.Topics
<p> Fold 4 of <a class=”ds-ref”>msmarco-passage/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-passage.trec-dl-hard.fold4.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Fold 4 of <a class=”ds-ref”>msmarco-passage/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-passage.trec-dl-hard.fold4
datamaestro_text.datasets.irds.data.Adhoc
<p> Fold 4 of <a class=”ds-ref”>msmarco-passage/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-passage.trec-dl-hard.fold5.queries
datamaestro_text.datasets.irds.data.Topics
<p> Fold 5 of <a class=”ds-ref”>msmarco-passage/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-passage.trec-dl-hard.fold5.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Fold 5 of <a class=”ds-ref”>msmarco-passage/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-passage.trec-dl-hard.fold5
datamaestro_text.datasets.irds.data.Adhoc
<p> Fold 5 of <a class=”ds-ref”>msmarco-passage/trec-dl-hard</a> </p>
mmarco/de
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with documents translated into German. </p>
-
Dataset irds.mmarco.de.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with documents translated into German. </p>
-
Dataset irds.mmarco.de.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.de.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.de.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.de.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.de.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.de.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.de.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.de.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.de.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.de.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.de.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into German. </p>
mmarco/es
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with documents translated into Spanish. </p>
-
Dataset irds.mmarco.es.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with documents translated into Spanish. </p>
-
Dataset irds.mmarco.es.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.es.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.es.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.es.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.es.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.es.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.es.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.es.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.es.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.es.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.es.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Spanish. </p>
mmarco/fr
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with documents translated into French. </p>
-
Dataset irds.mmarco.fr.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with documents translated into French. </p>
-
Dataset irds.mmarco.fr.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.fr.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.fr.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.fr.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.fr.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.fr.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.fr.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.fr.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.fr.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.fr.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.fr.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into French. </p>
mmarco/id
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with documents translated into Indonesian. </p>
-
Dataset irds.mmarco.id.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with documents translated into Indonesian. </p>
-
Dataset irds.mmarco.id.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.id.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.id.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.id.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.id.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.id.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.id.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.id.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.id.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.id.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.id.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Indonesian. </p>
mmarco/it
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with documents translated into Italian. </p>
-
Dataset irds.mmarco.it.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with documents translated into Italian. </p>
-
Dataset irds.mmarco.it.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.it.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.it.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.it.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.it.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.it.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.it.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.it.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.it.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.it.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.it.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Italian. </p>
mmarco/pt
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with documents translated into Portuguese. </p>
-
Dataset irds.mmarco.pt.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with documents translated into Portuguese. </p>
-
Dataset irds.mmarco.pt.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.pt.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.pt.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.pt.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.pt.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.pt.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.pt.dev.small.v1.1.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Portuguese. </p> <p> Version 1.1 of this file includes manual corrections from the authorss of the translated files. <a href=”https://github.com/unicamp-dl/mMARCO/issues/8#issuecomment-992810293”>See discussion here</a>. It also removes some duplicated query IDs. </p>
-
Dataset irds.mmarco.pt.dev.small.v1.1.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Portuguese. </p> <p> Version 1.1 of this file includes manual corrections from the authorss of the translated files. <a href=”https://github.com/unicamp-dl/mMARCO/issues/8#issuecomment-992810293”>See discussion here</a>. It also removes some duplicated query IDs. </p>
-
Dataset irds.mmarco.pt.dev.small.v1.1.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Portuguese. </p> <p> Version 1.1 of this file includes manual corrections from the authorss of the translated files. <a href=”https://github.com/unicamp-dl/mMARCO/issues/8#issuecomment-992810293”>See discussion here</a>. It also removes some duplicated query IDs. </p>
-
Dataset irds.mmarco.pt.dev.small.v1.1
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Portuguese. </p> <p> Version 1.1 of this file includes manual corrections from the authorss of the translated files. <a href=”https://github.com/unicamp-dl/mMARCO/issues/8#issuecomment-992810293”>See discussion here</a>. It also removes some duplicated query IDs. </p>
-
Dataset irds.mmarco.pt.dev.v1.1.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Portuguese. </p> <p> Version 1.1 of this file includes manual corrections from the authorss of the translated files. <a href=”https://github.com/unicamp-dl/mMARCO/issues/8#issuecomment-992810293”>See discussion here</a>. It also removes some duplicated query IDs. </p>
-
Dataset irds.mmarco.pt.dev.v1.1.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Portuguese. </p> <p> Version 1.1 of this file includes manual corrections from the authorss of the translated files. <a href=”https://github.com/unicamp-dl/mMARCO/issues/8#issuecomment-992810293”>See discussion here</a>. It also removes some duplicated query IDs. </p>
-
Dataset irds.mmarco.pt.dev.v1.1
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Portuguese. </p> <p> Version 1.1 of this file includes manual corrections from the authorss of the translated files. <a href=”https://github.com/unicamp-dl/mMARCO/issues/8#issuecomment-992810293”>See discussion here</a>. It also removes some duplicated query IDs. </p>
-
Dataset irds.mmarco.pt.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.pt.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.pt.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.pt.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.pt.train.v1.1.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Portuguese. </p> <p> Version 1.1 of this file includes manual corrections from the authorss of the translated files. <a href=”https://github.com/unicamp-dl/mMARCO/issues/8#issuecomment-992810293”>See discussion here</a>. It also removes some duplicated query IDs. </p>
-
Dataset irds.mmarco.pt.train.v1.1.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Portuguese. </p> <p> Version 1.1 of this file includes manual corrections from the authorss of the translated files. <a href=”https://github.com/unicamp-dl/mMARCO/issues/8#issuecomment-992810293”>See discussion here</a>. It also removes some duplicated query IDs. </p>
-
Dataset irds.mmarco.pt.train.v1.1.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Portuguese. </p> <p> Version 1.1 of this file includes manual corrections from the authorss of the translated files. <a href=”https://github.com/unicamp-dl/mMARCO/issues/8#issuecomment-992810293”>See discussion here</a>. It also removes some duplicated query IDs. </p>
-
Dataset irds.mmarco.pt.train.v1.1
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Portuguese. </p> <p> Version 1.1 of this file includes manual corrections from the authorss of the translated files. <a href=”https://github.com/unicamp-dl/mMARCO/issues/8#issuecomment-992810293”>See discussion here</a>. It also removes some duplicated query IDs. </p>
mmarco/ru
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with documents translated into Russian. </p>
-
Dataset irds.mmarco.ru.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with documents translated into Russian. </p>
-
Dataset irds.mmarco.ru.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.ru.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.ru.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.ru.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.ru.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.ru.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.ru.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.ru.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.ru.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.ru.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.ru.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Russian. </p>
mmarco/v2/ar
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Arabic. </p>
-
Dataset irds.mmarco.v2.ar.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Arabic. </p>
-
Dataset irds.mmarco.v2.ar.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Arabic. </p>
-
Dataset irds.mmarco.v2.ar.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Arabic. </p>
-
Dataset irds.mmarco.v2.ar.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Arabic. </p>
-
Dataset irds.mmarco.v2.ar.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Arabic. </p>
-
Dataset irds.mmarco.v2.ar.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Arabic. </p>
-
Dataset irds.mmarco.v2.ar.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Arabic. </p>
-
Dataset irds.mmarco.v2.ar.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Arabic. </p>
-
Dataset irds.mmarco.v2.ar.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Arabic. </p>
-
Dataset irds.mmarco.v2.ar.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Arabic. </p>
-
Dataset irds.mmarco.v2.ar.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Arabic. </p>
-
Dataset irds.mmarco.v2.ar.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Arabic. </p>
mmarco/v2/de
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.v2.de.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.v2.de.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.v2.de.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.v2.de.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.v2.de.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.v2.de.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.v2.de.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.v2.de.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.v2.de.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.v2.de.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.v2.de.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into German. </p>
-
Dataset irds.mmarco.v2.de.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into German. </p>
mmarco/v2/dt
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Dutch. </p>
-
Dataset irds.mmarco.v2.dt.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Dutch. </p>
-
Dataset irds.mmarco.v2.dt.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Dutch. </p>
-
Dataset irds.mmarco.v2.dt.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Dutch. </p>
-
Dataset irds.mmarco.v2.dt.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Dutch. </p>
-
Dataset irds.mmarco.v2.dt.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Dutch. </p>
-
Dataset irds.mmarco.v2.dt.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Dutch. </p>
-
Dataset irds.mmarco.v2.dt.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Dutch. </p>
-
Dataset irds.mmarco.v2.dt.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Dutch. </p>
-
Dataset irds.mmarco.v2.dt.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Dutch. </p>
-
Dataset irds.mmarco.v2.dt.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Dutch. </p>
-
Dataset irds.mmarco.v2.dt.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Dutch. </p>
-
Dataset irds.mmarco.v2.dt.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Dutch. </p>
mmarco/v2/es
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.v2.es.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.v2.es.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.v2.es.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.v2.es.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.v2.es.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.v2.es.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.v2.es.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.v2.es.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.v2.es.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.v2.es.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.v2.es.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Spanish. </p>
-
Dataset irds.mmarco.v2.es.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Spanish. </p>
mmarco/v2/fr
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.v2.fr.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.v2.fr.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.v2.fr.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.v2.fr.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.v2.fr.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.v2.fr.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.v2.fr.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.v2.fr.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.v2.fr.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.v2.fr.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.v2.fr.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into French. </p>
-
Dataset irds.mmarco.v2.fr.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into French. </p>
mmarco/v2/hi
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Hindi. </p>
-
Dataset irds.mmarco.v2.hi.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Hindi. </p>
-
Dataset irds.mmarco.v2.hi.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Hindi. </p>
-
Dataset irds.mmarco.v2.hi.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Hindi. </p>
-
Dataset irds.mmarco.v2.hi.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Hindi. </p>
-
Dataset irds.mmarco.v2.hi.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Hindi. </p>
-
Dataset irds.mmarco.v2.hi.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Hindi. </p>
-
Dataset irds.mmarco.v2.hi.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Hindi. </p>
-
Dataset irds.mmarco.v2.hi.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Hindi. </p>
-
Dataset irds.mmarco.v2.hi.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Hindi. </p>
-
Dataset irds.mmarco.v2.hi.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Hindi. </p>
-
Dataset irds.mmarco.v2.hi.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Hindi. </p>
-
Dataset irds.mmarco.v2.hi.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Hindi. </p>
mmarco/v2/id
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.v2.id.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.v2.id.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.v2.id.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.v2.id.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.v2.id.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.v2.id.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.v2.id.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.v2.id.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.v2.id.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.v2.id.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.v2.id.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Indonesian. </p>
-
Dataset irds.mmarco.v2.id.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Indonesian. </p>
mmarco/v2/it
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.v2.it.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.v2.it.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.v2.it.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.v2.it.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.v2.it.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.v2.it.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.v2.it.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.v2.it.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.v2.it.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.v2.it.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.v2.it.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Italian. </p>
-
Dataset irds.mmarco.v2.it.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Italian. </p>
mmarco/v2/ja
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Japanese. </p>
-
Dataset irds.mmarco.v2.ja.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Japanese. </p>
-
Dataset irds.mmarco.v2.ja.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Japanese. </p>
-
Dataset irds.mmarco.v2.ja.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Japanese. </p>
-
Dataset irds.mmarco.v2.ja.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Japanese. </p>
-
Dataset irds.mmarco.v2.ja.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Japanese. </p>
-
Dataset irds.mmarco.v2.ja.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Japanese. </p>
-
Dataset irds.mmarco.v2.ja.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Japanese. </p>
-
Dataset irds.mmarco.v2.ja.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Japanese. </p>
-
Dataset irds.mmarco.v2.ja.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Japanese. </p>
-
Dataset irds.mmarco.v2.ja.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Japanese. </p>
-
Dataset irds.mmarco.v2.ja.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Japanese. </p>
-
Dataset irds.mmarco.v2.ja.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Japanese. </p>
mmarco/v2/pt
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.v2.pt.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.v2.pt.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.v2.pt.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.v2.pt.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.v2.pt.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.v2.pt.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.v2.pt.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.v2.pt.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.v2.pt.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.v2.pt.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.v2.pt.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Portuguese. </p>
-
Dataset irds.mmarco.v2.pt.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Portuguese. </p>
mmarco/v2/ru
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.v2.ru.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.v2.ru.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.v2.ru.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.v2.ru.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.v2.ru.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.v2.ru.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.v2.ru.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.v2.ru.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.v2.ru.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.v2.ru.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.v2.ru.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Russian. </p>
-
Dataset irds.mmarco.v2.ru.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Russian. </p>
mmarco/v2/vi
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Vietnamese. </p>
-
Dataset irds.mmarco.v2.vi.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Vietnamese. </p>
-
Dataset irds.mmarco.v2.vi.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Vietnamese. </p>
-
Dataset irds.mmarco.v2.vi.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Vietnamese. </p>
-
Dataset irds.mmarco.v2.vi.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Vietnamese. </p>
-
Dataset irds.mmarco.v2.vi.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Vietnamese. </p>
-
Dataset irds.mmarco.v2.vi.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Vietnamese. </p>
-
Dataset irds.mmarco.v2.vi.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Vietnamese. </p>
-
Dataset irds.mmarco.v2.vi.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Vietnamese. </p>
-
Dataset irds.mmarco.v2.vi.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Vietnamese. </p>
-
Dataset irds.mmarco.v2.vi.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Vietnamese. </p>
-
Dataset irds.mmarco.v2.vi.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Vietnamese. </p>
-
Dataset irds.mmarco.v2.vi.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Vietnamese. </p>
mmarco/v2/zh
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.v2.zh.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.v2.zh.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.v2.zh.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.v2.zh.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.v2.zh.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.v2.zh.dev.small.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.v2.zh.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.v2.zh.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.v2.zh.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.v2.zh.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.v2.zh.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.v2.zh.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Chinese. </p>
mmarco/zh
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with documents translated into Chinese. </p>
-
Dataset irds.mmarco.zh.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version of <a class=”ds-ref”>msmarco-passage</a>, with documents translated into Chinese. </p>
-
Dataset irds.mmarco.zh.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.zh.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.zh.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.zh.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.zh.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.zh.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.zh.dev.small.v1.1.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Chinese. </p> <p> Version 1.1 of this file includes manual corrections from the authorss of the translated files. <a href=”https://github.com/unicamp-dl/mMARCO/issues/8#issuecomment-992810293”>See discussion here</a>. </p>
-
Dataset irds.mmarco.zh.dev.small.v1.1.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Chinese. </p> <p> Version 1.1 of this file includes manual corrections from the authorss of the translated files. <a href=”https://github.com/unicamp-dl/mMARCO/issues/8#issuecomment-992810293”>See discussion here</a>. </p>
-
Dataset irds.mmarco.zh.dev.small.v1.1.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Chinese. </p> <p> Version 1.1 of this file includes manual corrections from the authorss of the translated files. <a href=”https://github.com/unicamp-dl/mMARCO/issues/8#issuecomment-992810293”>See discussion here</a>. </p>
-
Dataset irds.mmarco.zh.dev.small.v1.1
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Chinese. </p> <p> Version 1.1 of this file includes manual corrections from the authorss of the translated files. <a href=”https://github.com/unicamp-dl/mMARCO/issues/8#issuecomment-992810293”>See discussion here</a>. </p>
-
Dataset irds.mmarco.zh.dev.v1.1.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Chinese. </p> <p> Version 1.1 of this file includes manual corrections from the authorss of the translated files. <a href=”https://github.com/unicamp-dl/mMARCO/issues/8#issuecomment-992810293”>See discussion here</a>. </p>
-
Dataset irds.mmarco.zh.dev.v1.1.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Chinese. </p> <p> Version 1.1 of this file includes manual corrections from the authorss of the translated files. <a href=”https://github.com/unicamp-dl/mMARCO/issues/8#issuecomment-992810293”>See discussion here</a>. </p>
-
Dataset irds.mmarco.zh.dev.v1.1
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/dev</a>, with queries and documents translated into Chinese. </p> <p> Version 1.1 of this file includes manual corrections from the authorss of the translated files. <a href=”https://github.com/unicamp-dl/mMARCO/issues/8#issuecomment-992810293”>See discussion here</a>. </p>
-
Dataset irds.mmarco.zh.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.zh.train.docpairs
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.zh.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Chinese. </p>
-
Dataset irds.mmarco.zh.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Version of <a class=”ds-ref”>msmarco-passage/train</a>, with queries and documents translated into Chinese. </p>
mr-tydi/ar
<p> Complete Arabic dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ar.documents
datamaestro_text.datasets.irds.data.Documents
<p> Complete Arabic dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ar.queries
datamaestro_text.datasets.irds.data.Topics
<p> Complete Arabic dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ar.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Complete Arabic dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ar
datamaestro_text.datasets.irds.data.Adhoc
<p> Complete Arabic dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ar.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p>Development set for Arabic</p>
-
Dataset irds.mr-tydi.ar.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Development set for Arabic</p>
-
Dataset irds.mr-tydi.ar.dev
datamaestro_text.datasets.irds.data.Adhoc
<p>Development set for Arabic</p>
-
Dataset irds.mr-tydi.ar.test.queries
datamaestro_text.datasets.irds.data.Topics
<p>Test set for Arabic</p>
-
Dataset irds.mr-tydi.ar.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Test set for Arabic</p>
-
Dataset irds.mr-tydi.ar.test
datamaestro_text.datasets.irds.data.Adhoc
<p>Test set for Arabic</p>
-
Dataset irds.mr-tydi.ar.train.queries
datamaestro_text.datasets.irds.data.Topics
<p>Train set for Arabic</p>
-
Dataset irds.mr-tydi.ar.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Train set for Arabic</p>
-
Dataset irds.mr-tydi.ar.train
datamaestro_text.datasets.irds.data.Adhoc
<p>Train set for Arabic</p>
mr-tydi/bn
<p> Complete Bengali dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.bn.documents
datamaestro_text.datasets.irds.data.Documents
<p> Complete Bengali dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.bn.queries
datamaestro_text.datasets.irds.data.Topics
<p> Complete Bengali dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.bn.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Complete Bengali dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.bn
datamaestro_text.datasets.irds.data.Adhoc
<p> Complete Bengali dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.bn.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p>Development set for Bengali</p>
-
Dataset irds.mr-tydi.bn.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Development set for Bengali</p>
-
Dataset irds.mr-tydi.bn.dev
datamaestro_text.datasets.irds.data.Adhoc
<p>Development set for Bengali</p>
-
Dataset irds.mr-tydi.bn.test.queries
datamaestro_text.datasets.irds.data.Topics
<p>Test set for Bengali</p>
-
Dataset irds.mr-tydi.bn.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Test set for Bengali</p>
-
Dataset irds.mr-tydi.bn.test
datamaestro_text.datasets.irds.data.Adhoc
<p>Test set for Bengali</p>
-
Dataset irds.mr-tydi.bn.train.queries
datamaestro_text.datasets.irds.data.Topics
<p>Train set for Bengali</p>
-
Dataset irds.mr-tydi.bn.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Train set for Bengali</p>
-
Dataset irds.mr-tydi.bn.train
datamaestro_text.datasets.irds.data.Adhoc
<p>Train set for Bengali</p>
mr-tydi/en
<p> Complete English dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.en.documents
datamaestro_text.datasets.irds.data.Documents
<p> Complete English dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.en.queries
datamaestro_text.datasets.irds.data.Topics
<p> Complete English dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.en.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Complete English dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.en
datamaestro_text.datasets.irds.data.Adhoc
<p> Complete English dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.en.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p>Development set for English</p>
-
Dataset irds.mr-tydi.en.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Development set for English</p>
-
Dataset irds.mr-tydi.en.dev
datamaestro_text.datasets.irds.data.Adhoc
<p>Development set for English</p>
-
Dataset irds.mr-tydi.en.test.queries
datamaestro_text.datasets.irds.data.Topics
<p>Test set for English</p>
-
Dataset irds.mr-tydi.en.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Test set for English</p>
-
Dataset irds.mr-tydi.en.test
datamaestro_text.datasets.irds.data.Adhoc
<p>Test set for English</p>
-
Dataset irds.mr-tydi.en.train.queries
datamaestro_text.datasets.irds.data.Topics
<p>Train set for English</p>
-
Dataset irds.mr-tydi.en.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Train set for English</p>
-
Dataset irds.mr-tydi.en.train
datamaestro_text.datasets.irds.data.Adhoc
<p>Train set for English</p>
mr-tydi/fi
<p> Complete Finnish dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.fi.documents
datamaestro_text.datasets.irds.data.Documents
<p> Complete Finnish dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.fi.queries
datamaestro_text.datasets.irds.data.Topics
<p> Complete Finnish dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.fi.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Complete Finnish dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.fi
datamaestro_text.datasets.irds.data.Adhoc
<p> Complete Finnish dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.fi.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p>Development set for Finnish</p>
-
Dataset irds.mr-tydi.fi.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Development set for Finnish</p>
-
Dataset irds.mr-tydi.fi.dev
datamaestro_text.datasets.irds.data.Adhoc
<p>Development set for Finnish</p>
-
Dataset irds.mr-tydi.fi.test.queries
datamaestro_text.datasets.irds.data.Topics
<p>Test set for Finnish</p>
-
Dataset irds.mr-tydi.fi.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Test set for Finnish</p>
-
Dataset irds.mr-tydi.fi.test
datamaestro_text.datasets.irds.data.Adhoc
<p>Test set for Finnish</p>
-
Dataset irds.mr-tydi.fi.train.queries
datamaestro_text.datasets.irds.data.Topics
<p>Train set for Finnish</p>
-
Dataset irds.mr-tydi.fi.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Train set for Finnish</p>
-
Dataset irds.mr-tydi.fi.train
datamaestro_text.datasets.irds.data.Adhoc
<p>Train set for Finnish</p>
mr-tydi/id
<p> Complete Indonesian dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.id.documents
datamaestro_text.datasets.irds.data.Documents
<p> Complete Indonesian dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.id.queries
datamaestro_text.datasets.irds.data.Topics
<p> Complete Indonesian dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.id.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Complete Indonesian dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.id
datamaestro_text.datasets.irds.data.Adhoc
<p> Complete Indonesian dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.id.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p>Development set for Indonesian</p>
-
Dataset irds.mr-tydi.id.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Development set for Indonesian</p>
-
Dataset irds.mr-tydi.id.dev
datamaestro_text.datasets.irds.data.Adhoc
<p>Development set for Indonesian</p>
-
Dataset irds.mr-tydi.id.test.queries
datamaestro_text.datasets.irds.data.Topics
<p>Test set for Indonesian</p>
-
Dataset irds.mr-tydi.id.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Test set for Indonesian</p>
-
Dataset irds.mr-tydi.id.test
datamaestro_text.datasets.irds.data.Adhoc
<p>Test set for Indonesian</p>
-
Dataset irds.mr-tydi.id.train.queries
datamaestro_text.datasets.irds.data.Topics
<p>Train set for Indonesian</p>
-
Dataset irds.mr-tydi.id.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Train set for Indonesian</p>
-
Dataset irds.mr-tydi.id.train
datamaestro_text.datasets.irds.data.Adhoc
<p>Train set for Indonesian</p>
mr-tydi/ja
<p> Complete Japanese dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ja.documents
datamaestro_text.datasets.irds.data.Documents
<p> Complete Japanese dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ja.queries
datamaestro_text.datasets.irds.data.Topics
<p> Complete Japanese dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ja.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Complete Japanese dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ja
datamaestro_text.datasets.irds.data.Adhoc
<p> Complete Japanese dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ja.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p>Development set for Japanese</p>
-
Dataset irds.mr-tydi.ja.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Development set for Japanese</p>
-
Dataset irds.mr-tydi.ja.dev
datamaestro_text.datasets.irds.data.Adhoc
<p>Development set for Japanese</p>
-
Dataset irds.mr-tydi.ja.test.queries
datamaestro_text.datasets.irds.data.Topics
<p>Test set for Japanese</p>
-
Dataset irds.mr-tydi.ja.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Test set for Japanese</p>
-
Dataset irds.mr-tydi.ja.test
datamaestro_text.datasets.irds.data.Adhoc
<p>Test set for Japanese</p>
-
Dataset irds.mr-tydi.ja.train.queries
datamaestro_text.datasets.irds.data.Topics
<p>Train set for Japanese</p>
-
Dataset irds.mr-tydi.ja.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Train set for Japanese</p>
-
Dataset irds.mr-tydi.ja.train
datamaestro_text.datasets.irds.data.Adhoc
<p>Train set for Japanese</p>
mr-tydi/ko
<p> Complete Korean dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ko.documents
datamaestro_text.datasets.irds.data.Documents
<p> Complete Korean dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ko.queries
datamaestro_text.datasets.irds.data.Topics
<p> Complete Korean dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ko.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Complete Korean dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ko
datamaestro_text.datasets.irds.data.Adhoc
<p> Complete Korean dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ko.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p>Development set for Korean</p>
-
Dataset irds.mr-tydi.ko.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Development set for Korean</p>
-
Dataset irds.mr-tydi.ko.dev
datamaestro_text.datasets.irds.data.Adhoc
<p>Development set for Korean</p>
-
Dataset irds.mr-tydi.ko.test.queries
datamaestro_text.datasets.irds.data.Topics
<p>Test set for Korean</p>
-
Dataset irds.mr-tydi.ko.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Test set for Korean</p>
-
Dataset irds.mr-tydi.ko.test
datamaestro_text.datasets.irds.data.Adhoc
<p>Test set for Korean</p>
-
Dataset irds.mr-tydi.ko.train.queries
datamaestro_text.datasets.irds.data.Topics
<p>Train set for Korean</p>
-
Dataset irds.mr-tydi.ko.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Train set for Korean</p>
-
Dataset irds.mr-tydi.ko.train
datamaestro_text.datasets.irds.data.Adhoc
<p>Train set for Korean</p>
mr-tydi/ru
<p> Complete Russian dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ru.documents
datamaestro_text.datasets.irds.data.Documents
<p> Complete Russian dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ru.queries
datamaestro_text.datasets.irds.data.Topics
<p> Complete Russian dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ru.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Complete Russian dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ru
datamaestro_text.datasets.irds.data.Adhoc
<p> Complete Russian dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.ru.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p>Development set for Russian</p>
-
Dataset irds.mr-tydi.ru.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Development set for Russian</p>
-
Dataset irds.mr-tydi.ru.dev
datamaestro_text.datasets.irds.data.Adhoc
<p>Development set for Russian</p>
-
Dataset irds.mr-tydi.ru.test.queries
datamaestro_text.datasets.irds.data.Topics
<p>Test set for Russian</p>
-
Dataset irds.mr-tydi.ru.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Test set for Russian</p>
-
Dataset irds.mr-tydi.ru.test
datamaestro_text.datasets.irds.data.Adhoc
<p>Test set for Russian</p>
-
Dataset irds.mr-tydi.ru.train.queries
datamaestro_text.datasets.irds.data.Topics
<p>Train set for Russian</p>
-
Dataset irds.mr-tydi.ru.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Train set for Russian</p>
-
Dataset irds.mr-tydi.ru.train
datamaestro_text.datasets.irds.data.Adhoc
<p>Train set for Russian</p>
mr-tydi/sw
<p> Complete Swahili dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.sw.documents
datamaestro_text.datasets.irds.data.Documents
<p> Complete Swahili dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.sw.queries
datamaestro_text.datasets.irds.data.Topics
<p> Complete Swahili dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.sw.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Complete Swahili dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.sw
datamaestro_text.datasets.irds.data.Adhoc
<p> Complete Swahili dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.sw.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p>Development set for Swahili</p>
-
Dataset irds.mr-tydi.sw.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Development set for Swahili</p>
-
Dataset irds.mr-tydi.sw.dev
datamaestro_text.datasets.irds.data.Adhoc
<p>Development set for Swahili</p>
-
Dataset irds.mr-tydi.sw.test.queries
datamaestro_text.datasets.irds.data.Topics
<p>Test set for Swahili</p>
-
Dataset irds.mr-tydi.sw.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Test set for Swahili</p>
-
Dataset irds.mr-tydi.sw.test
datamaestro_text.datasets.irds.data.Adhoc
<p>Test set for Swahili</p>
-
Dataset irds.mr-tydi.sw.train.queries
datamaestro_text.datasets.irds.data.Topics
<p>Train set for Swahili</p>
-
Dataset irds.mr-tydi.sw.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Train set for Swahili</p>
-
Dataset irds.mr-tydi.sw.train
datamaestro_text.datasets.irds.data.Adhoc
<p>Train set for Swahili</p>
mr-tydi/te
<p> Complete Telugu dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.te.documents
datamaestro_text.datasets.irds.data.Documents
<p> Complete Telugu dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.te.queries
datamaestro_text.datasets.irds.data.Topics
<p> Complete Telugu dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.te.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Complete Telugu dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.te
datamaestro_text.datasets.irds.data.Adhoc
<p> Complete Telugu dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.te.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p>Development set for Telugu</p>
-
Dataset irds.mr-tydi.te.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Development set for Telugu</p>
-
Dataset irds.mr-tydi.te.dev
datamaestro_text.datasets.irds.data.Adhoc
<p>Development set for Telugu</p>
-
Dataset irds.mr-tydi.te.test.queries
datamaestro_text.datasets.irds.data.Topics
<p>Test set for Telugu</p>
-
Dataset irds.mr-tydi.te.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Test set for Telugu</p>
-
Dataset irds.mr-tydi.te.test
datamaestro_text.datasets.irds.data.Adhoc
<p>Test set for Telugu</p>
-
Dataset irds.mr-tydi.te.train.queries
datamaestro_text.datasets.irds.data.Topics
<p>Train set for Telugu</p>
-
Dataset irds.mr-tydi.te.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Train set for Telugu</p>
-
Dataset irds.mr-tydi.te.train
datamaestro_text.datasets.irds.data.Adhoc
<p>Train set for Telugu</p>
mr-tydi/th
<p> Complete Thai dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.th.documents
datamaestro_text.datasets.irds.data.Documents
<p> Complete Thai dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.th.queries
datamaestro_text.datasets.irds.data.Topics
<p> Complete Thai dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.th.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Complete Thai dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.th
datamaestro_text.datasets.irds.data.Adhoc
<p> Complete Thai dataset, including all train, dev, and test queries and qrels. </p>
-
Dataset irds.mr-tydi.th.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p>Development set for Thai</p>
-
Dataset irds.mr-tydi.th.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Development set for Thai</p>
-
Dataset irds.mr-tydi.th.dev
datamaestro_text.datasets.irds.data.Adhoc
<p>Development set for Thai</p>
-
Dataset irds.mr-tydi.th.test.queries
datamaestro_text.datasets.irds.data.Topics
<p>Test set for Thai</p>
-
Dataset irds.mr-tydi.th.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Test set for Thai</p>
-
Dataset irds.mr-tydi.th.test
datamaestro_text.datasets.irds.data.Adhoc
<p>Test set for Thai</p>
-
Dataset irds.mr-tydi.th.train.queries
datamaestro_text.datasets.irds.data.Topics
<p>Train set for Thai</p>
-
Dataset irds.mr-tydi.th.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>Train set for Thai</p>
-
Dataset irds.mr-tydi.th.train
datamaestro_text.datasets.irds.data.Adhoc
<p>Train set for Thai</p>
MSMARCO (document)
<p> “Based the questions in the [MS-MARCO] Question Answering Dataset and the documents which answered the questions a document ranking task was formulated. There are 3.2 million documents and the goal is to rank based on their relevance. Relevance labels are derived from what passages was marked as having the answer in the QnA dataset.” </p> <ul> <li>See also: <a class=”ds-ref”>msmarco-passage</a></li> <li>Documents: Text extracted from web pages</li> <li>Queries: Natural language questions (from query log)</li> <li><a href=”https://microsoft.github.io/msmarco/#docranking”>Leaderboard</a></li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> </ul>
-
Dataset irds.msmarco-document.documents
datamaestro_text.datasets.irds.data.Documents
<p> “Based the questions in the [MS-MARCO] Question Answering Dataset and the documents which answered the questions a document ranking task was formulated. There are 3.2 million documents and the goal is to rank based on their relevance. Relevance labels are derived from what passages was marked as having the answer in the QnA dataset.” </p> <ul> <li>See also: <a class=”ds-ref”>msmarco-passage</a></li> <li>Documents: Text extracted from web pages</li> <li>Queries: Natural language questions (from query log)</li> <li><a href=”https://microsoft.github.io/msmarco/#docranking”>Leaderboard</a></li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> </ul>
-
Dataset irds.msmarco-document.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official dev set. All queries have exactly 1 (positive) relevance judgment. </p> <p> scoreddocs are the top 100 results from Indri QL. These are used for the “re-ranking” setting. </p>
-
Dataset irds.msmarco-document.dev.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official dev set. All queries have exactly 1 (positive) relevance judgment. </p> <p> scoreddocs are the top 100 results from Indri QL. These are used for the “re-ranking” setting. </p>
-
Dataset irds.msmarco-document.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official dev set. All queries have exactly 1 (positive) relevance judgment. </p> <p> scoreddocs are the top 100 results from Indri QL. These are used for the “re-ranking” setting. </p>
-
Dataset irds.msmarco-document.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Official dev set. All queries have exactly 1 (positive) relevance judgment. </p> <p> scoreddocs are the top 100 results from Indri QL. These are used for the “re-ranking” setting. </p>
-
Dataset irds.msmarco-document.eval.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official eval set for submission to MS MARCO leaderboard. Relevance judgments are hidden. </p> <p> scoreddocs are the top 100 results from Indri QL. These are used for the “re-ranking” setting. </p>
-
Dataset irds.msmarco-document.eval.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official eval set for submission to MS MARCO leaderboard. Relevance judgments are hidden. </p> <p> scoreddocs are the top 100 results from Indri QL. These are used for the “re-ranking” setting. </p>
-
Dataset irds.msmarco-document.orcas.queries
datamaestro_text.datasets.irds.data.Topics
<p> “ORCAS is a click-based dataset associated with the TREC Deep Learning Track. It covers 1.4 million of the TREC DL documents, providing 18 million connections to 10 million distinct queries.” </p> <ul> <li>Queries: From query log</li> <li>Relevance Data: User clicks</li> <li>Scored docs: Indri Query Likelihood model</li> <li><a href=”https://arxiv.org/abs/2006.05324”>Dataset Paper</a></li> </ul>
-
Dataset irds.msmarco-document.orcas.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> “ORCAS is a click-based dataset associated with the TREC Deep Learning Track. It covers 1.4 million of the TREC DL documents, providing 18 million connections to 10 million distinct queries.” </p> <ul> <li>Queries: From query log</li> <li>Relevance Data: User clicks</li> <li>Scored docs: Indri Query Likelihood model</li> <li><a href=”https://arxiv.org/abs/2006.05324”>Dataset Paper</a></li> </ul>
-
Dataset irds.msmarco-document.orcas.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> “ORCAS is a click-based dataset associated with the TREC Deep Learning Track. It covers 1.4 million of the TREC DL documents, providing 18 million connections to 10 million distinct queries.” </p> <ul> <li>Queries: From query log</li> <li>Relevance Data: User clicks</li> <li>Scored docs: Indri Query Likelihood model</li> <li><a href=”https://arxiv.org/abs/2006.05324”>Dataset Paper</a></li> </ul>
-
Dataset irds.msmarco-document.orcas
datamaestro_text.datasets.irds.data.Adhoc
<p> “ORCAS is a click-based dataset associated with the TREC Deep Learning Track. It covers 1.4 million of the TREC DL documents, providing 18 million connections to 10 million distinct queries.” </p> <ul> <li>Queries: From query log</li> <li>Relevance Data: User clicks</li> <li>Scored docs: Indri Query Likelihood model</li> <li><a href=”https://arxiv.org/abs/2006.05324”>Dataset Paper</a></li> </ul>
-
Dataset irds.msmarco-document.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official train set. All queries have exactly 1 (positive) relevance judgment. </p> <p> scoreddocs are the top 100 results from Indri QL. These are used for the “re-ranking” setting. </p>
-
Dataset irds.msmarco-document.train.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official train set. All queries have exactly 1 (positive) relevance judgment. </p> <p> scoreddocs are the top 100 results from Indri QL. These are used for the “re-ranking” setting. </p>
-
Dataset irds.msmarco-document.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official train set. All queries have exactly 1 (positive) relevance judgment. </p> <p> scoreddocs are the top 100 results from Indri QL. These are used for the “re-ranking” setting. </p>
-
Dataset irds.msmarco-document.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Official train set. All queries have exactly 1 (positive) relevance judgment. </p> <p> scoreddocs are the top 100 results from Indri QL. These are used for the “re-ranking” setting. </p>
-
Dataset irds.msmarco-document.trec-dl-2019.queries
datamaestro_text.datasets.irds.data.Topics
<p> Queries from the TREC Deep Learning (DL) 2019 shared task, which were sampled from <a class=”ds-ref”>msmarco-document/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-document/trec-dl-2019/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2003.07820.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-document.trec-dl-2019.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Queries from the TREC Deep Learning (DL) 2019 shared task, which were sampled from <a class=”ds-ref”>msmarco-document/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-document/trec-dl-2019/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2003.07820.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-document.trec-dl-2019.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Queries from the TREC Deep Learning (DL) 2019 shared task, which were sampled from <a class=”ds-ref”>msmarco-document/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-document/trec-dl-2019/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2003.07820.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-document.trec-dl-2019
datamaestro_text.datasets.irds.data.Adhoc
<p> Queries from the TREC Deep Learning (DL) 2019 shared task, which were sampled from <a class=”ds-ref”>msmarco-document/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-document/trec-dl-2019/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2003.07820.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-document.trec-dl-2019.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of <a class=”ds-ref”>msmarco-document/trec-dl-2019</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-document.trec-dl-2019.judged.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Subset of <a class=”ds-ref”>msmarco-document/trec-dl-2019</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-document.trec-dl-2019.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of <a class=”ds-ref”>msmarco-document/trec-dl-2019</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-document.trec-dl-2019.judged
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of <a class=”ds-ref”>msmarco-document/trec-dl-2019</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-document.trec-dl-2020.queries
datamaestro_text.datasets.irds.data.Topics
<p> Queries from the TREC Deep Learning (DL) 2020 shared task, which were sampled from <a class=”ds-ref”>msmarco-document/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-document/trec-dl-2020/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2102.07662.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-document.trec-dl-2020.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Queries from the TREC Deep Learning (DL) 2020 shared task, which were sampled from <a class=”ds-ref”>msmarco-document/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-document/trec-dl-2020/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2102.07662.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-document.trec-dl-2020.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Queries from the TREC Deep Learning (DL) 2020 shared task, which were sampled from <a class=”ds-ref”>msmarco-document/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-document/trec-dl-2020/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2102.07662.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-document.trec-dl-2020
datamaestro_text.datasets.irds.data.Adhoc
<p> Queries from the TREC Deep Learning (DL) 2020 shared task, which were sampled from <a class=”ds-ref”>msmarco-document/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-document/trec-dl-2020/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2102.07662.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-document.trec-dl-2020.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of <a class=”ds-ref”>msmarco-document/trec-dl-2020</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-document.trec-dl-2020.judged.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Subset of <a class=”ds-ref”>msmarco-document/trec-dl-2020</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-document.trec-dl-2020.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of <a class=”ds-ref”>msmarco-document/trec-dl-2020</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-document.trec-dl-2020.judged
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of <a class=”ds-ref”>msmarco-document/trec-dl-2020</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-document.trec-dl-hard.queries
datamaestro_text.datasets.irds.data.Topics
<p> A more challenging subset of <a class=”ds-ref”>msmarco-document/trec-dl-2019</a> and <a class=”ds-ref”>msmarco-document/trec-dl-2020</a>. </p> <ul> <li><a href=”https://github.com/grill-lab/DL-Hard”>data website</a></li> <li>See Also: <a class=”ds-ref”>msmarco-passage/trec-dl-hard</a></li> </ul>
-
Dataset irds.msmarco-document.trec-dl-hard.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A more challenging subset of <a class=”ds-ref”>msmarco-document/trec-dl-2019</a> and <a class=”ds-ref”>msmarco-document/trec-dl-2020</a>. </p> <ul> <li><a href=”https://github.com/grill-lab/DL-Hard”>data website</a></li> <li>See Also: <a class=”ds-ref”>msmarco-passage/trec-dl-hard</a></li> </ul>
-
Dataset irds.msmarco-document.trec-dl-hard
datamaestro_text.datasets.irds.data.Adhoc
<p> A more challenging subset of <a class=”ds-ref”>msmarco-document/trec-dl-2019</a> and <a class=”ds-ref”>msmarco-document/trec-dl-2020</a>. </p> <ul> <li><a href=”https://github.com/grill-lab/DL-Hard”>data website</a></li> <li>See Also: <a class=”ds-ref”>msmarco-passage/trec-dl-hard</a></li> </ul>
-
Dataset irds.msmarco-document.trec-dl-hard.fold1.queries
datamaestro_text.datasets.irds.data.Topics
<p> Fold 1 of <a class=”ds-ref”>msmarco-document/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-document.trec-dl-hard.fold1.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Fold 1 of <a class=”ds-ref”>msmarco-document/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-document.trec-dl-hard.fold1
datamaestro_text.datasets.irds.data.Adhoc
<p> Fold 1 of <a class=”ds-ref”>msmarco-document/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-document.trec-dl-hard.fold2.queries
datamaestro_text.datasets.irds.data.Topics
<p> Fold 2 of <a class=”ds-ref”>msmarco-document/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-document.trec-dl-hard.fold2.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Fold 2 of <a class=”ds-ref”>msmarco-document/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-document.trec-dl-hard.fold2
datamaestro_text.datasets.irds.data.Adhoc
<p> Fold 2 of <a class=”ds-ref”>msmarco-document/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-document.trec-dl-hard.fold3.queries
datamaestro_text.datasets.irds.data.Topics
<p> Fold 3 of <a class=”ds-ref”>msmarco-document/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-document.trec-dl-hard.fold3.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Fold 3 of <a class=”ds-ref”>msmarco-document/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-document.trec-dl-hard.fold3
datamaestro_text.datasets.irds.data.Adhoc
<p> Fold 3 of <a class=”ds-ref”>msmarco-document/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-document.trec-dl-hard.fold4.queries
datamaestro_text.datasets.irds.data.Topics
<p> Fold 4 of <a class=”ds-ref”>msmarco-document/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-document.trec-dl-hard.fold4.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Fold 4 of <a class=”ds-ref”>msmarco-document/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-document.trec-dl-hard.fold4
datamaestro_text.datasets.irds.data.Adhoc
<p> Fold 4 of <a class=”ds-ref”>msmarco-document/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-document.trec-dl-hard.fold5.queries
datamaestro_text.datasets.irds.data.Topics
<p> Fold 5 of <a class=”ds-ref”>msmarco-document/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-document.trec-dl-hard.fold5.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Fold 5 of <a class=”ds-ref”>msmarco-document/trec-dl-hard</a> </p>
-
Dataset irds.msmarco-document.trec-dl-hard.fold5
datamaestro_text.datasets.irds.data.Adhoc
<p> Fold 5 of <a class=”ds-ref”>msmarco-document/trec-dl-hard</a> </p>
Anchor Text for Version 1 of MS MARCO
<p> For version 1 of MS MARCO, the anchor text collection enriches 1,703,834 documents with anchor text extracted from six Common Crawl snapshots. To keep the collection size reasonable, we sampled 1,000 anchor texts for documents with more than 1,000 anchor texts (this sampling yields that all anchor text is included for 94% of the documents). The <code>text</code> field contains the anchor texts concatenated and the <code>anchors</code> field contains the anchor texts as list. The raw dataset with additional information (roughly 100GB) is <a href=”https://github.com/webis-de/ecir22-anchor-text”>available online</a>. </p>
-
Dataset irds.msmarco-document.anchor-text.documents
datamaestro_text.datasets.irds.data.Documents
<p> For version 1 of MS MARCO, the anchor text collection enriches 1,703,834 documents with anchor text extracted from six Common Crawl snapshots. To keep the collection size reasonable, we sampled 1,000 anchor texts for documents with more than 1,000 anchor texts (this sampling yields that all anchor text is included for 94% of the documents). The <code>text</code> field contains the anchor texts concatenated and the <code>anchors</code> field contains the anchor texts as list. The raw dataset with additional information (roughly 100GB) is <a href=”https://github.com/webis-de/ecir22-anchor-text”>available online</a>. </p>
MSMARCO (document, version 2)
<p> Version 2 of the MS MARCO document ranking dataset. The corpus contains 12M documents (roughly 3x as many as version 1). </p> <ul> <li>Version 1 of dataset: <a class=”ds-ref”>msmarco-document</a></li> <li>Documents: Text extracted from web pages</li> <li>Queries: Natural language questions (from query log)</li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> </ul>
-
Dataset irds.msmarco-document-v2.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version 2 of the MS MARCO document ranking dataset. The corpus contains 12M documents (roughly 3x as many as version 1). </p> <ul> <li>Version 1 of dataset: <a class=”ds-ref”>msmarco-document</a></li> <li>Documents: Text extracted from web pages</li> <li>Queries: Natural language questions (from query log)</li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> </ul>
-
Dataset irds.msmarco-document-v2.dev1.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official dev1 set with 4,552 queries. </p>
-
Dataset irds.msmarco-document-v2.dev1.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official dev1 set with 4,552 queries. </p>
-
Dataset irds.msmarco-document-v2.dev1.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official dev1 set with 4,552 queries. </p>
-
Dataset irds.msmarco-document-v2.dev1
datamaestro_text.datasets.irds.data.Adhoc
<p> Official dev1 set with 4,552 queries. </p>
-
Dataset irds.msmarco-document-v2.dev2.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official dev2 set with 5,000 queries. </p>
-
Dataset irds.msmarco-document-v2.dev2.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official dev2 set with 5,000 queries. </p>
-
Dataset irds.msmarco-document-v2.dev2.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official dev2 set with 5,000 queries. </p>
-
Dataset irds.msmarco-document-v2.dev2
datamaestro_text.datasets.irds.data.Adhoc
<p> Official dev2 set with 5,000 queries. </p>
-
Dataset irds.msmarco-document-v2.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official train set with 322,196 queries. </p>
-
Dataset irds.msmarco-document-v2.train.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official train set with 322,196 queries. </p>
-
Dataset irds.msmarco-document-v2.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official train set with 322,196 queries. </p>
-
Dataset irds.msmarco-document-v2.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Official train set with 322,196 queries. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2019.queries
datamaestro_text.datasets.irds.data.Topics
<p> Queries from the TREC Deep Learning (DL) 2019 shared task, which were sampled from <a class=”ds-ref”>msmarco-document/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-document-v2/trec-dl-2019/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2003.07820.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-document-v2.trec-dl-2019.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Queries from the TREC Deep Learning (DL) 2019 shared task, which were sampled from <a class=”ds-ref”>msmarco-document/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-document-v2/trec-dl-2019/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2003.07820.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-document-v2.trec-dl-2019
datamaestro_text.datasets.irds.data.Adhoc
<p> Queries from the TREC Deep Learning (DL) 2019 shared task, which were sampled from <a class=”ds-ref”>msmarco-document/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-document-v2/trec-dl-2019/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2003.07820.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-document-v2.trec-dl-2019.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of <a class=”ds-ref”>msmarco-document-v2/trec-dl-2019</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2019.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of <a class=”ds-ref”>msmarco-document-v2/trec-dl-2019</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2019.judged
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of <a class=”ds-ref”>msmarco-document-v2/trec-dl-2019</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2020.queries
datamaestro_text.datasets.irds.data.Topics
<p> Queries from the TREC Deep Learning (DL) 2020 shared task, which were sampled from <a class=”ds-ref”>msmarco-document/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-document-v2/trec-dl-2020/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2102.07662.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-document-v2.trec-dl-2020.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Queries from the TREC Deep Learning (DL) 2020 shared task, which were sampled from <a class=”ds-ref”>msmarco-document/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-document-v2/trec-dl-2020/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2102.07662.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-document-v2.trec-dl-2020
datamaestro_text.datasets.irds.data.Adhoc
<p> Queries from the TREC Deep Learning (DL) 2020 shared task, which were sampled from <a class=”ds-ref”>msmarco-document/eval</a>. A subset of these queries were judged by NIST assessors, (filtered list available in <a class=”ds-ref”>msmarco-document-v2/trec-dl-2020/judged</a>). </p> <ul> <li><a href=”https://arxiv.org/pdf/2102.07662.pdf”>Shared Task Paper</a></li> </ul>
-
Dataset irds.msmarco-document-v2.trec-dl-2020.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of <a class=”ds-ref”>msmarco-document-v2/trec-dl-2020</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2020.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of <a class=”ds-ref”>msmarco-document-v2/trec-dl-2020</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2020.judged
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of <a class=”ds-ref”>msmarco-document-v2/trec-dl-2020</a>, only including queries with qrels. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2021.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official topics for the TREC Deep Learning (DL) 2021 shared task. </p> <p> Note that at this time, qrels are only available to those with TREC active participant login credentials. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2021.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official topics for the TREC Deep Learning (DL) 2021 shared task. </p> <p> Note that at this time, qrels are only available to those with TREC active participant login credentials. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2021.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official topics for the TREC Deep Learning (DL) 2021 shared task. </p> <p> Note that at this time, qrels are only available to those with TREC active participant login credentials. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2021
datamaestro_text.datasets.irds.data.Adhoc
<p> Official topics for the TREC Deep Learning (DL) 2021 shared task. </p> <p> Note that at this time, qrels are only available to those with TREC active participant login credentials. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2021.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p> <a class=”ds-ref”>msmarco-document-v2/trec-dl-2021</a>, but filtered down to the 57 queries with qrels. </p> <p> Note that at this time, this is only available to those with TREC active participant login credentials. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2021.judged.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> <a class=”ds-ref”>msmarco-document-v2/trec-dl-2021</a>, but filtered down to the 57 queries with qrels. </p> <p> Note that at this time, this is only available to those with TREC active participant login credentials. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2021.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> <a class=”ds-ref”>msmarco-document-v2/trec-dl-2021</a>, but filtered down to the 57 queries with qrels. </p> <p> Note that at this time, this is only available to those with TREC active participant login credentials. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2021.judged
datamaestro_text.datasets.irds.data.Adhoc
<p> <a class=”ds-ref”>msmarco-document-v2/trec-dl-2021</a>, but filtered down to the 57 queries with qrels. </p> <p> Note that at this time, this is only available to those with TREC active participant login credentials. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2022.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official topics for the TREC Deep Learning (DL) 2022 shared task. </p> <p> Note that these qrels are <i>inferred</i> from the passage ranking task; a document’s relevance label is the maximum of the labels of its passages. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2022.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official topics for the TREC Deep Learning (DL) 2022 shared task. </p> <p> Note that these qrels are <i>inferred</i> from the passage ranking task; a document’s relevance label is the maximum of the labels of its passages. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2022.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official topics for the TREC Deep Learning (DL) 2022 shared task. </p> <p> Note that these qrels are <i>inferred</i> from the passage ranking task; a document’s relevance label is the maximum of the labels of its passages. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2022
datamaestro_text.datasets.irds.data.Adhoc
<p> Official topics for the TREC Deep Learning (DL) 2022 shared task. </p> <p> Note that these qrels are <i>inferred</i> from the passage ranking task; a document’s relevance label is the maximum of the labels of its passages. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2022.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p> <a class=”ds-ref”>msmarco-document-v2/trec-dl-2022</a>, but filtered down to only the queries with qrels. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2022.judged.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> <a class=”ds-ref”>msmarco-document-v2/trec-dl-2022</a>, but filtered down to only the queries with qrels. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2022.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> <a class=”ds-ref”>msmarco-document-v2/trec-dl-2022</a>, but filtered down to only the queries with qrels. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2022.judged
datamaestro_text.datasets.irds.data.Adhoc
<p> <a class=”ds-ref”>msmarco-document-v2/trec-dl-2022</a>, but filtered down to only the queries with qrels. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2023.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official topics for the TREC Deep Learning (DL) 2023 shared task. </p>
-
Dataset irds.msmarco-document-v2.trec-dl-2023.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official topics for the TREC Deep Learning (DL) 2023 shared task. </p>
Anchor Text for version 2 of MS Marco
<p> For version 2 of MS MARCO, the anchor text collection enriches 4,821,244 documents with anchor text extracted from six Common Crawl snapshots. To keep the collection size reasonable, we sampled 1,000 anchor texts for documents with more than 1,000 anchor texts (this sampling yields that all anchor text is included for 97% of the documents). The <code>text</code> field contains the anchor texts concatenated and the <code>anchors</code> field contains the anchor texts as list. The raw dataset with additional information (roughly 100GB) is <a href=”https://github.com/webis-de/ecir22-anchor-text”>available online</a>. </p>
-
Dataset irds.msmarco-document-v2.anchor-text.documents
datamaestro_text.datasets.irds.data.Documents
<p> For version 2 of MS MARCO, the anchor text collection enriches 4,821,244 documents with anchor text extracted from six Common Crawl snapshots. To keep the collection size reasonable, we sampled 1,000 anchor texts for documents with more than 1,000 anchor texts (this sampling yields that all anchor text is included for 97% of the documents). The <code>text</code> field contains the anchor texts concatenated and the <code>anchors</code> field contains the anchor texts as list. The raw dataset with additional information (roughly 100GB) is <a href=”https://github.com/webis-de/ecir22-anchor-text”>available online</a>. </p>
MSMARCO (passage, version 2)
<p> Version 2 of the MS MARCO passage ranking dataset. The corpus contains 138M passages, which can be linked up with documents in <a class=”ds-ref”>msmarco-document-v2</a>. </p> <ul> <li>Version 1 of dataset: <a class=”ds-ref”>msmarco-passage</a></li> <li>Documents: Text extracted from web pages</li> <li>Queries: Natural language questions (from query log)</li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> </ul> <p> Change Log </p> <ul> <li> On July 21, 2021, the task organizers <a href=”https://github.com/microsoft/msmarco/commit/41b3a684ed8ebd4e753250c3687547a77c62e7dd”> updated the train, dev1, and dev2 qrels</a> to remove duplicate entries from the files. This should not have change results from evaluation tools, but may result in non-repeatable results if these files were used in another process (e.g., model training). The original qrels file for <a class=”ds-ref”>msmarco-passage-v2/train</a> can be found <a href=”https://mirror.ir-datasets.com/abf1fd024b6aca203364d2138c241a6d”>here</a> to aid in result repeatability. </li> </ul>
-
Dataset irds.msmarco-passage-v2.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version 2 of the MS MARCO passage ranking dataset. The corpus contains 138M passages, which can be linked up with documents in <a class=”ds-ref”>msmarco-document-v2</a>. </p> <ul> <li>Version 1 of dataset: <a class=”ds-ref”>msmarco-passage</a></li> <li>Documents: Text extracted from web pages</li> <li>Queries: Natural language questions (from query log)</li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> </ul> <p> Change Log </p> <ul> <li> On July 21, 2021, the task organizers <a href=”https://github.com/microsoft/msmarco/commit/41b3a684ed8ebd4e753250c3687547a77c62e7dd”> updated the train, dev1, and dev2 qrels</a> to remove duplicate entries from the files. This should not have change results from evaluation tools, but may result in non-repeatable results if these files were used in another process (e.g., model training). The original qrels file for <a class=”ds-ref”>msmarco-passage-v2/train</a> can be found <a href=”https://mirror.ir-datasets.com/abf1fd024b6aca203364d2138c241a6d”>here</a> to aid in result repeatability. </li> </ul>
-
Dataset irds.msmarco-passage-v2.dev1.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official dev1 set with 3,903 queries. </p> <p> Note that that qrels in this dataset are not directly human-assessed; labels from <a class=”ds-ref”>msmarco-passage</a> are mapped to documents via URL, these documents are re-passaged, and then the best approximate match is identified. </p>
-
Dataset irds.msmarco-passage-v2.dev1.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official dev1 set with 3,903 queries. </p> <p> Note that that qrels in this dataset are not directly human-assessed; labels from <a class=”ds-ref”>msmarco-passage</a> are mapped to documents via URL, these documents are re-passaged, and then the best approximate match is identified. </p>
-
Dataset irds.msmarco-passage-v2.dev1.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official dev1 set with 3,903 queries. </p> <p> Note that that qrels in this dataset are not directly human-assessed; labels from <a class=”ds-ref”>msmarco-passage</a> are mapped to documents via URL, these documents are re-passaged, and then the best approximate match is identified. </p>
-
Dataset irds.msmarco-passage-v2.dev1
datamaestro_text.datasets.irds.data.Adhoc
<p> Official dev1 set with 3,903 queries. </p> <p> Note that that qrels in this dataset are not directly human-assessed; labels from <a class=”ds-ref”>msmarco-passage</a> are mapped to documents via URL, these documents are re-passaged, and then the best approximate match is identified. </p>
-
Dataset irds.msmarco-passage-v2.dev2.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official dev2 set with 4,281 queries. </p> <p> Note that that qrels in this dataset are not directly human-assessed; labels from <a class=”ds-ref”>msmarco-passage</a> are mapped to documents via URL, these documents are re-passaged, and then the best approximate match is identified. </p>
-
Dataset irds.msmarco-passage-v2.dev2.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official dev2 set with 4,281 queries. </p> <p> Note that that qrels in this dataset are not directly human-assessed; labels from <a class=”ds-ref”>msmarco-passage</a> are mapped to documents via URL, these documents are re-passaged, and then the best approximate match is identified. </p>
-
Dataset irds.msmarco-passage-v2.dev2.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official dev2 set with 4,281 queries. </p> <p> Note that that qrels in this dataset are not directly human-assessed; labels from <a class=”ds-ref”>msmarco-passage</a> are mapped to documents via URL, these documents are re-passaged, and then the best approximate match is identified. </p>
-
Dataset irds.msmarco-passage-v2.dev2
datamaestro_text.datasets.irds.data.Adhoc
<p> Official dev2 set with 4,281 queries. </p> <p> Note that that qrels in this dataset are not directly human-assessed; labels from <a class=”ds-ref”>msmarco-passage</a> are mapped to documents via URL, these documents are re-passaged, and then the best approximate match is identified. </p>
-
Dataset irds.msmarco-passage-v2.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official train set with 277,144 queries. </p>
-
Dataset irds.msmarco-passage-v2.train.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official train set with 277,144 queries. </p>
-
Dataset irds.msmarco-passage-v2.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official train set with 277,144 queries. </p>
-
Dataset irds.msmarco-passage-v2.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Official train set with 277,144 queries. </p>
-
Dataset irds.msmarco-passage-v2.trec-dl-2021.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official topics for the TREC Deep Learning (DL) 2021 shared task. </p>
-
Dataset irds.msmarco-passage-v2.trec-dl-2021.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official topics for the TREC Deep Learning (DL) 2021 shared task. </p>
-
Dataset irds.msmarco-passage-v2.trec-dl-2021.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official topics for the TREC Deep Learning (DL) 2021 shared task. </p>
-
Dataset irds.msmarco-passage-v2.trec-dl-2021
datamaestro_text.datasets.irds.data.Adhoc
<p> Official topics for the TREC Deep Learning (DL) 2021 shared task. </p>
-
Dataset irds.msmarco-passage-v2.trec-dl-2021.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p> <a class=”ds-ref”>msmarco-passage-v2/trec-dl-2021</a>, but filtered down to the 53 queries with qrels. </p>
-
Dataset irds.msmarco-passage-v2.trec-dl-2021.judged.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> <a class=”ds-ref”>msmarco-passage-v2/trec-dl-2021</a>, but filtered down to the 53 queries with qrels. </p>
-
Dataset irds.msmarco-passage-v2.trec-dl-2021.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> <a class=”ds-ref”>msmarco-passage-v2/trec-dl-2021</a>, but filtered down to the 53 queries with qrels. </p>
-
Dataset irds.msmarco-passage-v2.trec-dl-2021.judged
datamaestro_text.datasets.irds.data.Adhoc
<p> <a class=”ds-ref”>msmarco-passage-v2/trec-dl-2021</a>, but filtered down to the 53 queries with qrels. </p>
-
Dataset irds.msmarco-passage-v2.trec-dl-2022.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official topics for the TREC Deep Learning (DL) 2022 shared task. </p> <p> Note that the officially-released qrels <i>include</i> relevance labels propagated to duplicate passages, while results presented in the notebook papers remove duplicate documents. This means that the results are not directly comparable, and extra care should be taken when making comparisions among systems to ensure that they were evaluated in the same settings. </p>
-
Dataset irds.msmarco-passage-v2.trec-dl-2022.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official topics for the TREC Deep Learning (DL) 2022 shared task. </p> <p> Note that the officially-released qrels <i>include</i> relevance labels propagated to duplicate passages, while results presented in the notebook papers remove duplicate documents. This means that the results are not directly comparable, and extra care should be taken when making comparisions among systems to ensure that they were evaluated in the same settings. </p>
-
Dataset irds.msmarco-passage-v2.trec-dl-2022.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official topics for the TREC Deep Learning (DL) 2022 shared task. </p> <p> Note that the officially-released qrels <i>include</i> relevance labels propagated to duplicate passages, while results presented in the notebook papers remove duplicate documents. This means that the results are not directly comparable, and extra care should be taken when making comparisions among systems to ensure that they were evaluated in the same settings. </p>
-
Dataset irds.msmarco-passage-v2.trec-dl-2022
datamaestro_text.datasets.irds.data.Adhoc
<p> Official topics for the TREC Deep Learning (DL) 2022 shared task. </p> <p> Note that the officially-released qrels <i>include</i> relevance labels propagated to duplicate passages, while results presented in the notebook papers remove duplicate documents. This means that the results are not directly comparable, and extra care should be taken when making comparisions among systems to ensure that they were evaluated in the same settings. </p>
-
Dataset irds.msmarco-passage-v2.trec-dl-2022.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p> <a class=”ds-ref”>msmarco-passage-v2/trec-dl-2022</a>, but filtered down to only the queries with qrels. </p>
-
Dataset irds.msmarco-passage-v2.trec-dl-2022.judged.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> <a class=”ds-ref”>msmarco-passage-v2/trec-dl-2022</a>, but filtered down to only the queries with qrels. </p>
-
Dataset irds.msmarco-passage-v2.trec-dl-2022.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> <a class=”ds-ref”>msmarco-passage-v2/trec-dl-2022</a>, but filtered down to only the queries with qrels. </p>
-
Dataset irds.msmarco-passage-v2.trec-dl-2022.judged
datamaestro_text.datasets.irds.data.Adhoc
<p> <a class=”ds-ref”>msmarco-passage-v2/trec-dl-2022</a>, but filtered down to only the queries with qrels. </p>
-
Dataset irds.msmarco-passage-v2.trec-dl-2023.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official topics for the TREC Deep Learning (DL) 2023 shared task. </p>
-
Dataset irds.msmarco-passage-v2.trec-dl-2023.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official topics for the TREC Deep Learning (DL) 2023 shared task. </p>
msmarco-passage-v2/dedup
-
Dataset irds.msmarco-passage-v2.dedup.documents
MSMARCO (QnA)
<p> The MS MARCO Question Answering dataset. This is the source collection of <a class=”ds-ref”>msmarco-passage</a> and <a class=”ds-ref”>msmarco-document</a>. </p> <div class=”warn”> It is prohibited to use information from this dataset for submissions to the MS MARCO passage and document leaderboards or the TREC DL shared task. </div> <p> Query IDs in this collection align with those found in <a class=”ds-ref”>msmarco-passage</a> and <a class=”ds-ref”>msmarco-document</a>. The collection does not provide doc_ids, so these are assigned in the following format: <code>[msmarco_passage_id]-[url_seq]</code>, where <code>[msmarco_passage_id]</code> is the document from <a class=”ds-ref”>msmarco-passage</a> that has matching contents and <code>[url_seq]</code> is assigned sequentially for each URL encountered. In other words, all documents with the same prefix have the same text; they only differ in the originating document. </p> <p> Doc <code>msmarco_passage_id</code> fields are assigned by matching pasasge contents in <a class=”ds-ref”>msmarco-passage</a>, and this field is provided for every document. Doc <code>msmarco_document_id</code> fields are assigned by matching the URL to the one found in <a class=”ds-ref”>msmarco-document</a>. Due to how <a class=”ds-ref”>msmarco-document</a> was constructed, there is not necessarily a match (value will be <code class=”kwd”>None</code> if no match). </p> <ul> <li>Documents: Short passages (from web)</li> <li>Queries: Natural language questions (from query log), including type and natural-language answers.</li> <li><a href=”https://microsoft.github.io/msmarco/#qna”>Leaderboard</a></li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> <li><a href=”https://github.com/microsoft/MSMARCO-Question-Answering”>More information</a></li> </ul>
-
Dataset irds.msmarco-qna.documents
datamaestro_text.datasets.irds.data.Documents
<p> The MS MARCO Question Answering dataset. This is the source collection of <a class=”ds-ref”>msmarco-passage</a> and <a class=”ds-ref”>msmarco-document</a>. </p> <div class=”warn”> It is prohibited to use information from this dataset for submissions to the MS MARCO passage and document leaderboards or the TREC DL shared task. </div> <p> Query IDs in this collection align with those found in <a class=”ds-ref”>msmarco-passage</a> and <a class=”ds-ref”>msmarco-document</a>. The collection does not provide doc_ids, so these are assigned in the following format: <code>[msmarco_passage_id]-[url_seq]</code>, where <code>[msmarco_passage_id]</code> is the document from <a class=”ds-ref”>msmarco-passage</a> that has matching contents and <code>[url_seq]</code> is assigned sequentially for each URL encountered. In other words, all documents with the same prefix have the same text; they only differ in the originating document. </p> <p> Doc <code>msmarco_passage_id</code> fields are assigned by matching pasasge contents in <a class=”ds-ref”>msmarco-passage</a>, and this field is provided for every document. Doc <code>msmarco_document_id</code> fields are assigned by matching the URL to the one found in <a class=”ds-ref”>msmarco-document</a>. Due to how <a class=”ds-ref”>msmarco-document</a> was constructed, there is not necessarily a match (value will be <code class=”kwd”>None</code> if no match). </p> <ul> <li>Documents: Short passages (from web)</li> <li>Queries: Natural language questions (from query log), including type and natural-language answers.</li> <li><a href=”https://microsoft.github.io/msmarco/#qna”>Leaderboard</a></li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> <li><a href=”https://github.com/microsoft/MSMARCO-Question-Answering”>More information</a></li> </ul>
-
Dataset irds.msmarco-qna.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official dev set. </p> <p> The scoreddocs provides the roughtly 10 passages presented to the user for annotation, where the score indicates the order presented. </p>
-
Dataset irds.msmarco-qna.dev.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official dev set. </p> <p> The scoreddocs provides the roughtly 10 passages presented to the user for annotation, where the score indicates the order presented. </p>
-
Dataset irds.msmarco-qna.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official dev set. </p> <p> The scoreddocs provides the roughtly 10 passages presented to the user for annotation, where the score indicates the order presented. </p>
-
Dataset irds.msmarco-qna.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Official dev set. </p> <p> The scoreddocs provides the roughtly 10 passages presented to the user for annotation, where the score indicates the order presented. </p>
-
Dataset irds.msmarco-qna.eval.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official eval set. </p> <p> The scoreddocs provides the roughtly 10 passages presented to the user for annotation, where the score indicates the order presented. </p>
-
Dataset irds.msmarco-qna.eval.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official eval set. </p> <p> The scoreddocs provides the roughtly 10 passages presented to the user for annotation, where the score indicates the order presented. </p>
-
Dataset irds.msmarco-qna.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official train set. </p> <p> The scoreddocs provides the roughtly 10 passages presented to the user for annotation, where the score indicates the order presented. </p>
-
Dataset irds.msmarco-qna.train.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official train set. </p> <p> The scoreddocs provides the roughtly 10 passages presented to the user for annotation, where the score indicates the order presented. </p>
-
Dataset irds.msmarco-qna.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official train set. </p> <p> The scoreddocs provides the roughtly 10 passages presented to the user for annotation, where the score indicates the order presented. </p>
-
Dataset irds.msmarco-qna.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Official train set. </p> <p> The scoreddocs provides the roughtly 10 passages presented to the user for annotation, where the score indicates the order presented. </p>
nano-beir/arguana
<p> A version of the ArguAna Counterargs dataset, for argument retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/P18-1023.pdf”>Dataset paper</a></li> <li><a href=”http://argumentation.bplaced.net/arguana/data”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.arguana.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the ArguAna Counterargs dataset, for argument retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/P18-1023.pdf”>Dataset paper</a></li> <li><a href=”http://argumentation.bplaced.net/arguana/data”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.arguana.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the ArguAna Counterargs dataset, for argument retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/P18-1023.pdf”>Dataset paper</a></li> <li><a href=”http://argumentation.bplaced.net/arguana/data”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.arguana.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the ArguAna Counterargs dataset, for argument retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/P18-1023.pdf”>Dataset paper</a></li> <li><a href=”http://argumentation.bplaced.net/arguana/data”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.arguana
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the ArguAna Counterargs dataset, for argument retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/P18-1023.pdf”>Dataset paper</a></li> <li><a href=”http://argumentation.bplaced.net/arguana/data”>Dataset website</a></li> </ul>
nano-beir/climate-fever
<p> A version of the CLIMATE-FEVER dataset, for fact verification on claims about climate. </p> <ul> <li><a href=”https://arxiv.org/pdf/2012.00614.pdf”>Dataset paper</a></li> <li><a href=”https://www.sustainablefinance.uzh.ch/en/research/climate-fever.html”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.climate-fever.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the CLIMATE-FEVER dataset, for fact verification on claims about climate. </p> <ul> <li><a href=”https://arxiv.org/pdf/2012.00614.pdf”>Dataset paper</a></li> <li><a href=”https://www.sustainablefinance.uzh.ch/en/research/climate-fever.html”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.climate-fever.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the CLIMATE-FEVER dataset, for fact verification on claims about climate. </p> <ul> <li><a href=”https://arxiv.org/pdf/2012.00614.pdf”>Dataset paper</a></li> <li><a href=”https://www.sustainablefinance.uzh.ch/en/research/climate-fever.html”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.climate-fever.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the CLIMATE-FEVER dataset, for fact verification on claims about climate. </p> <ul> <li><a href=”https://arxiv.org/pdf/2012.00614.pdf”>Dataset paper</a></li> <li><a href=”https://www.sustainablefinance.uzh.ch/en/research/climate-fever.html”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.climate-fever
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the CLIMATE-FEVER dataset, for fact verification on claims about climate. </p> <ul> <li><a href=”https://arxiv.org/pdf/2012.00614.pdf”>Dataset paper</a></li> <li><a href=”https://www.sustainablefinance.uzh.ch/en/research/climate-fever.html”>Dataset website</a></li> </ul>
nano-beir/dbpedia-entity
<p> A version of the DBPedia-Entity-v2 dataset for entity retrieval. </p> <ul> <li><a href=”http://hasibi.com/files/sigir2017-dbpedia_entity.pdf”>Dataset paper</a></li> <li><a href=”https://github.com/iai-group/DBpedia-Entity”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.dbpedia-entity.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the DBPedia-Entity-v2 dataset for entity retrieval. </p> <ul> <li><a href=”http://hasibi.com/files/sigir2017-dbpedia_entity.pdf”>Dataset paper</a></li> <li><a href=”https://github.com/iai-group/DBpedia-Entity”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.dbpedia-entity.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the DBPedia-Entity-v2 dataset for entity retrieval. </p> <ul> <li><a href=”http://hasibi.com/files/sigir2017-dbpedia_entity.pdf”>Dataset paper</a></li> <li><a href=”https://github.com/iai-group/DBpedia-Entity”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.dbpedia-entity.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the DBPedia-Entity-v2 dataset for entity retrieval. </p> <ul> <li><a href=”http://hasibi.com/files/sigir2017-dbpedia_entity.pdf”>Dataset paper</a></li> <li><a href=”https://github.com/iai-group/DBpedia-Entity”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.dbpedia-entity
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the DBPedia-Entity-v2 dataset for entity retrieval. </p> <ul> <li><a href=”http://hasibi.com/files/sigir2017-dbpedia_entity.pdf”>Dataset paper</a></li> <li><a href=”https://github.com/iai-group/DBpedia-Entity”>Dataset website</a></li> </ul>
nano-beir/fever
<p> A version of the FEVER dataset for fact verification. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/N18-1074.pdf”>Dataset paper</a></li> <li><a href=”https://fever.ai/resources.html”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.fever.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the FEVER dataset for fact verification. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/N18-1074.pdf”>Dataset paper</a></li> <li><a href=”https://fever.ai/resources.html”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.fever.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the FEVER dataset for fact verification. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/N18-1074.pdf”>Dataset paper</a></li> <li><a href=”https://fever.ai/resources.html”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.fever.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the FEVER dataset for fact verification. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/N18-1074.pdf”>Dataset paper</a></li> <li><a href=”https://fever.ai/resources.html”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.fever
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the FEVER dataset for fact verification. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/N18-1074.pdf”>Dataset paper</a></li> <li><a href=”https://fever.ai/resources.html”>Dataset website</a></li> </ul>
nano-beir/fiqa
<p> A version of the FIQA-2018 dataset (financial opinion question answering). </p> <ul> <li><a href=”https://dl.acm.org/doi/10.1145/3184558.3192301”>Dataset paper</a></li> <li><a href=”https://sites.google.com/view/fiqa/home”>Dataset site</a></li> </ul>
-
Dataset irds.nano-beir.fiqa.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the FIQA-2018 dataset (financial opinion question answering). </p> <ul> <li><a href=”https://dl.acm.org/doi/10.1145/3184558.3192301”>Dataset paper</a></li> <li><a href=”https://sites.google.com/view/fiqa/home”>Dataset site</a></li> </ul>
-
Dataset irds.nano-beir.fiqa.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the FIQA-2018 dataset (financial opinion question answering). </p> <ul> <li><a href=”https://dl.acm.org/doi/10.1145/3184558.3192301”>Dataset paper</a></li> <li><a href=”https://sites.google.com/view/fiqa/home”>Dataset site</a></li> </ul>
-
Dataset irds.nano-beir.fiqa.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the FIQA-2018 dataset (financial opinion question answering). </p> <ul> <li><a href=”https://dl.acm.org/doi/10.1145/3184558.3192301”>Dataset paper</a></li> <li><a href=”https://sites.google.com/view/fiqa/home”>Dataset site</a></li> </ul>
-
Dataset irds.nano-beir.fiqa
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the FIQA-2018 dataset (financial opinion question answering). </p> <ul> <li><a href=”https://dl.acm.org/doi/10.1145/3184558.3192301”>Dataset paper</a></li> <li><a href=”https://sites.google.com/view/fiqa/home”>Dataset site</a></li> </ul>
nano-beir/hotpotqa
<p> A version of the Hotpot QA dataset for multi-hop question answering. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/D18-1259”>Dataset paper</a></li> <li><a href=”https://github.com/hotpotqa/hotpot”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.hotpotqa.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the Hotpot QA dataset for multi-hop question answering. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/D18-1259”>Dataset paper</a></li> <li><a href=”https://github.com/hotpotqa/hotpot”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.hotpotqa.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the Hotpot QA dataset for multi-hop question answering. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/D18-1259”>Dataset paper</a></li> <li><a href=”https://github.com/hotpotqa/hotpot”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.hotpotqa.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the Hotpot QA dataset for multi-hop question answering. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/D18-1259”>Dataset paper</a></li> <li><a href=”https://github.com/hotpotqa/hotpot”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.hotpotqa
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the Hotpot QA dataset for multi-hop question answering. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/D18-1259”>Dataset paper</a></li> <li><a href=”https://github.com/hotpotqa/hotpot”>Dataset website</a></li> </ul>
nano-beir/msmarco
<p> A version of the MS MARCO passage ranking dataset. </p> <p> Note that this version differs from <a class=”ds-ref”>msmarco-passage</a>, in that it does not correct the encoding problems in the source documents. </p> <ul> <li><a href=”https://microsoft.github.io/msmarco/#ranking”>Leaderboard</a></li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> <li>See also: <a class=”ds-ref”>msmarco-passage</a></li> </ul>
-
Dataset irds.nano-beir.msmarco.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the MS MARCO passage ranking dataset. </p> <p> Note that this version differs from <a class=”ds-ref”>msmarco-passage</a>, in that it does not correct the encoding problems in the source documents. </p> <ul> <li><a href=”https://microsoft.github.io/msmarco/#ranking”>Leaderboard</a></li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> <li>See also: <a class=”ds-ref”>msmarco-passage</a></li> </ul>
-
Dataset irds.nano-beir.msmarco.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the MS MARCO passage ranking dataset. </p> <p> Note that this version differs from <a class=”ds-ref”>msmarco-passage</a>, in that it does not correct the encoding problems in the source documents. </p> <ul> <li><a href=”https://microsoft.github.io/msmarco/#ranking”>Leaderboard</a></li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> <li>See also: <a class=”ds-ref”>msmarco-passage</a></li> </ul>
-
Dataset irds.nano-beir.msmarco.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the MS MARCO passage ranking dataset. </p> <p> Note that this version differs from <a class=”ds-ref”>msmarco-passage</a>, in that it does not correct the encoding problems in the source documents. </p> <ul> <li><a href=”https://microsoft.github.io/msmarco/#ranking”>Leaderboard</a></li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> <li>See also: <a class=”ds-ref”>msmarco-passage</a></li> </ul>
-
Dataset irds.nano-beir.msmarco
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the MS MARCO passage ranking dataset. </p> <p> Note that this version differs from <a class=”ds-ref”>msmarco-passage</a>, in that it does not correct the encoding problems in the source documents. </p> <ul> <li><a href=”https://microsoft.github.io/msmarco/#ranking”>Leaderboard</a></li> <li><a href=”https://arxiv.org/abs/1611.09268”>Dataset Paper</a></li> <li>See also: <a class=”ds-ref”>msmarco-passage</a></li> </ul>
nano-beir/nfcorpus
<p> A version of the NF Corpus (Nutrition Facts). </p> <p> Data pre-processing may be different than what is done in <a class=”ds-ref”>nfcorpus</a>. </p> <ul> <li><a href=”https://www.cl.uni-heidelberg.de/statnlpgroup/nfcorpus/”>Dataset website</p></li> <li><a href=”https://link.springer.com/chapter/10.1007/978-3-319-30671-1_58”>Dataset paper</p></li> <li>See also: <a class=”ds-ref”>nfcorpus</a></li> </ul>
-
Dataset irds.nano-beir.nfcorpus.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the NF Corpus (Nutrition Facts). </p> <p> Data pre-processing may be different than what is done in <a class=”ds-ref”>nfcorpus</a>. </p> <ul> <li><a href=”https://www.cl.uni-heidelberg.de/statnlpgroup/nfcorpus/”>Dataset website</p></li> <li><a href=”https://link.springer.com/chapter/10.1007/978-3-319-30671-1_58”>Dataset paper</p></li> <li>See also: <a class=”ds-ref”>nfcorpus</a></li> </ul>
-
Dataset irds.nano-beir.nfcorpus.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the NF Corpus (Nutrition Facts). </p> <p> Data pre-processing may be different than what is done in <a class=”ds-ref”>nfcorpus</a>. </p> <ul> <li><a href=”https://www.cl.uni-heidelberg.de/statnlpgroup/nfcorpus/”>Dataset website</p></li> <li><a href=”https://link.springer.com/chapter/10.1007/978-3-319-30671-1_58”>Dataset paper</p></li> <li>See also: <a class=”ds-ref”>nfcorpus</a></li> </ul>
-
Dataset irds.nano-beir.nfcorpus.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the NF Corpus (Nutrition Facts). </p> <p> Data pre-processing may be different than what is done in <a class=”ds-ref”>nfcorpus</a>. </p> <ul> <li><a href=”https://www.cl.uni-heidelberg.de/statnlpgroup/nfcorpus/”>Dataset website</p></li> <li><a href=”https://link.springer.com/chapter/10.1007/978-3-319-30671-1_58”>Dataset paper</p></li> <li>See also: <a class=”ds-ref”>nfcorpus</a></li> </ul>
-
Dataset irds.nano-beir.nfcorpus
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the NF Corpus (Nutrition Facts). </p> <p> Data pre-processing may be different than what is done in <a class=”ds-ref”>nfcorpus</a>. </p> <ul> <li><a href=”https://www.cl.uni-heidelberg.de/statnlpgroup/nfcorpus/”>Dataset website</p></li> <li><a href=”https://link.springer.com/chapter/10.1007/978-3-319-30671-1_58”>Dataset paper</p></li> <li>See also: <a class=”ds-ref”>nfcorpus</a></li> </ul>
nano-beir/nq
<p> A version of the Natural Questions dev dataset. </p> <p> Data pre-processing differs both from what is done in <a class=”ds-ref”>natural-questions</a> and <a class=”ds-ref”>dpr-w100/natural-questions</a>, especially with respect to the document collection and filtering conducted on the queries. See the Beir paper for details. </p> <ul> <li><a href=”https://ai.google.com/research/NaturalQuestions”>Dataset website</a></li> <li><a href=”https://storage.googleapis.com/pub-tools-public-publication-data/pdf/1f7b46b5378d757553d3e92ead36bda2e4254244.pdf”>Dataset paper</a></li> <li>See also: <a class=”ds-ref”>natural-questions</a>, <a class=”ds-ref”>dpr-w100/natural-questions</a></li> </ul>
-
Dataset irds.nano-beir.nq.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the Natural Questions dev dataset. </p> <p> Data pre-processing differs both from what is done in <a class=”ds-ref”>natural-questions</a> and <a class=”ds-ref”>dpr-w100/natural-questions</a>, especially with respect to the document collection and filtering conducted on the queries. See the Beir paper for details. </p> <ul> <li><a href=”https://ai.google.com/research/NaturalQuestions”>Dataset website</a></li> <li><a href=”https://storage.googleapis.com/pub-tools-public-publication-data/pdf/1f7b46b5378d757553d3e92ead36bda2e4254244.pdf”>Dataset paper</a></li> <li>See also: <a class=”ds-ref”>natural-questions</a>, <a class=”ds-ref”>dpr-w100/natural-questions</a></li> </ul>
-
Dataset irds.nano-beir.nq.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the Natural Questions dev dataset. </p> <p> Data pre-processing differs both from what is done in <a class=”ds-ref”>natural-questions</a> and <a class=”ds-ref”>dpr-w100/natural-questions</a>, especially with respect to the document collection and filtering conducted on the queries. See the Beir paper for details. </p> <ul> <li><a href=”https://ai.google.com/research/NaturalQuestions”>Dataset website</a></li> <li><a href=”https://storage.googleapis.com/pub-tools-public-publication-data/pdf/1f7b46b5378d757553d3e92ead36bda2e4254244.pdf”>Dataset paper</a></li> <li>See also: <a class=”ds-ref”>natural-questions</a>, <a class=”ds-ref”>dpr-w100/natural-questions</a></li> </ul>
-
Dataset irds.nano-beir.nq.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the Natural Questions dev dataset. </p> <p> Data pre-processing differs both from what is done in <a class=”ds-ref”>natural-questions</a> and <a class=”ds-ref”>dpr-w100/natural-questions</a>, especially with respect to the document collection and filtering conducted on the queries. See the Beir paper for details. </p> <ul> <li><a href=”https://ai.google.com/research/NaturalQuestions”>Dataset website</a></li> <li><a href=”https://storage.googleapis.com/pub-tools-public-publication-data/pdf/1f7b46b5378d757553d3e92ead36bda2e4254244.pdf”>Dataset paper</a></li> <li>See also: <a class=”ds-ref”>natural-questions</a>, <a class=”ds-ref”>dpr-w100/natural-questions</a></li> </ul>
-
Dataset irds.nano-beir.nq
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the Natural Questions dev dataset. </p> <p> Data pre-processing differs both from what is done in <a class=”ds-ref”>natural-questions</a> and <a class=”ds-ref”>dpr-w100/natural-questions</a>, especially with respect to the document collection and filtering conducted on the queries. See the Beir paper for details. </p> <ul> <li><a href=”https://ai.google.com/research/NaturalQuestions”>Dataset website</a></li> <li><a href=”https://storage.googleapis.com/pub-tools-public-publication-data/pdf/1f7b46b5378d757553d3e92ead36bda2e4254244.pdf”>Dataset paper</a></li> <li>See also: <a class=”ds-ref”>natural-questions</a>, <a class=”ds-ref”>dpr-w100/natural-questions</a></li> </ul>
nano-beir/quora
<p> A version of the Quora duplicate question detection dataset (QQP). </p> <ul> <li><a href=”https://www.kaggle.com/c/quora-question-pairs”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.quora.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the Quora duplicate question detection dataset (QQP). </p> <ul> <li><a href=”https://www.kaggle.com/c/quora-question-pairs”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.quora.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the Quora duplicate question detection dataset (QQP). </p> <ul> <li><a href=”https://www.kaggle.com/c/quora-question-pairs”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.quora.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the Quora duplicate question detection dataset (QQP). </p> <ul> <li><a href=”https://www.kaggle.com/c/quora-question-pairs”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.quora
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the Quora duplicate question detection dataset (QQP). </p> <ul> <li><a href=”https://www.kaggle.com/c/quora-question-pairs”>Dataset website</a></li> </ul>
nano-beir/scidocs
<p> A version of the SciDocs dataset, used for citation retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/2020.acl-main.207.pdf”>Dataset paper</a></li> <li><a href=”https://allenai.org/data/scidocs”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.scidocs.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the SciDocs dataset, used for citation retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/2020.acl-main.207.pdf”>Dataset paper</a></li> <li><a href=”https://allenai.org/data/scidocs”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.scidocs.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the SciDocs dataset, used for citation retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/2020.acl-main.207.pdf”>Dataset paper</a></li> <li><a href=”https://allenai.org/data/scidocs”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.scidocs.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the SciDocs dataset, used for citation retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/2020.acl-main.207.pdf”>Dataset paper</a></li> <li><a href=”https://allenai.org/data/scidocs”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.scidocs
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the SciDocs dataset, used for citation retrieval. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/2020.acl-main.207.pdf”>Dataset paper</a></li> <li><a href=”https://allenai.org/data/scidocs”>Dataset website</a></li> </ul>
nano-beir/scifact
<p> A version of the SciFact dataset, for fact verification. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/2020.emnlp-main.609.pdf”>Dataset paper</a></li> <li><a href=”https://www.aclweb.org/anthology/2020.emnlp-main.609.pdf”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.scifact.documents
datamaestro_text.datasets.irds.data.Documents
<p> A version of the SciFact dataset, for fact verification. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/2020.emnlp-main.609.pdf”>Dataset paper</a></li> <li><a href=”https://www.aclweb.org/anthology/2020.emnlp-main.609.pdf”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.scifact.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of the SciFact dataset, for fact verification. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/2020.emnlp-main.609.pdf”>Dataset paper</a></li> <li><a href=”https://www.aclweb.org/anthology/2020.emnlp-main.609.pdf”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.scifact.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of the SciFact dataset, for fact verification. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/2020.emnlp-main.609.pdf”>Dataset paper</a></li> <li><a href=”https://www.aclweb.org/anthology/2020.emnlp-main.609.pdf”>Dataset website</a></li> </ul>
-
Dataset irds.nano-beir.scifact
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of the SciFact dataset, for fact verification. </p> <ul> <li><a href=”https://www.aclweb.org/anthology/2020.emnlp-main.609.pdf”>Dataset paper</a></li> <li><a href=”https://www.aclweb.org/anthology/2020.emnlp-main.609.pdf”>Dataset website</a></li> </ul>
nano-beir/webis-touche2020
<p> Original version of the Touchè-2020 dataset, for argument retrieval. </p> <div class=”warn”> Consider using <a class=”ds-ref”>beir/webis-touche2020/v2</a> instead; it uses an updated, more complete version of the qrels. </div> <ul> <li><a href=”https://link.springer.com/chapter/10.1007%2F978-3-030-58219-7_26”>Dataset paper</a></li> <li><a href=”https://webis.de/events/touche-20/”>Dataset webiste</a></li> </ul>
-
Dataset irds.nano-beir.webis-touche2020.documents
datamaestro_text.datasets.irds.data.Documents
<p> Original version of the Touchè-2020 dataset, for argument retrieval. </p> <div class=”warn”> Consider using <a class=”ds-ref”>beir/webis-touche2020/v2</a> instead; it uses an updated, more complete version of the qrels. </div> <ul> <li><a href=”https://link.springer.com/chapter/10.1007%2F978-3-030-58219-7_26”>Dataset paper</a></li> <li><a href=”https://webis.de/events/touche-20/”>Dataset webiste</a></li> </ul>
-
Dataset irds.nano-beir.webis-touche2020.queries
datamaestro_text.datasets.irds.data.Topics
<p> Original version of the Touchè-2020 dataset, for argument retrieval. </p> <div class=”warn”> Consider using <a class=”ds-ref”>beir/webis-touche2020/v2</a> instead; it uses an updated, more complete version of the qrels. </div> <ul> <li><a href=”https://link.springer.com/chapter/10.1007%2F978-3-030-58219-7_26”>Dataset paper</a></li> <li><a href=”https://webis.de/events/touche-20/”>Dataset webiste</a></li> </ul>
-
Dataset irds.nano-beir.webis-touche2020.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Original version of the Touchè-2020 dataset, for argument retrieval. </p> <div class=”warn”> Consider using <a class=”ds-ref”>beir/webis-touche2020/v2</a> instead; it uses an updated, more complete version of the qrels. </div> <ul> <li><a href=”https://link.springer.com/chapter/10.1007%2F978-3-030-58219-7_26”>Dataset paper</a></li> <li><a href=”https://webis.de/events/touche-20/”>Dataset webiste</a></li> </ul>
-
Dataset irds.nano-beir.webis-touche2020
datamaestro_text.datasets.irds.data.Adhoc
<p> Original version of the Touchè-2020 dataset, for argument retrieval. </p> <div class=”warn”> Consider using <a class=”ds-ref”>beir/webis-touche2020/v2</a> instead; it uses an updated, more complete version of the qrels. </div> <ul> <li><a href=”https://link.springer.com/chapter/10.1007%2F978-3-030-58219-7_26”>Dataset paper</a></li> <li><a href=”https://webis.de/events/touche-20/”>Dataset webiste</a></li> </ul>
neumarco/fa
<p>The <a class=”ds-ref”>msmarco-passage</a> corpus, translated to Persian (Farsi).</p>
-
Dataset irds.neumarco.fa.documents
datamaestro_text.datasets.irds.data.Documents
<p>The <a class=”ds-ref”>msmarco-passage</a> corpus, translated to Persian (Farsi).</p>
-
Dataset irds.neumarco.fa.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p>A version of <a class=”ds-ref”>msmarco-passage/dev</a>, with the corpus translated to Persian (Farsi).</p>
-
Dataset irds.neumarco.fa.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>A version of <a class=”ds-ref”>msmarco-passage/dev</a>, with the corpus translated to Persian (Farsi).</p>
-
Dataset irds.neumarco.fa.dev
datamaestro_text.datasets.irds.data.Adhoc
<p>A version of <a class=”ds-ref”>msmarco-passage/dev</a>, with the corpus translated to Persian (Farsi).</p>
-
Dataset irds.neumarco.fa.dev.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p>A version of <a class=”ds-ref”>msmarco-passage/dev/judged</a>, with the corpus translated to Persian (Farsi).</p>
-
Dataset irds.neumarco.fa.dev.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>A version of <a class=”ds-ref”>msmarco-passage/dev/judged</a>, with the corpus translated to Persian (Farsi).</p>
-
Dataset irds.neumarco.fa.dev.judged
datamaestro_text.datasets.irds.data.Adhoc
<p>A version of <a class=”ds-ref”>msmarco-passage/dev/judged</a>, with the corpus translated to Persian (Farsi).</p>
-
Dataset irds.neumarco.fa.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p>A version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with the corpus translated to Persian (Farsi).</p>
-
Dataset irds.neumarco.fa.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>A version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with the corpus translated to Persian (Farsi).</p>
-
Dataset irds.neumarco.fa.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p>A version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with the corpus translated to Persian (Farsi).</p>
-
Dataset irds.neumarco.fa.train.queries
datamaestro_text.datasets.irds.data.Topics
<p>A version of <a class=”ds-ref”>msmarco-passage/train</a>, with the corpus translated to Persian (Farsi).</p>
-
Dataset irds.neumarco.fa.train.docpairs
<p>A version of <a class=”ds-ref”>msmarco-passage/train</a>, with the corpus translated to Persian (Farsi).</p>
-
Dataset irds.neumarco.fa.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>A version of <a class=”ds-ref”>msmarco-passage/train</a>, with the corpus translated to Persian (Farsi).</p>
-
Dataset irds.neumarco.fa.train
datamaestro_text.datasets.irds.data.Adhoc
<p>A version of <a class=”ds-ref”>msmarco-passage/train</a>, with the corpus translated to Persian (Farsi).</p>
-
Dataset irds.neumarco.fa.train.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p>A version of <a class=”ds-ref”>msmarco-passage/train/judged</a>, with the corpus translated to Persian (Farsi).</p>
-
Dataset irds.neumarco.fa.train.judged.docpairs
<p>A version of <a class=”ds-ref”>msmarco-passage/train/judged</a>, with the corpus translated to Persian (Farsi).</p>
-
Dataset irds.neumarco.fa.train.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>A version of <a class=”ds-ref”>msmarco-passage/train/judged</a>, with the corpus translated to Persian (Farsi).</p>
-
Dataset irds.neumarco.fa.train.judged
datamaestro_text.datasets.irds.data.Adhoc
<p>A version of <a class=”ds-ref”>msmarco-passage/train/judged</a>, with the corpus translated to Persian (Farsi).</p>
neumarco/ru
<p>The <a class=”ds-ref”>msmarco-passage</a> corpus, translated to Russian.</p>
-
Dataset irds.neumarco.ru.documents
datamaestro_text.datasets.irds.data.Documents
<p>The <a class=”ds-ref”>msmarco-passage</a> corpus, translated to Russian.</p>
-
Dataset irds.neumarco.ru.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p>A version of <a class=”ds-ref”>msmarco-passage/dev</a>, with the corpus translated to Russian.</p>
-
Dataset irds.neumarco.ru.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>A version of <a class=”ds-ref”>msmarco-passage/dev</a>, with the corpus translated to Russian.</p>
-
Dataset irds.neumarco.ru.dev
datamaestro_text.datasets.irds.data.Adhoc
<p>A version of <a class=”ds-ref”>msmarco-passage/dev</a>, with the corpus translated to Russian.</p>
-
Dataset irds.neumarco.ru.dev.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p>A version of <a class=”ds-ref”>msmarco-passage/dev/judged</a>, with the corpus translated to Russian.</p>
-
Dataset irds.neumarco.ru.dev.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>A version of <a class=”ds-ref”>msmarco-passage/dev/judged</a>, with the corpus translated to Russian.</p>
-
Dataset irds.neumarco.ru.dev.judged
datamaestro_text.datasets.irds.data.Adhoc
<p>A version of <a class=”ds-ref”>msmarco-passage/dev/judged</a>, with the corpus translated to Russian.</p>
-
Dataset irds.neumarco.ru.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p>A version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with the corpus translated to Russian.</p>
-
Dataset irds.neumarco.ru.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>A version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with the corpus translated to Russian.</p>
-
Dataset irds.neumarco.ru.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p>A version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with the corpus translated to Russian.</p>
-
Dataset irds.neumarco.ru.train.queries
datamaestro_text.datasets.irds.data.Topics
<p>A version of <a class=”ds-ref”>msmarco-passage/train</a>, with the corpus translated to Russian.</p>
-
Dataset irds.neumarco.ru.train.docpairs
<p>A version of <a class=”ds-ref”>msmarco-passage/train</a>, with the corpus translated to Russian.</p>
-
Dataset irds.neumarco.ru.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>A version of <a class=”ds-ref”>msmarco-passage/train</a>, with the corpus translated to Russian.</p>
-
Dataset irds.neumarco.ru.train
datamaestro_text.datasets.irds.data.Adhoc
<p>A version of <a class=”ds-ref”>msmarco-passage/train</a>, with the corpus translated to Russian.</p>
-
Dataset irds.neumarco.ru.train.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p>A version of <a class=”ds-ref”>msmarco-passage/train/judged</a>, with the corpus translated to Russian.</p>
-
Dataset irds.neumarco.ru.train.judged.docpairs
<p>A version of <a class=”ds-ref”>msmarco-passage/train/judged</a>, with the corpus translated to Russian.</p>
-
Dataset irds.neumarco.ru.train.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>A version of <a class=”ds-ref”>msmarco-passage/train/judged</a>, with the corpus translated to Russian.</p>
-
Dataset irds.neumarco.ru.train.judged
datamaestro_text.datasets.irds.data.Adhoc
<p>A version of <a class=”ds-ref”>msmarco-passage/train/judged</a>, with the corpus translated to Russian.</p>
neumarco/zh
<p>The <a class=”ds-ref”>msmarco-passage</a> corpus, translated to Chinese.</p>
-
Dataset irds.neumarco.zh.documents
datamaestro_text.datasets.irds.data.Documents
<p>The <a class=”ds-ref”>msmarco-passage</a> corpus, translated to Chinese.</p>
-
Dataset irds.neumarco.zh.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p>A version of <a class=”ds-ref”>msmarco-passage/dev</a>, with the corpus translated to Chinese.</p>
-
Dataset irds.neumarco.zh.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>A version of <a class=”ds-ref”>msmarco-passage/dev</a>, with the corpus translated to Chinese.</p>
-
Dataset irds.neumarco.zh.dev
datamaestro_text.datasets.irds.data.Adhoc
<p>A version of <a class=”ds-ref”>msmarco-passage/dev</a>, with the corpus translated to Chinese.</p>
-
Dataset irds.neumarco.zh.dev.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p>A version of <a class=”ds-ref”>msmarco-passage/dev/judged</a>, with the corpus translated to Chinese.</p>
-
Dataset irds.neumarco.zh.dev.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>A version of <a class=”ds-ref”>msmarco-passage/dev/judged</a>, with the corpus translated to Chinese.</p>
-
Dataset irds.neumarco.zh.dev.judged
datamaestro_text.datasets.irds.data.Adhoc
<p>A version of <a class=”ds-ref”>msmarco-passage/dev/judged</a>, with the corpus translated to Chinese.</p>
-
Dataset irds.neumarco.zh.dev.small.queries
datamaestro_text.datasets.irds.data.Topics
<p>A version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with the corpus translated to Chinese.</p>
-
Dataset irds.neumarco.zh.dev.small.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>A version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with the corpus translated to Chinese.</p>
-
Dataset irds.neumarco.zh.dev.small
datamaestro_text.datasets.irds.data.Adhoc
<p>A version of <a class=”ds-ref”>msmarco-passage/dev/small</a>, with the corpus translated to Chinese.</p>
-
Dataset irds.neumarco.zh.train.queries
datamaestro_text.datasets.irds.data.Topics
<p>A version of <a class=”ds-ref”>msmarco-passage/train</a>, with the corpus translated to Chinese.</p>
-
Dataset irds.neumarco.zh.train.docpairs
<p>A version of <a class=”ds-ref”>msmarco-passage/train</a>, with the corpus translated to Chinese.</p>
-
Dataset irds.neumarco.zh.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>A version of <a class=”ds-ref”>msmarco-passage/train</a>, with the corpus translated to Chinese.</p>
-
Dataset irds.neumarco.zh.train
datamaestro_text.datasets.irds.data.Adhoc
<p>A version of <a class=”ds-ref”>msmarco-passage/train</a>, with the corpus translated to Chinese.</p>
-
Dataset irds.neumarco.zh.train.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p>A version of <a class=”ds-ref”>msmarco-passage/train/judged</a>, with the corpus translated to Chinese.</p>
-
Dataset irds.neumarco.zh.train.judged.docpairs
<p>A version of <a class=”ds-ref”>msmarco-passage/train/judged</a>, with the corpus translated to Chinese.</p>
-
Dataset irds.neumarco.zh.train.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p>A version of <a class=”ds-ref”>msmarco-passage/train/judged</a>, with the corpus translated to Chinese.</p>
-
Dataset irds.neumarco.zh.train.judged
datamaestro_text.datasets.irds.data.Adhoc
<p>A version of <a class=”ds-ref”>msmarco-passage/train/judged</a>, with the corpus translated to Chinese.</p>
NFCorpus (NutritionFacts)
<p> “NFCorpus is a full-text English retrieval data set for Medical Information Retrieval. It contains a total of 3,244 natural language queries (written in non-technical English, harvested from the NutritionFacts.org site) with 169,756 automatically extracted relevance judgments for 9,964 medical documents (written in a complex terminology-heavy language), mostly from PubMed.” </p> <ul> <li><a href=”https://www.cl.uni-heidelberg.de/statnlpgroup/nfcorpus/”>Dataset website</p></li> <li><a href=”https://link.springer.com/chapter/10.1007/978-3-319-30671-1_58”>Dataset paper</p></li> </ul>
-
Dataset irds.nfcorpus.documents
datamaestro_text.datasets.irds.data.Documents
<p> “NFCorpus is a full-text English retrieval data set for Medical Information Retrieval. It contains a total of 3,244 natural language queries (written in non-technical English, harvested from the NutritionFacts.org site) with 169,756 automatically extracted relevance judgments for 9,964 medical documents (written in a complex terminology-heavy language), mostly from PubMed.” </p> <ul> <li><a href=”https://www.cl.uni-heidelberg.de/statnlpgroup/nfcorpus/”>Dataset website</p></li> <li><a href=”https://link.springer.com/chapter/10.1007/978-3-319-30671-1_58”>Dataset paper</p></li> </ul>
-
Dataset irds.nfcorpus.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official dev set. Queries include both title and combinted “all” text field (titles, descriptions, topics, transcripts and comments) </p>
-
Dataset irds.nfcorpus.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official dev set. Queries include both title and combinted “all” text field (titles, descriptions, topics, transcripts and comments) </p>
-
Dataset irds.nfcorpus.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Official dev set. Queries include both title and combinted “all” text field (titles, descriptions, topics, transcripts and comments) </p>
-
Dataset irds.nfcorpus.dev.nontopic.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official dev set, filtered to exclude queries from topic pages. </p>
-
Dataset irds.nfcorpus.dev.nontopic.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official dev set, filtered to exclude queries from topic pages. </p>
-
Dataset irds.nfcorpus.dev.nontopic
datamaestro_text.datasets.irds.data.Adhoc
<p> Official dev set, filtered to exclude queries from topic pages. </p>
-
Dataset irds.nfcorpus.dev.video.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official dev set, filtered to only include queries from video pages. </p>
-
Dataset irds.nfcorpus.dev.video.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official dev set, filtered to only include queries from video pages. </p>
-
Dataset irds.nfcorpus.dev.video
datamaestro_text.datasets.irds.data.Adhoc
<p> Official dev set, filtered to only include queries from video pages. </p>
-
Dataset irds.nfcorpus.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official test set. Queries include both title and combinted “all” text field (titles, descriptions, topics, transcripts and comments) </p>
-
Dataset irds.nfcorpus.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official test set. Queries include both title and combinted “all” text field (titles, descriptions, topics, transcripts and comments) </p>
-
Dataset irds.nfcorpus.test
datamaestro_text.datasets.irds.data.Adhoc
<p> Official test set. Queries include both title and combinted “all” text field (titles, descriptions, topics, transcripts and comments) </p>
-
Dataset irds.nfcorpus.test.nontopic.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official test set, filtered to exclude queries from topic pages. </p>
-
Dataset irds.nfcorpus.test.nontopic.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official test set, filtered to exclude queries from topic pages. </p>
-
Dataset irds.nfcorpus.test.nontopic
datamaestro_text.datasets.irds.data.Adhoc
<p> Official test set, filtered to exclude queries from topic pages. </p>
-
Dataset irds.nfcorpus.test.video.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official test set, filtered to only include queries from video pages. </p>
-
Dataset irds.nfcorpus.test.video.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official test set, filtered to only include queries from video pages. </p>
-
Dataset irds.nfcorpus.test.video
datamaestro_text.datasets.irds.data.Adhoc
<p> Official test set, filtered to only include queries from video pages. </p>
-
Dataset irds.nfcorpus.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official train set. Queries include both title and combinted “all” text field (titles, descriptions, topics, transcripts and comments) </p>
-
Dataset irds.nfcorpus.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official train set. Queries include both title and combinted “all” text field (titles, descriptions, topics, transcripts and comments) </p>
-
Dataset irds.nfcorpus.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Official train set. Queries include both title and combinted “all” text field (titles, descriptions, topics, transcripts and comments) </p>
-
Dataset irds.nfcorpus.train.nontopic.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official train set, filtered to exclude queries from topic pages. </p>
-
Dataset irds.nfcorpus.train.nontopic.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official train set, filtered to exclude queries from topic pages. </p>
-
Dataset irds.nfcorpus.train.nontopic
datamaestro_text.datasets.irds.data.Adhoc
<p> Official train set, filtered to exclude queries from topic pages. </p>
-
Dataset irds.nfcorpus.train.video.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official train set, filtered to only include queries from video pages. </p>
-
Dataset irds.nfcorpus.train.video.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official train set, filtered to only include queries from video pages. </p>
-
Dataset irds.nfcorpus.train.video
datamaestro_text.datasets.irds.data.Adhoc
<p> Official train set, filtered to only include queries from video pages. </p>
Natural Questions
<p> Google Natural Questions is a Q&A dataset containing long, short, and Yes/No answers from Wikipedia. <kbd>ir_datasets</kbd> frames this around an ad-hoc ranking setting by building a collection of all long answer candidate passages. However, short and Yes/No annotations are also available in the <kbd>qrels</kbd>, as are the passages presented to the annotators (via <kbd>scoreddocs</kbd>). </p> <p> Importantly, the document collection does not consist of all Wikipedia passages, but instead a union of the candidate passages presented to the annotators (akin to MS MARCO). <a class=”ds-ref”>dph-w100/natural-questions/train</a> and <a class=”ds-ref”>dph-w100/natural-questions/dev</a> contain a filtered set of the questions in this dataset and a full Wikipedia dump (which is a more realistic retrieval setting). </p> <ul> <li><a href=”https://ai.google.com/research/NaturalQuestions”>Dataset website</a></li> <li><a href=”https://storage.googleapis.com/pub-tools-public-publication-data/pdf/1f7b46b5378d757553d3e92ead36bda2e4254244.pdf”>Dataset paper</a></li> <li>See also: <a class=”ds-ref”>dph-w100</a></li> </ul>
-
Dataset irds.natural-questions.documents
datamaestro_text.datasets.irds.data.Documents
<p> Google Natural Questions is a Q&A dataset containing long, short, and Yes/No answers from Wikipedia. <kbd>ir_datasets</kbd> frames this around an ad-hoc ranking setting by building a collection of all long answer candidate passages. However, short and Yes/No annotations are also available in the <kbd>qrels</kbd>, as are the passages presented to the annotators (via <kbd>scoreddocs</kbd>). </p> <p> Importantly, the document collection does not consist of all Wikipedia passages, but instead a union of the candidate passages presented to the annotators (akin to MS MARCO). <a class=”ds-ref”>dph-w100/natural-questions/train</a> and <a class=”ds-ref”>dph-w100/natural-questions/dev</a> contain a filtered set of the questions in this dataset and a full Wikipedia dump (which is a more realistic retrieval setting). </p> <ul> <li><a href=”https://ai.google.com/research/NaturalQuestions”>Dataset website</a></li> <li><a href=”https://storage.googleapis.com/pub-tools-public-publication-data/pdf/1f7b46b5378d757553d3e92ead36bda2e4254244.pdf”>Dataset paper</a></li> <li>See also: <a class=”ds-ref”>dph-w100</a></li> </ul>
-
Dataset irds.natural-questions.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official dev set. </p>
-
Dataset irds.natural-questions.dev.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official dev set. </p>
-
Dataset irds.natural-questions.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official dev set. </p>
-
Dataset irds.natural-questions.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Official dev set. </p>
-
Dataset irds.natural-questions.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official train set. </p>
-
Dataset irds.natural-questions.train.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official train set. </p>
-
Dataset irds.natural-questions.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official train set. </p>
-
Dataset irds.natural-questions.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Official train set. </p>
NYT
<p> The New York Times Annotated Corpus. Consists of articles published between 1987 and 2007. It is used in TREC Core 2017 and it is also useful for transferring relevance signals in cases where training data is in short supply. </p> <p> Uses data from <a href=”https://catalog.ldc.upenn.edu/LDC2008T19”>LDC2008T19</a>. The source collection can be downloaded from the LDC. </p>
-
Dataset irds.nyt.documents
datamaestro_text.datasets.irds.data.Documents
<p> The New York Times Annotated Corpus. Consists of articles published between 1987 and 2007. It is used in TREC Core 2017 and it is also useful for transferring relevance signals in cases where training data is in short supply. </p> <p> Uses data from <a href=”https://catalog.ldc.upenn.edu/LDC2008T19”>LDC2008T19</a>. The source collection can be downloaded from the LDC. </p>
-
Dataset irds.nyt.trec-core-2017.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Common Core 2017 benchmark. </p> <p> Note that this dataset only contains the 50 queries assessed by NIST. </p> <ul> <li>Queries: TREC-style (keyword, description, narrative)</li> <li>Relevance: Deeply-annotated</li> <li><a href=”https://github.com/trec-core/2017”>Shared Task Website</a></li> <li><a href=”https://trec.nist.gov/pubs/trec26/papers/Overview-CC.pdf”>Shared Task Paper</a></li> <ul>
-
Dataset irds.nyt.trec-core-2017.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Common Core 2017 benchmark. </p> <p> Note that this dataset only contains the 50 queries assessed by NIST. </p> <ul> <li>Queries: TREC-style (keyword, description, narrative)</li> <li>Relevance: Deeply-annotated</li> <li><a href=”https://github.com/trec-core/2017”>Shared Task Website</a></li> <li><a href=”https://trec.nist.gov/pubs/trec26/papers/Overview-CC.pdf”>Shared Task Paper</a></li> <ul>
-
Dataset irds.nyt.trec-core-2017
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Common Core 2017 benchmark. </p> <p> Note that this dataset only contains the 50 queries assessed by NIST. </p> <ul> <li>Queries: TREC-style (keyword, description, narrative)</li> <li>Relevance: Deeply-annotated</li> <li><a href=”https://github.com/trec-core/2017”>Shared Task Website</a></li> <li><a href=”https://trec.nist.gov/pubs/trec26/papers/Overview-CC.pdf”>Shared Task Paper</a></li> <ul>
-
Dataset irds.nyt.wksup.queries
datamaestro_text.datasets.irds.data.Topics
<p> Training set (without held-out <a class=”ds-ref”>nyt/wksup/valid</a>) for transferring relevance signals from NYT corpus. </p>
-
Dataset irds.nyt.wksup.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Training set (without held-out <a class=”ds-ref”>nyt/wksup/valid</a>) for transferring relevance signals from NYT corpus. </p>
-
Dataset irds.nyt.wksup
datamaestro_text.datasets.irds.data.Adhoc
<p> Training set (without held-out <a class=”ds-ref”>nyt/wksup/valid</a>) for transferring relevance signals from NYT corpus. </p>
-
Dataset irds.nyt.wksup.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Training set (without held-out <a class=”ds-ref”>nyt/wksup/valid</a>) for transferring relevance signals from NYT corpus. </p>
-
Dataset irds.nyt.wksup.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Training set (without held-out <a class=”ds-ref”>nyt/wksup/valid</a>) for transferring relevance signals from NYT corpus. </p>
-
Dataset irds.nyt.wksup.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Training set (without held-out <a class=”ds-ref”>nyt/wksup/valid</a>) for transferring relevance signals from NYT corpus. </p>
-
Dataset irds.nyt.wksup.valid.queries
datamaestro_text.datasets.irds.data.Topics
<p> Held-out validation set for transferring relevance signals from NYT corpus (see <a class=”ds-ref”>nyt/wksup/train</a>). </p>
-
Dataset irds.nyt.wksup.valid.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Held-out validation set for transferring relevance signals from NYT corpus (see <a class=”ds-ref”>nyt/wksup/train</a>). </p>
-
Dataset irds.nyt.wksup.valid
datamaestro_text.datasets.irds.data.Adhoc
<p> Held-out validation set for transferring relevance signals from NYT corpus (see <a class=”ds-ref”>nyt/wksup/train</a>). </p>
pmc/v1
<p> Subset of PMC articles used for the TREC 2014 and 2015 tasks (v1). Inclues titles, abstracts, full text. Collected from the open access segment on January 21, 2014. </p> <ul> <li><a href=”http://www.trec-cds.org/2014.html#documents”>Information on documents</a></li> <ul>
-
Dataset irds.pmc.v1.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of PMC articles used for the TREC 2014 and 2015 tasks (v1). Inclues titles, abstracts, full text. Collected from the open access segment on January 21, 2014. </p> <ul> <li><a href=”http://www.trec-cds.org/2014.html#documents”>Information on documents</a></li> <ul>
-
Dataset irds.pmc.v1.trec-cds-2014.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Clinical Decision Support (CDS) track from 2014. </p> <ul> <li><a href=”http://www.trec-cds.org/2014.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec23/papers/overview-clinical.pdf”>Task Overview Paper</a></li> <ul>
-
Dataset irds.pmc.v1.trec-cds-2014.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Clinical Decision Support (CDS) track from 2014. </p> <ul> <li><a href=”http://www.trec-cds.org/2014.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec23/papers/overview-clinical.pdf”>Task Overview Paper</a></li> <ul>
-
Dataset irds.pmc.v1.trec-cds-2014
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Clinical Decision Support (CDS) track from 2014. </p> <ul> <li><a href=”http://www.trec-cds.org/2014.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec23/papers/overview-clinical.pdf”>Task Overview Paper</a></li> <ul>
-
Dataset irds.pmc.v1.trec-cds-2015.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Clinical Decision Support (CDS) track from 2015. </p> <ul> <li><a href=”http://www.trec-cds.org/2015.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec24/papers/Overview-CL.pdf”>Task Overview Paper</a></li> <ul>
-
Dataset irds.pmc.v1.trec-cds-2015.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Clinical Decision Support (CDS) track from 2015. </p> <ul> <li><a href=”http://www.trec-cds.org/2015.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec24/papers/Overview-CL.pdf”>Task Overview Paper</a></li> <ul>
-
Dataset irds.pmc.v1.trec-cds-2015
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Clinical Decision Support (CDS) track from 2015. </p> <ul> <li><a href=”http://www.trec-cds.org/2015.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec24/papers/Overview-CL.pdf”>Task Overview Paper</a></li> <ul>
pmc/v2
<p> Subset of PMC articles used for the TREC 2016 task (v2). Inclues titles, abstracts, full text. Collected from the open access segment on March 28, 2016. </p> <ul> <li><a href=”http://www.trec-cds.org/2016.html#documents”>Information on documents</a></li> <ul>
-
Dataset irds.pmc.v2.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of PMC articles used for the TREC 2016 task (v2). Inclues titles, abstracts, full text. Collected from the open access segment on March 28, 2016. </p> <ul> <li><a href=”http://www.trec-cds.org/2016.html#documents”>Information on documents</a></li> <ul>
-
Dataset irds.pmc.v2.trec-cds-2016.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Clinical Decision Support (CDS) track from 2016. </p> <ul> <li><a href=”http://www.trec-cds.org/2016.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec25/papers/Overview-CL.pdf”>Task Overview Paper</a></li> <ul>
-
Dataset irds.pmc.v2.trec-cds-2016.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Clinical Decision Support (CDS) track from 2016. </p> <ul> <li><a href=”http://www.trec-cds.org/2016.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec25/papers/Overview-CL.pdf”>Task Overview Paper</a></li> <ul>
-
Dataset irds.pmc.v2.trec-cds-2016
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Clinical Decision Support (CDS) track from 2016. </p> <ul> <li><a href=”http://www.trec-cds.org/2016.html”>Shared task site</a></li> <li><a href=”https://trec.nist.gov/pubs/trec25/papers/Overview-CL.pdf”>Task Overview Paper</a></li> <ul>
Touché Image Search
<p> Corpus version 2022-06-13 with 23 841 images. It was released on June 13, 2022 on <a href=”https://zenodo.org/record/3734893”>Zenodo</a>. </p> <p> This collection is licensed with the <a href=”https://creativecommons.org/licenses/by/4.0/”>Creative Commons Attribution 4.0 International</a>. Individual rights to the content still apply. </p> <ul> <li><a href=”https://zenodo.org/record/6873575”>Zenodo</a></li> <li><a href=”https://touche.webis.de/clef22/touche22-web/image-retrieval-for-arguments.html”>Touché 2022 task 3 website</a></li> <li><a href=”https://touche.webis.de/clef22/touche22-web/”>Touché 2022 lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-99739-7_43”>Overview paper</a></li> </ul>
-
Dataset irds.touche-image.2022-06-13.documents
datamaestro_text.datasets.irds.data.Documents
<p> Corpus version 2022-06-13 with 23 841 images. It was released on June 13, 2022 on <a href=”https://zenodo.org/record/3734893”>Zenodo</a>. </p> <p> This collection is licensed with the <a href=”https://creativecommons.org/licenses/by/4.0/”>Creative Commons Attribution 4.0 International</a>. Individual rights to the content still apply. </p> <ul> <li><a href=”https://zenodo.org/record/6873575”>Zenodo</a></li> <li><a href=”https://touche.webis.de/clef22/touche22-web/image-retrieval-for-arguments.html”>Touché 2022 task 3 website</a></li> <li><a href=”https://touche.webis.de/clef22/touche22-web/”>Touché 2022 lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-99739-7_43”>Overview paper</a></li> </ul>
-
Dataset irds.touche-image.2022-06-13.touche-2022-task-3.queries
datamaestro_text.datasets.irds.data.Topics
<p> Decision making processes, be it at the societal or at the personal level, often come to a point where one side challenges the other with a why-question, which is a prompt to justify some stance based on arguments. Since technologies for argument mining are maturing at a rapid pace, also ad-hoc argument retrieval becomes a feasible task in reach. Touché 2022 is the third lab on argument retrieval at CLEF 2022 featuring three tasks. </p> <p> Given a controversial topic, the task is to retrieve images (from <a class=”ds-ref”>touche-image/2022-06-13</a>) for each stance (pro/con) that show support for that stance. </p> <p> Systems are evaluated on Touché topics 1-50 by the ratio of images among the 20 retrieved images for each topic (10 images for each stance) that are all three: relevant to the topic, argumentative, and have the associated stance. </p> <ul> <li><a href=”https://touche.webis.de/clef22/touche22-web/image-retrieval-for-arguments.html”>Task 3 website</a></li> <li><a href=”https://touche.webis.de/clef22/touche22-web/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-99739-7_43”>Overview paper</a></li> </ul>
-
Dataset irds.touche-image.2022-06-13.touche-2022-task-3.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Decision making processes, be it at the societal or at the personal level, often come to a point where one side challenges the other with a why-question, which is a prompt to justify some stance based on arguments. Since technologies for argument mining are maturing at a rapid pace, also ad-hoc argument retrieval becomes a feasible task in reach. Touché 2022 is the third lab on argument retrieval at CLEF 2022 featuring three tasks. </p> <p> Given a controversial topic, the task is to retrieve images (from <a class=”ds-ref”>touche-image/2022-06-13</a>) for each stance (pro/con) that show support for that stance. </p> <p> Systems are evaluated on Touché topics 1-50 by the ratio of images among the 20 retrieved images for each topic (10 images for each stance) that are all three: relevant to the topic, argumentative, and have the associated stance. </p> <ul> <li><a href=”https://touche.webis.de/clef22/touche22-web/image-retrieval-for-arguments.html”>Task 3 website</a></li> <li><a href=”https://touche.webis.de/clef22/touche22-web/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-99739-7_43”>Overview paper</a></li> </ul>
-
Dataset irds.touche-image.2022-06-13.touche-2022-task-3
datamaestro_text.datasets.irds.data.Adhoc
<p> Decision making processes, be it at the societal or at the personal level, often come to a point where one side challenges the other with a why-question, which is a prompt to justify some stance based on arguments. Since technologies for argument mining are maturing at a rapid pace, also ad-hoc argument retrieval becomes a feasible task in reach. Touché 2022 is the third lab on argument retrieval at CLEF 2022 featuring three tasks. </p> <p> Given a controversial topic, the task is to retrieve images (from <a class=”ds-ref”>touche-image/2022-06-13</a>) for each stance (pro/con) that show support for that stance. </p> <p> Systems are evaluated on Touché topics 1-50 by the ratio of images among the 20 retrieved images for each topic (10 images for each stance) that are all three: relevant to the topic, argumentative, and have the associated stance. </p> <ul> <li><a href=”https://touche.webis.de/clef22/touche22-web/image-retrieval-for-arguments.html”>Task 3 website</a></li> <li><a href=”https://touche.webis.de/clef22/touche22-web/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-99739-7_43”>Overview paper</a></li> </ul>
Touché 2022 Task 2: Argument Retrieval for Comparative Questions
<p> Decision making processes, be it at the societal or at the personal level, often come to a point where one side challenges the other with a why-question, which is a prompt to justify some stance based on arguments. Since technologies for argument mining are maturing at a rapid pace, also ad-hoc argument retrieval becomes a feasible task in reach. Touché 2022 is the third lab on argument retrieval at CLEF 2022 featuring three tasks. </p> <p> Given a comparative topic and a collection of documents, the task is to retrieve relevant argumentative passages for either compared object or for both and to detect their respective stances with respect to the object they talk about. </p> <p> Documents are judged based on their general topical relevance and for rhetorical quality, i.e., “well-writtenness” of the document: (1) whether the text has a good style of speech (formal language is preferred over informal), (2) whether the text has a proper sentence structure and is easy to read, (3) whether it includes profanity, has typos, and makes use of other detrimental style choices. </p> <p> Additionally, classify the stance of the retrieved text passages towards the compared objects in questions. For instance, in the question <i>Who is a better friend, a cat or a dog?</i> the terms <i>cat</i> and <i>dog</i> are the comparison objects. An answer candidate like <i>Cats can be quite affectionate and attentive, and thus are good friends</i> should be classified as pro the <i>cat</i> object, while <i>Cats are less faithful than dogs</i> as supporting the <i>dog</i> object. </p> <ul> <li><a href=”https://touche.webis.de/clef22/touche22-web/argument-retrieval-for-comparative-questions.html”>Task 2 website</a></li> <li><a href=”https://touche.webis.de/clef22/touche22-web/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-99739-7_43”>Overview paper</a></li> </ul>
-
Dataset irds.clueweb12.touche-2022-task-2.documents
datamaestro_text.datasets.irds.data.Documents
<p> Decision making processes, be it at the societal or at the personal level, often come to a point where one side challenges the other with a why-question, which is a prompt to justify some stance based on arguments. Since technologies for argument mining are maturing at a rapid pace, also ad-hoc argument retrieval becomes a feasible task in reach. Touché 2022 is the third lab on argument retrieval at CLEF 2022 featuring three tasks. </p> <p> Given a comparative topic and a collection of documents, the task is to retrieve relevant argumentative passages for either compared object or for both and to detect their respective stances with respect to the object they talk about. </p> <p> Documents are judged based on their general topical relevance and for rhetorical quality, i.e., “well-writtenness” of the document: (1) whether the text has a good style of speech (formal language is preferred over informal), (2) whether the text has a proper sentence structure and is easy to read, (3) whether it includes profanity, has typos, and makes use of other detrimental style choices. </p> <p> Additionally, classify the stance of the retrieved text passages towards the compared objects in questions. For instance, in the question <i>Who is a better friend, a cat or a dog?</i> the terms <i>cat</i> and <i>dog</i> are the comparison objects. An answer candidate like <i>Cats can be quite affectionate and attentive, and thus are good friends</i> should be classified as pro the <i>cat</i> object, while <i>Cats are less faithful than dogs</i> as supporting the <i>dog</i> object. </p> <ul> <li><a href=”https://touche.webis.de/clef22/touche22-web/argument-retrieval-for-comparative-questions.html”>Task 2 website</a></li> <li><a href=”https://touche.webis.de/clef22/touche22-web/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-99739-7_43”>Overview paper</a></li> </ul>
-
Dataset irds.clueweb12.touche-2022-task-2.queries
datamaestro_text.datasets.irds.data.Topics
<p> Decision making processes, be it at the societal or at the personal level, often come to a point where one side challenges the other with a why-question, which is a prompt to justify some stance based on arguments. Since technologies for argument mining are maturing at a rapid pace, also ad-hoc argument retrieval becomes a feasible task in reach. Touché 2022 is the third lab on argument retrieval at CLEF 2022 featuring three tasks. </p> <p> Given a comparative topic and a collection of documents, the task is to retrieve relevant argumentative passages for either compared object or for both and to detect their respective stances with respect to the object they talk about. </p> <p> Documents are judged based on their general topical relevance and for rhetorical quality, i.e., “well-writtenness” of the document: (1) whether the text has a good style of speech (formal language is preferred over informal), (2) whether the text has a proper sentence structure and is easy to read, (3) whether it includes profanity, has typos, and makes use of other detrimental style choices. </p> <p> Additionally, classify the stance of the retrieved text passages towards the compared objects in questions. For instance, in the question <i>Who is a better friend, a cat or a dog?</i> the terms <i>cat</i> and <i>dog</i> are the comparison objects. An answer candidate like <i>Cats can be quite affectionate and attentive, and thus are good friends</i> should be classified as pro the <i>cat</i> object, while <i>Cats are less faithful than dogs</i> as supporting the <i>dog</i> object. </p> <ul> <li><a href=”https://touche.webis.de/clef22/touche22-web/argument-retrieval-for-comparative-questions.html”>Task 2 website</a></li> <li><a href=”https://touche.webis.de/clef22/touche22-web/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-99739-7_43”>Overview paper</a></li> </ul>
-
Dataset irds.clueweb12.touche-2022-task-2.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Decision making processes, be it at the societal or at the personal level, often come to a point where one side challenges the other with a why-question, which is a prompt to justify some stance based on arguments. Since technologies for argument mining are maturing at a rapid pace, also ad-hoc argument retrieval becomes a feasible task in reach. Touché 2022 is the third lab on argument retrieval at CLEF 2022 featuring three tasks. </p> <p> Given a comparative topic and a collection of documents, the task is to retrieve relevant argumentative passages for either compared object or for both and to detect their respective stances with respect to the object they talk about. </p> <p> Documents are judged based on their general topical relevance and for rhetorical quality, i.e., “well-writtenness” of the document: (1) whether the text has a good style of speech (formal language is preferred over informal), (2) whether the text has a proper sentence structure and is easy to read, (3) whether it includes profanity, has typos, and makes use of other detrimental style choices. </p> <p> Additionally, classify the stance of the retrieved text passages towards the compared objects in questions. For instance, in the question <i>Who is a better friend, a cat or a dog?</i> the terms <i>cat</i> and <i>dog</i> are the comparison objects. An answer candidate like <i>Cats can be quite affectionate and attentive, and thus are good friends</i> should be classified as pro the <i>cat</i> object, while <i>Cats are less faithful than dogs</i> as supporting the <i>dog</i> object. </p> <ul> <li><a href=”https://touche.webis.de/clef22/touche22-web/argument-retrieval-for-comparative-questions.html”>Task 2 website</a></li> <li><a href=”https://touche.webis.de/clef22/touche22-web/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-99739-7_43”>Overview paper</a></li> </ul>
-
Dataset irds.clueweb12.touche-2022-task-2
datamaestro_text.datasets.irds.data.Adhoc
<p> Decision making processes, be it at the societal or at the personal level, often come to a point where one side challenges the other with a why-question, which is a prompt to justify some stance based on arguments. Since technologies for argument mining are maturing at a rapid pace, also ad-hoc argument retrieval becomes a feasible task in reach. Touché 2022 is the third lab on argument retrieval at CLEF 2022 featuring three tasks. </p> <p> Given a comparative topic and a collection of documents, the task is to retrieve relevant argumentative passages for either compared object or for both and to detect their respective stances with respect to the object they talk about. </p> <p> Documents are judged based on their general topical relevance and for rhetorical quality, i.e., “well-writtenness” of the document: (1) whether the text has a good style of speech (formal language is preferred over informal), (2) whether the text has a proper sentence structure and is easy to read, (3) whether it includes profanity, has typos, and makes use of other detrimental style choices. </p> <p> Additionally, classify the stance of the retrieved text passages towards the compared objects in questions. For instance, in the question <i>Who is a better friend, a cat or a dog?</i> the terms <i>cat</i> and <i>dog</i> are the comparison objects. An answer candidate like <i>Cats can be quite affectionate and attentive, and thus are good friends</i> should be classified as pro the <i>cat</i> object, while <i>Cats are less faithful than dogs</i> as supporting the <i>dog</i> object. </p> <ul> <li><a href=”https://touche.webis.de/clef22/touche22-web/argument-retrieval-for-comparative-questions.html”>Task 2 website</a></li> <li><a href=”https://touche.webis.de/clef22/touche22-web/”>Lab website</a></li> <li><a href=”https://doi.org/10.1007/978-3-030-99739-7_43”>Overview paper</a></li> </ul>
Touché 2022 Task 2: Argument Retrieval for Comparative Questions (Expanded)
<p> Pre-processed version of <a class=”ds-ref”>clueweb12/touche-2022-task-2</a> where each passage has been expanded with queries generated using DocT5Query. </p>
-
Dataset irds.clueweb12.touche-2022-task-2.expanded-doc-t5-query.documents
datamaestro_text.datasets.irds.data.Documents
<p> Pre-processed version of <a class=”ds-ref”>clueweb12/touche-2022-task-2</a> where each passage has been expanded with queries generated using DocT5Query. </p>
-
Dataset irds.clueweb12.touche-2022-task-2.expanded-doc-t5-query.queries
datamaestro_text.datasets.irds.data.Topics
<p> Pre-processed version of <a class=”ds-ref”>clueweb12/touche-2022-task-2</a> where each passage has been expanded with queries generated using DocT5Query. </p>
-
Dataset irds.clueweb12.touche-2022-task-2.expanded-doc-t5-query.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Pre-processed version of <a class=”ds-ref”>clueweb12/touche-2022-task-2</a> where each passage has been expanded with queries generated using DocT5Query. </p>
-
Dataset irds.clueweb12.touche-2022-task-2.expanded-doc-t5-query
datamaestro_text.datasets.irds.data.Adhoc
<p> Pre-processed version of <a class=”ds-ref”>clueweb12/touche-2022-task-2</a> where each passage has been expanded with queries generated using DocT5Query. </p>
TREC Arabic
<p> A collection of news articles in Arabic, used for multi-lingual evaluation in TREC 2001 and TREC 2002. </p> <p> Document collection from <a href=”https://catalog.ldc.upenn.edu/LDC2001T55”>LDC2001T55</a>. </p>
-
Dataset irds.trec-arabic.documents
datamaestro_text.datasets.irds.data.Documents
<p> A collection of news articles in Arabic, used for multi-lingual evaluation in TREC 2001 and TREC 2002. </p> <p> Document collection from <a href=”https://catalog.ldc.upenn.edu/LDC2001T55”>LDC2001T55</a>. </p>
-
Dataset irds.trec-arabic.ar2001.queries
datamaestro_text.datasets.irds.data.Topics
<p> Arabic benchmark from TREC 2001. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec10/papers/clirtrack.pdf”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-arabic.ar2001.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Arabic benchmark from TREC 2001. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec10/papers/clirtrack.pdf”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-arabic.ar2001
datamaestro_text.datasets.irds.data.Adhoc
<p> Arabic benchmark from TREC 2001. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec10/papers/clirtrack.pdf”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-arabic.ar2002.queries
datamaestro_text.datasets.irds.data.Topics
<p> Arabic benchmark from TREC 2002. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec11/papers/OVERVIEW.gey.ps.gz”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-arabic.ar2002.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Arabic benchmark from TREC 2002. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec11/papers/OVERVIEW.gey.ps.gz”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-arabic.ar2002
datamaestro_text.datasets.irds.data.Adhoc
<p> Arabic benchmark from TREC 2002. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec11/papers/OVERVIEW.gey.ps.gz”>Task Overview Paper</a></li> </ul>
TREC Mandarin
<p> A collection of news articles in Mandarin in Simplified Chinese, used for multi-lingual evaluation in TREC 5 and TREC 6. </p> <p> Document collection from <a href=”https://catalog.ldc.upenn.edu/LDC2000T52”>LDC2000T52</a>. </p>
-
Dataset irds.trec-mandarin.documents
datamaestro_text.datasets.irds.data.Documents
<p> A collection of news articles in Mandarin in Simplified Chinese, used for multi-lingual evaluation in TREC 5 and TREC 6. </p> <p> Document collection from <a href=”https://catalog.ldc.upenn.edu/LDC2000T52”>LDC2000T52</a>. </p>
-
Dataset irds.trec-mandarin.trec5.queries
datamaestro_text.datasets.irds.data.Topics
<p> Mandarin Chinese benchmark from TREC 5. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec5/papers/multilingual_track.ps.gz”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-mandarin.trec5.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Mandarin Chinese benchmark from TREC 5. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec5/papers/multilingual_track.ps.gz”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-mandarin.trec5
datamaestro_text.datasets.irds.data.Adhoc
<p> Mandarin Chinese benchmark from TREC 5. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec5/papers/multilingual_track.ps.gz”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-mandarin.trec6.queries
datamaestro_text.datasets.irds.data.Topics
<p> Mandarin Chinese benchmark from TREC 6. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec6/papers/csiro.chinese.ps.gz”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-mandarin.trec6.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Mandarin Chinese benchmark from TREC 6. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec6/papers/csiro.chinese.ps.gz”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-mandarin.trec6
datamaestro_text.datasets.irds.data.Adhoc
<p> Mandarin Chinese benchmark from TREC 6. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec6/papers/csiro.chinese.ps.gz”>Task Overview Paper</a></li> </ul>
TREC Spanish
<p> A collection of news articles in Spanish, used for multi-lingual evaluation in TREC 3 and TREC 4. </p> <p> Document collection from <a href=”https://catalog.ldc.upenn.edu/LDC2000T51”>LDC2000T51</a>. </p>
-
Dataset irds.trec-spanish.documents
datamaestro_text.datasets.irds.data.Documents
<p> A collection of news articles in Spanish, used for multi-lingual evaluation in TREC 3 and TREC 4. </p> <p> Document collection from <a href=”https://catalog.ldc.upenn.edu/LDC2000T51”>LDC2000T51</a>. </p>
-
Dataset irds.trec-spanish.trec3.queries
datamaestro_text.datasets.irds.data.Topics
<p> Spanish benchmark from TREC 3. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec3/papers/overview.pdf”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-spanish.trec3.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Spanish benchmark from TREC 3. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec3/papers/overview.pdf”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-spanish.trec3
datamaestro_text.datasets.irds.data.Adhoc
<p> Spanish benchmark from TREC 3. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec3/papers/overview.pdf”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-spanish.trec4.queries
datamaestro_text.datasets.irds.data.Topics
<p> Spanish benchmark from TREC 4. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec4/overview.ps.gz”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-spanish.trec4.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Spanish benchmark from TREC 4. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec4/overview.ps.gz”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-spanish.trec4
datamaestro_text.datasets.irds.data.Adhoc
<p> Spanish benchmark from TREC 4. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec4/overview.ps.gz”>Task Overview Paper</a></li> </ul>
trec-tot/2023
<p> Corpus for the TREC 2023 tip-of-the-tongue search track. </p>
-
Dataset irds.trec-tot.2023.documents
datamaestro_text.datasets.irds.data.Documents
<p> Corpus for the TREC 2023 tip-of-the-tongue search track. </p>
-
Dataset irds.trec-tot.2023.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Train query set for TREC 2023 tip-of-the-tongue search track. </p>
-
Dataset irds.trec-tot.2023.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Train query set for TREC 2023 tip-of-the-tongue search track. </p>
-
Dataset irds.trec-tot.2023.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Train query set for TREC 2023 tip-of-the-tongue search track. </p>
-
Dataset irds.trec-tot.2023.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Dev query set for TREC 2023 tip-of-the-tongue search track. </p>
-
Dataset irds.trec-tot.2023.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Dev query set for TREC 2023 tip-of-the-tongue search track. </p>
-
Dataset irds.trec-tot.2023.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Dev query set for TREC 2023 tip-of-the-tongue search track. </p>
trec-tot/2024
<p> Corpus for the TREC 2024 tip-of-the-tongue search track. </p>
-
Dataset irds.trec-tot.2024.documents
datamaestro_text.datasets.irds.data.Documents
<p> Corpus for the TREC 2024 tip-of-the-tongue search track. </p>
-
Dataset irds.trec-tot.2024.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> Test query set for TREC 2024 tip-of-the-tongue search track. </p>
TripClick
<p> TripClick is a large collection from the <a href=”https://www.tripdatabase.com/”>Trip Database</a>. Relevance is inferred from click signals. </p> <p> A copy of this dataset can be obtained from the Trip Database through the process described <a href=”https://tripdatabase.github.io/tripclick/#getting-the-data”>here</a>. Documents, queries, and qrels require the “TripClick IR Benchmark”; for scoreddocs and docpairs, you will also need to request the “TripClick Training Package for Deep Learning Models”. </p> <ul> <li>Documents: <a href=”https://www.nlm.nih.gov/medline/medline_overview.html”>Medline</a> article titles and abstracts</li> <li>Queries: user queries issued to the <a href=”https://www.tripdatabase.com/”>Trip Database</a></li> <li>Qrels: Inferred from clicks</li> <li><a href=”https://docs.google.com/document/d/1RHVxVnZsPBDDZMDcSvbB8VyNZDl2cn6KpeeSvIu6g_c/edit?usp=sharing”>Dataset request form</a></li> <li><a href=”https://tripdatabase.github.io/tripclick/”>Dataset website</a></li> <li><a href=”https://arxiv.org/abs/2103.07901”>Dataset paper</a></li> </ul>
-
Dataset irds.tripclick.documents
datamaestro_text.datasets.irds.data.Documents
<p> TripClick is a large collection from the <a href=”https://www.tripdatabase.com/”>Trip Database</a>. Relevance is inferred from click signals. </p> <p> A copy of this dataset can be obtained from the Trip Database through the process described <a href=”https://tripdatabase.github.io/tripclick/#getting-the-data”>here</a>. Documents, queries, and qrels require the “TripClick IR Benchmark”; for scoreddocs and docpairs, you will also need to request the “TripClick Training Package for Deep Learning Models”. </p> <ul> <li>Documents: <a href=”https://www.nlm.nih.gov/medline/medline_overview.html”>Medline</a> article titles and abstracts</li> <li>Queries: user queries issued to the <a href=”https://www.tripdatabase.com/”>Trip Database</a></li> <li>Qrels: Inferred from clicks</li> <li><a href=”https://docs.google.com/document/d/1RHVxVnZsPBDDZMDcSvbB8VyNZDl2cn6KpeeSvIu6g_c/edit?usp=sharing”>Dataset request form</a></li> <li><a href=”https://tripdatabase.github.io/tripclick/”>Dataset website</a></li> <li><a href=”https://arxiv.org/abs/2103.07901”>Dataset paper</a></li> </ul>
-
Dataset irds.tripclick.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> Test subset of <a class=”ds-ref”>tripclick</a>, including all queries from <a class=”ds-ref”>tripclick/test/head</a>, <a class=”ds-ref”>tripclick/test/torso</a>, and <a class=”ds-ref”>tripclick/test/tail</a>. </p> <p> The scoreddocs are the official BM25 results from Anserini. </p>
-
Dataset irds.tripclick.test.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Test subset of <a class=”ds-ref”>tripclick</a>, including all queries from <a class=”ds-ref”>tripclick/test/head</a>, <a class=”ds-ref”>tripclick/test/torso</a>, and <a class=”ds-ref”>tripclick/test/tail</a>. </p> <p> The scoreddocs are the official BM25 results from Anserini. </p>
-
Dataset irds.tripclick.test.head.queries
datamaestro_text.datasets.irds.data.Topics
<p> The most frequent queries in the validation set. This represents 20% of the search engine traffic. </p>
-
Dataset irds.tripclick.test.head.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> The most frequent queries in the validation set. This represents 20% of the search engine traffic. </p>
-
Dataset irds.tripclick.test.tail.queries
datamaestro_text.datasets.irds.data.Topics
<p> The least frequent queries in the test set. This represents 50% of the search engine traffic. </p>
-
Dataset irds.tripclick.test.tail.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> The least frequent queries in the test set. This represents 50% of the search engine traffic. </p>
-
Dataset irds.tripclick.test.torso.queries
datamaestro_text.datasets.irds.data.Topics
<p> The moderately frequent queries in the test set. This represents 30% of the search engine traffic. </p>
-
Dataset irds.tripclick.test.torso.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> The moderately frequent queries in the test set. This represents 30% of the search engine traffic. </p>
-
Dataset irds.tripclick.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Training subset of <a class=”ds-ref”>tripclick</a>, including all queries from <a class=”ds-ref”>tripclick/train/head</a>, <a class=”ds-ref”>tripclick/train/torso</a>, and <a class=”ds-ref”>tripclick/train/tail</a>. </p> <p> The dataset provides docpairs in a full text format; we map this text back to the query and doc IDs. A small number of docpairs could not be mapped back, so they are skipped. </p>
-
Dataset irds.tripclick.train.docpairs
<p> Training subset of <a class=”ds-ref”>tripclick</a>, including all queries from <a class=”ds-ref”>tripclick/train/head</a>, <a class=”ds-ref”>tripclick/train/torso</a>, and <a class=”ds-ref”>tripclick/train/tail</a>. </p> <p> The dataset provides docpairs in a full text format; we map this text back to the query and doc IDs. A small number of docpairs could not be mapped back, so they are skipped. </p>
-
Dataset irds.tripclick.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Training subset of <a class=”ds-ref”>tripclick</a>, including all queries from <a class=”ds-ref”>tripclick/train/head</a>, <a class=”ds-ref”>tripclick/train/torso</a>, and <a class=”ds-ref”>tripclick/train/tail</a>. </p> <p> The dataset provides docpairs in a full text format; we map this text back to the query and doc IDs. A small number of docpairs could not be mapped back, so they are skipped. </p>
-
Dataset irds.tripclick.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Training subset of <a class=”ds-ref”>tripclick</a>, including all queries from <a class=”ds-ref”>tripclick/train/head</a>, <a class=”ds-ref”>tripclick/train/torso</a>, and <a class=”ds-ref”>tripclick/train/tail</a>. </p> <p> The dataset provides docpairs in a full text format; we map this text back to the query and doc IDs. A small number of docpairs could not be mapped back, so they are skipped. </p>
-
Dataset irds.tripclick.train.head.queries
datamaestro_text.datasets.irds.data.Topics
<p> The most frequent queries in the train set. This represents 20% of the search engine traffic. </p>
-
Dataset irds.tripclick.train.head.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The most frequent queries in the train set. This represents 20% of the search engine traffic. </p>
-
Dataset irds.tripclick.train.head
datamaestro_text.datasets.irds.data.Adhoc
<p> The most frequent queries in the train set. This represents 20% of the search engine traffic. </p>
-
Dataset irds.tripclick.train.head.dctr.queries
datamaestro_text.datasets.irds.data.Topics
<p> The most frequent queries in the train set. This represents 20% of the search engine traffic. </p>
-
Dataset irds.tripclick.train.head.dctr.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The most frequent queries in the train set. This represents 20% of the search engine traffic. </p>
-
Dataset irds.tripclick.train.head.dctr
datamaestro_text.datasets.irds.data.Adhoc
<p> The most frequent queries in the train set. This represents 20% of the search engine traffic. </p>
-
Dataset irds.tripclick.train.hofstaetter-triples.queries
datamaestro_text.datasets.irds.data.Topics
<p> A version of <a class=”ds-ref”>tripclick/train</a> that replaces the original (noisy) training triples (docpairs) with those sampled from BM25 instead, as suggested by Hofstätter et al (2022). </p> <ul> <li><a href=”https://arxiv.org/pdf/2201.00365.pdf”>Paper</a></li> </ul>
-
Dataset irds.tripclick.train.hofstaetter-triples.docpairs
<p> A version of <a class=”ds-ref”>tripclick/train</a> that replaces the original (noisy) training triples (docpairs) with those sampled from BM25 instead, as suggested by Hofstätter et al (2022). </p> <ul> <li><a href=”https://arxiv.org/pdf/2201.00365.pdf”>Paper</a></li> </ul>
-
Dataset irds.tripclick.train.hofstaetter-triples.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A version of <a class=”ds-ref”>tripclick/train</a> that replaces the original (noisy) training triples (docpairs) with those sampled from BM25 instead, as suggested by Hofstätter et al (2022). </p> <ul> <li><a href=”https://arxiv.org/pdf/2201.00365.pdf”>Paper</a></li> </ul>
-
Dataset irds.tripclick.train.hofstaetter-triples
datamaestro_text.datasets.irds.data.Adhoc
<p> A version of <a class=”ds-ref”>tripclick/train</a> that replaces the original (noisy) training triples (docpairs) with those sampled from BM25 instead, as suggested by Hofstätter et al (2022). </p> <ul> <li><a href=”https://arxiv.org/pdf/2201.00365.pdf”>Paper</a></li> </ul>
-
Dataset irds.tripclick.train.tail.queries
datamaestro_text.datasets.irds.data.Topics
<p> The least frequent queries in the train set. This represents 50% of the search engine traffic. </p>
-
Dataset irds.tripclick.train.tail.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The least frequent queries in the train set. This represents 50% of the search engine traffic. </p>
-
Dataset irds.tripclick.train.tail
datamaestro_text.datasets.irds.data.Adhoc
<p> The least frequent queries in the train set. This represents 50% of the search engine traffic. </p>
-
Dataset irds.tripclick.train.torso.queries
datamaestro_text.datasets.irds.data.Topics
<p> The moderately frequent queries in the train set. This represents 30% of the search engine traffic. </p>
-
Dataset irds.tripclick.train.torso.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The moderately frequent queries in the train set. This represents 30% of the search engine traffic. </p>
-
Dataset irds.tripclick.train.torso
datamaestro_text.datasets.irds.data.Adhoc
<p> The moderately frequent queries in the train set. This represents 30% of the search engine traffic. </p>
-
Dataset irds.tripclick.val.queries
datamaestro_text.datasets.irds.data.Topics
<p> Validation subset of <a class=”ds-ref”>tripclick</a>, including all queries from <a class=”ds-ref”>tripclick/val/head</a>, <a class=”ds-ref”>tripclick/val/torso</a>, and <a class=”ds-ref”>tripclick/val/tail</a>. </p> <p> The scoreddocs are the official BM25 results from Anserini. </p>
-
Dataset irds.tripclick.val.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Validation subset of <a class=”ds-ref”>tripclick</a>, including all queries from <a class=”ds-ref”>tripclick/val/head</a>, <a class=”ds-ref”>tripclick/val/torso</a>, and <a class=”ds-ref”>tripclick/val/tail</a>. </p> <p> The scoreddocs are the official BM25 results from Anserini. </p>
-
Dataset irds.tripclick.val.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Validation subset of <a class=”ds-ref”>tripclick</a>, including all queries from <a class=”ds-ref”>tripclick/val/head</a>, <a class=”ds-ref”>tripclick/val/torso</a>, and <a class=”ds-ref”>tripclick/val/tail</a>. </p> <p> The scoreddocs are the official BM25 results from Anserini. </p>
-
Dataset irds.tripclick.val
datamaestro_text.datasets.irds.data.Adhoc
<p> Validation subset of <a class=”ds-ref”>tripclick</a>, including all queries from <a class=”ds-ref”>tripclick/val/head</a>, <a class=”ds-ref”>tripclick/val/torso</a>, and <a class=”ds-ref”>tripclick/val/tail</a>. </p> <p> The scoreddocs are the official BM25 results from Anserini. </p>
-
Dataset irds.tripclick.val.head.queries
datamaestro_text.datasets.irds.data.Topics
<p> The most frequent queries in the validation set. This represents 20% of the search engine traffic. </p>
-
Dataset irds.tripclick.val.head.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> The most frequent queries in the validation set. This represents 20% of the search engine traffic. </p>
-
Dataset irds.tripclick.val.head.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The most frequent queries in the validation set. This represents 20% of the search engine traffic. </p>
-
Dataset irds.tripclick.val.head
datamaestro_text.datasets.irds.data.Adhoc
<p> The most frequent queries in the validation set. This represents 20% of the search engine traffic. </p>
-
Dataset irds.tripclick.val.head.dctr.queries
datamaestro_text.datasets.irds.data.Topics
<p> The most frequent queries in the validation set. This represents 20% of the search engine traffic. </p>
-
Dataset irds.tripclick.val.head.dctr.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> The most frequent queries in the validation set. This represents 20% of the search engine traffic. </p>
-
Dataset irds.tripclick.val.head.dctr.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The most frequent queries in the validation set. This represents 20% of the search engine traffic. </p>
-
Dataset irds.tripclick.val.head.dctr
datamaestro_text.datasets.irds.data.Adhoc
<p> The most frequent queries in the validation set. This represents 20% of the search engine traffic. </p>
-
Dataset irds.tripclick.val.tail.queries
datamaestro_text.datasets.irds.data.Topics
<p> The least frequent queries in the validation set. This represents 50% of the search engine traffic. </p>
-
Dataset irds.tripclick.val.tail.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> The least frequent queries in the validation set. This represents 50% of the search engine traffic. </p>
-
Dataset irds.tripclick.val.tail.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The least frequent queries in the validation set. This represents 50% of the search engine traffic. </p>
-
Dataset irds.tripclick.val.tail
datamaestro_text.datasets.irds.data.Adhoc
<p> The least frequent queries in the validation set. This represents 50% of the search engine traffic. </p>
-
Dataset irds.tripclick.val.torso.queries
datamaestro_text.datasets.irds.data.Topics
<p> The moderately frequent queries in the validation set. This represents 30% of the search engine traffic. </p>
-
Dataset irds.tripclick.val.torso.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> The moderately frequent queries in the validation set. This represents 30% of the search engine traffic. </p>
-
Dataset irds.tripclick.val.torso.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The moderately frequent queries in the validation set. This represents 30% of the search engine traffic. </p>
-
Dataset irds.tripclick.val.torso
datamaestro_text.datasets.irds.data.Adhoc
<p> The moderately frequent queries in the validation set. This represents 30% of the search engine traffic. </p>
tripclick/logs
<p> Raw query logs from TripClick. </p> <p> Note that this subset includes a broader set of documents than the main collection, but they only provide the title and URL. </p>
-
Dataset irds.tripclick.logs.documents
datamaestro_text.datasets.irds.data.Documents
<p> Raw query logs from TripClick. </p> <p> Note that this subset includes a broader set of documents than the main collection, but they only provide the title and URL. </p>
Tweets 2013 (Internet Archive)
<p> A collection of tweets from a 2-month window achived by the Internet Achive. This collection can be a stand-in document collection for the TREC Microblog 2013-14 tasks. (Even though it is not exactly the same collection, <a href=”https://cs.uwaterloo.ca/~jimmylin/publications/Sequiera_Lin_SIGIR2017.pdf”>Sequiera and Lin</a> show that it it close enough.) </p> <p> This collection is automatically downloaded from the Internet Archive, though download speeds are often slow so it takes some time. ir_datasets constructs a new directory hierarchy during the download process to facilitate fast lookups and slices. </p> <ul> <li>Documents: Tweets</li> <li><a href=”https://cs.uwaterloo.ca/~jimmylin/publications/Sequiera_Lin_SIGIR2017.pdf”>Information about dataset (paper)</a></li> <li><a href=”https://github.com/castorini/Tweets2013-IA”>Information about dataset (repository)</a></li> </ul>
-
Dataset irds.tweets2013-ia.documents
datamaestro_text.datasets.irds.data.Documents
<p> A collection of tweets from a 2-month window achived by the Internet Achive. This collection can be a stand-in document collection for the TREC Microblog 2013-14 tasks. (Even though it is not exactly the same collection, <a href=”https://cs.uwaterloo.ca/~jimmylin/publications/Sequiera_Lin_SIGIR2017.pdf”>Sequiera and Lin</a> show that it it close enough.) </p> <p> This collection is automatically downloaded from the Internet Archive, though download speeds are often slow so it takes some time. ir_datasets constructs a new directory hierarchy during the download process to facilitate fast lookups and slices. </p> <ul> <li>Documents: Tweets</li> <li><a href=”https://cs.uwaterloo.ca/~jimmylin/publications/Sequiera_Lin_SIGIR2017.pdf”>Information about dataset (paper)</a></li> <li><a href=”https://github.com/castorini/Tweets2013-IA”>Information about dataset (repository)</a></li> </ul>
-
Dataset irds.tweets2013-ia.trec-mb-2013.queries
datamaestro_text.datasets.irds.data.Topics
<p> TREC Microblog 2013 test collection. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec22/papers/MB.OVERVIEW.pdf”>Shared Task Paper</a></li> <li><a href=”https://github.com/lintool/twitter-tools/wiki/TREC-2013-Track-Guidelines”>Shared Task Site</a></li> </ul>
-
Dataset irds.tweets2013-ia.trec-mb-2013.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> TREC Microblog 2013 test collection. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec22/papers/MB.OVERVIEW.pdf”>Shared Task Paper</a></li> <li><a href=”https://github.com/lintool/twitter-tools/wiki/TREC-2013-Track-Guidelines”>Shared Task Site</a></li> </ul>
-
Dataset irds.tweets2013-ia.trec-mb-2013
datamaestro_text.datasets.irds.data.Adhoc
<p> TREC Microblog 2013 test collection. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec22/papers/MB.OVERVIEW.pdf”>Shared Task Paper</a></li> <li><a href=”https://github.com/lintool/twitter-tools/wiki/TREC-2013-Track-Guidelines”>Shared Task Site</a></li> </ul>
-
Dataset irds.tweets2013-ia.trec-mb-2014.queries
datamaestro_text.datasets.irds.data.Topics
<p> TREC Microblog 2014 test collection. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec23/papers/overview-microblog.pdf”>Shared Task Paper</a></li> <li><a href=”https://github.com/lintool/twitter-tools/wiki/TREC-2014-Track-Guidelines”>Shared Task Site</a></li> </ul>
-
Dataset irds.tweets2013-ia.trec-mb-2014.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> TREC Microblog 2014 test collection. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec23/papers/overview-microblog.pdf”>Shared Task Paper</a></li> <li><a href=”https://github.com/lintool/twitter-tools/wiki/TREC-2014-Track-Guidelines”>Shared Task Site</a></li> </ul>
-
Dataset irds.tweets2013-ia.trec-mb-2014
datamaestro_text.datasets.irds.data.Adhoc
<p> TREC Microblog 2014 test collection. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec23/papers/overview-microblog.pdf”>Shared Task Paper</a></li> <li><a href=”https://github.com/lintool/twitter-tools/wiki/TREC-2014-Track-Guidelines”>Shared Task Site</a></li> </ul>
Vaswani
<p> A small corpus of roughly 11,000 scientific abstracts. </p> <ul> <li>Documents: Scientific abstracts</li> <li>Queries: Natural language keywords</li> <li><a href=”http://ir.dcs.gla.ac.uk/resources/test_collections/npl/”>Dataset Information</a></li> </ul>
-
Dataset irds.vaswani.documents
datamaestro_text.datasets.irds.data.Documents
<p> A small corpus of roughly 11,000 scientific abstracts. </p> <ul> <li>Documents: Scientific abstracts</li> <li>Queries: Natural language keywords</li> <li><a href=”http://ir.dcs.gla.ac.uk/resources/test_collections/npl/”>Dataset Information</a></li> </ul>
-
Dataset irds.vaswani.queries
datamaestro_text.datasets.irds.data.Topics
<p> A small corpus of roughly 11,000 scientific abstracts. </p> <ul> <li>Documents: Scientific abstracts</li> <li>Queries: Natural language keywords</li> <li><a href=”http://ir.dcs.gla.ac.uk/resources/test_collections/npl/”>Dataset Information</a></li> </ul>
-
Dataset irds.vaswani.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A small corpus of roughly 11,000 scientific abstracts. </p> <ul> <li>Documents: Scientific abstracts</li> <li>Queries: Natural language keywords</li> <li><a href=”http://ir.dcs.gla.ac.uk/resources/test_collections/npl/”>Dataset Information</a></li> </ul>
-
Dataset irds.vaswani
datamaestro_text.datasets.irds.data.Adhoc
<p> A small corpus of roughly 11,000 scientific abstracts. </p> <ul> <li>Documents: Scientific abstracts</li> <li>Queries: Natural language keywords</li> <li><a href=”http://ir.dcs.gla.ac.uk/resources/test_collections/npl/”>Dataset Information</a></li> </ul>
wapo/v2
<p> Version 2 of the Washington Post collection, consisting of articles published between 2012-2017. </p> <p> The collection is obtained from NIST by requesting it from NIST <a href=”https://trec.nist.gov/data/wapost/”>here</a>. </p> <p> body contains all body text in plain text format, including paragrphs and multi-media captions. body_paras_html contains only source paragraphs and contains HTML markup. body_media contains images, videos, tweets, and galeries, along with a link to the content and a textual caption. </p> <ul> <li><a href=”https://trec.nist.gov/data/wapost/”>Collection Website</a></li> </ul>
-
Dataset irds.wapo.v2.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version 2 of the Washington Post collection, consisting of articles published between 2012-2017. </p> <p> The collection is obtained from NIST by requesting it from NIST <a href=”https://trec.nist.gov/data/wapost/”>here</a>. </p> <p> body contains all body text in plain text format, including paragrphs and multi-media captions. body_paras_html contains only source paragraphs and contains HTML markup. body_media contains images, videos, tweets, and galeries, along with a link to the content and a textual caption. </p> <ul> <li><a href=”https://trec.nist.gov/data/wapost/”>Collection Website</a></li> </ul>
-
Dataset irds.wapo.v2.trec-core-2018.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC Common Core 2018 benchmark. </p> <ul> <li>Queries: TREC-style (keyword, description, narrative)</li> <li>Relevance: Deeply-annotated</li> <li><a href=”https://github.com/trec-core/2018”>Shared Task Website</a></li> <ul>
-
Dataset irds.wapo.v2.trec-core-2018.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC Common Core 2018 benchmark. </p> <ul> <li>Queries: TREC-style (keyword, description, narrative)</li> <li>Relevance: Deeply-annotated</li> <li><a href=”https://github.com/trec-core/2018”>Shared Task Website</a></li> <ul>
-
Dataset irds.wapo.v2.trec-core-2018
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC Common Core 2018 benchmark. </p> <ul> <li>Queries: TREC-style (keyword, description, narrative)</li> <li>Relevance: Deeply-annotated</li> <li><a href=”https://github.com/trec-core/2018”>Shared Task Website</a></li> <ul>
-
Dataset irds.wapo.v2.trec-news-2018.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC News 2018 Background Linking task. The task is to find relevant background information for the provided articles. </p> <ul> <li>Queries: Articles via the doc_id field</li> <li><a href=”http://trec-news.org/”>Shared Task Website</a></li> <li><a href=”https://trec.nist.gov/pubs/trec27/papers/Overview-News.pdf”>Sared task paper</a></li> <ul>
-
Dataset irds.wapo.v2.trec-news-2018.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC News 2018 Background Linking task. The task is to find relevant background information for the provided articles. </p> <ul> <li>Queries: Articles via the doc_id field</li> <li><a href=”http://trec-news.org/”>Shared Task Website</a></li> <li><a href=”https://trec.nist.gov/pubs/trec27/papers/Overview-News.pdf”>Sared task paper</a></li> <ul>
-
Dataset irds.wapo.v2.trec-news-2018
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC News 2018 Background Linking task. The task is to find relevant background information for the provided articles. </p> <ul> <li>Queries: Articles via the doc_id field</li> <li><a href=”http://trec-news.org/”>Shared Task Website</a></li> <li><a href=”https://trec.nist.gov/pubs/trec27/papers/Overview-News.pdf”>Sared task paper</a></li> <ul>
-
Dataset irds.wapo.v2.trec-news-2019.queries
datamaestro_text.datasets.irds.data.Topics
<p> The TREC News 2019 Background Linking task. The task is to find relevant background information for the provided articles. </p> <ul> <li>Queries: Articles via the doc_id field</li> <li><a href=”http://trec-news.org/”>Shared Task Website</a></li> <li><a href=”https://trec.nist.gov/pubs/trec28/papers/OVERVIEW.N.pdf”>Sared task paper</a></li> <ul>
-
Dataset irds.wapo.v2.trec-news-2019.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> The TREC News 2019 Background Linking task. The task is to find relevant background information for the provided articles. </p> <ul> <li>Queries: Articles via the doc_id field</li> <li><a href=”http://trec-news.org/”>Shared Task Website</a></li> <li><a href=”https://trec.nist.gov/pubs/trec28/papers/OVERVIEW.N.pdf”>Sared task paper</a></li> <ul>
-
Dataset irds.wapo.v2.trec-news-2019
datamaestro_text.datasets.irds.data.Adhoc
<p> The TREC News 2019 Background Linking task. The task is to find relevant background information for the provided articles. </p> <ul> <li>Queries: Articles via the doc_id field</li> <li><a href=”http://trec-news.org/”>Shared Task Website</a></li> <li><a href=”https://trec.nist.gov/pubs/trec28/papers/OVERVIEW.N.pdf”>Sared task paper</a></li> <ul>
wapo/v4
-
Dataset irds.wapo.v4.documents
wikiclir/ar
<p> WikiCLIR with Arabic documents. </p>
-
Dataset irds.wikiclir.ar.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Arabic documents. </p>
-
Dataset irds.wikiclir.ar.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Arabic documents. </p>
-
Dataset irds.wikiclir.ar.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Arabic documents. </p>
-
Dataset irds.wikiclir.ar
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Arabic documents. </p>
wikiclir/ca
<p> WikiCLIR with Catalan documents. </p>
-
Dataset irds.wikiclir.ca.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Catalan documents. </p>
-
Dataset irds.wikiclir.ca.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Catalan documents. </p>
-
Dataset irds.wikiclir.ca.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Catalan documents. </p>
-
Dataset irds.wikiclir.ca
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Catalan documents. </p>
wikiclir/cs
<p> WikiCLIR with Czech documents. </p>
-
Dataset irds.wikiclir.cs.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Czech documents. </p>
-
Dataset irds.wikiclir.cs.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Czech documents. </p>
-
Dataset irds.wikiclir.cs.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Czech documents. </p>
-
Dataset irds.wikiclir.cs
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Czech documents. </p>
wikiclir/de
<p> WikiCLIR with German documents. </p>
-
Dataset irds.wikiclir.de.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with German documents. </p>
-
Dataset irds.wikiclir.de.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with German documents. </p>
-
Dataset irds.wikiclir.de.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with German documents. </p>
-
Dataset irds.wikiclir.de
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with German documents. </p>
wikiclir/en-simple
<p> WikiCLIR with Simple English documents. </p>
-
Dataset irds.wikiclir.en-simple.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Simple English documents. </p>
-
Dataset irds.wikiclir.en-simple.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Simple English documents. </p>
-
Dataset irds.wikiclir.en-simple.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Simple English documents. </p>
-
Dataset irds.wikiclir.en-simple
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Simple English documents. </p>
wikiclir/es
<p> WikiCLIR with Spanish documents. </p>
-
Dataset irds.wikiclir.es.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Spanish documents. </p>
-
Dataset irds.wikiclir.es.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Spanish documents. </p>
-
Dataset irds.wikiclir.es.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Spanish documents. </p>
-
Dataset irds.wikiclir.es
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Spanish documents. </p>
wikiclir/fi
<p> WikiCLIR with Finnish documents. </p>
-
Dataset irds.wikiclir.fi.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Finnish documents. </p>
-
Dataset irds.wikiclir.fi.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Finnish documents. </p>
-
Dataset irds.wikiclir.fi.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Finnish documents. </p>
-
Dataset irds.wikiclir.fi
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Finnish documents. </p>
wikiclir/fr
<p> WikiCLIR with French documents. </p>
-
Dataset irds.wikiclir.fr.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with French documents. </p>
-
Dataset irds.wikiclir.fr.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with French documents. </p>
-
Dataset irds.wikiclir.fr.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with French documents. </p>
-
Dataset irds.wikiclir.fr
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with French documents. </p>
wikiclir/it
<p> WikiCLIR with Italian documents. </p>
-
Dataset irds.wikiclir.it.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Italian documents. </p>
-
Dataset irds.wikiclir.it.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Italian documents. </p>
-
Dataset irds.wikiclir.it.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Italian documents. </p>
-
Dataset irds.wikiclir.it
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Italian documents. </p>
wikiclir/ja
<p> WikiCLIR with Japanese documents. </p>
-
Dataset irds.wikiclir.ja.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Japanese documents. </p>
-
Dataset irds.wikiclir.ja.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Japanese documents. </p>
-
Dataset irds.wikiclir.ja.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Japanese documents. </p>
-
Dataset irds.wikiclir.ja
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Japanese documents. </p>
wikiclir/ko
<p> WikiCLIR with Korean documents. </p>
-
Dataset irds.wikiclir.ko.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Korean documents. </p>
-
Dataset irds.wikiclir.ko.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Korean documents. </p>
-
Dataset irds.wikiclir.ko.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Korean documents. </p>
-
Dataset irds.wikiclir.ko
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Korean documents. </p>
wikiclir/nl
<p> WikiCLIR with Dutch documents. </p>
-
Dataset irds.wikiclir.nl.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Dutch documents. </p>
-
Dataset irds.wikiclir.nl.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Dutch documents. </p>
-
Dataset irds.wikiclir.nl.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Dutch documents. </p>
-
Dataset irds.wikiclir.nl
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Dutch documents. </p>
wikiclir/nn
<p> WikiCLIR with Norwegian (Bokmål) documents. </p>
-
Dataset irds.wikiclir.nn.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Norwegian (Bokmål) documents. </p>
-
Dataset irds.wikiclir.nn.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Norwegian (Bokmål) documents. </p>
-
Dataset irds.wikiclir.nn.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Norwegian (Bokmål) documents. </p>
-
Dataset irds.wikiclir.nn
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Norwegian (Bokmål) documents. </p>
wikiclir/no
<p> WikiCLIR with Norwegian (Nynorsk) documents. </p>
-
Dataset irds.wikiclir.no.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Norwegian (Nynorsk) documents. </p>
-
Dataset irds.wikiclir.no.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Norwegian (Nynorsk) documents. </p>
-
Dataset irds.wikiclir.no.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Norwegian (Nynorsk) documents. </p>
-
Dataset irds.wikiclir.no
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Norwegian (Nynorsk) documents. </p>
wikiclir/pl
<p> WikiCLIR with Polish documents. </p>
-
Dataset irds.wikiclir.pl.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Polish documents. </p>
-
Dataset irds.wikiclir.pl.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Polish documents. </p>
-
Dataset irds.wikiclir.pl.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Polish documents. </p>
-
Dataset irds.wikiclir.pl
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Polish documents. </p>
wikiclir/pt
<p> WikiCLIR with Portuguese documents. </p>
-
Dataset irds.wikiclir.pt.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Portuguese documents. </p>
-
Dataset irds.wikiclir.pt.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Portuguese documents. </p>
-
Dataset irds.wikiclir.pt.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Portuguese documents. </p>
-
Dataset irds.wikiclir.pt
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Portuguese documents. </p>
wikiclir/ro
<p> WikiCLIR with Romanian documents. </p>
-
Dataset irds.wikiclir.ro.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Romanian documents. </p>
-
Dataset irds.wikiclir.ro.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Romanian documents. </p>
-
Dataset irds.wikiclir.ro.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Romanian documents. </p>
-
Dataset irds.wikiclir.ro
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Romanian documents. </p>
wikiclir/ru
<p> WikiCLIR with Russian documents. </p>
-
Dataset irds.wikiclir.ru.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Russian documents. </p>
-
Dataset irds.wikiclir.ru.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Russian documents. </p>
-
Dataset irds.wikiclir.ru.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Russian documents. </p>
-
Dataset irds.wikiclir.ru
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Russian documents. </p>
wikiclir/sv
<p> WikiCLIR with Swedish documents. </p>
-
Dataset irds.wikiclir.sv.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Swedish documents. </p>
-
Dataset irds.wikiclir.sv.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Swedish documents. </p>
-
Dataset irds.wikiclir.sv.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Swedish documents. </p>
-
Dataset irds.wikiclir.sv
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Swedish documents. </p>
wikiclir/sw
<p> WikiCLIR with Swahili documents. </p>
-
Dataset irds.wikiclir.sw.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Swahili documents. </p>
-
Dataset irds.wikiclir.sw.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Swahili documents. </p>
-
Dataset irds.wikiclir.sw.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Swahili documents. </p>
-
Dataset irds.wikiclir.sw
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Swahili documents. </p>
wikiclir/tl
<p> WikiCLIR with Tagalog documents. </p>
-
Dataset irds.wikiclir.tl.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Tagalog documents. </p>
-
Dataset irds.wikiclir.tl.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Tagalog documents. </p>
-
Dataset irds.wikiclir.tl.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Tagalog documents. </p>
-
Dataset irds.wikiclir.tl
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Tagalog documents. </p>
wikiclir/tr
<p> WikiCLIR with Turkish documents. </p>
-
Dataset irds.wikiclir.tr.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Turkish documents. </p>
-
Dataset irds.wikiclir.tr.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Turkish documents. </p>
-
Dataset irds.wikiclir.tr.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Turkish documents. </p>
-
Dataset irds.wikiclir.tr
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Turkish documents. </p>
wikiclir/uk
<p> WikiCLIR with Ukrainian documents. </p>
-
Dataset irds.wikiclir.uk.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Ukrainian documents. </p>
-
Dataset irds.wikiclir.uk.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Ukrainian documents. </p>
-
Dataset irds.wikiclir.uk.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Ukrainian documents. </p>
-
Dataset irds.wikiclir.uk
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Ukrainian documents. </p>
wikiclir/vi
<p> WikiCLIR with Vietnamese documents. </p>
-
Dataset irds.wikiclir.vi.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Vietnamese documents. </p>
-
Dataset irds.wikiclir.vi.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Vietnamese documents. </p>
-
Dataset irds.wikiclir.vi.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Vietnamese documents. </p>
-
Dataset irds.wikiclir.vi
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Vietnamese documents. </p>
wikiclir/zh
<p> WikiCLIR with Chinese documents. </p>
-
Dataset irds.wikiclir.zh.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikiCLIR with Chinese documents. </p>
-
Dataset irds.wikiclir.zh.queries
datamaestro_text.datasets.irds.data.Topics
<p> WikiCLIR with Chinese documents. </p>
-
Dataset irds.wikiclir.zh.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> WikiCLIR with Chinese documents. </p>
-
Dataset irds.wikiclir.zh
datamaestro_text.datasets.irds.data.Adhoc
<p> WikiCLIR with Chinese documents. </p>
wikir/en1k
<p> A small version of WikIR for English. </p>
-
Dataset irds.wikir.en1k.documents
datamaestro_text.datasets.irds.data.Documents
<p> A small version of WikIR for English. </p>
-
Dataset irds.wikir.en1k.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> Test set of <a class=”ds-ref”>wikir/en1k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en1k.test.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Test set of <a class=”ds-ref”>wikir/en1k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en1k.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Test set of <a class=”ds-ref”>wikir/en1k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en1k.test
datamaestro_text.datasets.irds.data.Adhoc
<p> Test set of <a class=”ds-ref”>wikir/en1k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en1k.training.queries
datamaestro_text.datasets.irds.data.Topics
<p> Training set of <a class=”ds-ref”>wikir/en1k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en1k.training.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Training set of <a class=”ds-ref”>wikir/en1k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en1k.training.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Training set of <a class=”ds-ref”>wikir/en1k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en1k.training
datamaestro_text.datasets.irds.data.Adhoc
<p> Training set of <a class=”ds-ref”>wikir/en1k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en1k.validation.queries
datamaestro_text.datasets.irds.data.Topics
<p> Validation set of <a class=”ds-ref”>wikir/en1k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en1k.validation.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Validation set of <a class=”ds-ref”>wikir/en1k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en1k.validation.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Validation set of <a class=”ds-ref”>wikir/en1k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en1k.validation
datamaestro_text.datasets.irds.data.Adhoc
<p> Validation set of <a class=”ds-ref”>wikir/en1k</a>. Scoreddocs are the provided BM25 run. </p>
wikir/en59k
<p> WikIR for English. </p>
-
Dataset irds.wikir.en59k.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikIR for English. </p>
-
Dataset irds.wikir.en59k.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> Test set of <a class=”ds-ref”>wikir/en59k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en59k.test.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Test set of <a class=”ds-ref”>wikir/en59k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en59k.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Test set of <a class=”ds-ref”>wikir/en59k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en59k.test
datamaestro_text.datasets.irds.data.Adhoc
<p> Test set of <a class=”ds-ref”>wikir/en59k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en59k.training.queries
datamaestro_text.datasets.irds.data.Topics
<p> Training set of <a class=”ds-ref”>wikir/en59k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en59k.training.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Training set of <a class=”ds-ref”>wikir/en59k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en59k.training.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Training set of <a class=”ds-ref”>wikir/en59k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en59k.training
datamaestro_text.datasets.irds.data.Adhoc
<p> Training set of <a class=”ds-ref”>wikir/en59k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en59k.validation.queries
datamaestro_text.datasets.irds.data.Topics
<p> Validation set of <a class=”ds-ref”>wikir/en59k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en59k.validation.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Validation set of <a class=”ds-ref”>wikir/en59k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en59k.validation.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Validation set of <a class=”ds-ref”>wikir/en59k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en59k.validation
datamaestro_text.datasets.irds.data.Adhoc
<p> Validation set of <a class=”ds-ref”>wikir/en59k</a>. Scoreddocs are the provided BM25 run. </p>
wikir/en78k
<p> WikIR for English. This is one of the two versions used in <a href=”https://aclanthology.org/2020.lrec-1.237.pdf”>Frej2020Wikir</a>. </p>
-
Dataset irds.wikir.en78k.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikIR for English. This is one of the two versions used in <a href=”https://aclanthology.org/2020.lrec-1.237.pdf”>Frej2020Wikir</a>. </p>
-
Dataset irds.wikir.en78k.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> Test set of <a class=”ds-ref”>wikir/en78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en78k.test.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Test set of <a class=”ds-ref”>wikir/en78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en78k.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Test set of <a class=”ds-ref”>wikir/en78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en78k.test
datamaestro_text.datasets.irds.data.Adhoc
<p> Test set of <a class=”ds-ref”>wikir/en78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en78k.training.queries
datamaestro_text.datasets.irds.data.Topics
<p> Training set of <a class=”ds-ref”>wikir/en78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en78k.training.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Training set of <a class=”ds-ref”>wikir/en78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en78k.training.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Training set of <a class=”ds-ref”>wikir/en78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en78k.training
datamaestro_text.datasets.irds.data.Adhoc
<p> Training set of <a class=”ds-ref”>wikir/en78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en78k.validation.queries
datamaestro_text.datasets.irds.data.Topics
<p> Validation set of <a class=”ds-ref”>wikir/en78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en78k.validation.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Validation set of <a class=”ds-ref”>wikir/en78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en78k.validation.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Validation set of <a class=”ds-ref”>wikir/en78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.en78k.validation
datamaestro_text.datasets.irds.data.Adhoc
<p> Validation set of <a class=”ds-ref”>wikir/en78k</a>. Scoreddocs are the provided BM25 run. </p>
wikir/ens78k
<p> WikIR for English, using the first sentences of articles as queries. This is one of the two versions used in <a href=”https://aclanthology.org/2020.lrec-1.237.pdf”>Frej2020Wikir</a>. </p>
-
Dataset irds.wikir.ens78k.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikIR for English, using the first sentences of articles as queries. This is one of the two versions used in <a href=”https://aclanthology.org/2020.lrec-1.237.pdf”>Frej2020Wikir</a>. </p>
-
Dataset irds.wikir.ens78k.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> Test set of <a class=”ds-ref”>wikir/ens78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.ens78k.test.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Test set of <a class=”ds-ref”>wikir/ens78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.ens78k.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Test set of <a class=”ds-ref”>wikir/ens78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.ens78k.test
datamaestro_text.datasets.irds.data.Adhoc
<p> Test set of <a class=”ds-ref”>wikir/ens78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.ens78k.training.queries
datamaestro_text.datasets.irds.data.Topics
<p> Training set of <a class=”ds-ref”>wikir/ens78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.ens78k.training.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Training set of <a class=”ds-ref”>wikir/ens78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.ens78k.training.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Training set of <a class=”ds-ref”>wikir/ens78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.ens78k.training
datamaestro_text.datasets.irds.data.Adhoc
<p> Training set of <a class=”ds-ref”>wikir/ens78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.ens78k.validation.queries
datamaestro_text.datasets.irds.data.Topics
<p> Validation set of <a class=”ds-ref”>wikir/ens78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.ens78k.validation.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Validation set of <a class=”ds-ref”>wikir/ens78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.ens78k.validation.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Validation set of <a class=”ds-ref”>wikir/ens78k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.ens78k.validation
datamaestro_text.datasets.irds.data.Adhoc
<p> Validation set of <a class=”ds-ref”>wikir/ens78k</a>. Scoreddocs are the provided BM25 run. </p>
wikir/es13k
<p> WikIR for Spanish. </p>
-
Dataset irds.wikir.es13k.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikIR for Spanish. </p>
-
Dataset irds.wikir.es13k.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> Test set of <a class=”ds-ref”>wikir/es13k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.es13k.test.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Test set of <a class=”ds-ref”>wikir/es13k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.es13k.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Test set of <a class=”ds-ref”>wikir/es13k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.es13k.test
datamaestro_text.datasets.irds.data.Adhoc
<p> Test set of <a class=”ds-ref”>wikir/es13k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.es13k.training.queries
datamaestro_text.datasets.irds.data.Topics
<p> Training set of <a class=”ds-ref”>wikir/es13k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.es13k.training.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Training set of <a class=”ds-ref”>wikir/es13k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.es13k.training.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Training set of <a class=”ds-ref”>wikir/es13k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.es13k.training
datamaestro_text.datasets.irds.data.Adhoc
<p> Training set of <a class=”ds-ref”>wikir/es13k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.es13k.validation.queries
datamaestro_text.datasets.irds.data.Topics
<p> Validation set of <a class=”ds-ref”>wikir/es13k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.es13k.validation.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Validation set of <a class=”ds-ref”>wikir/es13k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.es13k.validation.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Validation set of <a class=”ds-ref”>wikir/es13k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.es13k.validation
datamaestro_text.datasets.irds.data.Adhoc
<p> Validation set of <a class=”ds-ref”>wikir/es13k</a>. Scoreddocs are the provided BM25 run. </p>
wikir/fr14k
<p> WikIR for French. </p>
-
Dataset irds.wikir.fr14k.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikIR for French. </p>
-
Dataset irds.wikir.fr14k.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> Test set of <a class=”ds-ref”>wikir/fr14k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.fr14k.test.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Test set of <a class=”ds-ref”>wikir/fr14k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.fr14k.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Test set of <a class=”ds-ref”>wikir/fr14k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.fr14k.test
datamaestro_text.datasets.irds.data.Adhoc
<p> Test set of <a class=”ds-ref”>wikir/fr14k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.fr14k.training.queries
datamaestro_text.datasets.irds.data.Topics
<p> Training set of <a class=”ds-ref”>wikir/fr14k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.fr14k.training.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Training set of <a class=”ds-ref”>wikir/fr14k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.fr14k.training.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Training set of <a class=”ds-ref”>wikir/fr14k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.fr14k.training
datamaestro_text.datasets.irds.data.Adhoc
<p> Training set of <a class=”ds-ref”>wikir/fr14k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.fr14k.validation.queries
datamaestro_text.datasets.irds.data.Topics
<p> Validation set of <a class=”ds-ref”>wikir/fr14k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.fr14k.validation.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Validation set of <a class=”ds-ref”>wikir/fr14k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.fr14k.validation.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Validation set of <a class=”ds-ref”>wikir/fr14k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.fr14k.validation
datamaestro_text.datasets.irds.data.Adhoc
<p> Validation set of <a class=”ds-ref”>wikir/fr14k</a>. Scoreddocs are the provided BM25 run. </p>
wikir/it16k
<p> WikIR for Italian. </p>
-
Dataset irds.wikir.it16k.documents
datamaestro_text.datasets.irds.data.Documents
<p> WikIR for Italian. </p>
-
Dataset irds.wikir.it16k.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> Test set of <a class=”ds-ref”>wikir/it16k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.it16k.test.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Test set of <a class=”ds-ref”>wikir/it16k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.it16k.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Test set of <a class=”ds-ref”>wikir/it16k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.it16k.test
datamaestro_text.datasets.irds.data.Adhoc
<p> Test set of <a class=”ds-ref”>wikir/it16k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.it16k.training.queries
datamaestro_text.datasets.irds.data.Topics
<p> Training set of <a class=”ds-ref”>wikir/it16k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.it16k.training.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Training set of <a class=”ds-ref”>wikir/it16k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.it16k.training.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Training set of <a class=”ds-ref”>wikir/it16k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.it16k.training
datamaestro_text.datasets.irds.data.Adhoc
<p> Training set of <a class=”ds-ref”>wikir/it16k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.it16k.validation.queries
datamaestro_text.datasets.irds.data.Topics
<p> Validation set of <a class=”ds-ref”>wikir/it16k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.it16k.validation.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Validation set of <a class=”ds-ref”>wikir/it16k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.it16k.validation.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Validation set of <a class=”ds-ref”>wikir/it16k</a>. Scoreddocs are the provided BM25 run. </p>
-
Dataset irds.wikir.it16k.validation
datamaestro_text.datasets.irds.data.Adhoc
<p> Validation set of <a class=”ds-ref”>wikir/it16k</a>. Scoreddocs are the provided BM25 run. </p>
TREC Fair Ranking
<p> The TREC Fair Ranking track evaluates systems according to how well they fairly rank documents. </p> <ul> <li><a href=”https://fair-trec.github.io/”>Website</a></li> </ul>
-
Dataset irds.trec-fair.2021.documents
datamaestro_text.datasets.irds.data.Documents
<p> The TREC Fair Ranking track evaluates systems according to how well they fairly rank documents. </p> <ul> <li><a href=”https://fair-trec.github.io/”>Website</a></li> </ul>
-
Dataset irds.trec-fair.2021.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official TREC Fair Ranking 2021 train set. </p>
-
Dataset irds.trec-fair.2021.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official TREC Fair Ranking 2021 train set. </p>
-
Dataset irds.trec-fair.2021.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Official TREC Fair Ranking 2021 train set. </p>
-
Dataset irds.trec-fair.2021.eval.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official TREC Fair Ranking 2021 evaluation set. </p>
-
Dataset irds.trec-fair.2021.eval.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official TREC Fair Ranking 2021 evaluation set. </p>
-
Dataset irds.trec-fair.2021.eval
datamaestro_text.datasets.irds.data.Adhoc
<p> Official TREC Fair Ranking 2021 evaluation set. </p>
trec-fair/2022
<p> The TREC Fair Ranking 2022 track focuses on fairly prioritising Wikimedia articles for editing to provide a fair exposure to articles from different groups. </p> <ul> <li><a href=”https://fair-trec.github.io/”>2022 Track Website</a></li> </ul>
-
Dataset irds.trec-fair.2022.documents
datamaestro_text.datasets.irds.data.Documents
<p> The TREC Fair Ranking 2022 track focuses on fairly prioritising Wikimedia articles for editing to provide a fair exposure to articles from different groups. </p> <ul> <li><a href=”https://fair-trec.github.io/”>2022 Track Website</a></li> </ul>
-
Dataset irds.trec-fair.2022.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official TREC Fair Ranking 2022 train set. </p>
-
Dataset irds.trec-fair.2022.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official TREC Fair Ranking 2022 train set. </p>
-
Dataset irds.trec-fair.2022.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Official TREC Fair Ranking 2022 train set. </p>
trec-cast/v0
<p> Version 0 of the TREC CAsT corpus. This version uses documents from the Washington Post (version 2), TREC CAR (version 2), and MS MARCO passage (version 1). </p> <p> This corpus was originally meant to be used for evaluation of the 2019 task, but the Washington Post corpus was not included for scoring in the final version due to “an error in the process led to ambiguous document ids,” and Washington Post documents were removed from participating systems. As such, <a class=”ds-ref”>trec-cast/v1</a> (which doesn’t include the Washington Post) should be used for the 2019 version of the task. However, this version still can be used for the training set (<a class=”ds-ref”>trec-cast/v0/train</a>) or for replicating the original submissions to the track (prior to the removal of Washingotn Post documents). </p> <ul> <li><a href=”https://arxiv.org/pdf/2003.13624.pdf”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-cast.v0.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version 0 of the TREC CAsT corpus. This version uses documents from the Washington Post (version 2), TREC CAR (version 2), and MS MARCO passage (version 1). </p> <p> This corpus was originally meant to be used for evaluation of the 2019 task, but the Washington Post corpus was not included for scoring in the final version due to “an error in the process led to ambiguous document ids,” and Washington Post documents were removed from participating systems. As such, <a class=”ds-ref”>trec-cast/v1</a> (which doesn’t include the Washington Post) should be used for the 2019 version of the task. However, this version still can be used for the training set (<a class=”ds-ref”>trec-cast/v0/train</a>) or for replicating the original submissions to the track (prior to the removal of Washingotn Post documents). </p> <ul> <li><a href=”https://arxiv.org/pdf/2003.13624.pdf”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-cast.v0.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Training set provided by TREC CAsT 2019. </p>
-
Dataset irds.trec-cast.v0.train.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Training set provided by TREC CAsT 2019. </p>
-
Dataset irds.trec-cast.v0.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Training set provided by TREC CAsT 2019. </p>
-
Dataset irds.trec-cast.v0.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Training set provided by TREC CAsT 2019. </p>
-
Dataset irds.trec-cast.v0.train.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p> <a class=”ds-ref”>trec-cast/2019/train</a>, but with queries that do not appear in the qrels removed. </p>
-
Dataset irds.trec-cast.v0.train.judged.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> <a class=”ds-ref”>trec-cast/2019/train</a>, but with queries that do not appear in the qrels removed. </p>
-
Dataset irds.trec-cast.v0.train.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> <a class=”ds-ref”>trec-cast/2019/train</a>, but with queries that do not appear in the qrels removed. </p>
-
Dataset irds.trec-cast.v0.train.judged
datamaestro_text.datasets.irds.data.Adhoc
<p> <a class=”ds-ref”>trec-cast/2019/train</a>, but with queries that do not appear in the qrels removed. </p>
trec-cast/v1
<p> Version 1 of the TREC CAsT corpus. This version uses documents from the TREC CAR (version 2) and MS MARCO passage (version 1). This version of the corpus was used for TREC CAsT 2019 and 2020. </p> <ul> <li><a href=”https://arxiv.org/pdf/2003.13624.pdf”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-cast.v1.documents
datamaestro_text.datasets.irds.data.Documents
<p> Version 1 of the TREC CAsT corpus. This version uses documents from the TREC CAR (version 2) and MS MARCO passage (version 1). This version of the corpus was used for TREC CAsT 2019 and 2020. </p> <ul> <li><a href=”https://arxiv.org/pdf/2003.13624.pdf”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-cast.v1.2019.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official evaluation set for TREC CAsT 2019. </p>
-
Dataset irds.trec-cast.v1.2019.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> Official evaluation set for TREC CAsT 2019. </p>
-
Dataset irds.trec-cast.v1.2019.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official evaluation set for TREC CAsT 2019. </p>
-
Dataset irds.trec-cast.v1.2019
datamaestro_text.datasets.irds.data.Adhoc
<p> Official evaluation set for TREC CAsT 2019. </p>
-
Dataset irds.trec-cast.v1.2019.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p> <a class=”ds-ref”>trec-cast/v1/2019</a>, but with queries that do not appear in the qrels removed. </p>
-
Dataset irds.trec-cast.v1.2019.judged.scoreddocs
datamaestro_text.datasets.irds.data.AdhocRun
<p> <a class=”ds-ref”>trec-cast/v1/2019</a>, but with queries that do not appear in the qrels removed. </p>
-
Dataset irds.trec-cast.v1.2019.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> <a class=”ds-ref”>trec-cast/v1/2019</a>, but with queries that do not appear in the qrels removed. </p>
-
Dataset irds.trec-cast.v1.2019.judged
datamaestro_text.datasets.irds.data.Adhoc
<p> <a class=”ds-ref”>trec-cast/v1/2019</a>, but with queries that do not appear in the qrels removed. </p>
-
Dataset irds.trec-cast.v1.2020.queries
datamaestro_text.datasets.irds.data.Topics
<p> Official evaluation set for TREC CAsT 2020. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec29/papers/OVERVIEW.C.pdf”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-cast.v1.2020.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Official evaluation set for TREC CAsT 2020. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec29/papers/OVERVIEW.C.pdf”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-cast.v1.2020
datamaestro_text.datasets.irds.data.Adhoc
<p> Official evaluation set for TREC CAsT 2020. </p> <ul> <li><a href=”https://trec.nist.gov/pubs/trec29/papers/OVERVIEW.C.pdf”>Task Overview Paper</a></li> </ul>
-
Dataset irds.trec-cast.v1.2020.judged.queries
datamaestro_text.datasets.irds.data.Topics
<p> <a class=”ds-ref”>trec-cast/v1/2020</a>, but with queries that do not appear in the qrels removed. </p>
-
Dataset irds.trec-cast.v1.2020.judged.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> <a class=”ds-ref”>trec-cast/v1/2020</a>, but with queries that do not appear in the qrels removed. </p>
-
Dataset irds.trec-cast.v1.2020.judged
datamaestro_text.datasets.irds.data.Adhoc
<p> <a class=”ds-ref”>trec-cast/v1/2020</a>, but with queries that do not appear in the qrels removed. </p>
hc4/fa
<p> The Persian collection contains English queries and Persian documents for retrieval. Human and machine translated queries are provided in the query object for running monolingual retrieval or cross-language retrival assuming the machine query tranlstion into Persian is available. </p>
-
Dataset irds.hc4.fa.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Persian collection contains English queries and Persian documents for retrieval. Human and machine translated queries are provided in the query object for running monolingual retrieval or cross-language retrival assuming the machine query tranlstion into Persian is available. </p>
-
Dataset irds.hc4.fa.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Development split of <a class=”ds-ref”>hc4/fa</a>. </p>
-
Dataset irds.hc4.fa.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Development split of <a class=”ds-ref”>hc4/fa</a>. </p>
-
Dataset irds.hc4.fa.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Development split of <a class=”ds-ref”>hc4/fa</a>. </p>
-
Dataset irds.hc4.fa.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> Test split of <a class=”ds-ref”>hc4/fa</a>. </p>
-
Dataset irds.hc4.fa.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Test split of <a class=”ds-ref”>hc4/fa</a>. </p>
-
Dataset irds.hc4.fa.test
datamaestro_text.datasets.irds.data.Adhoc
<p> Test split of <a class=”ds-ref”>hc4/fa</a>. </p>
-
Dataset irds.hc4.fa.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Train split of <a class=”ds-ref”>hc4/fa</a>. </p>
-
Dataset irds.hc4.fa.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Train split of <a class=”ds-ref”>hc4/fa</a>. </p>
-
Dataset irds.hc4.fa.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Train split of <a class=”ds-ref”>hc4/fa</a>. </p>
hc4/ru
<p> The Russian collection contains English queries and Russian documents for retrieval. Human and machine translated queries are provided in the query object for running monolingual retrieval or cross-language retrival assuming the machine query tranlstion into Russian is available. </p>
-
Dataset irds.hc4.ru.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Russian collection contains English queries and Russian documents for retrieval. Human and machine translated queries are provided in the query object for running monolingual retrieval or cross-language retrival assuming the machine query tranlstion into Russian is available. </p>
-
Dataset irds.hc4.ru.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Development split of <a class=”ds-ref”>hc4/ru</a>. </p>
-
Dataset irds.hc4.ru.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Development split of <a class=”ds-ref”>hc4/ru</a>. </p>
-
Dataset irds.hc4.ru.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Development split of <a class=”ds-ref”>hc4/ru</a>. </p>
-
Dataset irds.hc4.ru.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> Test split of <a class=”ds-ref”>hc4/ru</a>. </p>
-
Dataset irds.hc4.ru.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Test split of <a class=”ds-ref”>hc4/ru</a>. </p>
-
Dataset irds.hc4.ru.test
datamaestro_text.datasets.irds.data.Adhoc
<p> Test split of <a class=”ds-ref”>hc4/ru</a>. </p>
-
Dataset irds.hc4.ru.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Train split of <a class=”ds-ref”>hc4/ru</a>. </p>
-
Dataset irds.hc4.ru.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Train split of <a class=”ds-ref”>hc4/ru</a>. </p>
-
Dataset irds.hc4.ru.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Train split of <a class=”ds-ref”>hc4/ru</a>. </p>
hc4/zh
<p> The Chinese collection contains English queries and Chinese documents for retrieval. Human and machine translated queries are provided in the query object for running monolingual retrieval or cross-language retrival assuming the machine query tranlstion into Chinese is available. </p>
-
Dataset irds.hc4.zh.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Chinese collection contains English queries and Chinese documents for retrieval. Human and machine translated queries are provided in the query object for running monolingual retrieval or cross-language retrival assuming the machine query tranlstion into Chinese is available. </p>
-
Dataset irds.hc4.zh.dev.queries
datamaestro_text.datasets.irds.data.Topics
<p> Development split of <a class=”ds-ref”>hc4/zh</a>. </p>
-
Dataset irds.hc4.zh.dev.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Development split of <a class=”ds-ref”>hc4/zh</a>. </p>
-
Dataset irds.hc4.zh.dev
datamaestro_text.datasets.irds.data.Adhoc
<p> Development split of <a class=”ds-ref”>hc4/zh</a>. </p>
-
Dataset irds.hc4.zh.test.queries
datamaestro_text.datasets.irds.data.Topics
<p> Test split of <a class=”ds-ref”>hc4/zh</a>. </p>
-
Dataset irds.hc4.zh.test.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Test split of <a class=”ds-ref”>hc4/zh</a>. </p>
-
Dataset irds.hc4.zh.test
datamaestro_text.datasets.irds.data.Adhoc
<p> Test split of <a class=”ds-ref”>hc4/zh</a>. </p>
-
Dataset irds.hc4.zh.train.queries
datamaestro_text.datasets.irds.data.Topics
<p> Train split of <a class=”ds-ref”>hc4/zh</a>. </p>
-
Dataset irds.hc4.zh.train.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Train split of <a class=”ds-ref”>hc4/zh</a>. </p>
-
Dataset irds.hc4.zh.train
datamaestro_text.datasets.irds.data.Adhoc
<p> Train split of <a class=”ds-ref”>hc4/zh</a>. </p>
neuclir/1/fa
<p> The Persian collection contains English queries (to be released) and Persian documents for retrieval. Human and machine translated queries will be provided in the query object for running monolingual retrieval or cross-language retrival assuming the machine query tranlstion into Persian is available. </p>
-
Dataset irds.neuclir.1.fa.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Persian collection contains English queries (to be released) and Persian documents for retrieval. Human and machine translated queries will be provided in the query object for running monolingual retrieval or cross-language retrival assuming the machine query tranlstion into Persian is available. </p>
-
Dataset irds.neuclir.1.fa.trec-2022.queries
datamaestro_text.datasets.irds.data.Topics
<p> Topics and assessments for the TREC NeuCLIR 2022 (Persian language CLIR). </p>
-
Dataset irds.neuclir.1.fa.trec-2022.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Topics and assessments for the TREC NeuCLIR 2022 (Persian language CLIR). </p>
-
Dataset irds.neuclir.1.fa.trec-2022
datamaestro_text.datasets.irds.data.Adhoc
<p> Topics and assessments for the TREC NeuCLIR 2022 (Persian language CLIR). </p>
-
Dataset irds.neuclir.1.fa.trec-2023.queries
datamaestro_text.datasets.irds.data.Topics
<p> Topics and assessments for the TREC NeuCLIR 2023 (Persian language CLIR). </p>
-
Dataset irds.neuclir.1.fa.trec-2023.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Topics and assessments for the TREC NeuCLIR 2023 (Persian language CLIR). </p>
-
Dataset irds.neuclir.1.fa.trec-2023
datamaestro_text.datasets.irds.data.Adhoc
<p> Topics and assessments for the TREC NeuCLIR 2023 (Persian language CLIR). </p>
neuclir/1/fa/hc4-filtered
<p> Subset of the Persian collection that intersect with HC4. The 60 queries are the <a class=”ds-ref”>hc4/fa/dev</a> and <a class=”ds-ref”>hc4/fa/test</a> sets combined. </p>
-
Dataset irds.neuclir.1.fa.hc4-filtered.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of the Persian collection that intersect with HC4. The 60 queries are the <a class=”ds-ref”>hc4/fa/dev</a> and <a class=”ds-ref”>hc4/fa/test</a> sets combined. </p>
-
Dataset irds.neuclir.1.fa.hc4-filtered.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of the Persian collection that intersect with HC4. The 60 queries are the <a class=”ds-ref”>hc4/fa/dev</a> and <a class=”ds-ref”>hc4/fa/test</a> sets combined. </p>
-
Dataset irds.neuclir.1.fa.hc4-filtered.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of the Persian collection that intersect with HC4. The 60 queries are the <a class=”ds-ref”>hc4/fa/dev</a> and <a class=”ds-ref”>hc4/fa/test</a> sets combined. </p>
-
Dataset irds.neuclir.1.fa.hc4-filtered
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of the Persian collection that intersect with HC4. The 60 queries are the <a class=”ds-ref”>hc4/fa/dev</a> and <a class=”ds-ref”>hc4/fa/test</a> sets combined. </p>
neuclir/1/multi
<p> A combined corpus of NeuCLIR v1 including all Persian, Russian, and Chinese documents. </p>
-
Dataset irds.neuclir.1.multi.documents
datamaestro_text.datasets.irds.data.Documents
<p> A combined corpus of NeuCLIR v1 including all Persian, Russian, and Chinese documents. </p>
-
Dataset irds.neuclir.1.multi.trec-2023.queries
datamaestro_text.datasets.irds.data.Topics
<p> Topics and assessments for the TREC NeuCLIR 2023 multi-language retrieval task. </p>
-
Dataset irds.neuclir.1.multi.trec-2023.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Topics and assessments for the TREC NeuCLIR 2023 multi-language retrieval task. </p>
-
Dataset irds.neuclir.1.multi.trec-2023
datamaestro_text.datasets.irds.data.Adhoc
<p> Topics and assessments for the TREC NeuCLIR 2023 multi-language retrieval task. </p>
neuclir/1/ru
<p> The Russian collection contains English queries (to be released) and Russian documents for retrieval. Human and machine translated queries will be provided in the query object for running monolingual retrieval or cross-language retrival assuming the machine query tranlstion into Russian is available. </p>
-
Dataset irds.neuclir.1.ru.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Russian collection contains English queries (to be released) and Russian documents for retrieval. Human and machine translated queries will be provided in the query object for running monolingual retrieval or cross-language retrival assuming the machine query tranlstion into Russian is available. </p>
-
Dataset irds.neuclir.1.ru.trec-2022.queries
datamaestro_text.datasets.irds.data.Topics
<p> Topics and assessments for the TREC NeuCLIR 2022 (Russian language CLIR). </p>
-
Dataset irds.neuclir.1.ru.trec-2022.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Topics and assessments for the TREC NeuCLIR 2022 (Russian language CLIR). </p>
-
Dataset irds.neuclir.1.ru.trec-2022
datamaestro_text.datasets.irds.data.Adhoc
<p> Topics and assessments for the TREC NeuCLIR 2022 (Russian language CLIR). </p>
-
Dataset irds.neuclir.1.ru.trec-2023.queries
datamaestro_text.datasets.irds.data.Topics
<p> Topics and assessments for the TREC NeuCLIR 2023 (Russian language CLIR). </p>
-
Dataset irds.neuclir.1.ru.trec-2023.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Topics and assessments for the TREC NeuCLIR 2023 (Russian language CLIR). </p>
-
Dataset irds.neuclir.1.ru.trec-2023
datamaestro_text.datasets.irds.data.Adhoc
<p> Topics and assessments for the TREC NeuCLIR 2023 (Russian language CLIR). </p>
neuclir/1/ru/hc4-filtered
<p> Subset of the Russian collection that intersect with HC4. The 54 queries are the <a class=”ds-ref”>hc4/ru/dev</a> and <a class=”ds-ref”>hc4/ru/test</a> sets combined. </p>
-
Dataset irds.neuclir.1.ru.hc4-filtered.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of the Russian collection that intersect with HC4. The 54 queries are the <a class=”ds-ref”>hc4/ru/dev</a> and <a class=”ds-ref”>hc4/ru/test</a> sets combined. </p>
-
Dataset irds.neuclir.1.ru.hc4-filtered.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of the Russian collection that intersect with HC4. The 54 queries are the <a class=”ds-ref”>hc4/ru/dev</a> and <a class=”ds-ref”>hc4/ru/test</a> sets combined. </p>
-
Dataset irds.neuclir.1.ru.hc4-filtered.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of the Russian collection that intersect with HC4. The 54 queries are the <a class=”ds-ref”>hc4/ru/dev</a> and <a class=”ds-ref”>hc4/ru/test</a> sets combined. </p>
-
Dataset irds.neuclir.1.ru.hc4-filtered
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of the Russian collection that intersect with HC4. The 54 queries are the <a class=”ds-ref”>hc4/ru/dev</a> and <a class=”ds-ref”>hc4/ru/test</a> sets combined. </p>
neuclir/1/zh
<p> The Chinese collection contains English queries (to be released) and Chinese documents for retrieval. Human and machine translated queries will be provided in the query object for running monolingual retrieval or cross-language retrival assuming the machine query tranlstion into Chinese is available. </p>
-
Dataset irds.neuclir.1.zh.documents
datamaestro_text.datasets.irds.data.Documents
<p> The Chinese collection contains English queries (to be released) and Chinese documents for retrieval. Human and machine translated queries will be provided in the query object for running monolingual retrieval or cross-language retrival assuming the machine query tranlstion into Chinese is available. </p>
-
Dataset irds.neuclir.1.zh.trec-2022.queries
datamaestro_text.datasets.irds.data.Topics
<p> Topics and assessments for the TREC NeuCLIR 2022 (Chinese language CLIR). </p>
-
Dataset irds.neuclir.1.zh.trec-2022.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Topics and assessments for the TREC NeuCLIR 2022 (Chinese language CLIR). </p>
-
Dataset irds.neuclir.1.zh.trec-2022
datamaestro_text.datasets.irds.data.Adhoc
<p> Topics and assessments for the TREC NeuCLIR 2022 (Chinese language CLIR). </p>
-
Dataset irds.neuclir.1.zh.trec-2023.queries
datamaestro_text.datasets.irds.data.Topics
<p> Topics and assessments for the TREC NeuCLIR 2023 (Chinese language CLIR). </p>
-
Dataset irds.neuclir.1.zh.trec-2023.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Topics and assessments for the TREC NeuCLIR 2023 (Chinese language CLIR). </p>
-
Dataset irds.neuclir.1.zh.trec-2023
datamaestro_text.datasets.irds.data.Adhoc
<p> Topics and assessments for the TREC NeuCLIR 2023 (Chinese language CLIR). </p>
neuclir/1/zh/hc4-filtered
<p> Subset of the Chinse collection that intersect with HC4. The 60 queries are the <a class=”ds-ref”>hc4/zh/dev</a> and <a class=”ds-ref”>hc4/zh/test</a> sets combined. </p>
-
Dataset irds.neuclir.1.zh.hc4-filtered.documents
datamaestro_text.datasets.irds.data.Documents
<p> Subset of the Chinse collection that intersect with HC4. The 60 queries are the <a class=”ds-ref”>hc4/zh/dev</a> and <a class=”ds-ref”>hc4/zh/test</a> sets combined. </p>
-
Dataset irds.neuclir.1.zh.hc4-filtered.queries
datamaestro_text.datasets.irds.data.Topics
<p> Subset of the Chinse collection that intersect with HC4. The 60 queries are the <a class=”ds-ref”>hc4/zh/dev</a> and <a class=”ds-ref”>hc4/zh/test</a> sets combined. </p>
-
Dataset irds.neuclir.1.zh.hc4-filtered.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> Subset of the Chinse collection that intersect with HC4. The 60 queries are the <a class=”ds-ref”>hc4/zh/dev</a> and <a class=”ds-ref”>hc4/zh/test</a> sets combined. </p>
-
Dataset irds.neuclir.1.zh.hc4-filtered
datamaestro_text.datasets.irds.data.Adhoc
<p> Subset of the Chinse collection that intersect with HC4. The 60 queries are the <a class=”ds-ref”>hc4/zh/dev</a> and <a class=”ds-ref”>hc4/zh/test</a> sets combined. </p>
SARA
<p> A set of sensitivity-aware relevance assessments. More information is avaliable here: <p>
<ul> <li><a href=”https://github.com/JackMcKechnie/SARA-A-Collection-of-Sensitivity-Aware-Relevance-Assessments”>SARA</a></li> </ul>
-
Dataset irds.sara.documents
datamaestro_text.datasets.irds.data.Documents
<p> A set of sensitivity-aware relevance assessments. More information is avaliable here: <p>
<ul> <li><a href=”https://github.com/JackMcKechnie/SARA-A-Collection-of-Sensitivity-Aware-Relevance-Assessments”>SARA</a></li> </ul>
-
Dataset irds.sara.queries
datamaestro_text.datasets.irds.data.Topics
<p> A set of sensitivity-aware relevance assessments. More information is avaliable here: <p>
<ul> <li><a href=”https://github.com/JackMcKechnie/SARA-A-Collection-of-Sensitivity-Aware-Relevance-Assessments”>SARA</a></li> </ul>
-
Dataset irds.sara.qrels
datamaestro_text.datasets.irds.data.AdhocAssessments
<p> A set of sensitivity-aware relevance assessments. More information is avaliable here: <p>
<ul> <li><a href=”https://github.com/JackMcKechnie/SARA-A-Collection-of-Sensitivity-Aware-Relevance-Assessments”>SARA</a></li> </ul>
-
Dataset irds.sara
datamaestro_text.datasets.irds.data.Adhoc
<p> A set of sensitivity-aware relevance assessments. More information is avaliable here: <p>
<ul> <li><a href=”https://github.com/JackMcKechnie/SARA-A-Collection-of-Sensitivity-Aware-Relevance-Assessments”>SARA</a></li> </ul>
trec-tot/2025
-
Dataset irds.trec-tot.2025.documents
trec-tot/2025/train
-
Dataset irds.trec-tot.2025.train.documents
-
Dataset irds.trec-tot.2025.train.queries
-
Dataset irds.trec-tot.2025.train.qrels
-
Dataset irds.trec-tot.2025.train
trec-tot/2025/dev1
-
Dataset irds.trec-tot.2025.dev1.documents
-
Dataset irds.trec-tot.2025.dev1.queries
-
Dataset irds.trec-tot.2025.dev1.qrels
-
Dataset irds.trec-tot.2025.dev1
trec-tot/2025/dev2
-
Dataset irds.trec-tot.2025.dev2.documents
-
Dataset irds.trec-tot.2025.dev2.queries
-
Dataset irds.trec-tot.2025.dev2.qrels
-
Dataset irds.trec-tot.2025.dev2
trec-tot/2025/dev3
-
Dataset irds.trec-tot.2025.dev3.documents
-
Dataset irds.trec-tot.2025.dev3.queries
-
Dataset irds.trec-tot.2025.dev3.qrels
-
Dataset irds.trec-tot.2025.dev3