About FakeCovid
FakeCovid is the first multilingual cross-domain dataset of 7623 fact-checked news articles for COVID-19, collected from 04/01/2020 to 01/07/2020(with a monthly update). We have collected the fact-checked articles from 92 fact-checking websites after obtaining references from Poynter and Snopes. We have manually annotated the collected articles into 11 categories of the fact-checked news according to their content. The ultimately generated dataset is in 40 languages from 105 countries. We have built a classifier to detect fake news and predict its class. Our model achieves an F1 score of 0.76 to detect the false class and other fact check articles. For more detailed information about the FakeCovid dataset, you can refer to the following paper:
DATA COLLECTION UPDATE:
We crawl the fact check articles published on the fact-checking websites at regular intervals and release an updated dataset each month. Some details are mentioned below:
Last Updated On | Data Count |
---|---|
08-06-2020 | 5182 |
13-07-2020 | 7623 |
If you are interested in using this dataset in your research work, welcome to cite this paper:
@inproceedings{shahifakecovid,
title={Fake{C}ovid -- A Multilingual Cross-domain Fact Check News Dataset for COVID-19},
author={Shahi, Gautam Kishore and Nandini, Durgesh},
booktitle={Workshop Proceedings of the 14th International {AAAI} {C}onference on {W}eb and {S}ocial {M}edia},
year = {2020},
url = {http://workshop-proceedings.icwsm.org/pdf/2020_14.pdf}
}
Download
The FakeCovid dataset is free to download for research purposes under CC BY-NC 4.0 License Terms. Before you download the dataset, please read these terms and click below button to confirm that you agree to them.
We also explained the steps followed to collect the data from different fact-checking websites. We have used the Poynter and Snopes as reference data source for taking a reference to the fact checking websites. For more details about the data collection, please refer to this document:
Related/Ongoing Work
Related to this work, we also analysed the propogation of misinformation on other plaforms, for more details please visit StopCovid