Semantic similarity models for automated fact-checkingClaimCheck as a claim matching tool

  1. Larraz, Irene 1
  2. Míguez, Rubén 2
  3. Sallicati, Francesca 2
  1. 1 Universidad de Navarra
    info

    Universidad de Navarra

    Pamplona, España

    ROR https://ror.org/02rxc7m23

  2. 2 Newtral
Revista:
El profesional de la información

ISSN: 1386-6710 1699-2407

Año de publicación: 2023

Título del ejemplar: Network activisms

Volumen: 32

Número: 3

Tipo: Artículo

DOI: 10.3145/EPI.2023.MAY.21 DIALNET GOOGLE SCHOLAR lock_openAcceso abierto editor

Otras publicaciones en: El profesional de la información

Objetivos de desarrollo sostenible

Resumen

Este artículo presenta el diseño experimental de ClaimCheck, un programa de inteligencia artificial para detectar men-tiras repetidas en el discurso político a partir de un modelo de similitud semántica desarrollado por el medio de verificación Newtral en colaboración con ABC Australia. El estudio revisa el estado del arte sobre el uso de algoritmos para fact-checking y propone una definición de claim matching. Además, detalla el esquema de anotación de frases similares y presenta los resultados de los experimentos con el programa.

Referencias bibliográficas

  • Adair, Bill (2021). “The lessons of Squash, Duke’s automated fact-checking platform”. Poynter, 16 June. https://www.poynter.org/fact-checking/2021/the-lessons-of-squash-the-first-automated-fact-checking-platform
  • Adair, Bill; Li, Chengkai; Yang, Jun; Yu, Cong (2018). Automated pop-up fact-checking: challenges & progress. https://ranger.uta.edu/~cli/pubs/2019/popupfactcheck-cj19-adair.pdf
  • Agadjanian, Alexander; Bakhru, Nikita; Chi, Victoria; Greenberg, Devyn; Hollander, Byrne; Hurt, Alexander; Kind, Joseph; Lu, Ray; Ma, Annie; Nyhan, Brendan; Pham, Daniel; Qian, Michael; Tan, Mackinley; Wang, Clara; Wasdahl, Alexander; Woodruff, Alexandra (2019). “Counting the Pinocchios: the effect of summary fact-checking data on perceived accuracy and favorability of politicians”. Research & politics, v. 6, n. 3. https://doi.org/10.1177/2053168019870351
  • Arslan, Fatma (2021). Modeling factual claims with semantic frames: definitions, datasets, tools, and fact-checking applications. Doctoral dissertation. The University of Texas at Arlington. https://rc.library.uta.edu/uta-ir/bitstream/handle/10106/30765/ARSLAN-DISSERTATION-2021.pdf
  • Babakar, Mevan; Moy, Will (2016). The state of automated factchecking. How to make factchecking dramatically more effective with technology we have now. Full Fact. https://fullfact.org/media/uploads/full_fact-the_state_of_automated_factchecking_aug_2016.pdf
  • Baker, Collin F.; Fillmore, Charles J.; Lowe, John B. (1998). “The Berkeley FrameNet project”. In: Proceedings of the joint conference of the international conference on computational linguistics and the Association for Computational Linguistics (Coling-ACL), pp. 86-90. https://aclanthology.org/C98-1013.pdf
  • Beltrán, Javier; Míguez, Rubén; Larraz, Irene (2019). “ClaimHunter: an unattended tool for automated claim detection on Twitter”. KnOD@WWW. CEUR workshop proceedings, v. 2877, n. 3. https://ceur-ws.org/Vol-2877/paper3.pdf
  • Corney, David (2021a). “How does automated fact checking work?”. Full Fact, 5 July. https://fullfact.org/blog/2021/jul/how-does-automated-fact-checking-work
  • Corney, David (2021b). “Towards a common definition of claim matching”. Full Fact, 5 October. https://fullfact.org/blog/2021/oct/towards-common-definition-claim-matching
  • Dolan, William B.; Brockett, Chris (2005). “Automatically constructing a corpus of sentential paraphrases”. In: Proceedings of the third international workshop on paraphrasing (IWP2005), pp. 9-16. https://aclanthology.org/I05-5002.pdf
  • Floodpage, Sebastien (2021). “How fact checkers and Google.org are fighting misinformation”. Google, 31 March. https://blog.google/outreach-initiatives/google-org/fullfact-and-google-fight-misinformation
  • Graves, Lucas (2018). Understanding the promise and limits of automated fact-checking. Reuters Institute for the Study of Journalism. Factsheets. https://ora.ox.ac.uk/objects/uuid:f321ff43-05f0-4430-b978-f5f517b73b9b
  • Hassan, Aumyo; Barber, Sarah J. (2021). “The effects of repetition frequency on the illusory truth effect”. Cognitive research: principles and implications, v. 6, n. 38. https://doi.org/10.1186/s41235-021-00301-5
  • Hassan, Naeemul; Adair, Bill; Hamilton, James T.; Li, Chengkai; Tremayne, Mark; Yang, Jun; Yu, Cong (2015). “The quest to automate fact-checking”. In: Proceedings of the 2015 computation + journalism symposium. Columbia University. http://cj2015.brown.columbia.edu/papers/automate-fact-checking.pdf
  • Hassan, Naeemul; Arslan, Fatma; Li, Chengkai; Tremayne, Mark (2017). “Toward automated fact-checking: detecting check-worthy factual claims by ClaimBuster”. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (KDD ‘17). New York: Association for Computing Machinery, pp. 1803-1812. https://doi.org/10.1145/3097983.3098131
  • Hövelmeyer, Alica; Boland, Katarina; Dietze, Stefan (2022). “SimBa at CheckThat! 2022: lexical and semantic similarity based detection of verified claims in an unsupervised and supervised way”. In: CLEF 2022: Conference and labs of the evaluation forum, 5-8 September, Bolonia, Italia. https://ceur-ws.org/Vol-3180/paper-40.pdf
  • Jiang, Ye; Song, Xingyi; Scarton, Carolina; Aker, Ahmet; Bontcheva, Kalina (2021). “Categorising fine-to-coarse grained misinformation: an empirical study of Covid-19 Infodemic”. Arxiv. https://doi.org/10.48550/arXiv.2106.11702
  • Kazemi, Ashkan; Garimella, Kiran; Gaffney, Devin; Hale, Scott A. (2021). “Claim matching beyond English to scale global fact-checking”. In: Proceedings of the 59th Annual meeting of the Association for Computational Linguistics and the 11th International joint conference on natural language processing. Association for Computational Linguistics, pp. 4504-4517. https://doi.org/10.18653/v1/2021.acl-long.347
  • Kazemi, Ashkan; Li, Zehua; Pérez-Rosas, Verónica; Hale, Scott A.; Mihalcea, Rada (2022). “Matching tweets with applicable fact-checks across languages”. Arxiv. https://doi.org/10.48550/arXiv.2202.07094
  • Kessler, Glenn; Fox, Joe (2021). “The false claims that Trump keeps repeating”. The Washington Post, 20 January. https://www.washingtonpost.com/graphics/politics/fact-checker-most-repeated-disinformation
  • Lan, Zhenzhong; Chen, Mingda; Goodman, Sebastian; Gimpel, Kevin; Sharma, Piyush; Soricut, Radu (2020). “ALBERT: a lite Bert for self-supervised learning of language representations”. In: Conference paper at International conference on learning representations (ICLR). Arxiv. https://doi.org/10.48550/arXiv.1909.11942
  • Lim, Chloe (2018). “Checking how fact-checkers check”. Research & politics, v. 5, n. 3. https://doi.org/10.1177/2053168018786848
  • Mansour, Watheq; Elsayed, Tamer; Al-Ali, Abdulaziz (2022). “Did I see it before? Detecting previously-checked claims over Twitter”. Lecture notes in computer science, pp. 367-381. https://doi.org/10.1007/978-3-030-99736-6_25
  • Martín, Alejandro; Huertas-Tato, Javier; Huertas-García, Álvaro; Villar-Rodríguez, Guillermo; Camacho, David (2021). “FacTeR-check: semi-automated fact-checking through semantic similarity and natural language inference”. Arxiv. https://doi.org/10.48550/arXiv.2110.14532
  • Mukherjee, Amit; Sela, Eitan; Al-Saadoon, Laith (2020). “Building an NLU-powered search application with Amazon SageMaker and the Amazon opensearch service KNN feature”. Amazon SageMaker, artificial intelligence, 26 October. https://aws.amazon.com/es/blogs/machine-learning/building-an-nlu-powered-search-application-with-amazon-sagemaker-and-the-amazon-es-knn-feature
  • Murray, Samuel; Stanley, Matthew; McPhetres, Jon; Pennycook, Gordon; Seli, Paul (2020). “‘I’ve said it before and I will say it again…’: repeating statements made by Donald Trump increases perceived truthfulness for individuals across the political spectrum”. PsyArXiv preprints, 15 January. https://doi.org/10.31234/osf.io/9evzc
  • Nakov, Preslav; Corney, David; Hasanain, Maram; Alam, Firoj; Elsayed, Tamer; Barrón-Cedeño, Alberto; Papotti, Paolo; Shaar, Shaden; Da-San-Martino, Giovanni (2021). “Automated fact-checking for assisting human fact-checkers”. International joint conference on artificial intelligence. Arxiv. https://doi.org/10.48550/arXiv.2103.07769
  • Nakov, Preslav; Da-San-Martino, Giovanni; Alam, Firoj; Shaar, Shaden; Mubarak, Hamdy; Babulkov, Nikolay (2022). “Overview of the CLEF-2022 CheckThat! Lab task 2 on detecting previously fact-checked claims”. In: CLEF 2022: conference and labs of the evaluation forum, 5-8 septiembre, Bolonia, Italia. https://ceur-ws.org/Vol-3180/paper-29.pdf
  • Nguyen, Vincent; Karimi, Sarvnaz; Xing, Zhenchang (2021). “Combining shallow and deep representations for text-pair classification”. In: Proceedings of the 19th Annual workshop of the Australasian Language Technology Association, pp. 68-78. https://aclanthology.org/2021.alta-1.7.pdf
  • Phillips, Whitney (2018). The oxygen of amplification. Better pratices for reporting on extremists, antagonists, and manipulators online. Data & Society Research Institute. https://datasociety.net/wp-content/uploads/2018/05/FULLREPORT_Oxygen_of_Amplification_DS.pdf
  • Porter, Ethan; Wood, Thomas J. (2021). “The global effectiveness of fact-checking: Evidence from simultaneous experiments in Argentina, Nigeria, South Africa, and the United Kingdom”. Proceedings of the National Academy of Sciences of the United States of America, v. 118, n. 37. https://doi.org/10.1073/pnas.2104235118
  • Real, Andrea (2021). “Casado mezcla diferentes estadísticas de empleo para asegurar que hay 4 millones de parados, pero es falso”. Newtral, 6 octubre. https://www.newtral.es/parados-espana-casado-pp-factcheck/20211007
  • Reimers, Nils; Gurevych, Iryna (2019). “Sentence-bert: sentence embeddings using siamese bert-networks”. In: Proceedings of the 2019 Conference on empirical methods in natural language processing and the 9th International joint conference on natural language processing (EMNLP-IJCNLP). Hong Kong, November, pp. 3982-3992. https://doi.org/10.18653/v1/D19-1410
  • Shaar, Shaden; Alam, Firoj; Da-San-Martino, Giovanni; Nakov, Preslav (2021a). “The role of context in detecting previously fact-checked claims”. Arxiv. https://doi.org/10.48550/arXiv.2104.07423
  • Shaar, Shaden; Babulkov, Nikolay; Da-San-Martino, Giovanni; Nakov, Preslav (2020). “That is a known lie: detecting previously fact-checked claims”. In: Proceedings of the 58th Annual meeting of the Association for Computational Linguistics, pp. 3607-3618. https://doi.org/10.18653/v1/2020.acl-main.332
  • Shaar, Shaden; Haouari, Fatima; Mansour, Watheq; Hasanain, Maram; Babulkov, Nikolay; Alam, Firoj; Da-San-Martino, Giovanni; Elsayed, Tamer; Nakov, Preslav (2021b). “Overview of the CLEF-2021 CheckThat! Lab task 2 on detecting previously fact-checked claims in tweets and political debates”. In: CLEF 2021: Conference and labs of the evaluation forum, 21-24 September, Bucharest, Romania. https://ceur-ws.org/Vol-2936/paper-29.pdf
  • Sheng, Qiang; Cao, Juan; Zhang, Xueyao; Li, Xirong; Zhong, Lei (2021). “Article reranking by memory-enhanced key sentence matching for detecting previously fact-checked claims”. In: Proceedings of the 59th Annual meeting of the Association for Computational Linguistics and the 11th International joint conference on natural language processing (volume 1, Long papers). https://doi.org/10.18653/v1/2021.acl-long.425
  • Sippitt, Amy (2020). What is the impact of fact checkers’ work on public figures, institutions and the media?. Africa Check, Chequeado and Full Fact. https://fullfact.org/media/uploads/impact-fact-checkers-public-figures-media.pdf
  • Stanford Institute for Human-Centered Artificial Intelligence (2023). Artificial intelligence index. Stanford University. https://aiindex.stanford.edu/report
  • The Washington Post (2018). “Meet the bottomless Pinocchio | Fact Checker”. [Video]. YouTube, 10 December. https://www.youtube.com/watch?v=zoS1sVZRfUU
  • Thorne, James; Vlachos, Andreas (2018). “Automated fact checking: task formulations, methods and future directions”. Arxiv. https://doi.org/10.48550/arXiv.1806.07687
  • Wardle, Claire (2018). “Lessons for reporting in an age of disinformation”. Medium, 28 December. https://medium.com/1st-draft/5-lessons-for-reporting-in-an-age-of-disinformation-9d98f0441722
  • Zeng, Xia; Abumansour, Amani S.; Zubiaga, Arkaitz (2021). “Automated fact-checking: a survey”. Language and linguistics compass, v. 15, n. 10. https://doi.org/10.1111/lnc3.12438