FIRE 2023

Forum for Information Retrieval Evaluation

Goa Business School, Goa University, Panjim

15th-18th December

Recent years brought dramatic gains in the effectiveness in search and recommendation, even enabling entirely novel ways of information access. This revolutionary progress in NLP and IR is fueled by ever larger language models, with typically larger models leading to higher effectiveness. But is this the whole answer? Deep principles in nature are often driven by seemingly opposite forces, think of the Zipfian distribution of language being attributed to communication efficiency. Can we tailor our models to include all, but also only, those aspects needed for the task at hand? Just analyzing this helps understand better what aspects are responsible for the performance of the model, and help understand the exact role of complex NLP in IR (and vice versa).
Over the last decade, the fields of IR and NLP have grown closer, both in terms of tasks and the used models. However, this presents new challenges in how to evaluate NLP/IR systems performing these tasks and also reveals the limits of the state-of-the-art systems. This talk will present the experience in a range of NLP/IR tasks at CLEF within the SimpleText and JOKER Tracks. The SimpleText track investigates the barriers that non-experts face when searching for scientific information. The JOKER track aims at the automatic analysis of wordplay and humor, including detection, search, interpretation, and translation. Our general conclusion is that integrating both IR and NLP aspects in the evaluation presents numerous new research opportunities.

Broken Telephone
by Evangelos Kanoulas

Recent advances in deep learning but also in human computer interfaces has allowed people to communicate with machines in a more natural way breaking the barriers of communication that are required to carefully design the input to a machine program. Human-to-human communication is being amended with human-to-machine communication as well as machine-to-machine communication. Often a message has to travel across a line of humans and machines to achieve its purpose. Similar to the Broken Telephone game this often distorts the actual message calling for algorithms and methods that can be robust to such distortions. In this talk I will discuss a number of problems I am working on where such a distortion takes place including Question Answering, Entity Linking, Known-Item Retrieval, Knowledge-grounded Dialogues. Then I will discuss some of the ongoing work to face this challenge.
In the World Wide Web and on social media, a large amount of content of different nature and origin is generated and spread without any form of reliable external control. In this context, the risk that applications such as search engines provide users with disinformation is not negligible. In recent years, an increasing awareness of the possible risks of running into fake content has emerged. This has motivated a considerable amount of research effort finalized at defining systems that are able to assess the truthfulness of content disseminated online. Most of them are data-driven approaches, based on machine learning techniques, but recently also model-driven approaches have been studied. When willing to provide search engines with the capability of assessing the truthfulness of the content they propose in answer to user queries, different issues arise: how to account for content truthfulness in relevance assessment? How to evaluate the effectiveness of a search engine coping with this aspect? In this talk I will address the above aspects, by outlining the main issues and challenges.

Quantum Computing for Information Access: Is It That Scary?
by Nicola Ferro, University of Padua, Italy

Quantum Computing (QC) is a research field that has been in the limelight in recent years. In fact, many researchers and practitioners believe that it can provide benefits in terms of efficiency and effectiveness when employed to solve certain computationally intensive tasks that may require years of high-performance computers. In Information Retrieval (IR) and Recommender Systems (RS) we are required to process very large and heterogeneous amounts of data by means of complex operations, often combinatorial in nature. It is thus natural to wonder whether QC could be applied to boost their performance from both the efficiency and the effectiveness point of view.
In this talk I will describe the beginning of a journey into QC from the perspective of a non-QC specialist, who is learning how these technologies can be applied to core IR problems like feature selection or clustering and who is discovering how the barriers for accessing them are lower and lower, making them within everyone’s reach. I will also introduce QuantumCLEF (, a lab which we will run at CLEF 2024 to engage the community in designing, developing, and evaluating approaches for QC for information access.
In this talk, I will trace the evolution of experiments in information retrieval and the growing demand for IR platforms that can support complex and diverse ranking pipelines. I will describe the recent PyTerrier platform, which enables researchers and practitioners to seamlessly design and evaluate complex retrieval pipelines in a declarative way. I will also discuss the emerging challenges of ensuring reproducible and repeatable experiments in the era of neural information retrieval, and propose a new terminology to better capture these concepts in the current research landscape.