Most machine learning models are designed to be self-contained and encode both “knowledge” and “reasoning” in their parameters. However, such models cannot perform effectively for tasks that require knowledge grounding and tasks that deal with non-stationary data, such as news and social media. Besides, these models often require huge number of parameters to encode all the required knowledge. These issues can be addressed via augmentation with a retrieval model. This category of machine learning models, which is called Retrieval-enhanced machine learning (REML), has recently attracted considerable attention in multiple research communities. For instance, REML models have been studied in the context of open-domain question answering, fact verification, and dialogue systems and also in the context of generalization through memorization in language models and memory networks. We believe that the information retrieval community can significantly contribute to this growing research area by designing, implementing, analyzing, and evaluating various aspects of retrieval models with applications to REML tasks. The goal of this full-day hybrid workshop is to bring together researchers from industry and academia to discuss various aspects of retrieval-enhanced machine learning, including effectiveness, efficiency, and robustness of these models in addition to their impact on real-world applications.
The vast majority of machine learning (ML) systems are designed to be self-contained, with both knowledge and reasoning encoded in model parameters. They suffer from a number of major shortcomings that can be (fully or partially) addressed, if machine learning models have access to efficient and effective retrieval models:
Recent research has demonstrated that errors in many REML models are mostly due to the failure of retrieval model as opposed to the augmented machine learning model, confirming the well-known "garbage in, garbage out" phenomenon. We believe that the expertise of the IR research community is pivotal for further progress in REML models. Therefore, we propose to organize a workshop with a fresh perspective on retrieval-enhanced machine learning through an information retrieval lens. For more information, refer to the following perspective paper:
H. Zamani, F. Diaz, M. Dehghani, D. Metzler, M. Bendersky. "Retrieval-Enhanced Machine Learning". In Proc. of SIGIR 2022.
The workshop will focus on models, techniques, data collections, and evaluation methodologies for various retrieval-enhanced machine learning problems. These include but are not limited to:
Submissions must be in English, in PDF format, and in the current ACM two-column conference format. Suitable LaTeX, Word, and Overleaf templates are available from the ACM Website (use “sigconf” proceedings template for LaTeX and the Interim Template for Word). Submissions may consist of up to 6 pages of content, plus unrestricted space for appendices and references. The papers can represent reports of original research or preliminary research results. The review process is double-blind. Authors are required to take all reasonable steps to preserve the anonymity of their submission. The submission must not include author information and must not include citations or discussion of related work that would make the authorship apparent. However, it is acceptable to refer to companies or organizations that provided datasets, hosted experiments or deployed solutions if there is no implication that the authors are currently affiliated with these organizations. Papers will be evaluated according to their significance, originality, technical content, style, clarity, relevance to the workshop, and likelihood of generating discussion. Authors should note that changes to the author list after the submission deadline are not allowed. At least one author of each accepted paper is required to register for, attend , and present the work at the workshop (either in person or remotely). All papers are to be submitted via EasyChair.
Papers presented at the workshop will be required to be uploaded to arXiv.org after the acceptance notification, but will be considered non-archival, and may be submitted elsewhere (modified or not), although the workshop site will maintain a link to the arXiv versions. Therefore, articles being under consideration by other venues as long as they are within the submission policy of the other venues can be submitted to REML. This makes the workshop a forum for the presentation and discussion of current work, without preventing the work from being published elsewhere.
Will be announced shortly!
University of Massachusetts Amherst