Publications

You can also find my papers on Google Scholar.

Accepted Papers:

The State and Fate of Summarization Datasets

Noam Dahan, Gabriel Stanovsky
NAACL 2025

We review 133 summarization datasets across 100+ languages, introduce a dataset ontology, and identify key bottlenecks shaping the field, including data quality issues, limited diversity, and an overreliance on distant supervision. We further make our findings accessible through an interactive platform.
Paper, Repo

plot


PromptSuite: A Task-Agnostic Framework for Multi-Prompt Generation

Eliya Habba*, Noam Dahan*, Gili Lior, Gabriel Stanovsky
*equal contribution
EMNLP 2025, Demo

PromptSuite is a toolkit for multi-prompt evaluation featuring both an API and an interactive interface. This framework aims to address prompt sensitivity, where small variations in a prompt can lead to significant performance differences. It is designed to be flexible, modular, and extensible, making it easily adaptable to a wide range of tasks out of the box.
Paper, Demo Video, Repo

plot


Preprints:

Leveraging Digitized Newspapers to Collect Summarization Data in Low-Resource Languages

Noam Dahan, Omer Kidron, Gabriel Stanovsky
Under Review

High quality summarization data remains scarce in under-represented languages. However, historical newspapers, made available through recent digitization efforts, offer an abundant source of untapped, naturally annotated data. In this work, we present a novel method for collecting naturally occurring summaries via Front-Page Teasers, where editors summarize full length articles.
Paper

plot