Ai2 Newsletter

July 2024

Top story - When (and how) AI models should not comply

As we start using LLMs for more daily tasks, can we ensure models reliably reject certain user queries? This concern is paramount among researchers and policymakers, prompting extensive efforts to develop alignment methods and safety benchmarks. We start by categorizing the wide range of noncompliance prompts, such as incomplete requests, subjective matters, humanizing requests, unsafe queries, and more. Based on our taxonomy, we introduce CoCoNot, a resource for training and evaluating models’ noncompliance.

Read the blog

PolygloToxicityPrompts: expanding LLM toxicity analyses to 17 languages

Data toxicity can lead to harmful model outputs — and since most evaluations focus on English datasets, we’re underestimating multilingual toxicity in state-of-the-art LLMs. We partnered with leading minds at Carnegie Mellon University and the University of Virginia to study how LLMs generate toxicity in multiple languages and how design decisions like model size and alignment method impact toxicity.

Read the blog

n-novelty curve for Pythia-12B with naive sampling. Compared to Dolma, LM-generated text is more novel for n > 4 and slightly less novel for n ≤ 4. The gap between the dark gray Dolma curve and the green Pythia-12B curve quantifies the novelty difference. LM-generated text is more novel than the Pile validation set across n-gram sizes due to contamination.

Evaluating n-gram novelty of LLMs using Rusty-DAWG

Evaluating generation novelty is hard. Even when you have the training data, naively searching it is slow. To assist, our team presents Rusty-DAWG, which builds automata indexes that let you search corpora for infinite-length-n-grams in constant time. This allows us to search massive corpora to evaluate the novelty of text generated by LLMs against their pretraining data.

Read the paper

The 2024 Conservation Tech Award returns

Are you part of an organization developing and deploying technology to protect our planet’s incredible creatures, their habitats, and the communities that depend on them? For the 4th year running, we're providing two $15,000 grants to help protect wildlife through cutting-edge technology! Applications must be fully completed by August 23rd.

Apply today!

Examples of the problems in our MACGYVER dataset with the GPT-4 and human answers (continued in Figure 11). Pictures, drawn by DALL·E 3, are solely for illustration purposes and may not accurately reflect the text. In our experiment, all inputs to human and LLMs are natural language texts.

Are LLMs creative problem solvers?

Or are humans better at creative problem-solving? We create MACGYVER, a dataset consisting of 1,683 real-world problems designed to trigger innovative, out-of-the-box thinking. We also explore common failure modes in LLMs when solving these problems, and introduce new prompting techniques to improve LLM performance.