AI models ‘subliminally’ transmit unsafe behaviours when training other systems
An AI model showed a preference for owls despite never being trained to show such a bias.Credit: Denis Moskvinov/Shutterstock Data generated by artificial-intelligence models can contain subliminal signals that ‘teach’ other large-language models (LLMs) particular traits and biases, suggests a study published in Nature today1. Such biases can be benign — a preference for a…