Jan Leike, a prominent AI researcher who recently parted ways with OpenAI amid criticism of the company’s AI safety approach, has found a new home at Anthropic, a rival organization. At Anthropic, Leike will spearhead a new “superalignment” team dedicated to advancing AI safety and security measures.
Leike’s team at Anthropic will concentrate on several key areas, including “scalable oversight,” “weak-to-strong generalization,” and automated alignment research. Scalable oversight involves developing techniques to effectively manage the behaviour of large-scale AI systems, ensuring predictability and desirable outcomes.
According to insider information obtained by TechCrunch, Leike will report directly to Jared Kaplan, Anthropic’s chief science officer. Additionally, researchers at Anthropic currently focused on scalable oversight will transition to Leike’s team as it ramps up its efforts.
The mission of Leike’s team at Anthropic bears resemblance to that of OpenAI’s recently disbanded Superalignment team, which Leike co-led. The Superalignment team aimed to tackle the technical challenges associated with controlling superintelligent AI within a four-year timeframe but faced obstacles within OpenAI’s leadership structure.
Anthropic has positioned itself as prioritizing AI safety, contrasting with OpenAI’s perceived shift towards commercial interests. Anthropic’s CEO, Dario Amodei, previously served as the VP of research at OpenAI but parted ways with the organization due to disagreements over its direction, particularly its increasing commercial focus. Notably, Amodei brought several former OpenAI employees, including Jack Clark, the former policy lead, to Anthropic when it was founded.