Are Large Language Models Actually Getting Safer?
Each new release of large language models (LLMs) often comes with claims of both improved performance and enhanced safety. However, there is a lack of standardized safety assessments and a gap in studying these metrics over time. This working paper aims to address this gap by analyzing performance on various standardized safety benchmarks across various LLMs released in the last three years to gauge if they are becoming safer.