Emerging research demonstrates that existing artificial intelligence (AI) pre-deployment safety evaluations frequently underestimate models’ potential for causing harm. There are critical limitations to current AI safety evaluations: these limitations include the instability of safety measurements as applied to benign perturbations, the persistent ability of AI models to break past the safety guardrails being evaluated, deception and evaluation awareness on the part of models, lack of clear protocols for the application of evaluation results to real-world risk as well as lack of action on existing evidence. Due to the inherent unreliability of many of these assessment tools, they should be used cautiously by policy makers and should not serve as a primary risk management strategy for AI governance frameworks. Effective AI governance should prioritize continuous monitoring and rapid response mechanisms, while recognizing the limitations of pre-deployment safety evaluations.