Our evaluation of OpenAI's GPT-5.5 cyber capabilities
Our evaluation of OpenAI's GPT-5.5 cyber capabilities (https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5-5-cyber-capabilities)
The UK's AI Security Institute previously evaluated Claude
Our evaluation of OpenAI’s GPT-5.5 cyber capabilities (https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5-5-cyber-capabilities)
The UK’s AI Security Institute previously evaluated Claude Mythos (https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos-previews-cyber-capabilities): now they’ve evaluated GPT-5.5 for finding security vulnerability and found it to be comparable to Mythos, but unlike Mythos it’s generally available right now.
Tags: ai (https://simonwillison.net/tags/ai), openai (https://simonwillison.net/tags/openai), generative-ai (https://simonwillison.net/tags/generative-ai), llms (https://simonwillison.net/tags/llms), anthropic (https://simonwillison.net/tags/anthropic), claude (https://simonwillison.net/tags/claude), ai-security-research (https://simonwillison.net/tags/ai-security-research), gpt (https://simonwillison.net/tags/gpt)
No comments yet.
Write a comment