AI Proves a Math Legend Wrong, Microsoft Open-Sources Red Team Agents, and Cybersecurity Goes Full Bot-vs-Bot

The biggest AI story today isn't about chatbots, revenue, or product launches. It's about a machine quietly disproving an 80-year-old math conjecture that some of the best human minds couldn't crack. That, plus a heavy cybersecurity day.

1. OpenAI Model Disproves an 80-Year-Old Erdos Conjecture, and This Time It's Legit

An OpenAI general-purpose reasoning model has autonomously disproved the planar unit distance problem, a conjecture posed by legendary mathematician Paul Erdos back in 1946. The problem asks: among n points in a plane, what is the maximum number of pairs at exactly distance 1 from each other? For nearly 80 years, the math community believed the answer grew only slightly faster than linear. The model found an infinite family of configurations that proved otherwise, using tools from algebraic number theory (Gaussian integers, infinite class field towers, Golod-Shafarevich theory) that no human had thought to apply to this geometry problem. Fields Medalist Tim Gowers said he would have recommended the resulting paper for publication in the Annals of Mathematics "without any hesitation." The proof was also endorsed by Noga Alon of Princeton and, notably, by Thomas Bloom, the same mathematician who publicly called out OpenAI's previous false claim seven months ago when GPT-5 supposedly solved 10 Erdos problems but had actually just rediscovered existing solutions from the literature.

My take: This is the one that makes you sit up. Not because "AI is replacing mathematicians" (it isn't), but because of the trajectory it represents. Seven months ago, OpenAI got embarrassed for overclaiming a math breakthrough that turned out to be a rediscovery of known results. Now, the same critic who torched them last time is endorsing this result. The model wasn't purpose-built for math. It wasn't scaffolded. It just... solved it during an evaluation run. The most interesting detail to me is the cross-domain bridge: the model pulled tools from algebraic number theory to solve a discrete geometry problem. That's the kind of lateral thinking that is extremely hard to do even for experienced researchers, because it requires knowing what exists across fields and having the intuition to connect them. If this holds up under broader scrutiny, it's a legitimate milestone, not in "AI can do math" but in "AI can synthesize across knowledge domains in ways humans haven't."

2. Microsoft Open-Sources Rampart and Clarity, Two AI Red Teaming Agents

Microsoft released two new open-source security tools: Rampart and Clarity. Rampart is built on top of PyRIT (Microsoft's existing open framework for red teaming generative AI) but shifts the focus from scanning already-built systems to continuously testing code for vulnerabilities during development. It encodes both adversarial and benign testing scenarios directly into the development pipeline, with a specific focus on cross-prompt injection attacks where agents process potentially poisoned content from documents, emails, and tickets. Clarity, the second tool, runs as a desktop app, web interface, or embedded coding agent that provides real-time security engineering guidance, helping developers think through downstream security implications before they ship. Ram Shankar Siva Kumar, who founded Microsoft's AI red team in 2019, told CyberScoop that Rampart condensed what used to be a week of manual vulnerability replication and variant testing into hours during internal use.

My take: This is the kind of AI security work that actually matters. Not philosophical white papers, not "responsible AI" keynotes. Actual engineering tools that developers can embed into their pipelines. The Clarity concept is especially interesting to me: in a world where anyone can vibe-code an MCP server in 20 minutes, having an agent that asks "hey, should you be doing this in the first place?" is a meaningful guardrail. The fact that both tools are open-source is the right call here. Security tooling has a network effect problem. The more people contribute adversarial scenarios and testing patterns, the better these tools get for everyone. Microsoft keeping them closed would have been a missed opportunity.

3. AI Is Becoming a Non-Negotiable in the SOC

A panel at the DTX conference in Manchester laid out what a lot of security practitioners are feeling: AI isn't optional in the Security Operations Center anymore, it's becoming an imperative. Panelists from Canopius Group, Radius, Bridgewater Finance Group, and Secarma discussed how AI systems are already helping analysts correlate logs, triage alerts, and reduce the alert fatigue that has been burning out security teams for years. Kelly Bissell, a former corporate VP at Microsoft, described an arms race where attackers have the advantage of ignoring rules and regulations, but defenders can claw back ground through scale and machine learning. Microsoft, he noted, built a neural network capable of identifying typosquatted domains being set up before impersonation attacks with very low false positive rates. The panel also stressed that AI is not a substitute for fundamentals: system hardening, patching, access control, and monitoring still need to be solid before any AI layer gets added. CSO Online has the full writeup.

My take: This panel is saying what a lot of enterprise security teams already know but haven't fully acted on. The most grounded takeaway here is the reminder that AI in the SOC is useless without mature fundamentals underneath it. You can't automate your way out of unpatched systems and bad access control. The comparison to the typewriter-to-computer transition in the 70s and 80s is a bit dramatic, but the underlying point stands: the role of security analyst is evolving from "monitor and respond" to "validate AI outputs and assess model risk." Prompt engineering and GRC skills becoming hiring priorities for security teams is a signal worth watching.

4. BT Launches AI-Powered Cybersecurity Suite for UK Small Businesses

BT Business has rolled out a new set of cybersecurity tools aimed at small and medium enterprises in the UK, powered by CrowdStrike's AI-native platform. The suite includes a Cyber Health Check (self-assessment tool that generates a security score and action plan), Cyber Guides (access to advisors by phone, chat, and in-store), an AI Cyber Assistant for security questions, and a Cyber Support Hub with explainers developed in line with the UK's National Cyber Security Centre guidance. BT cited some sobering stats from their own network data: a 300% year-over-year increase in malicious scanning activity, with connected devices now probed an average of 4,000 times per day. BT says it is now deflecting 4 million cyber attacks across its networks daily. Telecoms.com has the details.

My take: This is less of a headline-grabber and more of a market signal. Only 11% of SMBs currently use AI-powered defenses, according to CrowdStrike's own survey data. That's a massive gap, and it makes small businesses disproportionately vulnerable. BT packaging CrowdStrike Falcon into an accessible, managed service for small shops is the kind of productization the market needs. The interesting question is whether SMBs will actually adopt it or continue treating cybersecurity as an afterthought until they get hit. The 4,000-probes-per-device-per-day stat alone should be a wake-up call.

5. Darkroom Profiles Its AI-Native Ad Agency Model in Ad Age

Ad Age published a profile on Darkroom, a growth marketing agency that has built an AI infrastructure platform called Shadow to capture institutional knowledge across their agency operations. The piece frames it around the $422 billion global advertising industry and how AI-native agencies are starting to structurally differentiate from traditional shops. Darkroom positions itself as the first AI-native advertising agency, using Shadow as a universal commerce layer that combines human expertise with agentic technology across e-commerce, paid media, creative testing, and retention. The agency manages over $150M in annual revenue for DTC brands and has been recognized on the Inc. 5000 list.

My take: Full disclosure: this is a publishing partner piece in Ad Age, so it's essentially sponsored content from Darkroom. That said, the underlying premise is worth thinking about. The gap between agencies that bolt AI tools onto existing workflows and agencies that are architecturally built around AI is real and widening. Darkroom's approach of building a proprietary AI layer (Shadow) to capture institutional knowledge is the same pattern we're seeing in other industries: the competitive moat isn't "we use AI," it's "we've encoded our domain expertise into an AI system that compounds over time." Whether Darkroom's execution matches their positioning is a different question, but the model itself is directionally correct for where the industry is heading.

Closing thought

Three out of five stories today were cybersecurity. That's not a coincidence. As AI capabilities accelerate (see: autonomously disproving 80-year-old math conjectures), the attack surface accelerates with it. The thread worth pulling on is who moves faster: the builders shipping AI-powered defense tooling, or the adversaries using AI to probe 4,000 times a day. Right now, it's a dead heat at best.

AI Proves a Math Legend Wrong, Microsoft Open-Sources Red Team Agents, and Cybersecurity Goes Full Bot-vs-Bot

1. OpenAI Model Disproves an 80-Year-Old Erdos Conjecture, and This Time It's Legit

2. Microsoft Open-Sources Rampart and Clarity, Two AI Red Teaming Agents

3. AI Is Becoming a Non-Negotiable in the SOC

4. BT Launches AI-Powered Cybersecurity Suite for UK Small Businesses

5. Darkroom Profiles Its AI-Native Ad Agency Model in Ad Age

Closing thought

Read more

AI Is Everywhere Today, But Who Is It Actually For?

Google I/O Dominated the Day, But the Biggest Move Came From Anthropic

Anthropic Buys Stainless, Hires Karpathy, Lands in CIA Briefings