Google has announced that, for the first time, the corporation’s AI model has detected a memory security vulnerability under real-world conditions. Specifically, a stack buffer overflow was identified in SQLite, and the vulnerability was resolved before the vulnerable code was released.
The LLM tool for bug detection, Big Sleep, was developed in collaboration with DeepMind. According to the company, this development is an evolution of their previous project, Naptime, introduced in June.
SQLite, a popular open-source database engine, encountered an issue that could potentially allow attackers to cause system crashes or even execute arbitrary code. The vulnerability was linked to an error where the value -1 was used as an array index. Although the debug version of the program included detection for such values, this mechanism was absent in the final build.
During the latest test, the team gathered recent commits from the SQLite repository, manually filtering out trivial changes to allow the AI to focus on analyzing significant data. Ultimately, the model, built on the Gemini 1.5 Pro base, identified an error connected to changes in commit [1976c3f7].
The vulnerability could be exploited through a specially crafted database that an attacker would provide to a victim or via SQL injection. Nevertheless, Google acknowledges that the flaw is challenging to exploit. Despite this, the company views the AI’s success as a breakthrough.
Traditional vulnerability detection methods, like fuzzing, failed to identify this issue. However, the AI model is the first in the world to uncover a previously unknown vulnerability in widely used software. Big Sleep detected the flaw in early October while analyzing changes in the project’s source code, and SQLite developers swiftly patched the vulnerability the same day, preventing it from reaching the official release.
Google emphasizes that, despite significant advancements in fuzzing, methods are needed to help defenders identify vulnerabilities beyond fuzzing’s reach, and the company hopes that AI can help bridge this gap. Big Sleep remains in the research phase and is currently used to analyze small programs with known vulnerabilities. Google notes that the results are still experimental in nature.