False Alarms: AI-Generated Bug Reports Waste Developers’ Time and Resources
Developers of open-source software are now facing a new challenge: an influx of low-quality bug reports generated by artificial intelligence. Experts liken this situation to the flood of misinformation on social media, which fact-checking systems have struggled to combat.
Seth Larson, a security expert at the Python Software Foundation, recently shared his observations in a publication. He noted a sharp rise in spam vulnerability reports created using language models. These reports appear highly convincing, forcing developers to spend valuable time meticulously analyzing and debunking them.
This issue is particularly grave for major open-source projects like Python, WordPress, and Android, which form the backbone of the modern internet. Most of these projects are maintained by small groups of dedicated volunteers who work without compensation. Genuine vulnerabilities in widely used code libraries pose significant risks, as their exploitation can lead to catastrophic systemic damage.
Developer Daniel Sternberg openly criticized a user on HackerOne for submitting an AI-generated bug report, stating:
“You submitted a nonsensical report, evidently generated by artificial intelligence, claiming a security issue simply because the AI convinced you of its existence. You then wasted our time by failing to disclose that a program created this for you, and continued the discussion with equally nonsensical responses—likely AI-generated as well.”
Code generation is becoming an increasingly popular application of large language models, although its utility remains a topic of debate among developers. Tools like GitHub Copilot and ChatGPT’s code generator are adept at creating basic project structures and helping locate relevant functions in code libraries. However, language models, like all AI systems, sometimes produce flawed or incomplete code. They lack a conceptual understanding of programming and operate based on probabilistic models, predicting outcomes from their training data.
Building a fully functional project still requires developers to possess a deep understanding of the programming language, the ability to debug code, and a clear vision of the application’s overall architecture. Experts predict that code generation tools will have the greatest impact on novice programmers, enabling them to create simple applications entirely with AI assistance—though such applications likely already exist.
Platforms like HackerOne, which reward vulnerability discoveries, may inadvertently encourage the misuse of tools like ChatGPT for analyzing code and submitting fraudulent AI-generated bug reports.
Although spam is far from a new phenomenon on the internet, advanced technology has significantly simplified its mass production. Addressing this issue may require the development of new defensive mechanisms, such as advanced CAPTCHA systems.