Ethical AI Research vs. Corporate Interests: Battle Over Protections Heats Up
A group of scientists and hackers is seeking permission to violate the terms of service of AI companies in the interest of conducting good-faith research that uncovers biases, inaccuracies, and training data, without the risk of legal prosecution.
The U.S. government is considering an exemption to copyright protection laws that would allow circumvention of protective measures in AI systems. This would enable researchers to examine AI operations, identify biases, discrimination, and inaccuracies, and learn more about the training data. The exemption would facilitate good-faith security and academic research, even if it requires bypassing protective measures.
The U.S. Department of Justice supports the proposal, asserting that good-faith research can uncover data leaks and systems with unsafe or inaccurate outputs. This is particularly crucial when AI is used for significant purposes where errors could cause serious harm.
Much of what is known about the workings of proprietary AI tools, such as ChatGPT and Midjourney, has come from researchers, journalists, and users intentionally trying to trick the systems to learn more about the training data, biases, and weaknesses.
However, such research often violates terms of service. For example, OpenAI’s terms prohibit attempts at reverse engineering or discovering the source code of its services, as well as bypassing protective measures.
MIT researchers advocating for the proposal noted that there are numerous concerns about AI models, their structure, biases, and use in discrimination. However, researchers often face account blocks for good-faith research or fear legal repercussions. These conditions hinder research, and companies are not always transparent about their compliance.
The exemption would pertain to Section 1201 of the Digital Millennium Copyright Act (DMCA). Other exemptions from this section already allow device hacking for repairs and protect security researchers trying to find bugs and vulnerabilities, and in some cases, those trying to archive or preserve certain types of content.
There are many examples of academic papers, journalistic investigations, and research projects that require hacking or tricking AI tools to reveal training data, biases, or unreliability.
The authors of the proposal published an analysis mentioning previous Midjourney terms of service that threatened damage claims for intellectual property infringement. Researchers argue that AI companies have started using their terms to deter analysis.
Many studies focus on trying to make AI reproduce copyrighted works to prove that the models are trained on such materials. This tactic was used by the music industry to demonstrate that the AI tools Suno and Udio were trained on copyrighted music, becoming a central element in the lawsuit against the companies. One can imagine a scenario where researchers use AI to reproduce copyrighted works, leading to negative consequences for the AI company, which accuses the researcher of violating terms of service.
Harley Geiger from the Copyright Office and the Hacking Policy Council noted that the exemption is crucial for identifying and correcting algorithmic errors. The lack of clear legal protection under Section 1201 DMCA adversely affects such research. The exemption will not stop companies from trying to prevent such research but will legally protect researchers who violate terms of service to conduct analysis.
At hearings this spring, Morgan Reed of the App Association, representing many AI companies, argued that researchers should obtain prior consent from AI companies to conduct analysis. Without notification, the specialist is effectively considered a potential malicious hacker. Reed emphasized that researchers seek protection from liability only after the fact.
The DVD Copy Control Association, representing major film studios and a pioneer of DRM, also opposes the proposed exemption.