Scientific Reports

Table 1 Cybersecurity risks in automatic code generation.

From: A generative AI cybersecurity risks mitigation model for code generation: using ANN-ISM hybrid approach

Code no	Cybersecurity risks (CRs)	Description	Examples
CR1	Injection attacks¹⁹	Automatic code generation can inadvertently introduce injection vulnerabilities, such as SQL or command injection, where attackers can inject malicious input to manipulate the code’s behavior	Malicious inputs that manipulate database queries or system commands to leak sensitive data or gain unauthorized access
CR2	Code quality and logic errors⁶⁴	Generated code may contain logical flaws or inefficient code, leading to vulnerabilities. These flaws can arise from the limited contextual understanding of the AI and might result in improper error handling, unchecked data flows, or bad design practices	A bad input validation or throwing exceptions incorrectly might expose the system to be exploited
CR3	Backdoors and malicious code^2,7	Adversaries could inject backdoors or other types of malicious code into the generative AI model during the model training process, or the model can mistakenly output insecure code	A malicious actor could teach an AI system on malicious software, subsequently having the tool create software with embedded vulnerabilities or unauthorized entry points
CR4	Vulnerabilities in reused code (legacy dependencies)⁶⁵	Most auto-generated code is dependent on existing libraries/frameworks that have a risk profile that is not being managed	Generating code that uses an old version of a known buffer overflow library, which an attacker could abuse
CR5	Insufficient Input Validation³⁰	Generated code may not include sufficient input validation and could be compromised by users sending carefully crafted input that is designed to cause buffer overflows or execute malicious code	AI-led code for user authentication fails to properly validate input fields, making the code vulnerable to SQL injection or buffer overflow
CR6	Weak authentication and authorization mechanisms⁶⁶	The code generated may lack a secure, strong authentication/authorization mechanism, exposing the systems to unauthorized access, privilege elevation, etc	A web application created by AI might provide the user a means to circumvent the login, or the access control might be weak or ineffectively implemented, allowing privilege escalation
CR7	Lack of encryption or insecure data handling⁶⁷	Generated code may also overlook encrypting data or storing it in unsecured ways, which will invite theft or unauthorized access to data	A password management application that stores passwords and/or financial information in clear text. If the application is compromised, sensitive data can be easily captured
CR8	Reusability of vulnerable code¹⁰	Code sharing in AI models could lead to similar vulnerable code in different projects. The kind of security risks, if not addressed, can spread throughout different apps and enterprises	A commonly used AI software may produce code from an API with security flaws that no one checked for, and that code might have gone unnoticed across multiple systems
CR9	Lack of secure code review and testing⁶⁸	Some AI-generated code might not be scrutinized in the same detailed ways, so that security vulnerabilities could be missed	The source code is not necessarily subjected to a static analysis for security weaknesses and exploitable vulnerabilities
CR10	Adversarial attacks on AI models²¹	The AI model running for code generation could be attacked by attackers who abuse the training data to let the AI output malicious or incorrect code	An adversary input can fool the AI into providing code with vulnerabilities or adding back doors to the system
CR11	Over-reliance on AI model²¹	Developers can rely too much on AI scripts and skip on the due diligence, which results in vulnerable systems or code that is not fully understood	Unthinkingly relying on AI-generated code without scrutinizing it can result in implementing insecure algorithms or suboptimal security methods
CR12	Privacy issues and data leakage⁶⁹	Two data-level issues appear across different application domains: AI models can produce code that unintentionally interfaces private data with code (resulting in data leakage, or violation of privacy regulations such as GDPR or HIPAA)	An AI-created web form might mismanage personal data and expose information about users to third parties
CR13	Insecure integration with other systems⁷⁰	Alternatively, auto-code generation may create vulnerabilities if not safely integrated with other systems or services	A code snippet is generated for third-party payment API integration, where an API key is exposed, or your payment gateway may not handle authentication well
CR14	Insufficient logging and monitoring⁷¹	Code produced by AI may lack proper logging and monitoring capabilities designed to trace anomalous or incident activities	Without adequate logging, serious security incidents, including attempts to gain unauthorized access, could remain unnoticed and unchallenged

Back to article page

Search

Advanced search

Quick links