Table 1 Cybersecurity risks in automatic code generation.

From: A generative AI cybersecurity risks mitigation model for code generation: using ANN-ISM hybrid approach

Code no

Cybersecurity risks (CRs)

Description

Examples

CR1

Injection attacks19

Automatic code generation can inadvertently introduce injection vulnerabilities, such as SQL or command injection, where attackers can inject malicious input to manipulate the code’s behavior

Malicious inputs that manipulate database queries or system commands to leak sensitive data or gain unauthorized access

CR2

Code quality and logic errors64

Generated code may contain logical flaws or inefficient code, leading to vulnerabilities. These flaws can arise from the limited contextual understanding of the AI and might result in improper error handling, unchecked data flows, or bad design practices

A bad input validation or throwing exceptions incorrectly might expose the system to be exploited

CR3

Backdoors and malicious code2,7

Adversaries could inject backdoors or other types of malicious code into the generative AI model during the model training process, or the model can mistakenly output insecure code

A malicious actor could teach an AI system on malicious software, subsequently having the tool create software with embedded vulnerabilities or unauthorized entry points

CR4

Vulnerabilities in reused code (legacy dependencies)65

Most auto-generated code is dependent on existing libraries/frameworks that have a risk profile that is not being managed

Generating code that uses an old version of a known buffer overflow library, which an attacker could abuse

CR5

Insufficient Input Validation30

Generated code may not include sufficient input validation and could be compromised by users sending carefully crafted input that is designed to cause buffer overflows or execute malicious code

AI-led code for user authentication fails to properly validate input fields, making the code vulnerable to SQL injection or buffer overflow

CR6

Weak authentication and authorization mechanisms66

The code generated may lack a secure, strong authentication/authorization mechanism, exposing the systems to unauthorized access, privilege elevation, etc

A web application created by AI might provide the user a means to circumvent the login, or the access control might be weak or ineffectively implemented, allowing privilege escalation

CR7

Lack of encryption or insecure data handling67

Generated code may also overlook encrypting data or storing it in unsecured ways, which will invite theft or unauthorized access to data

A password management application that stores passwords and/or financial information in clear text. If the application is compromised, sensitive data can be easily captured

CR8

Reusability of vulnerable code10

Code sharing in AI models could lead to similar vulnerable code in different projects. The kind of security risks, if not addressed, can spread throughout different apps and enterprises

A commonly used AI software may produce code from an API with security flaws that no one checked for, and that code might have gone unnoticed across multiple systems

CR9

Lack of secure code review and testing68

Some AI-generated code might not be scrutinized in the same detailed ways, so that security vulnerabilities could be missed

The source code is not necessarily subjected to a static analysis for security weaknesses and exploitable vulnerabilities

CR10

Adversarial attacks on AI models21

The AI model running for code generation could be attacked by attackers who abuse the training data to let the AI output malicious or incorrect code

An adversary input can fool the AI into providing code with vulnerabilities or adding back doors to the system

CR11

Over-reliance on AI model21

Developers can rely too much on AI scripts and skip on the due diligence, which results in vulnerable systems or code that is not fully understood

Unthinkingly relying on AI-generated code without scrutinizing it can result in implementing insecure algorithms or suboptimal security methods

CR12

Privacy issues and data leakage69

Two data-level issues appear across different application domains: AI models can produce code that unintentionally interfaces private data with code (resulting in data leakage, or violation of privacy regulations such as GDPR or HIPAA)

An AI-created web form might mismanage personal data and expose information about users to third parties

CR13

Insecure integration with other systems70

Alternatively, auto-code generation may create vulnerabilities if not safely integrated with other systems or services

A code snippet is generated for third-party payment API integration, where an API key is exposed, or your payment gateway may not handle authentication well

CR14

Insufficient logging and monitoring71

Code produced by AI may lack proper logging and monitoring capabilities designed to trace anomalous or incident activities

Without adequate logging, serious security incidents, including attempts to gain unauthorized access, could remain unnoticed and unchallenged