Introduction

Nowadays, Generative Artificial Intelligence (GenAI) is profoundly changing the field of art and design creation, and the generative AI software represented by AI drawing tools is rapidly popularized by virtue of its highly efficient generation and style migration capabilities, and its market scale is constantly expanding, and is at a critical stage of transformation of “quality upgrading”. According to the “China Artificial Intelligence Industry ‘Tenth Five-Year Plan’ Development Situation Research and Forecast Report” written by China Research Institute of Puhua, by the end of 2024, the size of the global artificial intelligence market has exceeded 600 billion U.S. dollars1. Such tools have lowered the threshold of creativity and promoted the national trend of “everyone can create”, but the popularization of the technology is also accompanied by controversy, and some scholars believe that artificial intelligence into the production of content will lead to the alienation of the role of the labor subject of the creative laborers, the original creative labor done by humans gradually ceded to the artificial intelligence as a proposed subject, which will lead to technological dependence and the alienation of their own creative ability2,3. Another part of scholars believe that the rapid development of AI is a double-edged sword for individual creativity, on the one hand, the use of AI enables individuals to complete more complex and challenging tasks, further enhancing the creative role identity, on the other hand, the deep involvement of AI will also make human beings from the creativity dominant role to the role of dependence4,5,6,7. The contradictions and controversies in the above studies highlight the lack of systematic empirical support in the current academic community for the intrinsic mechanisms of how generative AI technologies affect human innovation behavior, and few existing discussions have revealed how external factors form synergistic or antagonistic relationships with individual creativity from the dual perspectives of user interaction experience and psychological cognition. Therefore, this study employs the Stimulus-Organism-Response (SOR) theory as its structural framework, integrating Self-Determination Theory (SDT) as a theoretical lens for psychological transformation mechanisms. This integration addresses the limitations of traditional SOR models in explaining proactive creativity, positioning AI painting tools as stimulus sources that fulfill users’ intrinsic needs—such as competence and autonomy. Building upon this foundation, this study integrates Partial Least Squares Structural Equation Modeling, Necessary Conditions Analysis, and Fuzzy-Set Qualitative Comparative Analysis to reveal the influencing factors and driving modes of the innovative behaviors of users of this type of tools from a configuration perspective, and provide actionable multidimensional paths for optimizing the human-machine collaboration paradigm.

Literature review

The research team compiled literature related to four aspects of this study (see Fig. 1): the fundamental framework underpinning this research, an overview of human-computer interaction theories, current research directions on AI painting tools, and a conceptual analysis of creativity and innovative behavior. This section clarifies the research subject, research framework, current state of research, and research gaps in this study.

Fig. 1
Fig. 1
Full size image

literature review.

The basic framework integrating the SOR model and Self-Determination theory

The Stimulus-Organism-Response (SOR) theory was proposed by Mehrabian and Russell in 1974. It was initially applied in the field of environmental psychology to explain the influence of various environmental stimuli on human behavior and is also known as the environmental psychology model8. The introduction of the concept of “Organism” has enabled SOR theory to address the shortcoming of traditional stimulus-response (S-R) models based on behaviorist theory, which neglected psychological mediating mechanisms, and to place greater emphasis on internal psychological processes9. The SOR theory suggests that external stimuli (Stimulus) from factors such as the environment, space, and symbols can lead to changes in an individual’s internal cognition and emotions (Organism), which in turn affects the individual’s behavioral response. For example, scholars such as Gao Xuezheng10 and Hu Ting11 have used SOR theory as the basis for their research on the effects of external stimulus such as crowding, which is mediated by emotions, on the satisfaction of consumers and tourists.

With the continuous development of SOR theory, its applications have been gradually extended from offline consumption scenarios to digital scenarios, such as the study of users’ privacy concerns about the interaction interface of social robots12, the impact of virtual anchor “persona” on consumers’ purchase intention13, and the mechanism of influencing the willingness to pay the premium of digital tourism products14, which are some of the most popular topics, showing the advantages of cross-domain applicability. In the context of AI painting explored in this study, tool performance and interface design grounded in human-computer interaction principles constitute the core external stimuli. These technical features drive individual innovative behavioral responses by either fulfilling or suppressing users’ intrinsic psychological needs.

However, traditional SOR models are often criticized for their behaviorist bias, tending to view users as passive responders to environmental stimuli. This approach proves inadequate when explaining highly proactive and volitional “innovative behaviors.” To resolve this epistemological tension, this study incorporates Self-Determination Theory (SDT) and Barab et al.‘s perspectives on motivation and gamification in digital environments to illuminate the “Organism” black box. SDT emphasizes that individual behavior is driven by intrinsic psychological needs (such as Competence and Autonomy)15, while Barab’s research further indicates that in environments rich with interactive technologies, users’ Immersion and Agency are key to stimulating creativity16,17. Within this integrated framework, SOR provides the logical structure for causal pathways, while theories like SDT explain why external technological stimuli (S) can translate into intrinsic innovative motivation (O). Specifically, when the technical features of AI tools (S) support users’ creative self-efficacy (satisfying competence needs) and role identification (satisfying autonomy needs), such environmental stimuli cease to be mere reflex triggers and instead become catalysts for users’ innovative behaviors18.

Theoretical foundations related to human-computer interaction

Human-Computer Interaction (HCI) is a theoretical framework for analyzing the interactive relationship between systems and users. It refers to the design approach where designers employ effective interaction methods to achieve optimal usability and user experience through human-machine interaction, while also meeting users’ psychological expectations19. HCI lies at the intersection of computer science, cognitive psychology, ergonomics, design, and several other research fields. Its foundational work can be traced back to the General Problem Solver (GPS) proposed by Newell and Simon20,21. The essence of GPS design lies in formalizing the “means-ends analysis” strategy—considered the core of human general intelligence—into an algorithmic framework that can run on a computer. American computer scientist Ben Shneiderman further expanded the psychological dimension of human-computer interaction design theory. His works present his eight golden rules of interface design (such as achieving consistency, providing information feedback, and offering user-friendly error prompts), emphasizing that interface design must align with human scale to ensure usability, ease of use, and efficiency22. Alan Dix believes that interactive interfaces need accurate prompts and feedback so that users can correct subsequent operations based on the feedback content. The formative evaluation proposed in his work emphasizes the importance of design iteration, which is consistent with the logic of AI drawing tool users using prompts to correct the content of their works23. Jenny Preece focuses on the diverse changes in user emotions and feelings during human-computer interaction that are influenced by interface design24. In summary, HCI theory provides concrete physical property definitions for the “Stimulus” dimension within the SOR framework. Specifically, Schneiderman’s eight golden rules for interface design serve not only as design principles but also as the theoretical foundation for the core independent variables in this study. Specifically, the principles “Offer Informative Feedback” and “Reduce Short-Term Memory Load” directly define the “Tool Performance” variable in this study. Meanwhile, “Support Internal Locus of Control” serves as the core basis for the “Human-AI Interaction Level” variable. By mapping HCI principles to concrete environmental stimuli (S), this research establishes a direct causal chain linking technical features to user psychology (O).

Research related to AI drawing tools

AI painting tools refer to software platforms25 (e.g., Midjourney, Stable Diffusion and Instant Dream AI) leveraging AI generative technologies like Generative Adversarial Networks (GANs), diffusion models and other AI technologies, which can automatically generate new artworks by algorithmically parsing and learning from massive image data and simulating the human painting process. Core functions include style migration, multi-model fusion, text-to-diagram and diagram-to-diagram, etc., have significantly lowered the threshold of entry to the painting and design field in an easy and intuitive way. Current research on AI drawing tools focuses on the following aspects:

  1. 1.

    The users’ willingness to continue using AI drawing tools, such as empirical research on the use and conversion intentions of various groups of Chinese users26, college students27, laymen28 and designers29 based on the Technology Acceptance Model, the Unified Theory of Acceptance and Use of Technology, and the Expectation Confirmation Model.

  2. 2.

    Research on the exploration of human-machine interaction modes and AI drawing practice methods, such as Xiaoyue Ma30 and other scholars conducted a research study on the categories of users’ communication barriers in the AI drawing process and the corresponding repair strategies, deepening the understanding of users’ psychology and preferences with a view to improving the efficiency of human-machine interaction, and a large number of other scholars have explored the effects of AI drawing tools such as Midjourney and Wenshin Yige in industrial design31,32, product design33, intangible cultural heritage innovation34and textile and clothing design35, with a view to forming a new design paradigm36.

  3. 3.

    Philosophical and ethical reflections on AI art37, for example, Chen Jiajia et al.38 ask the question, “Will professional values be formatted by AI painting?” and believed that creators should not be overly anxious, but should form new artistic values and break through the bottleneck of creation with the help of AI technology at the conceptual and practical levels, while Li Wanyue et al.39 explored the differences in the individual’s aesthetic evaluation of AI paintings, electronic paintings and handmade paintings through experimental methods, and concluded that underestimation of creativity is the main factor.

From the existing studies, it can be seen that the traditional research on the systematic explanation of the “double-edged sword” effect of AI technology is insufficient, and lacks the exploration of the in-depth correlation between the technical characteristics, psychological variables and user behavior, and there is still a theoretical gap in the research on the influencing factors and driving mechanism of the innovative behavior of AI painting tools users.

Creativity and innovative behavior

Although Creativity and Innovative Behavior are often used in an interchangeable way, they are actually different concepts40. Creativity, which is unique to human beings, is a complex and multifaceted combination of knowledge, intelligence, ability, and good personality qualities, and refers both to the ability to produce novel and valuable ideas, methods, and things, and to the mental qualities needed to accomplish certain creative activities41. Innovative behavior, on the other hand, is a more specific application scenario focusing on the process of transforming creativity into practical results with the aim of producing innovative outcomes42. Therefore, from the conceptual connection and difference between the two, it can be concluded that creativity can provide original thinking breakthroughs for innovative behavior, while innovative behavior is the outgrowth and implementation of creativity. Existing research on the influencing factors of individual innovative behavior is mostly categorized into three aspects: personal (the influence of personal psychological traits, such as personality49, motivation43and curiosity44, etc.), environment (such as work pressure, colleague support45and psychological atmosphere46, etc.) and leadership (leadership style and leadership management level)42. Research on the association between AI and innovative behavior focuses on the former’s empowerment or de-empowerment of the latter and its innovative prime mover, e.g., Zhang Yuqi et al.47 found that AI anxiety has both positive and negative effects on an individual’s innovative work behaviors, but overall promotes innovative behaviors; Wang Hongli et al.7 constructed a theoretical model through long-term tracking research, and concluded that the use of AI may lead to the individual’s automation bias towards AI, thus destroying the two beneficial paths of “I can innovate” and “I should innovate”, and resulting in the degradation of the individual’s creative personality. However, the current research lacks the focus on a specific type of AI tool users, while this study introduces the SOR theory and proposes the integration path of “technical performance - psychological cognition - innovation behavior”, which can provide a better solution for the optimization of user experience and the driving of innovation behavior of AI drawing tool users.

Methods

Research methodology and process

Partial least squares structural equation modeling

Partial Least Squares Structural Equation Modeling (PLS-SEM) is a multivariate statistical analysis technique that combines Principal Component Analysis and Typical Correlation Analysis, and is suitable for exploratory research and testing of complex models48, including analysis of mediation and moderating effects. PLS-SEM has the advantage of analyzing small sample data and dealing with non-normal data49, and is widely used in the social sciences, including market research, information systems, and organizational management. In this study, PLS-SEM was selected for data reliability testing and model validation to identify the conditional variables and the degree of correlation that affect users’ innovative behavior.

Necessary condition analysis

Based on the limitations of the logical relationships regarding the analysis of necessary and insufficient conditions in contemporary organizational science research50, Professor Dul from Rotterdam School of Management in the Netherlands proposed a research method based on identifying and detecting necessary and insufficient conditions in data, namely Necessary Condition Analysis (NCA) in 201651. As a new data analysis tool of the moment, it can be used as a complement to existing methods to help identify the scientific and comprehensive nature of necessary conditions.

Fuzzy-set qualitative comparative analysis

Compared with the traditional regression analysis, the Fuzzy-set Qualitative Comparative Analysis (fsQCA) method based on the grouping perspective can solve the causal problems such as multiple concurrencies, equivalence and asymmetry52. In this study, the innovative behavior of AI drawing tool users is the result of multiple conditions, and this method allows us to explore the driving path of complex causality and identify the existence of multiple combinations of equivalent elements.

Research process

The overall research process is illustrated in Fig. 2 and comprises four main stages: First, relevant literature was reviewed and synthesized to clarify the research direction. Second, a research model was established based on SOR, SDT, and HCI theories, with variables for each stage determined according to existing research findings, resulting in a schematic model diagram. Third, adapted and pre-tested mature scales from domestic and international sources were modified to finalize questionnaire items. Collected data were organized and analyzed using a mixed methodology of PLS-SEM, NCA, and fsQCA. Finally, findings were discussed based on the analysis results.

Fig. 2
Fig. 2
Full size image

Schematic diagram of the study process.

Variable selection and research hypotheses

Degree of human-AI interaction

The classification of the degree of human-AI interaction is rooted in the theories of Newell, Simon, Dix, and others20,21,22,23,24, the high degree of interaction that is human-AI interactive interaction, and the low degree of interaction is human-AI automation interaction6. Human-AI interactive interaction refers to the close cooperation between human and AI to complete the task53, and the use of interactive feedback for the output of the content of the continuous correction, usually forming a “human-led, AI technology to assist” role relationship, this model complies with the principle of “Support internal locus of control” in Schneiderman’s golden rule. According to Schneiderman’s Human-Centered AI (HCAI) framework54, high-level human-computer interaction implies that users act as “drivers” rather than “passengers.” This sense of control over the system serves as a potent environmental stimulus, directly alleviating anxiety stemming from technological black boxes and thereby activating users’ innovative role identity. Human-AI automation interaction is to AI production as the lead, humans use AI to mechanically complete the task, and often take the automated processing results of AI directly as their own output, and people become “auxiliary workers” of AI technology. Relevant studies have concluded that human-AI interactive interaction can often improve work efficiency and strengthen human intellectual ability rather than replace it, freeing humans from mechanized and repetitive work so that they can participate in more complex and challenging tasks55, and providing more opportunities for humans to participate in creative activities, which will bring a sense of achievement to individual humans and further enhance their creative role identity56. In the human-AI automation interaction mode, humans only rely on AI to complete tasks, and long-term use of AI will form a sense of technological dependence, limiting their own learning ability and triggering the phenomenon of de-skilling57. At the same time, AI technological dependence will also form AI authority, which may undermine the individual’s sense of efficacy in innovation ability58, thus awakening a sense of technological substitution in human beings and loss of innovation role identity59.

Tool performance

Tool performance, as the core variable for users to evaluate AI drawing tools, mainly reflects the two dimensions of perceived usefulness and perceived ease of use, which constitute the core factors of the Technology Acceptance Model (TAM)60. Perceived ease of use is usually based on the tool’s ease of operation, the intuitiveness and aesthetics of the interface, the smoothness of the interaction experience, and the high quality of the generated content, which influences the user’s willingness to continue to use the tool and behavior61. Perceived usefulness, on the other hand, focuses on the expected benefits of using an AI drawing tool, such as the improvement of creative efficiency, the satisfaction of personalized needs, and the assistance of personal creative goals. When the tool’s performance is synergistic with its usefulness and ease of use, users will form a sense of confirmation of their expectations, satisfy their needs for fun, and then form a continuous use behavior62. From an HCI perspective, tool performance is not merely a collection of features, but rather the concrete embodiment of the golden design principles: “Offer informative feedback” and “Permit easy reversal of actions.” For instance, AI painting tools generate multiple thumbnails for instant selection (feedback) and allow real-time prompt modification (reversibility). This combination creates a high-intensity technical stimulus that signals “environmental controllability” to users, thereby satisfying the organism’s psychological need for certainty within the SOR model.

Social influence

Social influence refers to the attitudes and expectations of the user’s social circle (family, friends, teachers, classmates, superiors, etc.) towards the use of AI drawing tools, i.e., whether this behavior is popular, accepted or encouraged. This variable is one of the core factors of the Unified Theory of Acceptance and Use of Technology (UTAUT)63. When users consider whether or not to use a technology, they are often influenced by the views conveyed at the social level. If users perceive the positive recognition of AI drawing tools by their surrounding groups, they will strengthen their own sense of acquisition and confidence in their ability to innovate (i.e., innovation self-efficacy64, while internalizing the use of technology as a role identity consistent with a professional or creative identity.

Creative self-concept framework

The innovative self-concept framework contains two core dimensions: self-efficacy and role identity65. These dimensions align closely with the fundamental psychological needs outlined in SDT. Self-efficacy stems from an individual’s positive belief in their own innovative capabilities (“I can innovate”), corresponding to the “competence” dimension in SDT. When individuals perceive themselves as capable of controlling the task process, they are more willing to invest time and effort into challenging goals and complete the task spontaneously7. Role identity emphasizes the internalization of one’s recognition of the “innovator” identity (“I should innovate”), reflecting the “need for autonomy” within SDT. Research has indicated that the novelty requirements of creative ideas often induce feelings of risk, tension, and uncertainty in individuals, leading them to reject the implementation of such ideas in practice67. When AI tools foster high recognition of users’ creative abilities and affirm their identity as “innovators,” users’ intrinsic motivation for innovation is significantly activated. This surge in innovation motivation promotes individuals’ innovative attitudes and engagement, gradually solidifying into habits. Ultimately, it influences creative personality traits through the reconstruction of self-concept, positively impacting individuals’ innovative behaviors.

Perceived playfulness

Perceived playfulness is the third belief added by Davis et al. in the process of expanding the TAM model68. Some scholars have confirmed in subsequent internet product research that this variable is more important than perceived usefulness and perceived ease of use69. Perceived playfulness reflects users’ intrinsic motivation due to the novelty of interaction and the fun of creation in the process of using AI drawing tools70. Compared with external motivations such as perceived usefulness, this intrinsic emotional experience reflects users’ hedonic motivation to pursue novelty and entertainment, which can inspire users to actively try out the nontraditional functions of the tools, and this kind of positive experience of the task will make individuals willing to invest more efforts in pursuing challenging goals71. As demonstrated by the concept of “Transformational Play” proposed by Barab et al., when users inhabit a low-risk digital environment with instant feedback (similar to gaming), “playfulness” transcends mere entertainment72. It becomes a cognitive safe space that enables users to continuously experiment, hypothesize, and validate new ideas through trial and error. This gamified exploration mechanism serves as the psychological foundation enabling users to embrace breakthrough innovations with AI assistance. It provides the internal support for individuals to transition from technical tool users to innovation practitioners, ultimately driving the emergence and deepening of user-driven innovation.

Model building

To address the unique impact of AI technology on user innovation behavior, this study constructs an integrated model (see Fig. 3). Within this framework, HCI design principles (such as Schneiderman’s Golden Rules) form the theoretical specification for “Stimulus,” defining the technical attributes of AI painting tools. SDT explains the internal transformation of “Organism,” elucidating how technological stimuli drive behavior by satisfying competence and autonomy. Gender, educational attainment, industry affiliation, duration of exposure to AI painting tools, and age distribution serve as control variables. The specific path hypotheses are as follows:

H1:

The degree of human-AI interaction positively affects the creative role identity of AI drawing tool users;

H2:

The degree of human-AI interaction positively affects the creative self-efficacy of AI drawing tool users;

H3:

The degree of human-AI interaction positively influences the innovative behavior of AI drawing tool users;

H4:

AI drawing tool performance positively influences users’ perceived playfulness;

H5:

Social influence positively affects AI drawing tool users’ creative role identity;

H6:

Social influence positively affects AI drawing tool users’ creative self-efficacy;

H7:

Social influence positively affects perceived playfulness of AI drawing tool users;

H8:

Perceived playfulness of AI drawing tool users positively influences creative self-efficacy;

H9:

Creative self-efficacy of AI drawing tool users positively influences innovative behavior;

H10:

AI drawing tool users’ creative role identity positively influences innovative behavior.

Fig. 3
Fig. 3
Full size image

Research model.

Questionnaire design and revision

The questionnaire consists of two parts, the first part is demographic information, i.e., the basic information of the users, including gender, age distribution, education level, industry, etc., which are placed separately at the beginning and end of the questionnaire; the second part is the scale items for variable measurement, which are referenced, integrated, and modified from mature scales in international literature, and finally adjusted after pre-testing and feedback from the users, and then set to 27 items (as shown in Table 1), corresponding to the 7 latent variables in the model, with each latent variable corresponding to at least 3 items, and the questionnaire was filled out in the form of a 5-point Likert scale to determine the degree of user tendency towards the variable factors in the model.

Table 1 Scale variables and items.

Data collection

Considering that the overall user group of AI drawing tools is difficult to obtain, in order to investigate this group, the research team used snowball sampling to obtain survey respondents. From January to April 2025, the research team took college teachers, designers, school students and non-design industry practitioners as the main survey targets, and distributed the questionnaires through a combination of offline and online forms, and ultimately retrieved 305 valid responses (invalid responses were excluded based on trap questions). The demographic information of the samples was collected and organized. In terms of gender: 69.51% were female and 30.49% were male; in terms of age distribution: young people aged 18–35 accounted for the most, 82.62%, and those under 18 accounted for the least, 3.61%; in terms of education level: 74.76% of the respondents had bachelor’s degree or above, and 9.51% of them had postgraduate degree or above; Engaged in the industry: 62.95% are students and practitioners in design and related industries; Time of contacting or using AI drawing tools: 62.29% of the survey respondents have used or contacted AI drawing tools for 2 months or more.

Statement: All methods were carried out in accordance with relevant guidelines and regulations; all experimental protocols were approved by School of Media & Art Design, Wenzhou Business College (the first author’s affiliation); confirms that informed consent was obtained from all participants and/or their legal guardians.

Results

Measurement model evaluation

The evaluation of the measurement model includes the assessment of reliability, convergent validity and discriminant validity. Reliability was mainly evaluated by internal consistency tests, including Cronbach’s alpha coefficient and Composite Reliability (CR). The data were imported into SmartPLS 4.0 software for calculation, and the results are shown in Table 2. The Cronbach’s alpha coefficients of all reflective constructs in this study were all greater than 0.8, and the CR values were all higher than 0.784, which indicated that the measurement model had good reliability.

Convergent validity was assessed by factor loading, Average Variance Extracted (AVE) and CR values; discriminant validity was mainly assessed using the method of cross-loadings. As can be seen from Tables 2 and 3, the standardized loadings of each question item are all greater than 0.6, the AVE values corresponding to the seven latent variables are all greater than 0.584, the CR values are all higher than 0.7, and the effects of each latent variable are significant, meanwhile, the loadings of each question in the factor to which it belongs are higher, and those of the non-factor to which it doesn’t belong are lower, and the loadings of all the factor indexes in the relevant constructs should be greater than their loadings in other constructs, which means that the data have good convergent and discriminant validity85.

Table 2 Measure model reliability and convergence validity assessment.
Table 3 Discriminant validity test.

The predictive power of the model in this study was evaluated by the explainable variance (R2) of the endogenous structure, with a larger R2 indicating that the independent variables of the current model have a greater ability to explain the variance of the dependent variable. In this study, the degree of explanation of perceived playfulness, creative self-efficacy, creative role identification and innovation behavior for the model exceeds 0.5 (moderate explanatory power between 0.5 and 0.7586, indicating that the explanatory power of the model meets the requirements.

Structural model evaluation

As shown in Tables 4 and 5, the path coefficients of H4 (β = 0.583, p < 0.001) and H8 (β = 0.524, p < 0.001) are higher, indicating that the correlation between tool performance and perceived playfulness, and between perceived playfulness and creative self-efficacy is stronger; H3 (β = 0.214, p < 0.001), H9 (β = 0325, p < 0.001) and H10 (β = 0.410, p < 0.001) hold, indicating that the degree of human-AI interaction, creative self-efficacy, and creative role identity these three have a direct correlation with user innovation behavior; the rest of the hypotheses have significant P-values and there are seven mediating paths to innovation behavior, indicating that all the independent variables have a direct or indirect correlation with the dependent variable. In addition, the effects of all control variables on user innovation behavior are not significant (p > 0.05).

Table 4 Hypothesis testing.
Table 5 Analysis of mediating effects.

Configuration analysis for fusion of NCA and FsQCA

Data calibration

Including all the independent variables into the condition variables, taking user innovation behavior as the outcome variable, averaging the questionnaire scale data corresponding to the variables, and adopting the direct calibration method, using the quartiles 95%, 50% and 5% as the selection criteria for the three qualitative anchors of full affiliation, intersection and full non-affiliation, and then calibrating them using the algorithms of the fsQCA software87 (see Table 6). In fsQCA, cases that happen to be at 0.5 affiliation will be rejected by the software during the analysis process, and to overcome this problem, the data calibrated at 0.5 will be changed to 0.501 in this study88.

Table 6 Variable data calibration results.

Analysis of necessary conditions

The NCA method can be used to generate the function by ceiling envelopment analysis (CE) or use ceiling regression analysis (CR), using the corresponding effective size (d) and p-value to determine whether it is necessary for the outcome variable89. First, the d-value should pass the threshold of 0.1, which is too small to be meaningful; second, the p-value based on the permutation test should be less than 0.05 to prove that the effect size is not a random result. Table 7 was calculated by the NCA package in the RStudio software, and the results of the CE-generated calculations were used in this study. From the results, it can be concluded that tool performance, creative role identity and perceived playfulness meet the d-value and p-value conditions with accuracy greater than 95%, which is a necessary condition for the outcome variable, i.e., when the affiliation values of these three are low, the user will not show high innovation behavior.

Table 7 Necessary conditions to analyze the results.

Necessary conditions are bottlenecks in the presence of results50,90, so the three necessary conditions shown in Table 7 were analyzed for bottleneck effects, calculated using RStudio to produce Table 8, where the bottleneck table allows necessary statements by degree. From the data in Table 8, it can be seen that when the user’s innovative behavior is demonstrated at 10%, creative role identity is not required; to demonstrate 50% innovative behavior, 22.6% tool performance, 23.4% creative role identity, and 17.2% perceived playfulness are required; when the innovative behavior is demonstrated at 100% level, 22.6% tool performance, 71.3% creative role identity, and 17.2% perceived playfulness are required. The results of the NCA analysis will be used as a complement to the configuration analysis that follows.

Table 8 Results of NCA bottleneck effect analysis.

Configuration analysis

Using software for truth table calculation, set the case frequency threshold to 3, raw consistency threshold to 0.887, PRI (proportional reduction in inconsistency) consistency threshold is set to 0.690, combined with the parsimonious solution91 and the results of the necessary conditions analysis to establish a high user innovation behavior configurations analysis table (see Table 9), the formation of four high user innovation behavior configurations all exceed 0.9, the overall coverage is 0.93, indicating that all configuration paths cover 93% of the cases. All configurations are required to satisfy the requisite affiliation requirement corresponding to innovation behavior affiliation > 0.5 (necessity requirement in Table 9), otherwise the formation paths of high innovation behavior determined by fsQCA will not produce the expected results.

Table 9 Configuration analysis table.
  1. 1.

    A1: technology-efficiency synergy

    The configuration is based on the technical premise of high tool performance (s), and integrates the users’ creative self-cognitive drive of high creative self-efficacy and high creative role identity (o), as well as the intrinsic emotional demand of perceived playfulness (O). Among them, the stability of technical performance guarantees the functional feasibility of users’ innovation behavior; creative self-efficacy reflects the belief of this group of people in their own innovation ability (“I can innovate”), and is not interfered by external motivation93, such as social influence (s), and they always insist on innovation behavior; role identity is internalized to form a continuous innovation motivation (“I should innovate”), while the perceived playfulness to play a lubricating role through the emotional design of products to stimulate the motivation to innovate, and the two can achieve the goal of a low degree of investment in this group state. Ultimately, the synthesis of the four elements to realize the resonance effect of technology and individual cognition, revealing that when the technology tools and individual creative self-cognition together to form positive feedback, can break through the innovation hysteresis of AI painting tool users, to achieve the “technology available” to “technology good use of the” leap. At the same time, the raw coverage of up to 81.5% indicates that the realization path of this innovative behavior is relatively common, which verifies the universality of the synergy between technology tools and individual capabilities.

  2. 2.

    A2: individual cognitively dominant

    The model presents a ternary reinforcement structure of “tool performance-creative role identity-perceived playfulness”, which are the necessary conditions for realizing innovative behavior in the NCA analysis, and the raw coverage rate of the combination is close to 84%, which is the most common innovation path. Among them, individual cognition (creative role identity and perceived playfulness) is the main factor, and technical performance is the complementary factor (the input degree of tool performance only needs to be greater than 0.226). When the individual strengthens the responsibility attribution through role identity and obtains the emotional incentive through the perceived playfulness, even if the technological tools do not reach the ultimate performance, the deep activation of cognitive resources can still drive the innovation. This verifies the autonomy of the organism (O) and the leverage effect of the technological stimulus (S) in the SOR model.

  3. 3.

    A3: Interactive Empowerment

    The human-AI interactive interaction mode (high degree of interaction) enables individuals to regard AI as a collaborator in the process of using the AI drawing tool, and to form a creative co-existence with the AI by constantly changing the “prompt”, and the interactive interface reduces the cognitive load of the user through the multimodal design in order to harvest a sense of pleasure. The instant feedback mechanism strengthens the confidence in innovation, deepens the role identity of “I should innovate”, and realizes the continuous innovation behavior. In this case, the input level of tool performance and role identity only needs to meet the basic requirements. This model confirms the interaction between environmental stimuli and cognition in the SOR model, and reflects the paradigm shift from “human-machine collaboration” to “human-intelligence symbiosis” in the context of smart technology.

  4. 4.

    A4: Social Compensatory

    With high social influence and high perceived playfulness as the core elements, this configuration is an innovation path formed under low creative self-efficacy, and the low coverage rate (40.6%) shows the uniqueness of the configuration even more. The social norm pressure compensates for the lack of individual confidence through the group demonstration effect, which is in line with the external incentive compensation mechanism in the social exchange theory, i.e., the users of the AI drawing tool feel the value of their own existence after receiving benefits or recognition from the environment, thus intrinsically motivating themselves to continue their Innovative Behavior94. In this process, tool performance (S) and creative role identity (O) to build a basic support system, through the environment recognition + pleasure perception to achieve user innovation breakthroughs, for example, some users in the environment under the influence of the use of AI painting tools to complete the creation of a considerable amount of other people’s praise and recognition of the pleasure that can be generated, can realize the identity and internalization of the “digital art creators”, forming creative self-propulsion.

Robustness testing

After setting the frequency number of cases to 4, raising the raw consistency threshold to 0.9, and raising the PRI consistency threshold to 0.65, 2 condition combinations consisting of only necessary conditions were generated, which were basically consistent with A2 in the original grouping, and the consistency and coverage did not undergo any major changes, and the robustness test showed that the grouping results were robust.

Conclusion and discussion

Theoretical and practical contributions

Theoretical contributions

At the theoretical level, the article focuses on a specific group of AI drawing tool users, and makes up for the limitations of traditional research on user innovation behavior by integrating multiple methods.

First, this study breaks through the limitations of a single theoretical perspective by constructing an integrated theoretical framework combining SOR, HCI, and SDT. Using SOR as the backbone, it defines the technical performance boundaries of AI painting tools by incorporating stimulating concepts such as “controllability” and “feedback” from the HCI perspective. Simultaneously, incorporating the game motivation perspectives of SDT and Barab successfully unraveled the “organism” black box, confirming that innovative behavior is not merely an environmental reflex. Instead, it is an active process triggered by technological stimuli that satisfy individuals’ sense of competence (self-efficacy) and autonomy (role identity), unfolding within a gamified psychological safety space (perceived playfulness).

Second, PLS-SEM validated the direct effects of human-AI interaction intensity, creative self-efficacy, and creative role identity within the “S” and “O” components of the empirical model on innovative behavior, along with the indirect influence of other factors. NCA identified tool performance, role identity, and perceived enjoyment as necessary conditions for innovative behavior among AI painting tool users—meaning the absence of any one of these three elements would directly lead to innovation stagnation.

Finally, fsQCA was used to calculate the four types of configuration, which further revealed the differences in the paths of innovation of different user groups, Technology-Efficiency Synergy (A1) and Social Compensatory (A4) validate the coexistence of the “technology-driven” and “socially-drawn” innovation logics, indicating that there are multiple dynamic paths for the interaction between the environmental stimulus (S) and the organismal response (O) in the SOR model. This finding breaks through the over-reliance on a single causal chain in traditional theories, and provides a more three-dimensional theoretical framework for explaining the complex context of human-machine collaborative innovation.

The theoretical framework of this study significantly diverges from previous empirical research grounded in singular perspectives. Prior studies based on TAM or UTAUT often treated AI as a passive technological object for adoption, while traditional SOR research focused on unidirectional environmental stimuli. In contrast, this integrated model reveals a unique dual-driver mechanism of “technology + psychology” within the generative AI context: generative AI functions not only as a tool but also as a partner possessing “human-like interactive attributes” (HCI perspective). Through high-frequency interactions, it awakens users’ intrinsic sense of competence and autonomy (SDT perspective), transforming them from mere “operators” of technology into “creative leaders.” This finding theoretically explains why generative AI elicits deeper psychological resonance and behavioral stickiness than traditional design software, validating the paradigm shift in human-computer interaction from “instrumental rationality” to “value co-creation.”

Practical guidance

At the practical level, this study offers concrete, actionable design guidelines for the development and operation of AI painting tools (see Fig. 4).

Fig. 4
Fig. 4
Full size image

Practical guidance.

First, for “Technology-Efficacy Synergistic” users, developers should implement gamified emotional design to enhance enjoyment while continuously optimizing tool performance. For instance, the system could trigger celebratory animations (e.g., confetti effects) when users generate high-quality artwork or unlock customizable UI palettes to boost perceived enjoyment and sustain users’ flow experience.

Second, for the most prevalent “Individual Cognitively Dominant” users, developers should prioritize “Lightweight Empowerment & Identity Customization.” Since these users can leverage limited tool performance (>0.226) to achieve high levels of innovation, system design should focus on “subtraction” rather than “addition.” Specifically, introducing style fine-tuning capabilities is recommended, allowing users to train personalized lightweight AI painting models. This significantly fulfills their desire to internalize AI as an extension of their creative process, strengthening their creative role identity. Simultaneously, to heighten perceived playfulness and activate cognitive resources, interfaces could incorporate random dice tools or Mystery Box Generation modes. By leveraging the serendipity of unpredictable outcomes, these features guide users toward deep cognitive exploration and creative reconstruction beyond basic functionalities.

Third, for “Interactive Empowerment” configurations, developers should focus on reducing cognitive load through multimodal interaction. Tools can incorporate voice-to-image commands or gesture-based editing on mobile devices, adhering to universal usability principles to minimize complex parameter adjustments. This transforms AI drawing tools into true “creative partners” that understand human language.

Fourth, for the “Social Compensatory” configuration, platforms can establish community-driven identity verification systems. Recognizing that such users rely on external validation, platforms may introduce weekly “Prompt of the Week” badges or create co-creation galleries. Through community recognition and official certification, these initiatives compensate for users’ perceived lack of self-efficacy, granting them identity validation as “digital artists.”

Research shortcomings

Although this study has achieved certain results in theoretical integration and empirical analysis, the following limitations remain:

First, the measurement of objective HCI interaction metrics remains inadequate. This study primarily relies on questionnaires to obtain subjective perception data regarding “Tool Performance” and “Degree of Human-AI Interaction,” lacking objective log records of actual user operational behavior. This may introduce subjective bias in validating relevant interaction principles.

Second, the study fails to capture the dynamic evolution of psychological needs. Users’ competence and autonomy based on SDT are not static. While the cross-sectional data reveals current configuration effects, it cannot track the dynamic migration mechanisms of innovation behavior pathways as users transition from novices to experts.

Third, limitations in control variables and sample coverage. The model inadequately accounts for the moderating effect of AI literacy on technology acceptance. While the sample spans multiple industries, it disproportionately includes design professionals, necessitating larger-scale validation to confirm the findings’ generalizability beyond specialized fields.

Prospect of future research

Based on the aforementioned limitations and the findings of this study, future research can be deepened in the following three dimensions:

First, conduct HCI guideline validation based on experimental design. Future studies can move beyond simple questionnaire surveys by designing testing experiments that control the interface characteristics of AI tools. This approach will enable precise quantification of the causal effects of interaction rules and gamification mechanisms on users’ “Perceived Playfulness” and innovation output, providing more accurate parameter guidance for product design.

Second, construct longitudinal tracking models to explore the dynamic evolution of “human-machine symbiosis.” Incorporating a temporal dimension, track users’ psychological trajectories across exploration, adaptation, and symbiosis phases. Focus particularly on identifying which SDT motivational mechanisms sustain long-term innovation engagement after novelty wears off.

Third, expand the model’s cross-modal and cross-scenario applicability. Extend the integrated “SOR + SDT+HCI” framework beyond AI painting to other generative AI domains like AI video generation and AI music composition. Validate the universality of the dual “technology-psychology” driving mechanism while incorporating variables such as “AI literacy” and “Technology Anxiety” to refine the boundaries of human-machine collaborative innovation theory.