Abstract
One potential remedy for grid stability and energy efficiency is the integration of electric vehicles (EVs) into the grid via Vehicle-to-Grid (V2G) technology. The challenge has been figuring out how to best charge and discharge EVs based on driver preferences, electricity rates, and changing grid requirements. Three cutting-edge approaches are compared in this research as V2G system optimization solutions: adaptive control using Reinforcement Learning (RL), dynamic pricing strategy using Game Theory, and predictive charging using Artificial Intelligence (AI). Predictive models are used in the AI-powered predictive charging method to forecast grid circumstances and adjust the charging time accordingly. To control supply and demand among EVs, grid operators, and the power market, the Game Theory model uses dynamic pricing. In this paper, a hierarchical collaborative fusion method is proposed, where the reinforcement learning control policy applies price signals generated by an artificial intelligence-based demand forecasting methodology, which constrains the game-theoretic pricing layer. Due to such cooperation, V2G systems can make consistent strategic, financial, and operational decisions. Finally, RL provides an adaptive control system that makes optimal charging and discharging decisions in real time. The article highlights the potential to improve energy management in V2G systems, making them more economical, reducing grid congestion, and optimizing sustainability by combining various approaches. The methodologies’ suitability for real-world applications is demonstrated by evaluation with simulated data.
Similar content being viewed by others
Introduction
To find the optimal charging and discharging schedules, machine learning (ML) and artificial intelligence algorithms consider weather, grid load, pricing, and driving history1. It minimizes battery wear by optimizing the charge cycle, lowers charging costs by charging during periods of low demand, and stabilizes the grid by balancing demand spikes2. In a Game Theory-Based Dynamic Pricing Strategy, game theory models are applied to reconcile EV owner preferences with grid demand3. EVs bid on charging/discharging slots with incentives, price signals, and availability4. The advantages of implementing this approach include encouraging EV drivers to discharge when the grid needs it most and recharge when it doesn’t5, deriving maximum economic benefits for EV owners and utilities, and avoiding grid overload by efficiently managing charging load6. Employs RL to learn and modify continuously based on grid conditions, electricity prices, and user trends for best V2G planning7. By controlling power flows in real time and adjusting to unanticipated factors, such as abrupt changes in grid demand or fluctuations in renewable resources, real-time adjustments optimize long-term energy savings for EV users and increase V2G efficiency8. The investigation aims at discussing the interaction of AI forecasting, game-theoretic pricing, and reinforcement learning in a coordinated manner, instead of the existing literature, which implements these two methods separately. The peculiarity is that a single V2G optimization pipeline with a clearly defined information flow and role division is implemented across the forecasting, pricing, and control layers.
Contributions
The major contributions are:
-
RL, AI forecasting, and game-theoretic pricing V2G system control in a hierarchical fusion.
-
Nash equilibrium prices cannot be established in this pricing mechanism because artificial intelligence forecasts are used.
-
An incentive structure for pricing control that accounts for market conditions and uses RL policies that adapt to dynamic prices.
Related works
To control EV charging trends and prevent grid overload, smart scheduling algorithms are required, as the widespread adoption of electric vehicles (EVs) strains the current power grid infrastructure9. It is observed that e-mobility services based on AI techniques enhance the performance of energy management systems within EV ecosystems, optimize charging strategies, and adapt to varying operating environments10. Some of the machine learning methodologies explored in EV charging management include charging protocols, demand response, energy management, and integration of renewable energy sources to enhance and accelerate the charging operations11. Based on machine learning predictions, optimal charging schedules and habits can minimize fossil fuel consumption and environmental impacts12. Moreover, to reduce charging costs and enhance the effectiveness of charging systems, AI-inspired optimization methods have focused on scheduling, clustering, and prediction techniques13.
EV integration into the power system has its benefits and drawbacks for system operators and electric vehicle owners14. In recent years, dynamic pricing systems based on game theory have attracted increased interest as a cost-efficient strategy for optimizing energy delivery and charging15. Such plans will reduce grid strain by promoting EV charging during off-peak hours, when demand is lower16. Game-theoretic formulations in V2G systems have modeled the interactions between grid operators and EV owners as non-cooperative games to optimize pricing policies and charging schedules, thereby avoiding grid congestion and maximizing system-level benefits17. Dynamic pricing mechanisms can be used to achieve the goals of reducing peak demand and charging prices through strategic interactions between grid operators and electric vehicle (EV) users18. As an additional means to enhance social welfare, stabilize pricing mechanisms, and achieve greater user satisfaction than available through a non-cooperative solution, it has been proposed that cooperative game theory models be used to facilitate coordination between grid operators and EV customers19.
Two complex and dynamic systems that have attracted the attention of reinforcement learning (RL) researchers for optimizing decision-making are power management and electric vehicle charging8. RL-based charging methods are highly suitable in the smart grid setting, as they can adapt to varying grid demand, power pricing, and vehicle availability20. To minimize charging costs and meet battery energy demands under uncertainty, the electric vehicle charging control problems were formulated as Markov decision processes (MDPs)21. The solution to these MDPs is found using deep reinforcement learning methods, such as deep deterministic policy gradient (DDPG) and related methods22. To reduce operational costs while maintaining the EV’s best performance, multi-agent deep reinforcement learning systems have proven successful. These methods employ decentralized implementation and a centralized training23. To address the limitations of the distribution network capacity and high-power charging, deep reinforcement learning applied to real-time electric vehicle charging control can be considered a solution24.
In the case of multimicrogrids with V2G technology, the authors proposed a coordinated load-frequency control using an improved multi-agent deep deterministic policy gradient (MA-DDPG) algorithm. This strategy can be used to coordinate the work of distributed EV aggregators in a dynamic operating environment and regulate the system frequency25. Simulation results reveal that compared to the traditional controllers, there is reduced frequency deviation and increased stability. As shown in this paper, the concept of multi-agent reinforcement learning is effective in solving EV-related grid stability problems at a massive scale.
Researchers proposed an optimal scheduling process for microgrids that utilizes V2G and deep Q-learning. Even when the renewable generation and electric vehicle mobility are unpredictable, the proposed model-free approach can be used to learn charge and discharge policies26. The outcomes are more adaptable and have lower operating costs than those of rule-based systems. The study claims that smart grids, facilitated by intelligent V2G scheduling, are enabled by Q-learning methods.
The frequency control method for charging electric vehicles at charging stations in isolated microgrids used model predictive control in conjunction with virtual synchronous generator technology. This approach allows electric vehicle charging stations to comply with charging limitations while also providing rapid inertial support27. The simulation results showed lower oscillations and increased frequency stability. The paper demonstrates that AI-based optimization and improved control can ensure microgrid stability, even with high EV penetration.
A V2G scheduling model based on deep reinforcement learning (Soft Actor-Critic) to enable a multi-energy microgrid that is capable of dynamically coordinating electric vehicles to charge or discharge even when the prices and demand are unknown. The approach conditions the DRA agent to maximize microgrid profitability, subject to operational constraints, through a Markov decision process to plan V2G28. The simulation shows better performance, both economically and in terms of responsiveness, compared to static/rule-based schedulers. This study identifies the potential of state-of-the-art DRL algorithms for real-time grid optimization in the complex V2G setting.
To address the issue of the deep reinforcement learning-based errors in solar power predictions, a journal article in Sustainability represents V2G operations as a Markov decision process. Given the state of charging stations and the uncertainty in solar output, the DRL agent adapts EV charging and discharging29. The comparison with more conventional techniques indicates that the proposed solution reduces the impact of forecasting errors while enhancing grid stability. This paper demonstrates how DRL can enhance V2G energy management in the face of renewable variability.
To achieve a balance between grid demands and user autonomy, the authors propose VESTA, a semantically aware, intelligent V2G management platform based on blockchain, edge computing, and artificial intelligence. The semantic model can enhance energy distribution efficiency by approximately 15% and reduce response times by 20%, while prioritizing key vehicles, such as emergency services, during high grid demand30. The architecture demonstrates that V2G coordination can be enhanced by integrating AI with decentralized technologies, extending beyond common optimization and pricing algorithms.
A more recent study proposes a spatial-temporal data fusion model grounded in large language models to enhance demand prediction for electric vehicle charging in smart grid settings31. The framework is a better way to forecast charging behavior patterns than traditional forecasting models, as it integrates heterogeneous temporal and spatial information into a single framework. The findings show the increased prediction accuracy in dynamic urban settings. The method defines the potential of large language models for high-level energy demand forecasting and intelligent grid decision support.
Another similar paper presents a closed-loop vehicle-to-vehicle charging scheme based on a non-cooperative game-theoretic model. The model captures competitive relationships among electric vehicles at charging stations and derives equilibrium-based charging policies32. Simulation demonstrates that charging efficiency and fairness are enhanced compared to centralized control mechanisms. The paper is useful for understanding decentralized energy exchange and pricing methodologies applicable to V2G systems.
Based the literature, a summary on EV Charging & Energy Management is provided in Table 1 followed by the research gaps identified along with the proposed solution for the respective gaps in Table 2.
Methodology used for smart charging for V2G optimization
This article presents three methods for smart charging in vehicle-to-grid optimization.
Artificial intelligence-based predictive charging
AI and ML models analyse historical driving patterns, weather forecasts, grid demand, and electricity prices to predict optimal charging and discharging schedules. Figure 1 shows the pictorial representation of Smart charging in E-vehicles.
Pictorial representation of Smart Charging in E-Vehicles.
AI-Based Predictive Charging Algorithm.
In the case of EVs operating in a V2G environment, such a system will map an estimated charging schedule through artificial intelligence. The inputs to the trained AI models include projections of renewable energy, user driving behaviour, electricity costs, and past grid demand. These models predict the grid’s demand, energy prices, and vehicle availability over the time frame specified in Algorithm 1. The system then uses these forecasts to evaluate all possible future time slots and determine the most efficient billing approach. Charging is done during periods of low grid demand and low electricity prices, and discharging is done during periods of high demand and high prices, and when the battery state of charge is beyond a specified threshold. In all other situations, the car is kept idle to avoid an unproductive operation. The charging controller executes the selected move, and the AI models are continually fed new information. In general, the predictive technique is used to enhance the operational stability of V2G networks, minimize expenses, and decrease grid stress.
Battery Level Update Eq.
where:
-
B(t) = Battery level at time ttt.
-
Pc = Charging power.
-
Pd = Discharging power.
-
ηc, ηd = Charging and discharging efficiencies.
-
Δt = Time interval.
-
Optimal Charging Cost Calculation.
-
\(\:{C}_{charge\:}\) - Total charging cost.
-
\(\:{P}_{price}\left(t\right)\) - Electricity price at time t.
-
T - Set of optimal charging time slots.
Charging Decision Function.
Flow chart of Artificial Intelligence-Based Predictive Charging.
First, the system boots up and verifies real-time data collection. During data collection, IoT sensors gather battery SOC (State of Charge), grid demand, electricity prices, user driving behaviour, and renewable energy availability. Then the data is sent through MQTT or cloud API to a central processing unit. Preprocessing and Feature Engineering ensure that the data is cleaned, normalized, and formatted for AI processing, as shown in Fig. 2. The time-based attributes (hour, day, month), electricity price trends, and consumption patterns are pulled out. The AI Model Prediction, LSTM Neural Network, investigates historical patterns to forecast the optimal charging time and predict when to charge, based on grid stability and cost minimization. If the prediction allows to charge, the system issues a command to the EV to initiate charging. If the charging is not ideal, the system waits and continues to look for an opportune moment. If charging is undertaken, the system tracks battery health and dynamically adjusts charging speed. If charging is delayed, the model reassesses periodically. The system learns and improves continuously from real-time grid variability and user activity.
Game theory-based dynamic pricing strategy
Uses game theory models to balance EV owner preferences with grid demand. EVs “compete” for charging/discharging slots based on incentives, price signals, and availability.
Game Theory-Based Dynamic Pricing Strategy Algorithm.
The System starts and gathers real-time information from the grid, EVs, and the electricity market. The smart meters and IoT sensors collect EV battery level, grid demand, electricity prices, and user preferences, and transmit this information to the centralized V2G control system described in Algorithm 2. Then Identify Players of the Game: the EV Owners initially want to sell surplus energy at the best price, the Grid Operators need to purchase the energy at the lowest price, and the Electricity Market determines dynamic energy pricing. The most important thing is that either a real-time auction or a Nash Equilibrium model is employed to match demand (grid operators) and supply (EV owners), and That Price incentives are dynamically adjusted based on energy requirements. After such decision-making for EV owners, there are two scenarios: one is IF the price offered is profitable, the EV supplies energy to the grid; the second is IF the price is below cost, the EV waits for favourable market conditions. The market dynamically updates prices based on actual transactions, and the process continues dynamically for subsequent time slots, as shown in Fig. 3.
Flow chart of Game Theory-Based Dynamic Pricing Strategy.
Dynamic Pricing Strategy:
Bid Price Calculation for Charging
.
Bid Price Calculation for Discharging
.
B min - Minimum required charge level.
β - Weighting factor for discharging bid.
Profit Calculation for EV Owners.
The weighting factors α and β represent the relative importance of grid demand stress and economic pricing incentives, respectively. These parameters are normalized coefficients constrained to α + β = 1, ensuring interpretability and stability of the bidding formulation.
Psell(t) = Selling price at time.
Edischarged(t) - Energy discharged at time t.
Pbuy(t)- Buying price at time t.
Echarged(t) - Energy charged at time t.
Reinforcement learning-based adaptive control
The proposed method uses RL algorithm to continuously learn and adapt to grid conditions, electricity tariffs, and user behaviour for optimal V2G scheduling. The key benefit of the system is real-time adaptation to unpredictable factors, such as sudden changes in grid demand or fluctuations in renewable energy. Maximizes long-term energy savings for EV owners and increases V2G efficiency by dynamically adjusting power flows.
Reinforcement Learning-Based Adaptive Control Algorithm.
First, the State and Action Space Definition is carried out; the State: [Battery Level, Grid Demand, Electricity Price, Time] and Action: [Charge, Discharge, Idle] are initialised. In Algorithm 3, the RL agent explores different actions and learns from rewards/penalties and Q-table updates based on rewards and future state estimates. The key optimization control monitors the following.
-
High reward for charging when prices are low, and demand is low (Allow).
-
High reward for discharging when prices are high, and demand is high (Allow).
-
Penalties for inefficient charging/discharging (Not Allow).
After training, the model continuously monitors the grid and EV state. It selects the best action based on learned policies. The process repeats periodically, ensuring dynamic V2G optimization as shown in Fig. 4.
Q-Learning Update Rule
.
Q (s, a) = Q-value for state s and action a.
α = Learning rate.
r = Reward for taking action a in state s.
γ = Discount factor for future rewards.
max Q(s′, a′) = Maximum Q-value for the next state s′.
Flow chart of Reinforcement Learning-Based Adaptive Control.
Reward Function for Charging
.
Reward Function for Discharging
Collaborative fusion mechanism
The proposed framework consists of three interrelated layers. Firstly, the grid demand and pricing pattern anticipation adopted in the module is grounded on artificial intelligence. These predictions ensure consistency in equilibrium solutions to the game under the expected conditions of the system, constraining the pricing layer of game theory. To facilitate adaptive charging decisions informed by market and forecast awareness, the reinforcement learning agent receives dynamic prices as part of the state and reward formulation.
Collaborative fusion mechanism overview.
Figure 5 represents the fusion architecture and shows a hierarchical coordination scheme for vehicle-to-grid energy management. The predictive module is an AI-based forecasting tool to predict future demand and electricity price trends in the grid. These predictions limit the game- theoretic pricing layer, where equilibrium prices are fixed to satisfy the expected system conditions. The dynamically generated prices are then fed into the reinforcement learning controller, which identifies real-time charging, discharging, or idle actions. This systematic flow of information will help to turn individual techniques into a common decision-making system.
In Eq. (10), \(\:{a}_{t}^{*}\) denotes the optimal charging action selected at time \(\:t\). The state \(\:{S}_{t}\) includes battery state of charge and temporal information. \(\:{P}_{t}^{GT}\) represents the dynamic electricity price obtained from the game-theoretic pricing model, while \(\:\:{\widehat{D}}_{t})\) is the AI-predicted grid demand. By embedding both market price and forecasted demand into the \(\:Q\) -function, the reinforcement learning agent makes decisions that are economically efficient and grid-aware.
System model and assumptions
The proposed V2G framework is evaluated using a simulation-based system model that represents interactions between the power grid, electric vehicle fleets, and renewable energy sources. Due to limited access to real-time grid operation data, simulation enables controlled assessment under diverse operating conditions while ensuring reproducibility. Parameter ranges are selected to reflect per-vehicle energy contributions and scalable aggregation behavior consistent with engineering practice. Renewable energy availability is modeled as a stochastic input that affects supply constraints and pricing signals, reflecting the intermittent nature of renewable generation.
Scenario-based sensitivity analysis
To evaluate robustness and generalizability, the proposed framework is tested under varying EV fleet sizes and load conditions. This analysis ensures that observed performance trends are not restricted to a single operating configuration.
Scenario-based sensitivity analysis.
Figure 6 illustrates the performance of the proposed V2G framework under varying EV fleet sizes, evaluating its robustness and scalability. Charging cost and peak load metrics are compared across different participation levels to assess sensitivity to system scale. The consistent reduction in both metrics as fleet size increases indicates effective coordination between forecasting, pricing, and control layers. These results demonstrate that the proposed framework maintains stable and predictable behavior across diverse operating conditions, supporting the generalizability and engineering relevance of the simulation-based evaluation.
To assess robustness, α and β are varied across representative ranges, and their effects on charging costs and peak load are evaluated. This analysis confirms that the pricing mechanism remains stable and effective across reasonable parameter selections.
Baseline methods for comparison
To evaluate the effectiveness of the proposed framework, comparisons are conducted against representative state-of-the-art (SOTA) and benchmark V2G optimization methods. These include learning-based, game-theoretic, and heuristic optimization approaches commonly adopted in recent literature.
The comparison baselines include:
-
(i)
Transformer-based load and price prediction,
-
(ii)
Multi-agent deep reinforcement learning (MADRL),
-
(iii)
Distributed game-theoretic pricing optimization,
-
(iv)
Particle swarm optimization (PSO), and.
-
(v)
Genetic algorithm (GA).
All methods are evaluated using unified metrics, including average charging cost, the grid peak–valley difference, and the EV user satisfaction index, ensuring a fair and consistent performance comparison.
Comparative performance evaluation
Comparative Performance Evaluation.
Figure 7 presents a quantitative comparison between the proposed fusion framework and representative SOTA and benchmark optimization methods. Charging cost and peak–valley difference is used as unified evaluation metrics. While learning-based and heuristic methods demonstrate moderate optimization capability, the proposed framework consistently achieves lower charging costs and improved load balancing. This improvement is attributed to coordinated forecasting, pricing, and adaptive control. The results confirm that the proposed approach outperforms existing standalone and heuristic-based solutions under identical evaluation conditions.
All baseline methods are evaluated under identical simulation conditions and input constraints to ensure fairness. Parameter settings follow commonly adopted values reported in the literature.
Analysis of smart charging algorithm
The data can simulate charging decisions, price fluctuations, battery levels, and market conditions over time. Below are example datasets for each method, which can be used to plot relevant graphs.
Artificial intelligence-based predictive charging
This method predicts the optimal charging time based on grid conditions, EV battery state of charge, and electricity prices, as shown in Table 3.
Example Data:
-
Battery Level (SOC): 0–100%.
-
Grid Demand: 0–10 kW.
-
Electricity Price: $0.05 - $0.30 per kWh.
-
Charging Decision: 1 (charge), 0 (wait).
Simulation result of Artificial Intelligence-Based Predictive Charging.
The graph depicts an AI-driven predictive charging approach for an EV over 8 h. The battery charge rises steadily from 20% to almost 100%, indicating regulated charging. The charging choice switches between charging (1) and waiting (0) in a cycle, most probably to keep energy usage efficient, keep expenses low, and prolong battery life. This periodic charging method maintains grid demand within limits, maintains battery health at optimal levels, and provides a full charge within the prescribed time limit. The outcomes indicate that AI plays a key role in improving the efficiency and effectiveness of EV charging stations, along with issues such as grid strain, energy management, and environmental concerns, as shown in Fig. 8.
The modeled grid demand range (0–10 kW) represents the effective power contribution of an individual EV rather than the total feeder load. When aggregated across multiple vehicles, this abstraction corresponds to realistic distribution-level demand scales.
Feature selection rationale
The selected features (time attributes, electricity price trends, historical demand) are chosen based on their demonstrated influence on charging behavior and grid load variability reported in prior V2G studies. Time-related features capture periodic demand patterns, while electricity price trends reflect market-driven charging incentives. Historical demand provides temporal correlation, which is essential for short-term forecasting. Feature relevance is quantitatively validated using normalized importance scores derived from model sensitivity analysis, in which prediction performance is evaluated after removing individual features. Results indicate that time attributes and price trends contribute most significantly to forecasting accuracy.
Feature selection analysis.
Figure 9 illustrates the impact of varying the weight factor α on charging cost performance. Results show that moderate weighting between grid demand and pricing incentives yields optimal outcomes, while extreme weighting leads to diminished performance. This confirms that the bidding mechanism is robust to parameter selection and does not rely on fine-tuned values, thereby enhancing reproducibility and practical applicability.
Game theory-based dynamic pricing strategy
This strategy uses dynamic pricing based on supply-and-demand interactions among EV owners, grid operators, and the electricity market, as shown in Table 4.
Example Data:
-
EV Participation: Number of EVs that discharge to the grid.
-
Market Price: Dynamic price of electricity per kWh.
-
Grid Demand: Amount of energy requested by the grid.
-
Profit (EVs): Profit gained from selling energy.
Simulation result of game theory -based predictive charging.
Dynamic pricing strategies for EV charging based on game theory have been investigated in different models to maximize the interaction between EV owners and grid operators. The models are designed to balance the grid, minimize operation costs, and maximize energy distribution. Nash equilibrium, cooperative game theory, and Stackelberg games are effective in maximizing pricing efficiency and load management. As EV adoption increases, these policies will become even more important for integrating electric cars into the electricity grid and enhancing sustainable energy management. The graph above illustrates a Game Theory-Based Dynamic Pricing Policy for EV charging. The top subplot shows the market price ($/kWh) increasing steadily over time, reflecting a dynamic pricing system in which prices increase in steps. The lower subplot shows EV profit ($), which at first rises, peaks at about 4–5 h, and then drops, indicating an optimal charging interval for profit maximization. The strategy probably aims to balance supply and demand, maximize revenue, and promote strategic charging behaviour, as shown in Fig. 10. The number of participating EVs is allowed to vary over time, capturing stochastic arrival and departure behavior observed in real-world charging environments.
Reinforcement learning-based adaptive control
This approach uses reinforcement learning to determine optimal charging or discharging actions based on the state (battery level, grid demand, etc.), as shown in Table 5.
Example Data:
-
Battery Level (SOC): 0–100%.
-
Action: 1 (charge), 0 (discharge), 2 (wait).
-
Reward: Positive for beneficial actions, negative for poor actions (e.g., overcharging, undercharging).
The graph shows a Reinforcement Learning-Based Adaptive Control strategy for Electric Vehicle charging. The upper subplot shows the decision-making for actions (charging, discharging, or waiting) changing over time, which implies an adaptive strategy for energy management. The middle subplot shows a steady increase in battery level (%), indicating controlled charging. The bottom subplot shows the reward function, which dynamically adapts to actions taken, optimizing efficiency and performance. It must be a measure that balances power consumption, cost, and system stability. The empirical studies presented clearly show the multifaceted value and efficiency of RL, from Q-learning through DRL, in achieving energy delivery efficiency, operational cost savings, and grid stabilization. RL models enable the adjustment of real-time charging schedules based on grid status, energy pricing, and car demand, an aspect that becomes increasingly important as EV adoption grows and grid control becomes more complex. These findings clearly show that RL has the potential to be a strong component of future smart grid solutions and energy management for EVs, as shown in Fig. 11.
Simulation result of Game Theory -Based Predictive Charging.
The three methods, AI-Based Predictive Charging, Game Theory-Based Dynamic Pricing, and Reinforcement Learning-Based Adaptive Control, while distinct in their approaches, share several key similarities and correlations when applied to the optimization of EV charging, grid management, and energy distribution.
The reward values are normalized to reflect relative operational preference rather than absolute monetary gain. This normalization ensures stable policy convergence and prevents reward saturation, a common issue in Q-learning-based energy management systems.
While the core state space focuses on battery level, grid demand, electricity price, and time, user travel urgency and battery aging costs are implicitly incorporated through charging constraints and reward penalties, ensuring tractable state dimensionality without compromising decision relevance.
The reward function \(\:{R}_{t}\:\)balances economic cost \(\:{E}_{t}\), grid stability \(\:{d}_{t}\), battery aging \(\:{C}_{aging}\), and user satisfaction\(\:{U}_{t}\). Weighting coefficients \(\:\alpha\:\), \(\:\beta\:\), \(\:\gamma\:,\) \(\:\delta\:\) regulate trade-offs between system-level and user-centric objectives by Eq. 11.
The multi-objective reward formulation enables the RL agent to balance competing operational goals. Economic efficiency and grid stability are prioritized through cost and demand penalties, while battery aging cost discourages excessive cycling. User satisfaction is encouraged via timely charging completion. Weight normalization ensures stable learning and prevents dominance of any single objective. This design aligns with practical V2G operation, where economic, technical, and user-centric factors must be jointly optimized.
Unified performance evaluation metrics
To ensure objective comparison, all methods are evaluated using unified quantitative metrics, including.
-
(i)
Charging cost reduction rate,
-
(ii)
Grid peak–valley difference mitigation ratio, and.
-
(iii)
EV user satisfaction index. These metrics are computed directly from simulation outputs rather than subjective scoring.
Comparative performance evaluation
This is defined as the percentage reduction in charging costs relative to uncontrolled charging. Peak-valley difference reduction is a measure of load smoothing of base grid demand, as analysed in Table 6. The mean EV waiting time indicates the level of charging convenience for users. All metrics are generated under the same simulation settings and averaged across multiple runs to ensure consistency and reproducibility.
Below is an exploration of the correlation between these methods:
Common goal: optimization of EV charging and grid management
All three approaches are designed to maximize energy distribution efficiency and minimize EV charging operational costs. Each solution targets different charging dimensions—timing, pricing, and real-time adjustment. AI-Based Predictive Charging anticipates the best times to charge based on past and real-time data to optimize charging schedules. Game Theory-Based Dynamic Pricing optimizes pricing strategies between grid operators and EV owners, influencing when and how much energy EV owners charge. Reinforcement Learning-Based Adaptive Control uses real-time feedback to learn optimal charging and discharging strategies.
Data-driven decision-making
All three methods rely on data-driven decision-making for optimizing charging. They use real-time data from the grid, energy prices, and vehicle statuses, enabling them to adjust charging behaviour dynamically. AI-Based Predictive Charging uses machine learning or AI algorithms to predict future charging patterns based on historical and current data. Game Theory-Based Dynamic Pricing relies on real-time information regarding grid load and pricing to optimize the charging decisions for both EV owners and grid operators. Reinforcement Learning-Based Adaptive Control continuously adapts charging strategies by learning from real-time feedback from the grid and charging behaviour.
Interaction between EV owners and grid operators
These approaches are structured around the interaction between EV drivers and grid operators to achieve optimal charging schedules and grid stability. The distinction lies in how these interactions are modelled and controlled. AI-based predictive charging primarily optimizes the charging behavior of individual EVs based on forecasted data, without explicitly modelling interactions with the grid. Game Theory-Based Dynamic Pricing explicitly simulates the non-cooperative or cooperative interactions among EV owners (as consumers) and grid operators (as service providers) to optimize pricing strategies. Reinforcement Learning-Based Adaptive Control employs an adaptive feedback mechanism in which the system (RL agent) adapts based on interactions between the grid and the charging station, thereby maximizing charging efficiency iteratively.
Balancing load and reducing peak demand
Both methods help alleviate congestion on the grid and peak demand. However, they do so differently: AI-Based Predictive Charging forecasts when demand will peak and schedules EV charging for low-demand times, without adding further load to the grid. Game Theory-Based Dynamic Pricing employs dynamic pricing to encourage EV owners to charge their vehicles at low-demand times to balance supply and demand and reduce grid load. Reinforcement Learning-Based Adaptive Control: Ongoing charging-schedule adaptation in real time, learning when it is optimal to charge based on grid load and energy prices.
Adaptability and flexibility
AI-Based Predictive Charging is very effective at predicting charging patterns but is static, as it uses past data to anticipate future charging behaviour. Game Theory-Based Dynamic Pricing is more dynamic in its response, adapting faster to sudden grid status changes and rapidly changing conditions through pricing adjustments. Reinforcement Learning-Based Adaptive Control is the most proactive of the three, as it learns and adjusts continuously to real-time information and changes, tailoring the charging process to optimize based on ongoing interactions with the grid.
Real-time decision making vs. long-term predictions
AI-Based Predictive Charging was largely concerned with long-term forecasting based on past trends and models. Game Theory-Based Dynamic Pricing was more concerned with short-term, real-time pricing strategy decisions for grid management and customer behaviour. Reinforcement Learning-Based Adaptive Control concerns real-time decision-making and ongoing optimization, learning from feedback and adapting, as shown in Table 7.
AI-Based Predictive Charging is best suited for situations where charging behaviour can be predicted based on historical data. It’s more effective in stable environments but lacks real-time adaptability. Game Theory-Based Dynamic Pricing is best for optimizing pricing and guiding consumer behaviour. It works well for balancing demand and incentivizing off-peak charging, and it is easier to integrate into existing systems. Reinforcement Learning-Based Adaptive Control is most suitable for highly dynamic and complex systems that require continuous, real-time adjustments to optimize charging behaviour. It is computationally expensive but offers the highest flexibility and adaptability, as shown in Table 7.
The smart charging framework decision-making includes grid demand, power price, battery state of charge (SOC), renewable energy availability, user flexibility, and normalized internal scores (Table 8). These scores are ordinal, derived by threshold-based standardization rather than raw physical measures. All values are expressed as multiples of five due to reasons of readability, strong aggregation of varied parameters, and low sensitivity to minor swings. At this level of discretization, stable optimization and learning in AI and RL modules are facilitated, and decision correctness is maintained. The sensitivity analysis shows that the trend of the decision and the results of the comparison are not affected by the resolution of the scores; hence, the results can be relied on. Charging cost reduction is calculated as the percentage decrease in total charging cost compared to uncontrolled charging. Peak–valley difference reduction measures the improvement in load smoothing relative to baseline grid demand. Average EV waiting time reflects user-level charging convenience. All metrics are obtained from identical simulation settings and averaged over multiple runs to ensure consistency and reproducibility.
Comparison of charging Algorithms based on correlation factors.
Figure 12 indicated that the normalized internal decision scores differed across conditions and operational time spans. The figures shown are not real-world physical values but rather the relative influence of grid demand, power prices, and battery condition on charging decisions. The discrete scoring method is applied to examine the system’s dynamics, as it emphasizes consistent trends and comparisons. The selected scale minimizes sensitivity to noise and enhances the stability of AI-based decision-making, yet finer granularity may increase numerical accuracy. The findings are sound and reliable, irrespective of the level of discretization adopted, because the charging patterns and transitions observed are similar across all levels.
Fusion evaluation
The performance evaluation focuses on the coordinated operation of the proposed fusion framework rather than isolated optimization techniques.
Fusion evaluation analysis.
Figure 13 compares the performance of individual optimization methods with the proposed fusion framework. Charging cost and peak load metrics are shown for AI-only, game-theory-only, reinforcement-learning-only, and fused approaches. The fusion framework consistently achieves lower charging costs and reduced peak demand, demonstrating the benefit of coordinated decision-making. The results confirm that integrating forecasting, pricing, and adaptive control yields superior system-level performance compared to applying each technique independently.
While public datasets provide valuable historical insights, they lack closed-loop interaction between forecasting, pricing, and control layers. Therefore, simulation is employed to enable dynamic coordination, which is essential to the proposed framework. All baseline methods are evaluated under identical simulation conditions and input constraints to ensure fairness. Parameter settings follow commonly adopted values reported in the literature. All parameters, weight ranges, and evaluation conditions are explicitly defined to ensure reproducibility. Sensitivity analyses demonstrate robustness to parameter variation, supporting scientific rigor and repeatability.
Mechanistic interpretation and applicability analysis
Reinforcement learning behavior analysis
The great real-time flexibility of the RL-based approach stems from its online policy update system, which maps system states to actions without the need to optimize globally. In contrast to the optimization-based approach, RL responds instantly to fluctuations in price and demand, making it adapt quickly to stochastic grid and user behavior. This feature makes RL especially well-suited for real-time charging management in dynamic systems.
Economic interpretation of game-theoretic pricing
EV profit rises at the beginning of the game, when dynamic pricing encourages participation when demand and price are favorable. Nonetheless, the greater the participation, the more competition between EVs will drive price convergence and lower marginal profits. This equilibrium is a saturation effect of classical equilibrium, where an increase in the number of participants dilutes individual gains, which is consistent with the laws of non-cooperative game theory.
Predictive model effect on system behavior
The AI-based predictive charging method is more efficient and will enhance system efficiency by shifting demand to low-price, low-load periods. Its performance, however, is dependent on the accuracy and stability of forecasts over time. Errors in prediction or unexpected changes in demand may lead to poor performance, underscoring the method’s dependence on the past rather than the present.
Scenario-specific applicability
Table 9 summarizes the appropriateness of each approach for representative V2G deployment scenarios, such as urban distribution networks and highway fast-charging stations.
Predictive charging based on AI is most efficiently applied in settings where demand patterns remain constant, e.g., residential grids. Game-theoretic pricing is effective in market-oriented situations where the number of rational participants is large. RL-based control is effective in highly dynamic environments that require quick adaptation, as in highway charging stations. These limits emphasize that there are no general-purpose approaches that work in every situation and that a coordinated or hybrid strategy is needed.
Quantitative summary
RL is more adaptable, and its training is more complex. Game-theoretic approaches are economically interpretable but lack sensitivity to equilibria. Predictive models based on AI minimize peak load; however, this depends on forecast precision. These trade-offs reflect the functional strengths and shortcomings of either approach. The observed behaviors align with the engineering constraints of real V2G systems, as latency, user behavior variability, and infrastructure constraints directly affect the effectiveness of the methods. Quantitative evaluation shows that game-theoretic dynamic pricing achieves the greatest reduction in charging costs, highlighting its effectiveness in economic optimization. Reinforcement learning-based control achieves superior reduction of peak–valley differences and lower EV waiting times through its adaptive decision-making. AI-based predictive charging delivers stable but moderate performance, particularly under predictable demand conditions. These results confirm that each method exhibits distinct strengths aligned with specific optimization objectives rather than universal superiority.
Reproducibility and evaluation transparency
All simulation parameters, evaluation metrics, and comparison conditions are explicitly defined to support reproducibility. Each experiment is conducted under identical system settings, and reported results represent averaged outcomes over multiple simulation runs to reduce stochastic bias. Unified quantitative metrics are used across all compared methods to ensure fair and transparent performance evaluation. This design allows independent researchers to replicate the experimental setup and verify the reported conclusions.
Conclusion
This research integrates three key technologies—AI forecasting, game-theoretic pricing, and reinforcement learning control—into a unified V2G energy management system. Unlike existing approaches that use these methods separately, our framework allows them to share information across layers, making collective decisions more efficient. Our comparative evaluation demonstrates the framework outperforms existing state-of-the-art methods. AI models predict grid conditions to enable better scheduling, game theory optimizes pricing to balance supply and demand, and reinforcement learning adjusts charging in real-time based on these prices. Together, these three methods reduce costs, improve grid stability, and enable sustainable EV integration. As EV adoption increases, our hierarchical approach combines the strengths of game theory, reinforcement learning, and forecasting to create a practical, cost-effective solution for large-scale EV integration.
Although the proposed framework demonstrates effective performance under simulated V2G scenarios, several limitations remain. First, the evaluation relies on simulation-based data rather than real-world grid operation datasets, which may not capture all infrastructure constraints. Second, the reinforcement learning model does not explicitly model battery aging dynamics or long-term user behavior. Third, coordination among multiple charging stations is not considered. Future work will focus on real-world deployment, enhanced battery degradation modeling, and large-scale multi-station coordination.
Data availability
All data supporting the findings of this study are included within the article itself. No additional datasets were generated or analyzed beyond those presented in the manuscript.
References
Shahriar, S., Al-Ali, A. R., Osman, A. H., Dhou, S. & Nijim, M. Prediction of EV charging behavior using machine learning. IEEE Access. 9, 111576–111586. https://doi.org/10.1109/ACCESS.2021.3103119 (2021).
Yan, S., Shah, M. H., Li, J., O’Connor, N. & Liu, M. A review on AI algorithms for energy management in e-mobility services. arXiv:2309.15140, 1–8. (2023). https://doi.org/10.48550/arXiv.2309.15140
Thwany, H., Alolaiwy, M., Zohdy, M., Edwards, W. & Kobus, C. J. Machine learning approaches for EV charging management: A systematic literature review. In 2023 IEEE International Conference on Artificial Intelligence, Blockchain, and Internet of Things (AIB Things). (2023). https://doi.org/10.1109/AIBThings58340.2023.10292487
Shahriar, S., Al-Ali, A. R., Osman, A. H., Dhou, S. & Nijim, M. Machine learning approaches for EV charging behavior: A review. IEEE Access. 8, 168980–168993. https://doi.org/10.1109/ACCESS.2020.3023388 (2020).
Shern, S. J. et al. Artificial intelligence optimization for user prediction and efficient energy distribution in electric vehicle smart charging systems. Energies 17, 1–25. https://doi.org/10.3390/en17225772 (2024).
Kazemtarghi, A., Mallik, A. & Chen, Y. Dynamic pricing strategy for electric vehicle charging stations to distribute the congestion and maximize the revenue. Int. J. Electr. Power Energy Syst. 58. https://doi.org/10.1016/j.ijepes.2024.109946 (2024).
Chen, P. et al. Game theory based optimal pricing strategy for V2G participating in demand response. IEEE Trans. Ind. Appl. 59 (4). https://doi.org/10.1109/TIA.2023.3273209 (2023).
Rasheed, M. B., Llamazares, Á., Ocaña, M. & Revenga, P. A game-theoretic approach to mitigate charging anxiety for electric vehicle users through multi-parameter dynamic pricing and real-time traffic flow. Energy 304, 132103. https://doi.org/10.1016/j.energy.2024.132103 (2024).
Zavvos, E., Gerding, E. H. & Brede, M. A comprehensive game-theoretic model for electric vehicle charging station competition. IEEE Trans. Intell. Transp. Syst. 23 (8). https://doi.org/10.1109/TITS.2021.3111765 (2022).
Gao, Q., Li, H., Peng, K., Zhang, C. & Qu, X. A real-time charging price strategy of distribution network based on comprehensive demand response of EVs and cooperative game. J. Energy Storage. 101, 113805. https://doi.org/10.1016/j.est.2024.113805 (2024).
Zhang, F., Yang, Q. & An, D. CDDPG: A deep-reinforcement-learning-based approach for electric vehicle charging control. IEEE Internet Things J. 8 (5). https://doi.org/10.1109/JIOT.2020.3015204 (2021).
Park, K. & Moon, I. Multi-agent deep reinforcement learning approach for EV charging scheduling in a smart grid. Appl. Energy. 328, 120111. https://doi.org/10.1016/j.apenergy.2022.120111 (2022).
Bertolini, A., Martins, M. S. E., Vieira, S. M. & Sousa, J. M. C. Power output optimization of electric vehicles smart charging hubs using deep reinforcement learning. Expert Syst. Appl. 201, 116995. https://doi.org/10.1016/j.eswa.2022.11 (2022).
Wang, S., Bi, S. & Zhang, Y. A. Reinforcement learning for real-time pricing and scheduling control in EV charging stations. IEEE Trans. Industr. Inf. 17 (2). https://doi.org/10.1109/TII.2019.2950809 (2021).
Lee, W. et al. A real-time intelligent energy management strategy for hybrid electric vehicles using reinforcement learning. IEEE Access. 9. https://doi.org/10.1109/ACCESS.2021.3079903 (2021).
Jia, Z., Chen, Q. & Xu, Q. The Art of balancing price and plug: developing a theoretical model for dynamic pricing in the electric vehicle market. Sustainability 16, 9325. https://doi.org/10.3390/su16219325 (2024).
Liu, D., Wang, W., Wang, L., Jia, H. & Shi, M. Dynamic pricing strategy of electric vehicle aggregators based on DDPG reinforcement learning algorithm. IEEE Access. 9, 21556–21566. https://doi.org/10.1109/ACCESS.2021.3055517 (2021).
Chavhan, S., Gupta, D., Alkhayyat, A., Alharbi, M. & Rodrigues, J. J. P. C. AI-empowered game theoretic-enabled dynamic electric vehicles charging price scheme in smart City. IEEE Syst. J. 17 (4). https://doi.org/10.1109/JSYST.2023.3307497 (2023).
Piao, L., Ai, Q. & Fan, S. Game theoretic based pricing strategy for electric vehicle charging stations. In International Conference on Renewable Power Generation (RPG 2015). (2015). https://doi.org/10.1049/cp.2015.0557
Muratori, M. Impact of uncoordinated plug-in electric vehicle charging on residential power demand. Nat. Energy. 3, 193–201. https://doi.org/10.1038/s41560-017-0074-z (2018).
Richardson, D. B. Electric vehicles and the electric grid: A review of modelling approaches, impacts, and renewable energy integration. Renew. Sustain. Energy Rev. 19, 247–254. https://doi.org/10.1016/j.rser.2012.11.042 (2013).
Ma, Z., Callaway, D. S. & Hiskens, I. A. Decentralized charging control of large populations of plug-in electric vehicles. IEEE Trans. Control Syst. Technol. 21 (1), 67–78. https://doi.org/10.1109/TCST.2011.2174059 (2013).
Dorokhova, M., Martinson, Y., Ballif, C. & Wyrsch, N. Deep reinforcement learning control of electric vehicle charging in the presence of photovoltaic generation. Appl. Energy. 301, 117504. https://doi.org/10.1016/j.apenergy.2021.117504 (2021).
Heendeniya, C. B. & Nespoli, L. A stochastic deep reinforcement learning agent for grid-friendly electric vehicle charging management. Energy Inf. 5, 28. https://doi.org/10.1186/s42162-022-00197-5 (2022).
Fan, P., Zhang, Y., Liu, X. & Wang, Z. A load frequency coordinated control strategy for multimicrogrids with V2G based on improved MA-DDPG. Int. J. Electr. Power Energy Syst. 146, 108765. https://doi.org/10.1016/j.ijepes.2022.108765 (2023).
Wen, Y., Li, Y., Zhang, H. & Wang, J. An optimal scheduling strategy of a microgrid with V2G based on deep Q-learning. Sustainability 14 (16), 10351. https://doi.org/10.3390/su141610351 (2022).
Ke, S., Zhang, Y., Liu, J. & Wang, L. A frequency control strategy for EV stations based on MPC-VSG in islanded microgrids. IEEE Trans. Industr. Inf. 20 (2), 1819–1831. https://doi.org/10.1109/TII.2023.3272456 (2023).
Pan, Z., Li, Q., Sun, H. & Wang, B. Deep reinforcement learning-based online vehicle-to-grid scheduling for multi-energy microgrids. Energies 17 (11), 2491. https://doi.org/10.3390/en17112491 (2024).
Zhang, R., Liu, Y., Chen, X. & Wang, P. Deep reinforcement learning-based vehicle-to-grid operation considering solar power forecast errors. Sustainability 16 (9), 3851. https://doi.org/10.3390/su16093851 (2024).
Elkhodr, M., Shahrestani, S. & Cheung, H. Semantic-aware intelligent vehicle-to-grid energy management framework. Computers 13 (10), 249. https://doi.org/10.3390/computers13100249 (2024).
Shang, Y. et al. Spatio-temporal data fusion framework based on large Language model for enhanced prediction of electric vehicle charging demand in smart grid management. Inform. Fusion. 103692. https://doi.org/10.1016/j.inffus.2024.103692 (2025).
Li, Z. et al. An accessible close-loop V2V charging mechanism under charging station with non-cooperative game. Energy Rep. 8, 1038–1044. https://doi.org/10.1016/j.egyr.2022.01.098 (2022).
Funding
The authors received no specific funding for this study.
Author information
Authors and Affiliations
Contributions
Conceptualization, Methodology: VN & KB; Formal analysis and investigation: PKN, VN & KB; Writing - original draft preparation: WKW; Writing - review and editing: WKW &PKN; Supervision: SK, KB & VRAll authors reviewed the results and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Nandagopal, V., Bhaskar, K., Periakaruppan, S. et al. A hierarchical fusion framework for vehicle to grid energy management using predictive intelligence and learning based pricing. Sci Rep 16, 6019 (2026). https://doi.org/10.1038/s41598-026-37243-1
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-026-37243-1


















