Introduction

As a pivotal and crucial industry underpinning the national economy, the price fluctuation and stable development of real estate directly affect the overall operation of the national economy. The relationship between demand and supply in the real estate market has become increasingly complex, and the price changes have become more frequent and difficult to predict. Accurate prediction of real estate price (REP) can help investors judge when to enter the market, reduce investment risks, and improve investment returns. At the same time, it can assist developers to grasp the market dynamics, formulate reasonable development strategies and sales plans, and improve the profitability and market competitiveness of the project (Sharma et al., 2024; Song and Ma, 2024). Hence, it carries immense practical importance to study the REP prediction model, which helps to promote the housing market.

Recently, the research on REP forecasting faces multiple technical difficulties. The primary difficulty lies in the multiple influencing factors of REP, which cover macroeconomic conditions, the fluctuation of policy environment, the delicate balance between market demand and supply, the uniqueness of geographical location, and the perfection of surrounding supporting facilities (Møller et al., 2024). These factors interact with each other intricately (Wu, 2024), and some of them are difficult to be directly quantified due to their abstract or dynamic nature, which sets obstacles for the construction of accurate prediction models. Secondly, the relationships between REP and these influencing factors are not linear correspondence, but show a nonlinear feature (Nirmala et al., 2024; Millar and White, 2024). This nonlinear relationship requires the prediction model to have a high degree of flexibility and deep learning ability to capture and process the nonlinear patterns in the data, which poses a severe challenge to traditional prediction methods (Rey-Blanco et al., 2024). Finally, the acquisition and processing of real estate market data is also a big challenge. The problem of data incompleteness and lag is particularly prominent, such as the absence or delay of key data such as land transaction and housing transaction, which makes it difficult for the model to reflect the market dynamics in a comprehensive and timely manner (Li, 2023; Mao, 2023). Such defects in data quality limit the prediction model and affect the accuracy and timeliness of the prediction results (Liu and Ma, 2024; Yang et al. 2024; Yang et al. 2024).

To solve the above problems, many real estate’s price prediction models have been proposed. Wheeler et al. (2014) discussed the difference between the different functional forms used in the characteristic price model for evaluating the housing price, and found that the Bayesian method can achieve better results when used to evaluate the housing price. Wei et al. (2022) employ big data to boost the real estate evaluation in the characteristic model according to 124 studies. Guirguis et al. (2024) suggested the autoregression to predict the house price index, and their model surpassed the autoregressive moving average model and GARCH model in the out-of-sample empirical prediction. Zhao et al. (2019) established an autoregressive average moving model based on the training and validation method, and used it to forecast the house price of New Zealand. The results demonstrate that the performance of ARIMA often surpasses that of multiple linear regression models (Soltani and Lee, 2024). Zulkifley et al. (2020) employed SVM to build the price predicting model, and utilized Genetic Algorithm (GA) to optimize the parameter selection of the SVM model. The empirical analysis proved that the model had a good prediction effect. Fang (2022) utilizes the BP network for the price prediction of auction houses (Zhu and Li, 2021) and uses GA to optimize the model. Alfaro-Navarro et al. (2020) applied a variety of ensemble algorithms to predict the housing price in Spain. Wang et al. (2021) introduced the a novel network upon Bagging ensemble method according to macroeconomic data, and predicted the housing price index of four municipalities directly under the central government of China. These methodologies have leveraged deep learning approaches for predicting house prices, making significant improvements in model structure and providing directions for feature processing enhancements. In terms of model structure, deep learning methods such as convolutional neural networks (CNNs), recurrent neural networks, and their variants have been explored to improve the predictive capabilities of housing price models. These architectures allow for the extraction of hierarchical features from the input data, enabling the models to learn more complex representations of the housing market dynamics.

Although current REP prediction models have made notable progress in exploration and practice, they continue to encounter significant challenges (Liu, 2022; Khrais and Shidwan, 2023; Rampini and Re Cecconi, 2022), particularly in accurately predicting prices due to the combined influence of numerous complex factors. These factors are complex and intertwined, which makes the traditional deep learning model unable to fully and accurately express these relationships. To comprehensively consider and deal with these diverse influencing factors more effectively, we propose a REP prediction upon the adaptive loss function (ALF) (Baik et al., 2021) and feature embedding optimization. This model aims to break through the existing bottleneck through the ALF and feature embedding optimization strategy. Firstly, in view of the fact that the influence degree of different influencing factors on REP may change dynamically with time, market conditions, and other factors, we design a loss function that can automatically adjust the weight. This adaptive mechanism enables the model to flexibly respond to changes in the importance of various factors during the training process. Secondly, in order to effectively process and integrate multi-dimensional features in REP prediction, we transform high-dimensional and sparse original features into low-dimensional and dense vector representations by feature embedding, while retaining key information. We further optimize the feature embedding process to ensure that the model can fully mine and utilize the potential relationship between these features to improve the accuracy of prediction.

Related works

REP prediction is a hot research topic now, and many excellent results have been achieved. Numerous studies have been conducted to investigate the various influencing factors of REP. Ganioğlu and Seven (2021) took the developing country Turkey as an analysis sample and found that the price in Turkey was influenced by the inflow of income, population, education, unemployment, and refugees, and the housing price in Turkey showed long-term convergence. Churchill et al. (2018) delved into the convergence patterns exhibited by residential house prices across the capital cities of Australian states and subsequently constructed a nonlinear model to capture the intricacies of house price dynamics. The results showed that the house prices in Australian states did not converge. Duca et al. (2021), on the other hand, established a connection between the housing market and various aspects such as credit markets, broader economic phenomena and so on (Turnbull and Zheng, 2021; Turnbull et al., 2018; Mathur, 2017), illustrating the interconnectedness and far-reaching implications of housing market dynamics.

For the prediction of REP, Peng et al. (2023) used GA as the entry point to improve the model. In the actual data analysis and application, it was empirically found that the improved model has better valuation accuracy. Zhao et al. (2024) improved the BP network upon the fruit fly and frog-leaping algorithm, and the valuation effect of the model was greatly improved. Gabauer et al. (2024) employed the high-dimensional sparse vector autoregressive model to predict the REP of 35 cities, which could better mine the key explanatory variables and economic information, and had a better prediction effect. Jiang et al. (2023) found that the use of web crawler technology can identify key factors that can significantly impact housing prices, and the prediction accuracy of housing prices can be improved by combining Internet data with a VAR model. Liang (2023) proved through research that ARIMA model can make continuous predictions for the price of the Chinese second-hand housing market, which provides a certain reference basis for buyers and sellers in China’s real estate market. Lorenz et al. (2023) constructed random forest, boosting, and Bagging models based on the advanced network search data to predict the REP. Comparative analysis showed that the random forest model combined with network search data had the best prediction effect. Du et al. (2014) combined an SVR model with linear regression to predict the REP, and proved that the fusion model had higher prediction accuracy and better fitting effect than a single model. Peng et al. (2023) introduced the Barbara method into SVR, so that it could adjust the three parameters to the maximum extent, so as to establish the BA-SVR&WSD prediction model.

Methodology

REP prediction model upon the ALF and feature embedding optimization is proposed for the complex, realistic situation where REP is deeply affected by many complex and intertwined factors. The core of this model lies in two major innovations. First, an optimized feature embedding framework is constructed to deeply analyze and effectively characterize the multi-dimensional factors (FEF) affecting housing prices. Secondly, the reinforcement learning strategy with the ALF is introduced to dynamically adjust the optimization direction of the model in the training to achieve the robustness of the prediction results.

First, through in-depth mining and analysis of extensive real estate market data, we employ advanced feature embedding techniques to transform the myriad of factors influencing housing prices (such as geographical location, surrounding environment, building quality, policy regulations, etc.) into quantifiable feature vectors in a high-dimensional space. This achieves efficient representation and dimensionality reduction of the factors affecting housing prices. This process not only simplifies the complexity of data, but significantly improves the ability of the model to capture the change law of housing prices. Secondly, to increase the accuracy, we introduce the concept of ALF. Different from the traditional fixed loss function, the adaptive mechanism can dynamically adjust the parameters of the loss function according to the real-time performance and prediction error, so that the model can adjust the optimization strategy more flexibly in the face of different types of house price data fluctuations, reduce the prediction bias, and achieve more accurate house price prediction.

In summary, by integrating the dual advantages of feature embedding optimization and ALF reinforcement learning, the proposed model not only overcomes the limitations of traditional REP prediction methods in dealing with complex factors, but also significantly improves the accuracy and practicability of the prediction, which provides strong support for real estate market analysis, investment decision-making and policy-making.

Fused feature embedding

To extensively integrate various conditions that affect the real estate and understand the internal factors of REP changes, we propose a REP representation method with feature embedding optimization to achieve a deep understanding of REP characteristics.

Due to too many factors affecting the REP, the model has the problem of gradient disappearance, which can only capture the local relationship of each factor, and cannot effectively learn the long-term related content, and the word memory ability at the front of the sequence is weak. Therefore, we redesign the bidirectional GRU (B-GRU) network (RoselinKiruba et al., 2024) to extract the sequence features composed of multiple factors. The network model of this part is shown in Fig. 1. According to the B-GRU network diagram, it contains the forward part and the reverse part. Each direction is a separate GRU. Feature extraction from the input of the forward and reverse directions can more fully learn the relationship between the context of each factor. The state at time t consists of two parts, which are the forward hidden state hn and the reverse hidden state h’t-n, which can be represented as follows:

$${h}_{n}^{{\prime} }={\rm{GRU}}({F}_{t},{h}_{t-1})$$
(1)
$${h}_{t-n}^{{\prime} {\prime} }={\rm{GRU}}(h\,\cap \sim {h}_{n}^{{\prime} },{h}_{t-n+1})$$
(2)
$${h}_{t}={w}_{t}{F}_{t}+{\nu }_{t}{h}_{t-n}^{{\prime} {\prime} }+{b}_{t}$$
(3)

where wt and vt are used as weight parameters to transfer weights between the input data and the internal state of the model or other layers, while bt is used as a bias parameter to adjust the baseline level of the output. These parameters work together to enable the model to effectively learn and capture context from the input data to generate the final output ht. In addition, regularization methods are introduced during the training process, especially the dropout operation. It reduces the coadaptation between neurons by randomly discarding a part of the neurons in the neural network temporarily during training.

Fig. 1
figure 1

The structure of B-GRU.

B-GRU is good at capturing the complex relationship in which house prices are affected by many factors. To deeply mine the deep semantic features of each single factor’s impact on house prices, we integrate CNN. It can efficiently extract the local correlation information in the data, which is particularly critical for analyzing the feature patterns in the house price data. We used a variety of convolution kernels of different sizes to simulate the influence of a single factor on house prices in different environments with different ranges and intensities (M-CNN), as shown in Fig. 2. This design allows the model to understand the characteristics of the data from multiple dimensions and scales, which enhances the flexibility and adaptability of the model. Meanwhile, by applying filters of different sizes, we can obtain a broader or more refined view, and then achieve an effective and comprehensive features of the influencing factors of housing prices.

Fig. 2
figure 2

The framework of M-CNN.

The integrated model structure combining B-GRU and CNN not only fully leverages the advantages of GRU in processing sequential data and capturing long-term dependencies, but also leverages the powerful feature extraction capabilities of CNN to deeply uncover the intrinsic relationships and underlying patterns among various factors in housing price data. This comprehensive model demonstrates significant performance improvements and higher prediction accuracy in applications such as housing price prediction. By incorporating the bidirectional nature of GRU, the model can effectively capture information from both past and future contexts, enabling it to more accurately predict housing prices based on a comprehensive understanding of market trends and historical data. Additionally, the CNN component enhances the model’s ability to extract intricate patterns and relationships within the data, which further refines the prediction process and boosts the overall accuracy of the housing price forecasts. Together, these capabilities make the B-GRU-CNN hybrid model a powerful tool for real estate market analysis and price prediction.

Reinforcement learning with adaptive loss functions

To make full use of the advantages of many factors affecting REP in the prediction model, this paper innovatively proposes a reinforcement learning method with an ALF, and the framework is shown in Fig. 3. This framework aims to better capture the differential impact of different factors on the prediction accuracy of REP by dynamically adjusting the loss function, to optimize the prediction performance. Through the mechanism of reinforcement learning, the model can automatically identify and emphasize those factors that can focus more on the results during the learning process, while weakening or ignoring the interference caused by secondary factors, and finally achieve more accurate and robust REP prediction.

Fig. 3
figure 3

The framework of M-CNN.

We utilize reinforcement learning as the intelligent decision-making unit in the framework to adjust the learning policy and parameters based on the performance feedback from the prediction model. Through trial and error, we optimize the ALF and the parameter Settings of B-GRU and CNN to achieve the goal of maximizing prediction accuracy. By training in a simulated environment, the reinforcement learning agent can learn how to most effectively configure the prediction model in different market environments, thereby improving its generalization ability and robustness in practical applications.

We redesign the loss from the Dice (Abraham and Khan, 2019). First, the Dice function is:

$${DLoss}({y},{y}^{\prime} )=1-\frac{2\sum {y}_{i}{y}^{\prime}_{i}+\epsilon}{\sum{y}_{i}+\sum {y}^{\prime}_{i}+\epsilon}$$
(4)

where y refers to the predicted results, y’ denotes the truth of the sample, and ε is set to 0.0001. Then, to compensate for Dice’s inability to handle multiple real estate factors, we redesign the loss function as follows:

$${MDL}{oss}=1-2\times \frac{\sum {y}_{i}{y^{\prime} }_{i}+\epsilon }{\sum {w}_{i}\sum {y}_{i}+\sum {w^{\prime} }_{i}\sum {y^{\prime} }_{i}+\epsilon }$$
(5)

where wi and wi are trainable matrices. To avoid the problem that the denominator of the loss function is 0. The proportion of each factor is corrected by its inverse to reduce the correlation between each factor to the improved loss. The MSE between the predicted results and the standard results is backpropagated through model training to realize the regression of prices. The MSE is present as follows:

$${MSE}(y,{y}^{{\prime} })=\frac{\mathop{\sum }\limits_{i=1}^{n}{\left({y}_{i}-{y^{\prime} }_{i}\right)}^{2}}{n}$$
(6)

Finally, to avoid the limitation of fixed weights to multiple losses, we adopt adaptive weights, so that the model can adaptively assign weights according to the loss values of different training stages during the training process, which are shown in the following equation:

$$L{oss}={w}_{1}{MSE}+{w}_{2}{MDL}{oss}$$
(7)

where w1 and w2 are trainable matrices. By dynamically adjusting the loss function, intelligently screening the key features and optimizing the learning strategy, the complexity and uncertainty problems faced in the REP prediction are effectively solved, and a more accurate and reliable prediction tool is provided for the real estate industry.

Experiments

Dataset and implementation settings

We use the House sale Dataset (https://zenodo.org/records/6423459, https://doi.org/10.5281/zenodo.6423459) on the REP prediction model for testing. This dataset encompasses information gathered from the websites of Fotocasa and Idealista, spanning from April 4th to April 7th, 2022. Each entry meticulously details a house listed for sale within the Salamanca and Villaverde districts of Madrid, utilizing the following attributes: a title, location specifications, the price, square meterage, the number of rooms, the floor level, the count of photos, availability of floor plans, 3D views, videos, home staging status and the comprehensive description. This dataset can be utilized for various research purposes, such as REP prediction, market analysis, and consumer behavior studies. Through in-depth analysis of these data, researchers can understand the relationship between housing prices and housing characteristics, predict future price trends, evaluate market trends, and gain insights into consumers’ preferences and demands for different housing features. Furthermore, this dataset offers invaluable information resources. By analyzing these data, they can better comprehend market demands and competitive landscapes, and formulate more reasonable pricing strategies, investment plans, and policy measures.

During the training phase, we leveraged a Ryzen 7600x processor alongside six Nvidia RTX 3070 GPUs to enhance computational efficiency. To expedite the training process, we opted for Pytorch as our framework and meticulously fine-tuned its configurations to precisely align with the training parameters outlined in Table 1.

Table 1 Parameter settings.

In order to fully evaluate our method, we choose the mean square error (MSE), explained square difference (EVS), mean absolute error (MAE), and determination coefficient R2 to analyze the accuracy of each model after prediction. The higher the R2 value, the better the performance, while the opposite is true for MSE. Firstly, real estate data is cleaned by employing methods such as deletion, imputation, or interpolation to handle missing values, outliers, and duplicate values. Subsequently, feature scaling is applied, and quantization is performed through one-hot encoding.

Ablation experiments

To enhance the predictive precision of REP, we introduce two distinct modules: FEF and ALF. The performance of FEF and ALF is verified by conducting ablation experiments. A common deep CNN was used as a Baseline in the experiment.

We conducted in-depth qualitative analysis of these two independent modules, and visually present the results in Fig. 4. When we integrate the FEF module on the basis of baseline, the performance of our model undergoes a notable enhancement, the R2 is significantly increased from 0.854 to 0.956, while the MSE is significantly decreased from 0.0425 to 0.0123. This result shows that the FEF module has a significant influence on improving the model. On the other hand, we also observe a positive change in model performance when we separately introduce ALF into the baseline model, with a decrease of 0.005 in MAE and an increase of 0.038 in EVS. This demonstrates the effectiveness of the ALF module in enhancing the model and improving the prediction accuracy and interpretability. Furthermore, to explore the two modules working together, we apply FEF and ALF to the baseline model simultaneously. This combination results in the optimal performance. The MSE of the model is reduced to an extremely low 0.0059, the MAE is only 0.0099, the R² is as high as 0.975, and the EVS is 0.951. These indicators not only show the excellent performance of the model in prediction accuracy but also reflect its strong explanatory power and generalization ability. In summary, the joint application of the FEF and ALF modules brings comprehensive performance improvement to the baseline model.

Fig. 4
figure 4

The ablation of FEF and ALF.

In order to clearly show the specific role and advantages of these two modules in REP prediction, we designed and executed a qualitative ablation experiment. In the experiment, we specially select the housing price data of two representative areas in the data set within a year as samples, and evaluate the performance of different modules through comparative analysis. The results are in Fig. 5, where the real price trend of the housing market is clearly depicted by the red solid line. First, we focus on the samples where housing prices show an upward trend. In this case, the comparison results show that both ALF and FEF modules can more closely track and predict the real change of house prices than the baseline, that is, their prediction curve is closer to the real price represented by the red solid line. What is more remarkable is that when the ALF and FEF modules are used together, the prediction accuracy is extremely high. Subsequently, we extend our analysis to the sample where house prices remain relatively stable. Through this comparison, we again verify the previous conclusion that the use of ALF and FEF modules alone or in combination can effectively increase the prediction accuracy and make the prediction results more aligned with the actual housing price. This finding not only strengthens the positive role of ALF and FEF modules in house price forecasting but also further proves their wide applicability and stability under different market conditions.

Fig. 5
figure 5

The ablation of two locations’ REP.

In addition, through the in-depth analysis of these experimental results, we can also draw the following conclusions. The ALF and FEF modules show good complementarity in the prediction process. ALF module improves the sensitivity of the model to complex market dynamics by focusing on key information points in the data. While the FEF module enhances the quality of input features to provide richer and more valuable information sources for the model. The combination of the two makes the model more comprehensive and accurate in capturing market trends and predicting price changes. The experimental results show that both ALF and FEF modules can maintain stable performance in different market environments, which reflects their good adaptability and robustness. This is particularly important for the highly uncertain and complex task of REP prediction, because market conditions often change rapidly, and a reliable forecasting model needs to have the ability to maintain accuracy in a variety of scenarios.

Compare other methods

We conduct an exhaustive and comprehensive performance evaluation of the newly proposed REP prediction model, aiming to verify its effectiveness and superiority in practical applications. For this purpose, we select Li et al. (2017), Li (2023), Liu and Ma (2024), Demirhan and Baser (2024), Zhao et al. (2024), Sharma et al. (2024), Ozalp and Akıncı (2024), and Samadadiya et al. (2024) as the comparison benchmarks, which represent the representative and advanced research results in this field. Meanwhile, we choose a plain CNN as our baseline. The CNN extracts local features from the input data through convolutional operations, and reduces the dimensionality and the number of parameters of the data through pooling operations, ultimately performing classification or regression through fully connected layers.

The evaluation results, as shown in Table 2 and Fig. 6, show that our property price prediction model has demonstrated excellent performance on various key evaluation indicators. The MSE of the model reaches 0.0059, the MAE is 0.795, the R² is as high as 0.824, and the EVS is 0.951, which emphasizes the strong strength of the model in capturing data variability and prediction ability. In the detailed comparison with the benchmark, our model highlights its significant advantages. Compared with Liu and Ma (2024), our MSE value is decreased by 0.0064, and EVS value is increased by 0.062. The improvement of these two key indicators directly reflects the significant improvement of our model in prediction accuracy and explanatory ability. Compared with Demirhan and Baser (2024) and Zhao et al. (2024), our MAE values are 0.0168 and 0.0133 ahead, respectively, which shows that our prediction results are closer to the real situation and our error control is better. At the same time, our R-squared values are 7.9% and 8.9% ahead of those of Demirhan and Baser (2024) and Zhao et al. (2024), which again verifies our performance in model fit and prediction accuracy. Sharma et al. (2024) and Ozalp, Akıncı (2024) perform worse than our method in all evaluation metrics, which further consolidates our technological leadership. Although Samadadiya et al. (2024) have similar performance with our method, it still lags behind our model in all key evaluation indicators, which reflects the depth and innovation of our research work.

Table 2 Comparison experiments for our method.
Fig. 6
figure 6

Comparison with other methods.

Our REP prediction model shows excellent performance in the performance evaluation, which is not only significantly better than many comparison methods in various evaluation indicators, but also provides a new solution for accurate prediction and decision support of the real estate market.

Real sample testing

To evaluate the efficiency of our model in the REP prediction application, I visualize the price prediction results on the sample and show the comparison results in Figs. 7, 8. Firstly, we use the real samples, Location 1 and Location 2, in the ablation experiment for testing. We compare the prediction of the house price of location 1, and find that our method is closer to the real state than that of Ozalp and Akıncı (2024) and Samadadiya et al. (2024). At the same time, we also output the prediction time of our model for different numbers of houses in location 1 for 12 months, which can help find good sample scalability of our model. In Fig. 8, we show the same conclusion.

Fig. 7
figure 7

Method application test for Location 1.

Fig. 8
figure 8

Method application test for Location 2.

In order to comprehensively and deeply evaluate the efficiency and performance of our method in the field of REP predicting, we take a series of rigorous steps and visualize the key results on a sample. Figures 7 and 8 show the comparison of our prediction effect in different scenarios, which verifies the performance of our model, but also reveals its unique advantages.

In Fig. 7, we focus on Location 1 and show significant improvement by comparing the prediction results of our method with those of existing literatures. It can demonstrate that our prediction curve is closer to the real housing price trend, which is not only consistent in the overall trend, but also shows higher accuracy in local fluctuations. This result fully proves the superiority of our method in capturing market dynamics and predicting future house prices. In addition, we also pay special attention to the prediction time of Location 1 for different numbers of listings within twelve months. The experimental results show that no matter how the number of listings changes, our model can complete the prediction in a reasonable time, and the prediction accuracy remains stable. This finding not only demonstrates the efficiency of our model but also provides strong support for its wide applicability in practical applications.

To further consolidate our conclusions, we show the comparison results of house price prediction for Location 2 in Fig. 8. Similar to Location 1, our method also shows better prediction performance than other literature. This cross-location validation not only proves the universality of our method in different market environments but also further enhances its reliability as a REP prediction tool.

Through the above comparative experiments and visual presentation, we can draw the following conclusions. Our method shows higher accuracy in house price forecasting and is able to capture market dynamics and price trends more precisely. The model shows good scalability when dealing with different numbers of listings, and is able to adapt to datasets of different sizes while ensuring the prediction accuracy. The cross-site validation results show that our method can maintain stable prediction performance in different market environments, and has high universality and application value.

Discussion

After experimental verification, we tested the REP representation method with the fused feature embedding optimization technique combined with a reinforcement learning framework based on an ALF. The experimental data fully show that our proposed model has high effectiveness and wide practicability in the field of REP prediction.

Through the refined feature embedding optimization strategy, the model significantly improves the richness and accuracy of data representation, so that the model can more deeply understand the dynamics of the market and the internal laws of price changes. Meanwhile, the introduction of ALF enables the model to intelligently adjust the optimization direction in the training process, and give different attention to different types of prediction errors, thus further improving the accuracy of prediction. Experiments show the proposed model achieves obvious advantages in terms of prediction accuracy, stability, and generalization ability. This verifies the rationality and innovation of the model design. The ALF can dynamically adjust to the characteristics of the data, focusing more on errors that need more attention. This can lead to improved prediction accuracy, especially in complex scenarios where REPs are influenced by various factors. By emphasizing larger errors and adjusting the model accordingly, the ALF helps in reducing overall prediction error. Feature embedding optimization schemes convert raw data into a more compact and meaningful representation, which can capture the underlying patterns and relationships within the data. This efficient representation of features allows the model to learn more effectively from the data, improving its ability to generalize and make accurate predictions on unseen data. REPs are influenced by a multitude of complex and often intertwined factors. Feature embedding techniques can help in extracting and representing these factors in a way that makes it easier for the model to understand and learn from them.

This study makes significant contributions across academic, practical, and policy domains. Academically, it advances predictive modeling by integrating reinforcement learning with an ALF, enabling dynamic feature weighting in response to market fluctuations. Additionally, optimizing the feature embedding process enhances the model’s ability to process FEF, improving interpretability and robustness in deep learning applications. The study also bridges theory and practice by providing a replicable hybrid modeling framework for complex forecasting tasks in financial and real estate analytics. Practically, the model enhances forecasting accuracy, aiding investors and developers in making informed decisions, while its improved predictive reliability helps stakeholders mitigate financial risks in volatile real estate markets. Furthermore, its scalability and adaptability allow application across diverse regional markets and varying data conditions. From a policy perspective, the model supports urban planning by offering insights into predictive price trends, facilitates data-driven policy formulation by leveraging machine learning analytics, and enhances resource allocation by optimizing housing supply, subsidy distribution, and infrastructure planning to promote sustainable urban development.

Limitations

While the proposed model demonstrates notable predictive performance, it introduces a certain level of computational complexity, primarily due to the integration of reinforcement learning and the adaptive loss optimization framework. The reinforcement learning module requires iterative environment interaction and policy updates, which can result in increased training time and memory consumption, especially when applied to large-scale or real-time datasets. Similarly, the dynamic adjustment of the loss function necessitates continuous gradient recalculations, potentially imposing additional computational overhead.

From an implementation perspective, deploying such a model in production environments may demand high-performance computing infrastructure and technical expertise, which could be a barrier for small-scale enterprises or government agencies with limited resources. To mitigate these challenges, future work may explore model compression techniques such as knowledge distillation or the development of lightweight surrogate models that retain predictive performance while reducing complexity. Additionally, hybrid deployment strategies—wherein offline training is combined with simplified online inference—may offer practical trade-offs between accuracy and efficiency.

Additionally, the model has been evaluated on a specific regional dataset, which may limit its generalizability to other real estate markets with different socio-economic structures or regulatory environments. Although the proposed architecture is theoretically adaptable, cross-market testing is necessary to verify robustness under varying conditions. The performance of data-driven models heavily depends on the availability and quality of input data. In real-world applications, inconsistencies, biases, or incomplete records in real estate datasets may negatively impact prediction accuracy. While not the focus of this study, attention to responsible data handling and transparent model interpretation will be essential in future work to ensure reliability and stakeholder trust in practical deployments.

Conclusion

In order to effectively deal with the dynamic real estate market, we propose a REP prediction method combining ALF and feature embedding optimization. This method constructs a more refined REP representation model by deeply mining the internal relationship between the FEF that affects the REP. Furthermore, the reinforcement learning mechanism based on ALF is introduced, which not only improves the adaptability of the model to a complex market environment, but also significantly enhances the weight consideration of key influencing factors in price prediction. The experiments conclude that the prediction model achieves a high level of accuracy, with the R-squared value reaching 0.975 and the EVS value reaching 0.951, which fully proves the superiority of the model in the field of REP prediction and provides solid data support and a reliable prediction basis for scientific decision-making. In the future, we will explore more effective feature extraction and embedding methods for a wider range of real estate markets, in order to enhance the model’s ability to capture and process critical information, making it more widely applicable to various market environments and conditions.