Abstract
Efficient Waste management plays a crucial role to ensure clean and green environment in the smart cities. This study investigates the critical role of efficient trash classification in achieving sustainable solid waste management within smart city environments. We conduct a comparative analysis of various trash classification methods utilizing deep learning models built on convolutional neural networks (CNNs). Leveraging the PyTorch open-source framework and the TrashBox dataset, we perform experiments involving ten unique deep neural network models. Our approach aims to maximize training accuracy. Through extensive experimentation, we observe the consistent superiority of the ResNext-101 model compared to others, achieving exceptional training, validation, and test accuracies. These findings illuminate the potential of CNN-based techniques in significantly advancing trash classification for optimized solid waste management within smart city initiatives. Lastly, this study presents a distributed framework based on federated learning that can be used to optimize the performance of a combination of CNN models for trash detection.
Similar content being viewed by others
Introduction
Urban landscapes are growing and evolving rapidly. With growing size and population, it is becoming quite challenging to manage the scale of waste management. While recycling is being preached and practiced to varying degrees around the world, the growing hazard of unmanageable waste has reinvigorated research into new and maintainable methods to recycle waste1. Delivering approaches to sustainable recycling is considered one of the leading challenges facing the environment and the economy today. The need of the hour are holistic policies and approaches to ensure recyclability and recycling. These policies would serve to reduce the burden and reliance on natural resources, a decrease in greenhouse gas emissions and a lower carbon footprint1,2. This policy should be based on a vision of a circular economy, where products are designed to be recycled, and where materials are continuously reused and repurposed2. Comprehensive realization of such a policy would be impossible without employing the latest tools and technologies that can be leveraged to maximize every aspect of waste management and recycling.
The term waste management can have a wide range of meanings. Typically, all policies and procedures adopted to collect, manage, dispose of and reuse waste would fall in the purview of waste management. It should have comprehensive oversight on tactics to reduce waste, manage its production, its transport, and ultimate disposal. Disposal may include recycling, composting, incineration, landfilling, waste-to-energy conversion etc, as illustrated in Fig. 1. Waste management would entail optimizing the use these methods in accordance with the nature of the economy, nature of the society and nature of the waste, with an emphasis towards reduction, reuse and recycling of waste.
Recycled materials hold significant economic value, creating new markets and job opportunities in collection, processing, and manufacturing sectors. Diverting them from landfills translates to reduced waste disposal costs for municipalities, freeing up resources for crucial urban development projects2. Moreover, investments in cutting-edge sorting technologies and infrastructure fuel innovation in areas like artificial intelligence and robotics, leading to more efficient and cost-effective solutions3. By prioritizing and actively promoting effective sorting practices, smart cities can unlock a path towards a more sustainable and resource-efficient future. This approach benefits not only the environment but also the economy and the well-being of urban communities.
As the world becomes more environmentally conscious, there is a growing emphasis on recycling a wide range of materials. Despite encountering obstacles, global patterns suggest a notable increase in the gathering and treatment of recyclable materials spanning different categories. In4, a growing awareness of responsible medicine waste management is suggested. The study highlights an increase in regulations and collection systems. The authors in5 highlights the widespread reluctance to adopt environmentally sustainable methods for healthcare waste disposal on a large scale, citing cost, accessibility, and feasibility as key barriers. There remains a lack of a universally scalable green solution. Furthermore, the review underscores that the global health implications of healthcare waste disposal methods tend to vary based on the level of development of the country.
Similar to healthcare waste, the management of electronic waste (e-waste) has garnered significant attention globally. In5, proposals for new regulations and increased enterprise collaboration have been advocated. In light of these developments, it is imperative to conduct an overview to assess the current status and explore potential new solutions and challenges concerning e-waste management in China. It has been reported that both central and local governments in China have made substantial endeavors to enhance e-waste management6. The presence of heavy metals in municipal solid waste underscores the necessity for effective waste management strategies due to limitations on its beneficial use or disposal. Understanding the occurrence of heavy metals in municipal solid waste is pivotal for refining waste handling practices. Compared to other pollutants, heavy metals 62 are extremely toxic and can contaminate soil, water, and air. They pose a serious risks to the 63 environment and health of all manner of living things. Physicochemical-fractionation has been shown to be a potentially viable approach to manage heavy metal waste by facilitating the separation and concentration of heavy metals from waste streams7. However, in the context of a city or municipality, using fractionation poses numerous challenges, as typical waste material is not as controlled or consistent as industrial raw materials. These approaches are not viable unless the chemical composition of the various articles of waste can reliably be determined. It would require stringent approaches to source control, waste sorting and regulatory compliance monitoring for fractionation to be integrated in existing waste management frameworks7.
While the importance of reuse and recycling is well established, recent literature has determined that optimizing the sorting and separation of waste at its source would have the most significant impact in the management and recycling potential of waste8. Such approaches would also serve to minimize contamination. This would entail the segregation of recyclable materials at their point of origin, such as households or businesses, before they enter the waste stream. By promoting and enforcing stringent communal waste sorting practices, recycling rates can be enhanced to achieve a net reduction in waste disposal. This would contribute to the conservation of resources and would reduce the load on rapidly growing landfills8.
Conventional manual sorting methods are costly, hard to scale and impractical when it comes to widespread applicability. Consequently, the demand for automated waste segregation systems is growing significantly9. Many existing automation techniques are simplistic mechanical sorting devices that do not use smart computerized control systems. However, many approaches have recently started to incorporate smart computational platforms to automate the sorting process. Sensors are placed on the trash bins and data is transmitted to the cloud servers using cellular wireless communications. Many of these approaches use artificial intelligence (AI), machine learning (ML) and neural networks to make these platforms smart. This research presents a comparative study of 10 waste classification methods that visually identify different types and classes of waste using deep neural networks (DNNs). This research tested the models using the open source PyTorch framework. The dataset used to test all 10 models was the TrashBox dataset9. A secondary outcome of this research was to introduce techniques to enhance the training accuracy of these models. Any model that can accurately detect and identify various classes of waste materials at a high speed has the potential to contribute to the development of a revolutionary waste sorting system.
Related work
To overcome trash sorting challenge, novel approaches leveraging deep neural models trained on diverse trash image datasets have emerged. This section offers an overview of research endeavors in trash detection and classification.
TrashNet10, a dataset featuring 2527 images across six trash classes was developed. Despite its limited size, TrashNet facilitated experiments with CNN and SVM models, with the SVM model achieving the highest accuracy at 63%. Another dataset, TACO (Trash Annotations in Context)11, presents images of trash objects found in various settings but with fewer annotations compared to TrashNet. To overcome the issues of both TrashNet and TACO datasets, the authors in9 developed a TrashBox dataset having more images and classes as compared to its counterparts.
Several studies have delved into machine and deep learning models for waste classification. The authors in12 utilized the ResNet-50 transfer learning model on TrashNet, achieving 87% accuracy with a multi-class SVM approach. Similarly, the authors in13 employed a CNN model with the Inception-v3 transfer learning model on the Plastic Model Detection dataset, reporting 92.5% accuracy. In14, the authors developed ScrapNet, based on the EfficientNet transfer learning model, achieving 98% accuracy on TrashNet and 92.87% on a new dataset. In15, a waste classification method is proposed using a multilayer hybrid CNN, achieving a classification accuracy of up to 92.6% on TrashNet.
The performance of transfer learning models trained using the TrashBox dataset to evaluate their adaptability. Impressively, these models demonstrated an accuracy rate of 98.47%. Moreover, a pioneering deep learning framework integrating quantum transfer learning methodologies is also explored in9. It is important to mention that in16 the large-scale TrashBox dataset is also suggested as to be a promising choice for training CNNs in waste classification. RecycleNet17 introduces an optimized deep learning architecture tailored for classifying recyclable objects, utilizing the TrashNet dataset. Hybrid deep learning approaches, as explored by18 and19, investigate the impact of image features on classification performance, demonstrating the effectiveness of Multilayer Hybrid Systems and CNNs. Medical waste classification receives attention with ResNeXt models20, while DSCR-Net20 and studies like21 focus on open-source datasets and single-object identification, respectively.
Furthermore, studies investigate deep learning-based methods for garbage classification, with models like GCNet22 and DSCAM23 demonstrating high accuracy rates. Practical applications such as AlphaTrash24 offer automated sorting solutions, addressing the need for efficient waste management systems. Augmentation techniques for addressing data scarcity are explored by25, while metadata-based approaches26 and robotics-driven classification27 contribute to the research landscape. Moreover, studies utilizing CNN models28,29 underline the ongoing efforts to improve trash classification methodologies and their practical implications in waste management practices. Table 1 provides a comparison of some of the existing trash classification studies.
TrashBox dataset overview
To fill the significant gap observed in existing trash detection and classification datasets, researchers developed a new dataset called TrashBox9. This initiative stemmed from a critical assessment of prevailing benchmark datasets like TrashNet and TACO, revealing notable deficiencies in image frequency and representation of key waste types.
The analysis underscored the limited availability of images necessary for precise trash object classification, a concern acknowledged by the datasets’ own authors. Additionally, the absence of crucial waste categories like e-waste and medical waste highlighted the urgent need to address this gap, given their significance in contemporary waste management practices. In response, TrashBox was thoroughly constructed to encompass a diverse range of trash objects commonly encountered across various environments. The dataset comprises seven primary classes-medical waste, e-waste, glass, plastic, cardboard, paper, and metal-each carefully curated to ensure comprehensive coverage of trash types encountered in real-world scenarios. Some images of the TrashBox dataset for different classes are shown in Fig. 2. Furthermore, these classes are subdivided into subclasses to enable finer distinctions between trash objects and support nuanced research and analysis in the field.
The TrashBox dataset comprises various trash classes and their corresponding sub-classes, along with the number of images available for each category is shown in Table 2.
The development of the TrashBox dataset required a substantial effort. The first step mandated a meticulous and systematic approach to collect data, i.e. large volumes of photographs of trash. Deep learning algorithms typically require tens of thousands of images before training becomes feasible. Various tools were used to download images from various reputable online platforms. However, raw images cannot be used directly in a deep learning context. Numerous pre-processing tasks have to be performed to bring all images to a consistent standard and quality. These processes may involve resizing, calibrating aspect ratios, cropping, quality adjustments, noise removal and finally labelling. Labelling is the process of identifying the various classes of trash in the images that will be used to train the deep networks. To date, TrashBox serves as one of the most important repositories of a large number of diverse images of trash, immensely aiding researchers, and practitioners in the field of trash detection and classification.
CNN-based classification techniques
In this section, we briefly explain CNN-based classification models that are commonly used in the literature. In this study, we use these DNN models to classify the TrashBox dataset.
GoogLeNet
GoogLeNet38, also referred to as Inception-v1, stands out as a CNN architecture developed by Google. It is distinguished by its unique “Inception” modules, comprising multiple convolutional layers of varying sizes and types. A notable feature of GoogLeNet was its capability to achieve high accuracy in image classification tasks while ensuring computational efficiency, compared to other models that were available at the time of its release. GoogLeNet demonstrated its ability to capture features of varying scales and complexities to analyse images. Their efficiency was attributed to the smart resource utilization by the underlying inception modules. GoogLeNet further introduced concepts such as dimensionality reduction and parallel feature extraction, that are a staple of all modern CNNs today. While GoogLeNet was primarily designed for visual computing tasks such as object detection and image segmentation etc., many researchers were able to show case its applicability to other tasks outside these domains. GoogLeNet contributed significantly to the popularity of CNNs in a wide range of applications31.
ResNet model
The deeper neural networks get, the harder and more time consuming it got to train them. ResNet (Residual Network) solved this problem by introducing the concept of using shortcuts known as “skip connections” for the feedforward operation. The idea is to allow for the skipping or bypassing of certain CNN layers, while reusing the activation functions of the previous layers32. This is made possible by the use of residual mapping, where layers learn from it instead of the underlying mapping. The skip connections perform identity mapping and are added to the outputs of the stacked layers.CNNs with residual mapping mechanism have proven to be easier to optimize, compared models without residual mapping. The authors of ResNet have demonstrated that they can incorporate all these feathers without adding any added computational complexity32. Furthermore, ResNets were the first CNNs that were able to support hundreds of thousands of convolutional layers. The innovations introduced in ResNet made it a cutting-edge model of its time allowing the application of deep CNNs in a real-world context. ResNets still are a compelling models when it comes to image classification tasks.
ResNeXt model
The ResNeXt model33 is based n the ResNet model that has introduced a additional dimension of cardinality, which is the introduction of a number of parallel transformation paths within a network layer. Within this architecture, a network module conducts a series of transformations on low-dimensional embeddings, and then integrating their outputs through summation. This design facilitates scalability to accommodate numerous transformations without necessitating specialized designs. Empirical evidence demonstrates that the model outperforms the original ResNet module, even while preserving computational complexity and model size. It has broad applicability in both visual and non-visual recognition tasks.
MobileNetV2
MobileNetV1 was developed in 2017 to cater to the unique needs of embedded systems and mobile phones. MobileNetV2 advances upon the original MobileNet by introducing several notable improvements. MobileNetV2 adopts inverted residual blocks with linear bottlenecks, enabling optimal resource utilization while upholding model accuracy. Moreover, it incorporates depth-wise separable convolutions to reduce parameters and computational costs without compromising spatial information capture. MobileNetV2 is a signifcant improvement over MobileNetV1 when it comes to both accuracy and latency. These improvements can be attributed to innovative architectural features such as linear bottlenecks, residual connections, and improved skip connections34. A large body of literature has shown that MobileNetV2 consistently performs a wide variety of image processing tasks pretty well, especially in the context of a mobile or embedded use-case. This would suggest that MobileNetV2 is a robust and efficient choice for deploying deep CNN models on devices with constrained resources, making it an excellent candidate when it comes to applications like mobile phone-based image recognition or object detection.
MobileNetV3
The latest version of MobileNet, known as MobileNetV3, comes in two variants: small and large, developed by the project’s contributors35. The small variant prioritizes efficiency, making it ideal for devices with limited processing power. It achieves compactness through structural improvements in its underlying convolutional neural network architecture. This makes it a suitable option for tasks like image categorization and identifying objects in images. Conversely, the large variant prioritizes accuracy and performance, making it better suited for complex image processing and visual computing applications. Both variants utilize innovative features like efficient attention mechanisms and advanced optimization techniques, expanding their applicability across various domains.
ShuffleNetV2
ShuffleNetV236 is a CNN-based model. ShuffleNetV2 prioritizes real-time responses 237 in image processing tasks, significantly improving upon its predecessor, ShuffleNetV1. 238 This version incorporates a unique channel split operation within its blocks, enabling it to 239 handle dependencies between channels more efficiently compared to the earlier version. 240 Additionally, the processing pipeline has been restructured, strategically positioning the 241 channel shuffle operation deeper within the block architecture. While this did not required 242 any addition of a component in the pipeline, it lead to improved performance of the network 243 over V1. The V2 has additionally incorporated techniques like feature map reorganization 244 and channel shuffling to improve feature representation and extraction, which makes it 245 more robust. All these considerations makes ShuffleNetV2 an excellent candidate for a 246 use-case where real-time response is of critical importance.
Results
In this section the performance of the models that we tested using the TrashBox dataset will presented and analyzed numerically. This section is divided into two parts, where in the first part we explain the experimental setup and in the second part we discuss the classification results obtained using different deep neural network models.
Experimental setup
To perform the task of classification of trash within the TrashBox dataset, we employ DNNs utilizing the PyTorch open-source platform. Specifically, we utilize a set of 10 DNNs for classification purposes. These models undergo training utilizing the Nvidia GF GTX 1080 GPU with 64GB of RAM. The experimentation phase for all the models is run for 30 epochs. The division of data into training, validation and test set are according to the original TrashBox dataset9.
Performance comparison of various DNNs
Upon analyzing the results obtained from various DNN models applied to classify the TrashBox dataset, it is evident from Tables 3 and 4 that ResNeXt-101 and ResNeXt-50 consistently outperform other models in terms of accuracy across training, validation, and test datasets. ResNeXt-101 achieves the highest test accuracy of 89.62\(\%\) and the highest test F1 score of 89.66\(\%\), indicating its superior performance in classifying the dataset. ResNeXt-50 also exhibits strong performance with a test accuracy of 88.68\(\%\) and a test F1 score of 87.01\(\%\).
ShuffleNetV2, despite demonstrating lower training accuracy compared to ResNeXt models, surprisingly shows comparable validation and test accuracies, suggesting potential overfitting in the ResNeXt models. However, its test F1 score of 83.11\(\%\) indicates slightly lower performance in capturing dataset nuances compared to ResNeXt models. Among ResNet models, ResNet-34 shows decent performance with a test accuracy of 87.07\(\%\) and a test F1 score of 86.86\(\%\), while ResNet-50 and ResNet-101 perform slightly lower, albeit still commendable.
MobileNetV2 and MobileNetV3-Large display relatively good performance, although their test accuracies and F1 scores fall short of ResNeXt and ResNet models. MobileNetV3-Small exhibits the lowest performance among MobileNet models, suggesting potential underfitting due to its lower complexity. GoogLeNet achieves competitive performance with a test accuracy of 86.18\(\%\) and a test F1 score of 85.71\(\%\), positioning it among the top-performing models, although slightly below ResNeXt models.
From Tables 3 and 4 ResNeXt-101 emerges as the superior model due to its consistently high accuracy and F1 score across all datasets. ResNeXt-50 also performs well, albeit slightly lower than ResNeXt-101. ShuffleNetV2 shows promise despite its lower complexity. This shows the potential applicability of ShuffleNetV2 in an embedded system or mobile-based platform. The most crucial aspect to consider in this study is the balance between model complexity and performance. The contrast between ResNeXt and MobileNet models can be observed to see that models with higher complexity often leads to better results but requires more computational resources and vice-versa.
The training and validation accuracy curves for ResNeXt-101, as displayed in Fig. 3, provide key information about its learning process and ability to perform well on new data. The model achieves a training accuracy of 97.86% and a validation accuracy of 97.15%. These high values indicate that ResNeXt-101 learns effectively from the training data and can generalize well to unseen examples. In simpler terms, the model effectively captures the underlying patterns within the data without memorizing specific details, reducing the risk of overfitting.
The training loss and validation loss curves for ResNeXt-101 can be seen in Fig. 4. The figures gives us some insight into its training dynamics and generalization capacity to unseen data. Throughout the training process, both training and validation loss curves stabilize at approximately 0.1, indicating the model’s convergence to a consistent performance level. This stabilization suggests that the model has grasped the underlying data patterns effectively and has reached a point where further training yields marginal improvements. A noteworthy observation is the consistent trend of the training loss curve dipping below the validation loss curve post the 5th epoch, persisting throughout the training duration. While this hints at potential overfitting to the training data, the narrow gap between the two curves suggests minimal overfitting.
The provided confusion matrix in Fig. 5 offers a detailed breakdown of a ResNeXt-101 classification model’s performance across seven distinct waste classes: cardboard, e-waste, glass, medical, metal, paper, and plastic. Each row represents the actual class labels, while each column signifies the predicted class labels. The numerical values within the matrix indicate the count of instances classified into each respective class. For example, when examining the row corresponding to cardboard (Class0), it reveals that the majority of cardboard instances (227) were accurately classified as Class0. However, there are some misclassifications, such as a small number of instances being erroneously labeled as glass (2), medical (1), metal (2), paper (7), and no instances being misclassified as plastic. By scrutinizing other rows, we can assess the accuracy and misclassification rates for each waste class. For instance, e-waste (Class1) demonstrates high accuracy with minimal misclassifications in other classes. Conversely, glass (Class2) exhibits relatively favorable performance with limited misclassifications.
The analysis of different techniques based on their complexities in Table 5 reveals varying trade-offs in computational resources and model sizes. In the view of this study, ResNeXt50 demonstrates the most balanced approach amongst all models that were compared. It presents a relatively moderate computational complexity and model size, making it adaptable for a variety of applications. The higher the computational complexity and requirement for the model size is, the more resource-intensive the architecture becomes. This is where the debate for a balance between accuracy and resource requirements arises. Considering the performance of ResNeXt50, ResNet101 exhibits slightly lower computational complexity and model size requirement, that is reflected in its slightly lower performance. ResNet50 strikes a middle ground, offering a reasonable compromise between complexity and performance, balancing efficiency, and accuracy. This downward trend of model complexity continues with ResNet34 all the way to MobileNet-based models, as can be seen in Table 5. As the name suggests, MobileNet models are quintessential light-weight models designed primarily for mobile devices, that are becoming increasingly popular for deployment on embedded systems and edge devices. Due to their relatively light architectures, they have low computational requirements, and consequently, low power requirements. However, they have been shown to work reasonably well, as is evident in this study as shown in Tables 3 and 4. To summarize, high performance corresponds to high computational requirements and large model size requirements. The choice of deployment of a model would be contingent on the criticality of performance versus the type of computational platform that is most suitable for deployment. This would vary from use-case to use-case and it would require a careful consideration of all associate factors before the most suitable model can be chosen.
Proposal for a federated learning framework
The various machine learning models, discussed in the previous section, have proven track records when it comes to discriminating and classifying different kinds of objects in images and video frames. While they may perform very well in distinguishing one kind of visual artefact, they may not perform equally well for something else39. Each of the aforementioned model has its own strengths and weaknesses. Furthermore, their performances vary in terms of working conditions, i.e. quality and types of images they have to work with in real-time. An ideal approach would allow for models to cater to all kinds of environmental conditions while simultaneously leveraging the cumulative competence of several models. This is possible by using a distributed framework that is based on Federated Learning (FL). FL is an approach initially developed by Google. Federated learning comprises a framework that is able to maintain a federated (master) model in the cloud that periodically communicates with several models that are running remotely38. Each remote model matures during inference (with experience), that results in a change of their model parameters. The remote models periodically share their parameters with the federated server, that may use any number of algorithms to consolidate the parameters to devise a so-called “collective wisdom”. This collective wisdom is than broadcast back to all the remote sites so that they can benefit from the learning of all instances of the model.
This work proposes deployment of a federated framework for the visual detection of 7 different classes of trash at waste management facilities. Furthermore, it is proposed that 4 models be used that have been shown to work well with the TrashBox dataset, instead of using a single model alone. The performance of every model varies slightly compared to the identification class of trash, (i.e. metal, plastic, glass, etc.) and can therefore be used in tandem to get the best possible discrimination of trash overall. In the proposed framework, the federated server will compute the weighted average of model parameters, specifically weights (W) and biases (B). It will receive individual parameters for each model via encrypted communication channels from numerous waste management facilities. Each of the facility would be responsible to run those models in tandem, as shown in Fig. 6. In return, the federated server would broadcast the consolidated parameters back to each waste management facility.
Based on the benchmarks and tests conducted in this study, the four models proposed to run at each remote site are ResNeXt-101, ShuffleNetV2, ResNet-34 and MobileNetV3-Large respectively. All four models have been shown in this study to perform the task of detection of different classes of trash very well. Furthermore, all of these models can be configured to function as reinforcement learning (RL) agents. This means that they have the ability to improve and adapt their parameters in accordance to their real-time performance via reinforcement learning. This would allow each model to be optimized locally for each class of trash that is provided in the TrashBox dataset, as shown in Fig. 7. Each individual waste management site will be able to share its unique learning experience with other sites via the federated server.
Conclusions
This study evaluated a variety of deep learning models to thoroughly test their efficacy in detecting and classifying different types of waste. The TrashBox dataset was used on all the models, that were tested on the same hardware platform. After rigorous testing, this study found that two models, namely, the ResNeXt-101 and ResNeXt-50 consistently outperformed other models both in training and post-inference validation. Of these two, ResNeXt-101 achieved the highest test accuracy of 89.62% and the highest test F1 score of 89.66%. ResNeXt-50 was comparable with an 88.68% accuracy and a test F1 score of 87.01%. ShuffleNetV2 showed potential with comparable validation and test accuracies with a slightly lower test F1 score. This may suggest a possible case of overfitting in the ResNeXt-based models.Among the ResNet-based models, ResNet-34 performs commendably well, followed by ResNet-50 and ResNet-101. They clearly outperform all the MobileNet-based models. This could be because MobileNet models are lightweight models built for relatively lower computational requirements resulting in potential underfitting.
In conclusion, while ResNeXt-101 stands out as the superior model, maintaining dependably high accuracy and F1 score, many of the other models were consistently accurate. making them viable to be applied in a trash-classification system. The key is balancing model complexity with the model performance. This trade-off was observed between ResNeXt and MobileNet models, where ResNeXt models were clearly optimal for performance while MobileNet models designed to run on low-end computational platforms. The ability to run on inexpensive embedded platforms may be a crucial consideration in a practical context. This comparative analysis emphasizes the importance of selecting the most suitable DNN model for effective trash classification, if used for the waste management of smart cities.
Lastly, this study has presented a proposal for a framework that is based on federated learning, that uses 4 image classification models at each waste management site. The concept is to integrate the model parameters of several such sites to a federated server, that would consolidate the model parameters, specifically, their weights and biases, to develop a global wisdom. Based on the tests and benchmarks performed in this study, the four most suitable models that should be deployed are ResNeXt-101, ShuffleNetV, ResNet-34 and MobileNetV3-Large respectively.
Data availibility
All related data is available in the paper.
References
Popova, Y. & Sproge, I. Decision-making within smart city: Waste sorting. Sustainability 13(19), 10586 (2021).
Mohammed, M. A., Abdulhasan, M. J., Manoj Kumar, N., Abdulkareem, K. H., Mostafa, S. A., Maashi, M. S., Khalid, L. S., Abdulaali, H. S., & Chopra, S. S. Automated waste-sorting and recycling classification using artificial neural network and features fusion: A digital-enabled circular economy vision for smart cities. Multimed. Tools Appl. 82(25), 39617–39632 (2023).
Blasi, S., Ganzaroli, A. & De Noni, I. Smartening sustainable development in cities: Strengthening the theoretical linkage between smart cities and SDGs. Sustain. Cities Soc. 80, 103793 (2022).
Kenny, C. & Priyadarshini, A. Review of current healthcare waste management methods and their effect on global health. Healthcare 9(3), 284 (2021).
Ali, M., Wang, W., Chaudhry, N. & Geng, Y. Hospital waste management in developing countries: A mini review. Waste Manag. Res. 35(6), 581–592 (2017).
Chenyu, L. et al. An overview of e-waste management in China. J. Mater. Cycles Waste Manage. 17, 1–12 (2015).
Zhang, H., He, P.-J. & Shao, L.-M. Implication of heavy metals distribution for a municipal solid waste management system—a case study in Shanghai. Sci. Total Environ. 402(2–3), 257–267 (2008).
Maletz, R., Dornack, C., & Ziyang, L. Source separation and recycling (Springer International Publishing, 2018).
Kumsetty, N.V., Nekkare, A. B., Sowmya, K. S., & Anand Kumar, M. Trashbox: Trash detection and classification using quantum transfer learning. In Proceedings of the 31st conference of open innovations association (FRUCT), pp. 125–130 (2022).
Thung, G., & Yang, M. Classification of trash for recyclability status [Online]. Available: https://github.com/garythung/trashnet (2016).
Proenca, P. F., & Simoes, P. Taco: Trash annotations in context for litter detection. arXiv preprint arXiv:2003.06975 (2020).
Adedeji, O. & Wang, Z. Intelligent waste classification system using deep learning convolutional neural network. Proc. Manuf. 35, 607–612 (2019).
Azis, F. A., Suhaimi, H., & Abas, E. Waste classification using convolutional neural network. In Proceedings of the 2nd international conference on information technology and computer communications, pp. 9–13 (2020).
Masand, A., Chauhan, S., Jangid, M., Kumar, R. & Roy, S. Scrapnet: An efficient approach to trash classification. IEEE Access 9, 130947–130958 (2021).
Shi, C., Tan, C., Wang, T., & Wang, L. A waste classification method based on a multilayer hybrid convolution neural network. Appl. Sci. 11(18) (2021).
Ting-Wei, W., Zhang, H., Peng, W., Lü, F. & He, P.-J. Applications of convolutional neural networks for intelligent waste identification and recycling: A review. Resour. Conserv. Recycl. 190, 106813 (2023).
Bircanoglu, C., Atay, M., Beser, F., Genc, O., & Kizrak, M. A. RecycleNet: Intelligent waste sorting using deep neural networks. In Proceedings Innovations in Intelligent Systems and Applications (INISTA), pp. 1–7 (2018).
Chu, Y., Huang, C., Xie, X., Tan, B., Kamal, S., & Xiong, X. Multilayer hybrid deep-learning method for waste classification and recycling. Comput. Intell. Neurosci. 1–9.
Zhou, H. et al. A deep learning approach for medical waste classification. Sci. Rep. 12(1), 1–9 (2022).
Song, F., Zhang, Y., & Zhang, J. Optimization of CNN-based garbage classification model. In Proceedings of the 4th International Conference on Computer Science and Application Engineering, pp. 1–5 (2020).
Meng, S., & Chu, W.-T. A study of garbage classification with convolutional neural networks. In Proceedings of the 2nd Conference on Computing, Analytics and Networks, pp. 152–157 (2020).
Liu, W., Ouyang, H., Liu, Q., Cai, S., Wang, C., Xie, J., & Hu, W. Image recognition for garbage classification based on transfer learning and model fusion. Math. Probl. Eng. 1–12 (2022).
Liu, F. et al. Depth-wise separable convolution attention module for garbage image classification. Sustainability 14(5), 3099 (2022).
Tiyajamorn, P., Lorprasertkul, P., Assabumrungrat, R., Poomarin, W., & Chancharoen, R. Automatic trash classification using convolutional neural network machine learning. In Proceedings of International Conference on Cybernetics and Intelligent Systems (CIS) and IEEE Conference on Robotics, Automation and Mechatronics (RAM), pp. 71–76 (2019).
Hong, J., Fulton, M., & Sattar, J. A generative approach towards improved robotic detection of marine litter. In Proceedings of International Conference on Robotics and Automation (ICRA), pp. 10525–10531 (2020).
Sun, A., & Xiao, H. ThanosNet: A novel trash classification method using metadata. In Proceedings of International Conference on Big Data (Big Data), pp. 1394–1401 (2020).
Dong, X. Research and design of marine trash classification robot based on color recognition. Proc. IOP Conf. Ser. Earth Environ. Sci. 514(3), 032043 (2020).
Liu, H., Guo, Z., Bao, J., & Xie, L. Research on trash classification based on artificial intelligence and sensor. In Proceedings of International Conference on Intelligent Computing and Human-Computer Interaction, pp. 274–279 (2021).
Patil, A., Tatke, A., Vachhani, N., Patil, M., & Gulhane, P. Garbage classifying application using deep learning techniques. In Proceedings of International Conference on Recent Trends in Electronics, Information and Communication Technology, pp. 122–130 (2021).
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9 (2015).
Sharma, N., Jain, V., & Mishra, A. An analysis of convolutional neural networks for image classification. Proc. Comput. Sci. 132, 377–384 (2018).
He, K., Zhang, X., Ren, S., & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 770–778 (2016).
Xie, S., Girshick, R., Dollar, P., Tu, Z., & He, K. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 1492–1500 (2017).
Dong, K., Zhou, C., Ruan, Y., & Li, Y. MobileNetV2 model for image classification. In Proceedings 2nd International Conference on Information Technology and Computer Application (ITCA), pp. 476–480 (2020).
Howard, A., Sandler, M., Chu, G., Chen, L. C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., Le, Q. V., & Adam, H. Searching for MobileNetV3. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1314–1324 (2019).
Ma, N., Zhang, X., Zheng, H. T., & Sun, J. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. In Proceedings of the European conference on computer vision (ECCV), pp. 116–131 (2018).
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L. C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 4510–4520 (2018).
Li, L., Fan, Y., Tse, M., & Lin, K.-Y. A review of applications in federated learning. In Elsevier Computers and Industrial Engineering, pp. 106854 (2020).
Chen, L., Li, S., Bai, Q., Yang, J., Jiang, S., & Miao, Y. Review of image classification algorithms based on convolutional neural networks. In MDPI Remote Sensing, pp. 4712 (2021).
Funding
This work was supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University (IMSIU) (grant number IMSIU-RG23078).
Author information
Authors and Affiliations
Contributions
H.A.K, S.S.N, A.A.K.A, S.A and M.A developed the paper idea, H.A.K, S.S.N and A.A.K.A developed the simulation model and conducted simulations, H.A.K, S.S.N and A.A.K.A wrote the initial version of the paper. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ahmed Khan, H., Naqvi, S.S., Alharbi, A.A.K. et al. Enhancing trash classification in smart cities using federated deep learning. Sci Rep 14, 11816 (2024). https://doi.org/10.1038/s41598-024-62003-4
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-024-62003-4
Keywords
This article is cited by
-
Federated propensity score matching: a method for estimating average treatment effect based on data in Islands
Annals of Operations Research (2025)
-
Spatial interpolation of global DEM using federated deep learning
Scientific Reports (2024)









