Lightweight deep deterministic policy gradient for edge computing in recirculating aquaculture systems: real-time feeding control with reduced computational requirements

Elmessery, Wael M.; Shams, Mahmoud Y.; El-Hafeez, Tarek Abd; Szűcs, Péter; Eid, Mohamed Hamdy; Alhumedi, M.; Ahmed, Atef Fathy; Elwakeel, Abdallah Elshawadfy

doi:10.1038/s41598-025-21677-0

Download PDF

Article
Open access
Published: 30 October 2025

Lightweight deep deterministic policy gradient for edge computing in recirculating aquaculture systems: real-time feeding control with reduced computational requirements

Wael M. Elmessery¹,
Mahmoud Y. Shams²,
Tarek Abd El-Hafeez^3,4,
Péter Szűcs⁵,
Mohamed Hamdy Eid^5,6,
M. Alhumedi⁷,
Atef Fathy Ahmed⁷ &
…
Abdallah Elshawadfy Elwakeel⁸

Scientific Reports volume 15, Article number: 37960 (2025) Cite this article

2411 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

The deployment of advanced reinforcement learning algorithms in edge computing environments presents significant challenges for real-time aquaculture management, particularly in resource-constrained recirculating aquaculture systems (RAS). Building upon our previous work demonstrating superior performance of DDPG controllers in commercial RAS operations, this research introduces a lightweight DDPG architecture specifically optimized for edge computing deployment in recirculating aquaculture systems. The Edge-DDPG framework reduces computational complexity by 85% while maintaining 92% of the original model’s performance accuracy. The lightweight architecture employs compact neural networks with reduced layer dimensions (64→32→1 neurons vs. 400→300→1 in the original), memory-efficient replay buffers (5,000 vs. 100,000 capacity), and CPU-optimized operations suitable for ARM-based edge devices. Experimental validation demonstrates consistent performance with average inference times of 15.2 ± 3.1 ms on Raspberry Pi 4B, enabling real-time control within 50 ms system response requirements. The edge-optimized controller achieved 94.3% feeding accuracy and 96.1% water quality stability while consuming only 47 ± 8 MB of system memory. Economic analysis demonstrates deployment cost reductions from $56,900 to $8,400 for large-scale implementations, enabling widespread adoption of intelligent feeding control in small to medium-scale aquaculture operations.

Introduction

The rapid advancement of intelligent aquaculture management systems has demonstrated significant potential for optimizing feeding strategies and water quality control in recirculating aquaculture systems (RAS). Previous research established the superiority of Deep Deterministic Policy Gradient (DDPG) algorithms for RAS feeding control, achieving substantial improvements in tracking accuracy, feed efficiency, and system stability compared to traditional control methods¹. However, the deployment of sophisticated reinforcement learning algorithms in real-world aquaculture operations remains constrained by computational requirements, infrastructure limitations, and economic barriers that prevent widespread adoption, particularly in small to medium-scale farming operations.

The growing demand for sustainable and economically viable aquaculture solutions has intensified interest in edge computing technologies that can bring advanced control algorithms closer to the point of operation^1,2. Edge computing offers compelling advantages for aquaculture applications, including reduced latency, improved reliability, lower bandwidth requirements, and enhanced data privacy³. However, the deployment of complex deep reinforcement learning models on resource-constrained edge devices presents fundamental challenges in balancing computational efficiency with control performance^4,5.

Traditional DDPG implementations require substantial computational resources, with typical architectures employing networks containing hundreds of hidden units and requiring large-capacity replay buffers for stable learning⁶. Previous implementations demonstrated excellent performance on high-end computational infrastructure with GPU acceleration for training and inference¹. However, such computational requirements are incompatible with edge deployment scenarios where systems must operate on ARM-based processors, industrial PLCs, or embedded controllers with limited memory, processing power, and energy budgets.

Recent studies have highlighted the gap between laboratory-scale reinforcement learning implementations and practical deployment requirements in agricultural settings^2,7. Computational constraints represent the primary barrier to intelligent aquaculture system adoption, particularly in developing regions where resource limitations are most pronounced⁸. This computational gap is particularly pronounced in aquaculture applications where real-time control decisions must be made within milliseconds to prevent system instability and ensure fish welfare.

The evolution of edge computing hardware has created new opportunities for deploying intelligent control systems in aquaculture environments⁹. Modern ARM-based processors, including the Raspberry Pi 4, NVIDIA Jetson series, and industrial-grade edge controllers, offer substantially improved price-performance ratios while maintaining the ruggedness required for aquaculture deployments¹⁰. However, these devices typically provide limited RAM and ARM Cortex-A72 processors, representing orders of magnitude less computational capacity compared to GPU-accelerated training systems.

The Internet of Things (IoT) integration in aquaculture has further emphasized the need for distributed intelligence, where multiple edge nodes collaborate to manage complex RAS operations^11,12. This distributed approach requires lightweight algorithms capable of operating autonomously while maintaining coordination with centralized monitoring systems¹³. Current edge computing frameworks in aquaculture primarily focus on data collection and simple threshold-based control, leaving significant opportunities for intelligent decision-making at the edge¹⁴.

The field of neural network compression has developed numerous techniques for reducing model complexity while preserving performance, including network pruning, quantization, knowledge distillation, and architectural optimization¹⁵. However, the application of these techniques to continuous control problems in reinforcement learning presents unique challenges, particularly for actor-critic architectures where both policy and value function approximation must be maintained under resource constraints.

Recent advances in efficient deep learning architectures have demonstrated the feasibility of deploying sophisticated models on edge devices^16,17. Works in efficient neural networks have shown that carefully designed architectures can achieve comparable performance to larger models while requiring substantially fewer parameters and computational operations¹⁸. In the context of reinforcement learning, efficient actor-critic architectures have been explored, though these have not been specifically adapted for aquaculture control applications¹⁹.

The challenge of maintaining exploration-exploitation balance in resource-constrained environments adds complexity to edge deployment of reinforcement learning algorithms²⁰. Traditional approaches rely on experience replay buffers and target networks that consume significant memory, requiring novel approaches to maintain learning stability while reducing resource requirements²¹.

Despite the demonstrated effectiveness of DDPG algorithms in aquaculture control and the growing availability of edge computing hardware, significant gaps remain in bridging advanced reinforcement learning with practical deployment requirements. The computational gap exists where DDPG implementations require resources orders of magnitude greater than available on typical edge devices, creating fundamental deployment barriers for widespread adoption^3,22. Real-time performance requirements demand response times under 50 ms for stable RAS operation, yet existing lightweight reinforcement learning implementations often sacrifice speed for memory efficiency²³.

Economic accessibility represents another critical barrier, as high computational infrastructure costs limit adoption to large-scale commercial operations, excluding small and medium-scale farmers who represent the majority of global aquaculture operations^7,24. Scalability limitations arise from centralized deployment designs that limit applicability to distributed RAS operations requiring autonomous control capabilities¹⁴. Additionally, edge deployment environments present unique robustness challenges including power fluctuations, network interruptions, and thermal constraints that existing aquaculture control algorithms have not been designed to handle²⁵.

This research addresses these gaps by developing a lightweight DDPG architecture specifically optimized for edge computing deployment in aquaculture systems. Building upon proven effectiveness of full-scale DDPG implementation¹, our approach introduces novel optimizations that enable practical edge deployment without compromising critical control performance. The primary objectives include architectural optimization to reduce computational requirements by over 80% while maintaining over 90% of original control performance, achieving real-time performance with inference times under 25ms on ARM-based edge devices, designing memory-efficient algorithms operating within typical edge computing hardware constraints, demonstrating successful deployment on commercially available platforms, and establishing cost-effective deployment pathways reducing implementation costs by over 80% compared to GPU-based solutions.

Research contributions

The key contributions of this work include:

Novel lightweight architecture: Strategically reduced network dimensions while preserving dual-component reward structure ensuring both feeding optimization and water quality management.
Edge-optimized training protocols: Maintaining learning stability with reduced replay buffer capacity and CPU-optimized operations.
Real-time inference pipelines: Achieving sub-25ms response times on ARM processors while maintaining operational precision.
Comprehensive validation framework: Testing edge deployment performance across multiple operational scales and environmental conditions.
Economic feasibility demonstration: Quantitative demonstration of cost reduction pathways enabling widespread adoption of intelligent aquaculture control systems.

Paper organization

The remainder of this paper is organized as follows: Sect. 2 presents the related work and comparative analysis with existing approaches. Section 3 provides the materials and methods, including the lightweight DDPG architecture design and edge computing optimization strategies. Section 4 provides comprehensive results and discussion of the architectural optimization performance, edge deployment validation, and economic impact analysis. Section 5 discusses the broader implications, limitations, and future research directions. Finally, Sect. 6 concludes the paper with key findings and their significance for precision aquaculture systems.

Related work

Reinforcement learning approaches in aquaculture control

Traditional aquaculture control systems primarily rely on rule-based approaches and PID controllers that struggle to adapt to the complex, non-linear dynamics of biological systems. Our previous works^1,26 established the effectiveness of Deep Deterministic Policy Gradient (DDPG) algorithms for RAS feeding control, and energy optimization demonstrating superior performance compared to conventional methods. However, the computational requirements of full-scale DDPG implementations limit their deployment to high-end GPU-accelerated systems, creating barriers for widespread adoption.

Recent advances in reinforcement learning for aquaculture applications have shown promising results but with notable limitations. Aljehani et al²⁷. provided a comprehensive review of feeding control and water quality monitoring approaches, highlighting the gap between laboratory-scale implementations and practical deployment requirements. They identified computational constraints as the primary barrier to intelligent system adoption in aquaculture. Chahid et al²⁸. developed Q-learning algorithms for fish growth trajectory tracking, demonstrating the potential of reinforcement learning in precision aquaculture but limited to simulation environments with high computational requirements.

Further work by Aljehani et al²⁹. compared model-based versus model-free feeding control approaches, proposing Q-learning methods for fish trajectory tracking while managing model uncertainties and environmental factors. However, these approaches required substantial computational resources and were not designed for edge deployment scenarios. Chen et al³⁰. designed an intelligent variable-flow recirculating aquaculture system using machine learning methods, achieving high accuracy in water quality regulation but requiring industrial PC hardware unsuitable for edge computing applications.

Edge computing in agricultural applications

The application of edge computing to agricultural systems has gained significant attention, with researchers identifying computational constraints as the primary barrier to AI adoption in precision agriculture. Zhang et al². provided a comprehensive overview of edge computing in agricultural IoT, highlighting key technologies, applications, and challenges. They emphasized the need for distributed intelligence where edge nodes can process data locally, reducing cloud dependency and improving real-time responsiveness.

Zamora-Izquierdo et al³¹. developed a smart farming IoT platform based on edge and cloud computing, demonstrating a multi-tier architecture for precision agriculture applications. Their work showed the feasibility of deploying intelligent algorithms on edge devices but focused primarily on environmental monitoring rather than continuous control applications. The unique requirements of real-time aquaculture control, including sub-second response times and continuous operation under varying environmental conditions, present distinct challenges not addressed by current approaches.

Recent work by He et al³². introduced edge computing-oriented smart agricultural supply chain mechanisms, integrating auction-based optimization with neural network processing. However, existing edge computing solutions for agriculture primarily focus on computer vision tasks and data collection rather than continuous control applications requiring real-time decision-making with strict timing constraints.

Hardware optimization and neural network compression

The field of neural network compression has developed numerous techniques for reducing model complexity while preserving performance. Han et al¹⁵. introduced the seminal “deep compression” approach, combining network pruning, trained quantization, and Huffman coding to achieve 35–49× storage reduction without accuracy loss. This work established the foundation for deploying sophisticated models on resource-constrained devices.

Liang et al³³. provided a comprehensive survey of pruning and quantization techniques for deep neural network acceleration, demonstrating that network compression can often be realized with minimal accuracy loss and sometimes even improved performance. They categorized pruning approaches as static (offline) or dynamic (runtime) and compared various quantization strategies from 8-bit integers to binary neural networks.

Recent advances have explored combined compression approaches. Kim et al³⁴. proposed PQK, integrating pruning, quantization, and knowledge distillation in a two-phase framework specifically designed for edge deployment. Chen et al³⁵. developed knowledge transfer algorithms for edge devices, achieving over 49.5× parameter compression with 3.2× inference speed improvement on ARM-based processors. However, these compression techniques have not been specifically adapted for reinforcement learning algorithms in continuous control applications.

Gaps in current approaches

Despite advances in both reinforcement learning and edge computing, significant gaps remain in bridging these technologies for practical aquaculture deployment:

1.
Computational Gap: Existing DDPG implementations require resources orders of magnitude greater than available on typical edge devices^3,22.
2.
Real-time Performance: Current lightweight reinforcement learning implementations often sacrifice speed for memory efficiency, failing to meet sub-second response requirements²³.
3.
Economic Accessibility: High computational infrastructure costs limit adoption to large-scale commercial operations, excluding small and medium-scale farmers^7,24.
4.
Scalability Limitations: Centralized deployment designs limit applicability to distributed RAS operations requiring autonomous control capabilities¹⁴.
5.
Application-Specific Optimization: Existing compression techniques have not been systematically applied to actor-critic architectures with dual-objective optimization requirements typical of aquaculture control systems.

Comparative analysis

Table 1. Summary of control approaches applied in aquaculture systems, with a focus on control type, platform used, computational requirements, ability to operate in real time, and associated deployment costs.

Table 1 Comparison of control approaches for aquaculture systems.

Full size table

The proposed edge-optimized approach addresses these critical gaps by: (1) enabling real-time operation on low-cost hardware, (2) achieving dramatic cost reduction while maintaining intelligent control benefits, (3) providing scalable deployment architecture suitable for diverse operational scales, and (4) specifically optimizing actor-critic architectures for aquaculture control requirements.

Materials and methods

To address the challenge of deploying the DDPG controller on resource-constrained hardware, we developed a comprehensive methodology encompassing architectural optimization, hardware-specific validation, and rigorous comparative analysis.

Lightweight DDPG architecture design

Building upon our established DDPG framework for aquaculture control¹, we developed a compact neural network architecture specifically optimized for edge computing deployment. The lightweight design maintains the fundamental actor-critic structure while dramatically reducing computational complexity through strategic parameter reduction and architectural optimization.

Compact actor network architecture

The edge-optimized actor network employs a three-layer fully connected architecture with systematically reduced dimensions compared to our original implementation. The network structure transitions from 6 input dimensions (dissolved oxygen, pH, temperature, ammonia, biomass, current feeding rate) through hidden layers of 64 and 32 neurons to a single output representing the feeding rate action.

The output layer employs a modified tanh activation function that differs from standard tanh by incorporating learned scaling and bias parameters to constrain feeding rate recommendations within the operational bounds [0.45, 0.65]. This modification is implemented as:

$output\,=\,0.1{\text{ }}*{\text{ }}\tanh \left( x \right)\,+\,0.55$

Where the scaling factor (0.1) and bias (0.55) ensure outputs remain within the required feeding rate range while preserving the smooth gradient properties of the tanh function.

The architectural optimization focuses on maintaining critical representational capacity while eliminating redundant parameters. Layer normalization from the original design is replaced with batch normalization to improve computational efficiency on CPU-based edge devices. As illustrated in Fig. 1, the comprehensive comparison between original and edge-optimized architectures demonstrates a 96.4% parameter reduction with minimal performance degradation, enabling deployment on low-cost edge devices while preserving aquaculture control effectiveness.

Memory-efficient critic network

The critic network architecture parallels the actor design with corresponding parameter reductions while maintaining the ability to evaluate state-action pairs effectively. The network processes concatenated state and action inputs (7 dimensions total) through hidden layers of 64 and 32 neurons, outputting a single Q-value estimate. The critic architecture incorporates several edge-specific optimizations, including shared batch normalization parameters, gradient clipping, and adaptive learning rate scheduling.

The overall training process, which utilizes these lightweight components within an edge-optimized framework, is outlined in Algorithm 1.

Edge computing optimization strategies

Beyond architectural modifications, deploying the DDPG algorithm on edge hardware necessitates significant algorithmic and procedural optimizations. Our strategy addresses two primary bottlenecks: the substantial memory footprint of experience replay and the unique computational constraints of CPU-based training. Figure 1 illustrates the comprehensive edge computing optimization pipeline, showing (a) memory-efficient replay buffer organization with compressed storage, (b) CPU-optimized training protocol flow, and (c) real-time inference pipeline with performance monitoring.

Memory-efficient experience replay

A primary challenge in edge deployment is the large memory requirement of the DDPG’s experience replay buffer. Conventional implementations, which maintain up to 100,000 transitions to ensure learning stability, are unfeasible on typical ARM-based edge devices. Our framework overcomes this by systematically reducing the replay buffer capacity by 95% to just 5,000 transitions. The critical benefits of experience replay—breaking temporal correlations and improving sample efficiency—are preserved through three complementary mechanisms:

Prioritized experience selection: An importance-weighted sampling mechanism is implemented to prioritize experiences with higher temporal difference errors. This strategy maximizes the learning utility of each stored transition, compensating for the reduced buffer capacity by ensuring that the most informative experiences are preferentially sampled.
Optimized data compression: To minimize memory overhead while preserving numerical accuracy, experience tuples are stored using a compressed data structure optimized for edge computing constraints. Each transition tuple (state, action, reward, next_state, done) is efficiently packed using float32 precision for continuous variables and boolean flags for terminal states. The state vector (6 dimensions: dissolved oxygen, pH, temperature, ammonia, biomass, current feeding rate) and next_state vector are stored as contiguous float32 arrays, while the scalar action and reward values utilize single-precision representation.

This compression strategy reduces the storage requirement for each transition from 384 bytes (using standard Python objects with float64 precision) to 96 bytes through several optimizations: (1) float32 precision reduces each floating-point value from 8 to 4 bytes while maintaining sufficient numerical fidelity for aquaculture control applications, (2) contiguous memory layout eliminates Python object overhead and improves cache locality during sampling operations, and (3) vectorized storage using NumPy arrays enables efficient batch operations essential for ARM processor optimization.

The compressed format maintains numerical precision within ± 0.001 units for all aquaculture parameters, ensuring that critical water quality thresholds (dissolved oxygen: 6.0–7.2.0.2 mg/L, pH: 6.8–7.8, ammonia: <0.3 mg/L) are preserved with sufficient resolution for accurate value function approximation. Memory alignment follows 32-byte boundaries to optimize SIMD operations on ARM NEON instruction sets,further enhancing computational efficiency during replay buffer sampling.

Adaptive buffer management: The framework includes the capability for dynamic buffer size adjustment in response to available system memory. This ensures operational stability across diverse edge platforms, preventing memory exhaustion in industrial scenarios where other processes may be running concurrently.

Algorithm 2 provides the detailed implementation of this memory-efficient experience replay mechanism, integrating data compression, prioritized sampling, and adaptive management.

Distributed system integration architecture

The edge deployment enables distributed intelligence architectures where multiple autonomous units coordinate to manage complex multi-tank RAS operations. This distributed approach provides enhanced reliability through redundancy, with individual nodes maintaining operation during network interruptions while coordinating when connectivity is available.

The distributed architecture consists of three primary components:

Edge node architecture

Each RAS unit operates an independent edge node containing the lightweight DDPG controller, local sensor interface, and autonomous decision-making capability. Nodes maintain operational continuity during communication failures while logging decisions for later synchronization.

Coordination protocol

A lightweight mesh network protocol enables nodes to share critical information including water quality alerts, feeding schedules, and system status updates. The protocol implements priority-based message routing to ensure critical alerts receive immediate attention while routine status updates are batched for efficiency.

Cloud integration

Periodic synchronization with cloud-based services enables model updates, performance monitoring, and centralized analytics while maintaining edge autonomy. The hybrid architecture reduces bandwidth requirements by > 90% compared to pure cloud-based solutions while preserving advanced capabilities.

The distributed architecture implementation is visualized in Fig. 2, which presents three critical aspects of the multi-node deployment strategy. Figure 2a demonstrates the multi-node coordination topology, illustrating how individual RAS units communicate through a mesh network protocol that enables real-time information sharing while maintaining local autonomy. Each node operates independently with its own lightweight DDPG controller, sensor interfaces, and decision-making capabilities, connected through prioritized message routing that ensures critical alerts receive immediate attention.

Figure 2b shows the autonomous operation capabilities during communication failures, a key advantage of edge deployment over centralized systems. When network connectivity is lost, each node continues operating using its local controller while logging decisions for later synchronization. This autonomous capability is demonstrated through the maintained feeding schedules and water quality control during 60–180 s network interruptions, with nodes resuming coordination seamlessly upon connectivity restoration.

Figure 2c illustrates the edge-to-cloud synchronization mechanism that enables model updates, performance monitoring, and centralized analytics while preserving edge autonomy. The hybrid architecture reduces bandwidth requirements by > 90% compared to pure cloud-based solutions by batching non-critical data and prioritizing essential communications. This synchronization strategy enables system-wide learning and optimization while maintaining the real-time responsiveness required for aquaculture control.

The distributed approach provides enhanced reliability through redundancy, with individual nodes maintaining operation during network interruptions while coordinating when connectivity is available. This architecture is particularly valuable for large-scale commercial operations where single-point failures could compromise entire production systems.

CPU-optimized training protocol

Complementing the memory enhancements, the training protocol is comprehensively redesigned for the computational characteristics of CPU-based edge devices, such as the ARM Cortex-A72 processors common in aquaculture deployments. These optimizations ensure efficient learning within the constraints of limited cache hierarchies and memory bandwidth. The key strategies employed include:

Batch size optimization: Training batch sizes are strategically reduced from 256 to 32 samples to align with the limited memory bandwidth and cache capacity of ARM processors. This smaller batch size minimizes cache misses and enables more frequent gradient updates without creating computational bottlenecks.
Accelerated convergence mechanisms: The soft update rate for the target networks is increased (τ from 0.001 to 0.005) to accelerate policy convergence. This modification compensates for the reduced effective learning rate resulting from smaller batch sizes and ensures that learning progresses efficiently despite the computational constraints.
Gradient accumulation strategies: The implementation uses accumulated gradients across several mini-batches to simulate the stabilizing effects of a larger effective batch size while respecting the hardware’s memory limitations. This technique preserves the statistical benefits of large-batch training by distributing the computational load into smaller operations that fit within the device’s available memory and cache.

The CPU-optimized update step, designed for ARM processors using techniques like gradient accumulation, is detailed in Algorithm 3 and Fig. 3.

Real-time inference pipeline and fault tolerance testing

Fault simulation methodology

To comprehensively evaluate system robustness under realistic operational conditions, we implemented a systematic fault injection framework that simulates common edge computing and aquaculture environment challenges:

Environmental fault simulation

Temperature spikes (+ 5 °C over 30 min) were simulated using controlled thermal profiles. Dissolved oxygen sensor drift (± 0.5 mg/L) was modeled using Gaussian noise injection with time-varying bias. Ammonia concentration variations were introduced through step-function disturbances representing biological loading changes.

Communication fault testing

Network delays (30–60 s) were simulated using controlled packet dropping and latency injection. Complete network interruptions (60–180 s) tested autonomous operation capabilities during connectivity loss.

Hardware constraint simulation

Processing power limitations were emulated by restricting CPU availability to 50% of nominal capacity. Memory pressure scenarios were created by running concurrent processes consuming 75% of available RAM.

Power system disturbances

Voltage fluctuations (± 10%) were simulated using software-controlled power management to test system stability under electrical variations common in remote aquaculture installations.

The comprehensive validation framework is presented in Fig. 4, which details the multi-faceted approach to system robustness testing. Figure 4a shows the multi-scale RAS simulation environment, illustrating how the same lightweight DDPG controller is validated across laboratory-scale (1,000 L), pilot-scale (10,000 L), and commercial-scale (50,000 L) systems. Each scale presents different challenges in terms of stocking density, water volume dynamics, and system complexity, providing comprehensive validation of scalability.

Figure 4b presents the environmental perturbation testing protocols, showing systematic injection of realistic disturbances including temperature fluctuations (± 3 °C), dissolved oxygen variations (± 1.5 mg/L), and ammonia spikes (0.1–0.3 mg/L). These perturbations simulate common aquaculture challenges such as equipment malfunctions, biological loading changes, and seasonal variations. The testing protocol ensures that the edge controller maintains stable operation under conditions that frequently occur in real-world aquaculture environments.

Figure 4c illustrates the edge-specific fault injection scenarios designed to test robustness under computational and communication constraints. The diagram shows simulation of sensor drift using Gaussian noise with time-varying bias, communication delays through controlled packet dropping, processing constraints by restricting CPU availability, and power fluctuations using software-controlled management. These fault scenarios are critical for validating edge deployment reliability in industrial environments where perfect conditions cannot be guaranteed.

Optimized forward pass implementation

To meet the real-time performance requirements, the forward pass implementation is accelerated through a multi-faceted optimization strategy that targets model size, computational throughput, and memory access latency. These techniques work in concert to ensure rapid decision-making on resource-constrained hardware.

Model quantization: A post-training quantization process is applied to reduce model weights from 32-bit floating-point (float32) to 8-bit integer (int8) precision where feasible. This strategy achieves a 4× reduction in the model’s memory footprint and accelerates computation on compatible hardware, with a negligible impact on accuracy (< 0.5% degradation).
Vectorized operations: The implementation leverages hardware-specific Single Instruction, Multiple Data (SIMD) capabilities, specifically ARM NEON instructions. This allows for the parallel computation of matrix operations, significantly improving the throughput of the neural network’s forward pass on embedded processors.
Cache-friendly memory access: Data structures within the pipeline are strategically reorganized to align with the memory access patterns of the CPU. This optimization maximizes cache hit rates during inference, thereby reducing memory access latency by an estimated 35–40% and minimizing a critical bottleneck in the computation process.

Adaptive noise management

Beyond raw computational speed, the operational safety of the edge-deployed agent is paramount. This requires a sophisticated approach to managing the exploration-exploitation trade-off. Our framework implements a hierarchical adaptive noise strategy that modulates the agent’s exploratory behavior based on both learning progress and the current state of the physical system.

Decaying noise schedule: The baseline exploration strategy is governed by an Ornstein-Uhlenbeck noise process with a standard deviation (σ) that begins at 0.1 and decays exponentially (with a factor of 0.995) toward a minimum threshold of 0.01. This ensures that exploration is gradually reduced as the agent gains experience and converges on an optimal policy.
State-dependent noise modulation: Layered on top of the decay schedule, the magnitude of the injected noise is dynamically modulated based on real-time water quality parameters. As system conditions (e.g., dissolved oxygen, ammonia) approach critical operational thresholds, noise is automatically attenuated. This makes the controller inherently risk-averse, reducing exploratory variance during sensitive periods to prioritize system stability.
Emergency override mechanism: As a final failsafe, an emergency override protocol is implemented to completely suppress action noise if a critical instability is detected. This mechanism ensures that fish welfare and system safety are prioritized above continued exploration in unforeseen or extreme circumstances, providing a crucial layer of operational robustness.

The complete real-time inference pipeline, incorporating these forward pass optimizations and adaptive noise management for operational safety, is summarized in Algorithm 4.

Deployment validation framework

To rigorously evaluate the performance and practical viability of the edge-optimized DDPG controller, a comprehensive validation framework was established. This framework is designed to assess the system’s capabilities across a spectrum of relevant hardware and to quantify its performance against key operational metrics. The methodology ensures that the validation results are both robust and directly applicable to real-world deployment scenarios.

Multi-platform testing environment

A central tenet of our validation strategy is the assurance of broad deployment compatibility. To this end, the controller was deployed and tested across three distinct edge computing platforms, each selected to represent a key archetype in potential aquaculture deployment scenarios:

Cost-effective deployment (Raspberry Pi 4B): This platform, featuring an ARM Cortex-A72 1.5 GHz quad-core processor and 4 GB of LPDDR4 RAM, serves as a baseline for accessible, low-cost implementations.
Accelerated deployment (NVIDIA Jetson Nano): With its ARM Cortex-A57 1.43 GHz quad-core processor and an integrated 128-core Maxwell GPU, the Jetson Nano represents scenarios where moderate computational acceleration is available for more demanding tasks.
Industrial deployment (Industrial PLC): A Beckhoff CX5140, equipped with an Intel Atom processor and 4 GB of RAM, was chosen to represent ruggedized, industrial-grade hardware commonly found in commercial automation and SCADA systems.

Performance benchmarking protocol

A standardized and repeatable benchmarking protocol was employed on each platform to conduct a comprehensive performance evaluation. This protocol is designed to assess both the computational efficiency of the lightweight architecture and its real-time control effectiveness through four key analytical components:

Inference latency analysis: To quantify real-time responsiveness, the statistical distribution of inference times was established by executing 1,000 consecutive forward passes under varying system loads. This assesses the system’s ability to consistently meet critical response deadlines.
Memory footprint profiling: To verify long-term stability, system memory usage was continuously monitored during extended operational periods. This profiling included analysis of RAM consumption, cache performance, and garbage collection overhead to identify potential memory leaks or inefficiencies.
Thermal performance assessment: The thermal design adequacy was validated by monitoring device temperatures during sustained, high-load operation. This is critical for ensuring reliability and longevity in industrial aquaculture environments, which may lack active cooling.
Power consumption analysis: To evaluate energy efficiency, power consumption was precisely measured across different operational modes. The results are used to quantify the system’s low-power advantages and to estimate potential battery life for remote or off-grid deployments.

Comparative analysis framework

To conduct a rigorous and holistic evaluation of the edge-optimized DDPG controller, a multi-faceted comparative analysis framework was established. This framework is designed to move beyond a singular focus on computational performance and instead provide a balanced assessment across three critical domains: the preservation of core control effectiveness, the quantification of computational efficiency gains, and the analysis of practical economic viability. This approach allows for a comprehensive understanding of the real-world trade-offs inherent in deploying a lightweight model compared to its full-scale counterpart.

Control performance metrics

The primary objective is to verify that the substantial architectural and computational optimizations do not compromise the agent’s fundamental ability to manage the aquaculture environment effectively. To this end, the evaluation preserves the key performance indicators established in our previous work. These metrics assess the controller’s precision, stability, robustness, and biological impact:

Tracking accuracy: The Root Mean Square Error (RMSE) between the commanded and achieved feeding rates is used to measure control precision, with the goal of maintaining performance below the 12% RMSE target of the original implementation.
Water quality stability: The agent’s effectiveness is assessed by the percentage of time that critical parameters (dissolved oxygen, pH, and ammonia) are maintained within their optimal biological ranges.
System responsiveness: To evaluate robustness, the system’s recovery time following significant environmental perturbations (e.g., temperature fluctuations, dissolved oxygen drops) is measured.
Feed conversion efficiency: The ultimate biological and economic impact is measured by the ratio of feed consumption to biomass gain, directly comparing the efficiency of the edge-deployed model against the full-scale DDPG results.

Computational efficiency assessment

Complementing the control performance analysis, a set of edge-specific metrics is used to precisely quantify the computational benefits derived from our optimization strategies. These metrics characterize the improvements in terms of model size, memory usage, execution speed, and power consumption:

Parameter reduction ratio: The total number of trainable network parameters is compared between the original and lightweight implementations to quantify the degree of model compression.
Memory footprint: Both peak and average RAM consumption are profiled during the training and inference phases to assess the system’s suitability for resource-constrained hardware.
Inference speed: Real-time capability is evaluated by measuring the average and 99th percentile response times across the different edge computing platforms.
Energy efficiency: A critical metric for remote deployments, energy efficiency is quantified as the computational energy expended per control decision, measured in millijoules per inference.

Economic viability analysis

Finally, the framework incorporates a cost-benefit analysis to translate the technical results into a clear assessment of economic feasibility for aquaculture operators. This analysis encompasses not just initial investment but also long-term operational value:

Hardware cost reduction: The direct capital savings are evaluated by comparing the cost of low-power edge devices ($50–$200) against the high-end GPU-accelerated systems ($5,000–$50,000) required for the full-scale model.
Deployment complexity: The analysis also considers indirect costs by assessing factors such as installation time, the level of technical expertise required, and ongoing maintenance overhead.
Operational benefits preservation: Crucially, this component validates that the economic benefits demonstrated in our previous work—specifically, feed cost reductions and improved growth rates—are preserved under edge deployment, confirming that the hardware savings do not come at the cost of reduced operational value.

Experimental design and validation protocols

To ensure the scientific rigor of our findings, a detailed experimental design was implemented, combining a high-fidelity simulation environment with a robust statistical analysis framework. This two-part methodology allows for the controlled, repeatable testing of the edge controller under a wide range of conditions and provides the analytical tools necessary to draw statistically significant conclusions from the resulting data.

High-fidelity simulated RAS environment

All validation experiments were conducted within a high-fidelity simulation of a recirculating aquaculture system (RAS). This virtual environment accurately replicates the complex, non-linear dynamics of real-world aquaculture operations while affording the precise control necessary for systematic testing. Within this simulation, a comprehensive suite of experiments was designed to assess the controller’s performance across three critical dimensions: scalability, environmental robustness, and fault tolerance.

Multi-scale testing: To evaluate the controller’s scalability and its applicability to diverse operational contexts, its performance was validated across three distinct operational scales: a laboratory-scale system (1,000 L), a pilot-scale system (10,000 L), and a commercial-scale system (50,000 L).
Environmental perturbation simulation: To assess the controller’s robustness against common, real-world biological and chemical stressors, the system was subjected to systematic environmental perturbations. These included controlled temperature fluctuations (± 3 °C), sudden dissolved oxygen variations (± 1.5 mg/L), and acute ammonia spikes (0.1–0.3 mg/L).
Fault injection testing: To specifically probe the system’s resilience to failure modes characteristic of edge computing deployments, a series of fault injection tests were performed. These tests simulated common hardware and network issues, including sensor drift,

Statistical analysis framework

The data generated from these comprehensive experimental protocols were subjected to a rigorous statistical analysis to ensure the reliability and significance of the comparative results. This framework employs a suite of statistical tools, each chosen for a specific analytical purpose:

Paired t-tests: Utilized for the direct, head-to-head comparison of control performance metrics (e.g., RMSE, stability) between the original and the lightweight DDPG implementations under identical experimental conditions.
ANOVA analysis: A multi-factor analysis of variance (ANOVA) was employed to deconstruct the sources of variance in system performance and to assess the independent and interactive effects of key factors such as edge platform choice, operational scale, and environmental conditions.
Bootstrap confidence intervals: For key performance metrics that did not conform to a normal distribution, such as computational timing data, non-parametric bootstrap techniques were used to estimate robust confidence intervals.
Effect size calculation: To move beyond mere statistical significance and evaluate the practical importance of our findings, Cohen’s d was computed. This allows us to quantify the magnitude of the performance differences, ensuring our conclusions are based on meaningful, real-world impact.

Results and discussion

Architectural optimization performance

The lightweight DDPG architecture achieved substantial computational reductions while maintaining effective control performance across all tested configurations. Comparative analysis between the original full-scale implementation and the edge-optimized version reveals significant efficiency gains with acceptable performance trade-offs.

Network compression analysis

The compact neural network architecture reduced total parameters from 242,401 in the original implementation to 8,706 in the lightweight version, representing a 96.4% parameter reduction. The actor network compression (121,201 → 4,353 parameters) and critic network optimization (121,200 → 4,353 parameters) maintained representational capacity sufficient for effective policy learning while dramatically reducing memory requirements.

The comprehensive architectural comparison is detailed in Table 2, which quantifies the dramatic reductions achieved across all system components. The table reveals that both actor and critic networks achieve identical 96.4% parameter reduction through strategic layer dimension optimization from 400→300 hidden units to 64→32 units. The total parameter count reduction from 242,401 to 8,706 parameters demonstrates the effectiveness of the lightweight design while preserving the fundamental actor-critic architecture.

Beyond network parameters, Table 2 shows critical infrastructure reductions including model size compression from 15.2 MB to 2.1 MB (86.2% reduction), enabling deployment on devices with limited storage capacity. The replay buffer optimization represents a 95% reduction in memory requirements from 100,000 to 5,000 transitions, while the overall memory footprint decreases from 384 MB to 47 MB (87.8% reduction). These reductions collectively enable deployment on edge devices with typical RAM constraints of 4–8 GB while preserving sufficient computational headroom for other system processes.

Table 2 Architectural comparison between original and lightweight DDPG implementations.

Full size table

The architectural optimization maintained critical functionality while achieving dramatic resource reductions. Layer normalization replacement with batch normalization improved CPU utilization by 23% on ARM processors, while the modified tanh activation function preserved feeding rate constraints within [0.45, 0.65] bounds established in our previous work.

Training convergence analysis

Despite reduced network capacity, the lightweight DDPG achieved stable convergence with only marginal performance degradation compared to the original implementation. Training progression analysis over 500 episodes demonstrates effective policy learning with optimized hyperparameters.

The training performance comparison is comprehensively analyzed in Fig. 5, which provides detailed insights into the learning dynamics of both implementations. Figure 5a presents the convergence trajectories for original versus lightweight DDPG, showing remarkably similar learning curves despite the 96.4% parameter reduction. The lightweight implementation converges to stable performance within 300 episodes compared to 250 for the original, with final episode rewards reaching 245 ± 15 units versus 262 ± 12 units. The convergence trajectory demonstrates that the architectural optimizations successfully preserve the fundamental learning capabilities while dramatically reducing computational requirements.

Figure 5b shows the action distribution analysis, which is critical for validating that the lightweight controller maintains appropriate feeding behavior patterns. The histogram comparison reveals that both implementations produce feeding rate distributions concentrated within the optimal range [0.45–0.65], with the lightweight version showing slightly more conservative behavior (mean = 0.52 vs. 0.54). This conservative bias actually enhances operational safety while maintaining feeding effectiveness, as confirmed by the biological performance metrics.

Figure 5c illustrates the reward component evolution, highlighting how the dual-objective optimization for both feeding efficiency and water quality management is maintained throughout training. The decomposed reward analysis shows that both feeding optimization and water quality maintenance components converge successfully in the lightweight implementation, with the dual-component structure preserved despite network compression. The water quality component shows slightly more variability initially but stabilizes at comparable levels, confirming that the architectural optimizations do not compromise the multi-objective nature of the control problem.

The lightweight implementation converged to stable performance within 300 episodes (vs. 250 for the original), with final episode rewards reaching 245 ± 15 units compared to 262 ± 12 units for the full-scale version. The 6.5% reduction in final performance represents an acceptable trade-off considering the 96.4% parameter reduction achieved.

Edge deployment validation results

Multi-platform performance assessment

Comprehensive validation across three edge computing platforms demonstrates consistent performance characteristics suitable for real-world aquaculture deployment. Each platform exhibited distinct advantages while maintaining operational compatibility with the lightweight DDPG framework.

The multi-platform performance assessment results are presented in Table 3, which provides detailed comparison across four key operational metrics for edge deployment validation. The Raspberry Pi 4B demonstrates the most balanced performance profile with inference times of 15.2 ± 3.1 ms, well within the critical < 50 ms real-time requirement for aquaculture control. Despite being the most cost-effective option at $75, it maintains competitive memory usage (47 ± 8 MB) and exceptional energy efficiency (3.2 ± 0.4 W), making it ideal for battery-powered remote installations where power consumption is critical.

The NVIDIA Jetson Nano shows superior inference performance at 8.7 ± 1.8 ms due to its integrated GPU acceleration, though this comes at the cost of higher power consumption (8.1 ± 1.2 W) and increased cost ($149). The slightly higher memory usage (52 ± 6 MB) reflects the GPU memory allocation overhead, but remains well within edge device constraints. This platform represents the optimal choice for applications requiring maximum computational performance within edge computing constraints.

The Industrial PLC (Beckhoff CX5140) provides the most ruggedized solution for harsh aquaculture environments, with inference times of 22.4 ± 4.2 ms that still meet real-time requirements. The industrial-grade reliability and extended temperature operating range justify the significantly higher cost ($890), while the lower memory usage (43 ± 5 MB) demonstrates efficient resource utilization. The higher power consumption (12.5 ± 2.1 W) is acceptable for installations with reliable power infrastructure.

The comparison with the original RTX 2080 Ti implementation highlights the dramatic efficiency improvements achieved through edge optimization. While the GPU system achieves faster inference (10.1 ± 0.8 ms), it requires > 5× more memory (256 ± 15 MB) and > 50× more power (175 ± 25 W) at 16× higher cost ($1,200), demonstrating the compelling value proposition of edge deployment for aquaculture applications.

Table 3 Edge platform performance comparison across key operational metrics.

Full size table

The Raspberry Pi 4B emerged as the optimal balance of performance, cost, and energy efficiency for most aquaculture applications. Inference times of 15.2 ± 3.1 ms consistently meet the < 50 ms real-time requirements while consuming only 3.2 ± 0.4 W, making it suitable for battery-powered remote installations.

Real-time control performance

Edge deployment validation demonstrates maintained control effectiveness across critical aquaculture parameters. The lightweight DDPG successfully preserved water quality management capabilities while operating within edge computing constraints.

The comparative control performance analysis is comprehensively detailed in Table 4, which quantifies the performance trade-offs between the original and edge-optimized implementations across seven critical metrics. The feeding accuracy shows minimal degradation from 96.1 ± 1.8% to 94.3 ± 2.1%, representing only a 1.9% relative reduction that remains well within acceptable operational thresholds. This slight reduction is compensated by the dramatic computational efficiency gains, maintaining practical feeding effectiveness while enabling widespread deployment accessibility.

The Root Mean Square Error (RMSE) increase from 11.71% to 12.08% represents a modest 3.2% relative degradation in tracking precision, remaining well below the 15% threshold typically considered acceptable for aquaculture control applications. This small increase in tracking error is offset by the improved system robustness and autonomous operation capabilities that edge deployment provides, particularly during network interruptions or centralized system failures.

Water quality management performance shows remarkable consistency, with overall stability decreasing only 1.1% from 97.2% to 96.1%. The individual parameter control demonstrates exceptional preservation: dissolved oxygen maintenance within the critical 6.0–7.2.0.2 mg/L range shows only 0.7% degradation (96.5% to 95.8%), pH control within 6.8–7.8 range maintains 96.4% effectiveness (0.8% reduction), and ammonia management below 0.3 mg/L threshold preserves 96.2% effectiveness (0.6% reduction). These minimal reductions in water quality control validate that the lightweight architecture successfully maintains the dual-objective optimization for both feeding control and environmental management.

The most significant trade-off appears in recovery time following environmental perturbations, increasing from 18.3 ± 2.4 min to 21.7 ± 3.2 min (18.6% increase). While this represents the largest relative degradation, the absolute difference of 3.4 min rarely impacts fish welfare or system stability, particularly given the enhanced fault tolerance and autonomous operation capabilities that edge deployment provides.

Table 4 Control performance comparison: original vs. edge-optimized DDPG.

Full size table

The edge-optimized implementation maintains > 94% of original control performance across all critical metrics. The modest 3.2% increase in tracking error (RMSE) and 1.1% reduction in water quality stability represent acceptable degradation considering the dramatic computational efficiency improvements achieved.

Economic impact and deployment analysis

Cost-benefit assessment

Economic analysis reveals substantial advantages for edge deployment, particularly for small to medium-scale aquaculture operations previously excluded from intelligent control system adoption due to cost barriers.

The comprehensive economic impact assessment is presented in Table 5, which demonstrates transformative cost-benefit improvements across three operational scales. For small-scale operations (1,000 L), the edge deployment approach achieves remarkable economic transformation, reducing total implementation costs from $18,600 ($15,400 hardware + $3,200 installation) to $1,275 ($475 hardware + $800 installation), representing a 93% cost reduction. Despite slightly lower annual savings ($4,200 vs. $4,850), the return on investment increases dramatically from 26% to 329%, with payback periods improving from 46.1 months to just 3.6 months. This transformation makes intelligent aquaculture control economically viable for small-scale operators who represent the majority of global aquaculture facilities.

For medium-scale operations (10,000 L), the economic advantages become even more pronounced. The total implementation cost reduction from $34,700 to $2,350 (93% reduction) combined with preserved annual savings of $35,600 results in an extraordinary ROI of 1,515% compared to 112% for the original implementation. The payback period of 0.8 months makes the investment decision practically risk-free, enabling rapid adoption across medium-scale commercial operations.

Large-scale operations (50,000 L) demonstrate the most dramatic absolute savings, with implementation costs reducing from $69,300 to $6,400 (91% reduction) while maintaining substantial annual savings of $172,800. The resulting ROI of 2,700% and 0.4-month payback period establish edge deployment as the clearly superior economic choice even for operations that could afford the original GPU-based implementation. The preserved operational benefits ensure that cost savings do not compromise biological or economic performance.

The economic analysis validates that edge deployment removes the primary barrier to intelligent aquaculture control adoption while preserving the operational benefits that justify the initial investment. The dramatic ROI improvements across all scales demonstrate that the technological advancement translates directly into practical economic value for aquaculture operators.

Table 5 Economic comparison across deployment scales and scenarios.

Full size table

The edge deployment approach achieves remarkable economic improvements, with ROI increasing from 26% to 329% for small-scale operations and payback periods reducing from 46.1 months to 3.6 months. This dramatic improvement enables profitable intelligent control system adoption for operations previously excluded by economic constraints.

Operational benefits preservation

Despite computational optimizations, the edge-deployed system preserves the key operational benefits demonstrated in our original research, including feed cost reductions, improved growth rates, and enhanced system stability.

The preservation of operational benefits under edge deployment is demonstrated in Fig. 6, which provides comprehensive analysis across three critical performance dimensions. Figure 6a presents the feed cost reduction analysis, showing that the lightweight DDPG maintains 18.2% feed cost savings compared to traditional control methods. The analysis includes both direct feed consumption metrics and feed conversion ratios across different operational scales, demonstrating that the architectural optimizations do not compromise feeding efficiency. The consistency of savings across small (1,000 L), medium (10,000 L), and large (50,000 L) systems validates the scalability of cost benefits.

Figure 6b illustrates the growth rate improvements, which are preserved at 24.5% above baseline performance despite the computational constraints. The biomass accumulation curves show that fish growth rates under edge-deployed control match those achieved with the original implementation, with statistical analysis confirming no significant difference (p > 0.05) in growth trajectories. This validation is crucial as it confirms that the cost savings from edge deployment do not come at the expense of biological performance, maintaining the fundamental value proposition of intelligent aquaculture control.

Figure 6c presents the water quality stability metrics, demonstrating consistent performance across all critical parameters. The radar plot shows maintenance rates for dissolved oxygen (95.8%), pH (96.4%), and ammonia management (96.2%) that closely match the original implementation performance. The temporal stability analysis reveals that water quality parameter excursions are rare and brief, with rapid recovery to optimal ranges. This consistency is essential for fish welfare and validates that the lightweight architecture preserves the dual-objective optimization capability for both feeding control and environmental management.

Feed conversion ratio improvements of 18.2% are maintained under edge deployment, representing significant ongoing operational savings. Growth rate enhancements of 24.5% above traditional control methods validate that the lightweight architecture preserves the biological effectiveness of the original DDPG approach.

Robustness and fault tolerance analysis

Environmental perturbation response

Edge deployment environments present unique challenges including temperature fluctuations, power instabilities, and communication interruptions. Comprehensive robustness testing validates system reliability under realistic operational conditions.

The fault tolerance analysis results are comprehensively presented in Table 6, which quantifies system performance under six distinct failure scenarios commonly encountered in edge computing and aquaculture environments. Temperature spike response (+ 5 °C) shows the most significant impact, with recovery time increasing from 18.3 ± 2.4 min to 21.7 ± 3.2 min (18.6% degradation). However, this response time remains well within acceptable bounds for aquaculture systems, where temperature changes typically occur gradually and the 3.4-minute difference rarely impacts fish welfare or system stability.

Dissolved oxygen sensor drift (± 0.5 mg/L) demonstrates robust performance with only 2.0% accuracy degradation (94.7% to 92.8%), validating the controller’s ability to maintain effective water quality management despite sensor reliability issues. This resilience is critical for remote deployments where sensor maintenance may be infrequent, and the edge controller must operate autonomously with potentially degraded sensor data.

Communication delay (30s) and processing constraint (50% CPU) scenarios show minimal impact with 1.3% and 2.8% performance degradation respectively, confirming that the lightweight architecture maintains effectiveness under computational stress and network latency typical of edge deployments. The processing constraint test particularly validates that the CPU-optimized design can operate effectively even when system resources are shared with other processes.

Power fluctuation (± 10%) testing shows 1.6% stability degradation, demonstrating acceptable resilience to electrical variations common in remote installations. This robustness reduces infrastructure requirements and enhances deployment flexibility in locations with unreliable power supply.

Most significantly, Table 6 reveals a new capability enabled by edge deployment: autonomous operation during network interruptions. The 98.3% autonomous operation effectiveness during 60-second network interruptions represents a fundamental advantage over centralized implementations, which cannot function during connectivity loss. This capability is particularly valuable for remote aquaculture installations where network reliability may be limited.

Table 6 Robustness testing results under various fault conditions.

Full size table

The edge implementation demonstrates robust performance under fault conditions, with degradation typically < 3% compared to optimal conditions. Notably, edge deployment enables autonomous operation during network interruptions, representing a new capability not available in centralized implementations.

Long-term stability assessment

Extended operation testing over 30-day periods validates the long-term stability and reliability of edge-deployed systems under continuous operation conditions typical of commercial aquaculture environments.

The long-term stability assessment is comprehensively presented in Fig. 7, which demonstrates the sustained performance characteristics of the edge-deployed system across multiple operational metrics. Figure 7a shows 30-day performance tracking with consistent control accuracy maintained throughout the extended operation period. The feeding accuracy remains stable at 94.3 ± 2.1% with no significant degradation over time, while water quality stability maintains 96.1 ± 1.5% consistency. The temporal analysis reveals no performance drift or system degradation, with coefficient of variation remaining below 2.5% for all critical metrics, confirming the robustness of the lightweight architecture for continuous operation.

Figure 7b presents detailed memory usage patterns throughout the 30-day operation, demonstrating stable resource utilization without memory leaks or gradual degradation. The memory footprint remains consistently at 47 ± 3 MB throughout the entire period, with periodic fluctuations corresponding to experience replay buffer cycling and garbage collection events. The absence of memory growth patterns validates the effectiveness of the compressed storage mechanisms and adaptive buffer management. Peak memory usage during intensive learning phases never exceeds 52 MB, well within the constraints of typical edge computing hardware.

Figure 7c illustrates temperature and power consumption profiles during extended operation, which are critical for assessing thermal management and energy efficiency in edge deployments. Device temperatures show a stable operational profile with maximum temperatures of 54 ± 4 °C during sustained operation, well within safe operating limits for commercial edge computing hardware. The power consumption analysis reveals consistent energy usage of 3.2 ± 0.4 W, with diurnal variations corresponding to feeding schedules and computational load changes. The thermal stability validates that passive cooling is sufficient for the lightweight implementation, reducing infrastructure requirements and enhancing deployment flexibility in diverse environmental conditions.

Memory usage remains stable at 47 ± 3 MB throughout extended operation, with no evidence of memory leaks or performance degradation. Temperature profiles show maximum device temperatures of 54 ± 4 °C during sustained operation, well within safe operating limits for commercial edge computing hardware.

Scalability and deployment considerations

Multi-scale performance validation

The lightweight DDPG architecture demonstrates consistent performance across operational scales from laboratory (1,000 L) to commercial (50,000 L) implementations, validating scalability for diverse aquaculture applications.

The multi-scale performance validation results are detailed in Table 7, which demonstrates remarkable consistency across three distinct operational scales with varying complexity and stocking densities. Laboratory-scale systems (1,000 L, 10 fish/m³) show optimal performance characteristics with inference times of 14.8 ± 2.9 ms, control accuracy of 94.8 ± 2.3%, and memory usage of 45 ± 6 MB. The lower stocking density and smaller system volume provide the most controlled environment, enabling the lightweight controller to achieve peak performance metrics.

Pilot-scale systems (10,000 L, 50 fish/m³) represent the most common aquaculture research and small commercial scale, showing minimal performance degradation with inference times of 15.2 ± 3.1 ms and control accuracy of 94.3 ± 2.1%. The 5× increase in system volume and stocking density has negligible impact on computational requirements, with memory usage increasing only slightly to 47 ± 8 MB. This consistency validates that the lightweight architecture scales effectively without requiring system-specific optimization.

Commercial-scale systems (50,000 L, 100 fish/m³) demonstrate robust scalability under maximum operational complexity, maintaining inference times of 15.7 ± 3.4 ms that comfortably meet real-time requirements. The control accuracy of 93.9 ± 2.4% represents only 0.9% degradation compared to laboratory scale, while memory usage increases minimally to 49 ± 7 MB. The high stocking density (100 fish/m³) represents intensive commercial conditions, yet the performance consistency validates the architecture’s suitability for large-scale deployment.

The narrow variance in inference times across scales (14.8–15.7 ms, < 1 ms difference) particularly demonstrates the computational efficiency of the lightweight design. Unlike traditional implementations where computational requirements scale with system complexity, the edge-optimized architecture maintains consistent resource utilization regardless of operational scale, enabling standardized deployment across diverse aquaculture facilities.

Table 7 Performance consistency across operational scales.

Full size table

Performance consistency across scales demonstrates the robust scalability of the edge-optimized approach. Inference times remain within narrow bounds (14.8–15.7 ms) regardless of system scale, while control accuracy shows minimal variation (93.9–94.8%), validating the architecture’s suitability for diverse deployment scenarios.

Discussion of implications

Technological and economic impact

The successful development of edge-compatible DDPG algorithms represents a significant advancement that bridges laboratory research with practical aquaculture deployment. The 96.4% parameter reduction while maintaining > 94% control performance demonstrates effective deployment of sophisticated reinforcement learning on resource-constrained hardware. This technological breakthrough extends beyond aquaculture, with methodologies for neural network compression and edge optimization being transferable to precision agriculture applications where computational constraints similarly limit AI adoption.

The economic transformation achieved through edge deployment fundamentally shifts the accessibility of intelligent aquaculture systems. Deployment costs reduced from $15,400 to $475 for small-scale operations enable access for > 80% of aquaculture facilities previously excluded by cost barriers, with ROI increasing from 26% to 329% for small operations. This democratization effect has significant implications for global food security and sustainable aquaculture development, particularly in developing regions where small-scale operations predominate.

Environmental benefits and sustainability

Edge deployment enhances environmental sustainability through dramatic energy reduction (3.2 W vs. 175 W for GPU-based systems) while maintaining operational improvements of 18.2% feed cost reduction and 24.5% growth rate enhancement. However, the distributed nature of edge computing introduces considerations regarding electronic waste management from deploying multiple devices across operations. The longer device lifespan (5–7 years for industrial edge devices vs. 2–3 years for GPU accelerators) and lower replacement frequency partially offset these concerns, while the dramatic energy reduction provides clear environmental benefits over the operational lifetime.

Limitations and critical scenarios

While the edge-optimized framework demonstrates robust performance across diverse conditions, certain limitations must be acknowledged. The 3.2% increase in tracking error and 18.6% longer recovery times may become critical during extreme environmental perturbations such as equipment failures, severe weather events, or disease outbreaks where rapid response is paramount. In scenarios with high stocking densities (> 150 fish/m³) or multi-species cultivation, the reduced network capacity may limit the controller’s ability to adapt to complex biological interactions.

Additionally, the reliance on specific edge hardware platforms introduces vendor lock-in concerns and potential supply chain vulnerabilities. Remote installations requiring model updates face connectivity constraints despite reduced bandwidth requirements, potentially limiting adaptive capabilities during extended isolation periods.

Performance validation and scalability

The edge-optimized framework preserves critical aquaculture management capabilities with 96.1% water quality stability and minimal degradation in dissolved oxygen (95.8%), pH (96.4%), and ammonia management (96.2%) compared to the original implementation. Real-time performance characteristics achieve inference times of 15.2 ± 3.1 ms on Raspberry Pi 4B hardware, consistently meeting < 50 ms response time requirements across operational scales from 1,000 L laboratory systems to 50,000 L commercial installations.

Future research directions

The established foundation opens several promising research avenues. Multi-agent reinforcement learning approaches could leverage distributed edge computing to optimize coordination between multiple RAS units. Transfer learning techniques could enable rapid adaptation of pre-trained models to diverse species and environmental conditions with minimal computational overhead. Integration with emerging edge AI accelerators (NPUs, ASICs) could further improve performance while reducing power consumption. Federated learning approaches could enable collaborative model improvement across deployments while preserving operational privacy.

Advanced sensor integration, including computer vision for fish monitoring and comprehensive environmental sensing arrays, could provide richer state representations without compromising real-time performance requirements. The demonstrated computational efficiency provides headroom for integrating additional sensing modalities.

Broader agricultural applications

The methodological advances have significant implications for precision agriculture where similar computational constraints limit AI adoption. Greenhouse climate control, livestock monitoring, and crop management applications could benefit from the neural network compression techniques and edge optimization strategies developed. The economic democratization could accelerate technology adoption across developing agricultural regions, establishing a template for intelligent agricultural systems that balance performance with practical deployment constraints.

Conclusion

This research demonstrates that advanced deep reinforcement learning algorithms can be successfully deployed on edge computing platforms for real-time aquaculture control while preserving performance benefits demonstrated in laboratory and commercial implementations. The 96.4% computational reduction with minimal performance degradation enables widespread adoption of intelligent aquaculture systems previously limited by computational and economic constraints. The preservation of critical control performance validates that edge deployment maintains biological and operational effectiveness while providing clear pathways for technology democratization across diverse operational scales.

Data availability

Data will be made available on reasonable request by the first author.

Abbreviations

AI:: Artificial intelligence
ANOVA:: Analysis of variance
ARM:: Advanced RISC machine
CPU:: Central processing unit
DDPG:: Deep deterministic policy gradient
DO:: Dissolved oxygen
GPU:: Graphics processing unit
IoT:: Internet of things
NPU:: Neural processing unit
PLC:: Programmable logic controller
RAS:: Recirculating aquaculture systems
RMSE:: Root mean square error
ROI:: Return on investment
SCADA:: Supervisory control and data acquisition
SIMD:: Single instruction, multiple data
TD:: Temporal difference

References

Elmessery, W. M. et al. A deep deterministic policy gradient approach for optimizing feeding rates and water quality management in recirculating aquaculture systems. Aquacult. Int. 33, 253 (2025).
Article Google Scholar
Zhang, X., Cao, Z. & Dong, W. Overview of edge computing in the agricultural internet of things: key technologies, applications, challenges. Ieee Access. 8, 141748–141761 (2020).
Article Google Scholar
Jouini, O. et al. A survey of machine learning in edge computing: Techniques, frameworks, applications, issues, and research directions. Technol. (Basel). 12, 81 (2024).
Google Scholar
Farag Taha, M. et al. Emerging technologies for precision crop management towards agriculture 5.0: a comprehensive overview. Agriculture 15, 582 (2025).
Article Google Scholar
Zhao, J. et al. Research status and development trends of deep reinforcement learning in the intelligent transformation of agricultural machinery. Agriculture 15, 1223 (2025).
Article Google Scholar
Lillicrap, T. P. et al. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015).
Aroba, O. J. & Rudolph, M. Systematic literature review on the application of precision agriculture using artificial intelligence by small-scale farmers in Africa and its societal impact. J. Infrastructure Policy Dev. 8, 8872 (2024).
Article Google Scholar
Senoo, E. E. K. et al. & Aritsugi, M. IoT Solutions with Artificial Intelligence Technologies for Precision Agriculture: Definitions, Applications, Challenges, and Opportunities. Preprint at (1894), (2024).
Iqbal, U., Davies, T. & Perez, P. A review of recent hardware and software advances in GPU-accelerated edge-computing Single-Board computers (SBCs) for computer vision. Sensors 24, 4830 (2024).
Article PubMed PubMed Central Google Scholar
Lucan Orășan, I., Seiculescu, C. & Căleanu, C. D. A brief review of deep neural network implementations for ARM cortex-M processor. Electron. (Basel). 11, 2545 (2022).
Google Scholar
Singh, M., Sahoo, K. S. & Gandomi, A. H. An intelligent-IoT-based data analytics for freshwater recirculating aquaculture system. IEEE Internet Things J. 11, 4206–4217 (2023).
Article Google Scholar
Petrosino, L. et al. dRAIN: A distributed reliable architecture for IoT networks. IEEE Internet Things J. 11, 1746–1760 (2023).
Article Google Scholar
Cheng, W. K., Khor, J. C., Liew, W. Z., Bea, K. T. & Chen, Y. L. Integration of federated learning and edge-cloud platform for precision aquaculture. IEEE Access (2024).
Singh, R. K., Berkvens, R., Weyn, M. & AgriFusion An architecture for IoT and emerging technologies based on a precision agriculture survey. IEEE Access. 9, 136253–136283 (2021).
Article Google Scholar
Han, S., Mao, H. & Dally, W. J. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).
Howard, A. G. et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).
Tan, M., Le, Q. & Efficientnet Rethinking model scaling for convolutional neural networks. in International conference on machine learning 6105–6114PMLR, (2019).
Xan, C. J., Nugroho, H., Eswaran, S. & Siang, T. F. Effective edge solution for early detection of rice disease on ARM-M microcontroller. IEEE Access (2024).
Fujimoto, S., Hoof, H. & Meger, D. Addressing function approximation error in actor-critic methods. in International conference on machine learning 1587–1596PMLR, (2018).
Schaul, T., Quan, J., Antonoglou, I. & Silver, D. Prioritized experience replay. arXiv preprint arXiv:1511.05952 (2015).
Hu, G., Zhang, W. & Zhu, W. Prioritized experience replay for continual learning. in 6th International Conference on Computational Intelligence and Applications (ICCIA) 16–20 (IEEE, 2021). 16–20 (IEEE, 2021), (2021).
Wang, F., Zhang, M., Wang, X., Ma, X. & Liu, J. Deep learning for edge computing applications: A state-of-the-art survey. IEEE Access. 8, 58322–58336 (2020).
Article Google Scholar
Kayan, H., Heartfield, R., Rana, O., Burnap, P. & Perera, C. Real-time anomaly detection for industrial robotic arms using edge computing. IEEE Internet Things J (2025).
Avalekar, U., Patil, D. J., Patil, D. S., Khot, P. & Prathapan, P. Optimizing agricultural efficiency: a fusion of Iot, AI, cloud computing, and wireless sensor network. Prof.(Dr.) Kesava, Optimizing Agricultural Efficiency: A Fusion of Iot, Ai, Cloud Computing, and Wireless Sensor Network (2024).
Kalyani, Y. & Collier, R. A systematic survey on the role of cloud, fog, and edge computing combination in smart agriculture. Sensors 21, 5922 (2021).
Article PubMed PubMed Central Google Scholar
Alnemar, A. M. et al. Energy optimization in large-scale recirculating aquaculture systems: Implementation and performance analysis of a hybrid deep learning approach. Aquac Eng 111, 102561 (2025).
Article Google Scholar
Aljehani, F., N’Doye, I. & Laleg-Kirati, T. M. Feeding control and water quality monitoring in aquaculture systems: Opportunities and challenges. arXiv preprint arXiv:2306.09920 (2023).
Chahid, A., N’Doye, I., Majoris, J. E., Berumen, M. L. & Laleg-Kirati T.-M. Fish growth trajectory tracking via reinforcement learning in precision aquaculture. arXiv preprint arXiv:2103.07251 (2021).
Aljehani, F., N’Doye, I. & Laleg-Kirati, T. M. Model-based versus model-free feeding control and water-quality monitoring for fish-growth tracking in aquaculture systems. IFAC J. Syst. Control. 26, 100226 (2023).
Article MathSciNet Google Scholar
Chen, F. et al. Design of an intelligent variable-flow recirculating aquaculture system based on machine learning methods. Appl. Sci. 11, 6546 (2021).
Article CAS Google Scholar
Zamora-Izquierdo, M. A., Santa, J., Martínez, J. A., Martínez, V. & Skarmeta, A. F. Smart farming IoT platform based on edge and cloud computing. Biosyst Eng. 177, 4–17 (2019).
Article Google Scholar
He, Q. et al. Edge computing-oriented smart agricultural supply chain mechanism with auction and fuzzy neural networks. J. Cloud Comput. 13, 66 (2024).
Article Google Scholar
Liang, T., Glossner, J., Wang, L., Shi, S. & Zhang, X. Pruning and quantization for deep neural network acceleration: A survey. Neurocomputing 461, 370–403 (2021).
Article Google Scholar
Kim, J., Chang, S. & Kwak, N. PQK: model compression via pruning, quantization, and knowledge distillation. arXiv preprint arXiv:2106.14681 (2021).
Chen, Y. et al. A deep neural network compression algorithm based on knowledge transfer for edge devices. Comput. Commun. 163, 186–194 (2020).
Article Google Scholar

Download references

Acknowledgements

The authors would like to acknowledge the Deanship of Graduate Studies and Scientific Research, Taif University, Kingdom of Saudi Arabia for funding this work. The research was funded by the Sustainable Development and Technologies National Programme of the Hungarian Academy of Sciences (FFT NP FTA).

Funding

The work was funded by the Deanship of Graduate Studies and Scientific Research, Taif University, Kingdom of Saudi Arabia.

Author information

Authors and Affiliations

Agricultural Engineering Department, Faculty of Agriculture, Kafrelsheikh University, Kafrelsheikh, Egypt
Wael M. Elmessery
Department of Machine Learning and Information Retrieval, Faculty of Artificial Intelligence, Kafrelsheikh University, Kafr Elsheikh, 33516, Egypt
Mahmoud Y. Shams
Department of Computer Science, Faculty of Science, Minia University, Minia, 61519, Egypt
Tarek Abd El-Hafeez
Computer Science Unit, Deraya University, Minia University, Minia, 61765, Egypt
Tarek Abd El-Hafeez
Institute of Environmental Management, Faculty of Earth Science, University of Miskolc, Miskolc- Egyetemváros, 3515, Hungary
Péter Szűcs & Mohamed Hamdy Eid
Geology Department, Faculty of Science, Beni-Suef University, Beni-Suef, 65211, Egypt
Mohamed Hamdy Eid
Department of Biology, College of Science, Taif University, P.O. Box 11099, Taif, 21944, Saudi Arabia
M. Alhumedi & Atef Fathy Ahmed
Agricultural Engineering Department, Faculty of Agriculture and Natural Resources, Aswan University, Aswan, 81528, Egypt
Abdallah Elshawadfy Elwakeel

Authors

Wael M. Elmessery
View author publications
Search author on:PubMed Google Scholar
Mahmoud Y. Shams
View author publications
Search author on:PubMed Google Scholar
Tarek Abd El-Hafeez
View author publications
Search author on:PubMed Google Scholar
Péter Szűcs
View author publications
Search author on:PubMed Google Scholar
Mohamed Hamdy Eid
View author publications
Search author on:PubMed Google Scholar
M. Alhumedi
View author publications
Search author on:PubMed Google Scholar
Atef Fathy Ahmed
View author publications
Search author on:PubMed Google Scholar
Abdallah Elshawadfy Elwakeel
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization, W.M.E., data curation, W.M.E., M.Y.S., T.A., formal analysis, W.M.E., M.Y.S., T.A., funding, P.S., M.H.E., M.A., A.F.A., investigation, P.S., M.H.E., M.A., A.F.A., A.E.E., methodology, W.M.E., project administration, W.M.E., A.E.E., resources, W.M.E., M.Y.S., T.A., software, W.M.E., supervision, W.M.E., A.E.E., validation, P.S., M.H.E., M.A., A.F.A., A.E.E., visualization, P.S., M.H.E., M.A., A.F.A., A.E.E., writing-original draft, W.M.E., writing - review and editing, W.M.E., A.E.E., All authors have read and agreed to the published version of the manuscript.

Corresponding authors

Correspondence to Mohamed Hamdy Eid or Abdallah Elshawadfy Elwakeel.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Elmessery, W.M., Shams, M.Y., El-Hafeez, T.A. et al. Lightweight deep deterministic policy gradient for edge computing in recirculating aquaculture systems: real-time feeding control with reduced computational requirements. Sci Rep 15, 37960 (2025). https://doi.org/10.1038/s41598-025-21677-0

Download citation

Received: 23 June 2025
Accepted: 23 September 2025
Published: 30 October 2025
Version of record: 30 October 2025
DOI: https://doi.org/10.1038/s41598-025-21677-0

Subjects

Abstract

Introduction

Research contributions

Paper organization

Related work

Reinforcement learning approaches in aquaculture control

Edge computing in agricultural applications

Hardware optimization and neural network compression

Gaps in current approaches

Comparative analysis

Materials and methods

Lightweight DDPG architecture design

Compact actor network architecture

Memory-efficient critic network

Edge computing optimization strategies

Memory-efficient experience replay

Distributed system integration architecture

Edge node architecture

Coordination protocol

Cloud integration

CPU-optimized training protocol

Real-time inference pipeline and fault tolerance testing

Fault simulation methodology

Environmental fault simulation

Communication fault testing

Hardware constraint simulation

Power system disturbances

Optimized forward pass implementation

Adaptive noise management

Deployment validation framework

Multi-platform testing environment

Performance benchmarking protocol

Comparative analysis framework

Control performance metrics

Computational efficiency assessment

Economic viability analysis

Experimental design and validation protocols

High-fidelity simulated RAS environment

Statistical analysis framework

Results and discussion

Architectural optimization performance

Network compression analysis

Training convergence analysis

Edge deployment validation results

Multi-platform performance assessment

Real-time control performance

Economic impact and deployment analysis

Cost-benefit assessment

Operational benefits preservation

Robustness and fault tolerance analysis

Environmental perturbation response

Long-term stability assessment

Scalability and deployment considerations

Multi-scale performance validation

Discussion of implications

Technological and economic impact

Environmental benefits and sustainability

Limitations and critical scenarios

Performance validation and scalability

Future research directions

Broader agricultural applications

Conclusion

Data availability

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article