A deep learning-driven multi-layered steganographic approach for enhanced data security

Sanjalawe, Yousef; Al-E’mari, Salam; Fraihat, Salam; Abualhaj, Mosleh; Alzubi, Emran

doi:10.1038/s41598-025-89189-5

Download PDF

Article
Open access
Published: 08 February 2025

A deep learning-driven multi-layered steganographic approach for enhanced data security

Yousef Sanjalawe¹,
Salam Al-E’mari²,
Salam Fraihat³,
Mosleh Abualhaj⁴ &
…
Emran Alzubi⁵

Scientific Reports volume 15, Article number: 4761 (2025) Cite this article

9901 Accesses
38 Citations
2 Altmetric
Metrics details

Subjects

Abstract

In the digital era, ensuring data integrity, authenticity, and confidentiality is critical amid growing interconnectivity and evolving security threats. This paper addresses key limitations of traditional steganographic methods, such as limited payload capacity, susceptibility to detection, and lack of robustness against attacks. A novel multi-layered steganographic framework is proposed, integrating Huffman coding, Least Significant Bit (LSB) embedding, and a deep learning-based encoder–decoder to enhance imperceptibility, robustness, and security. Huffman coding compresses data and obfuscates statistical patterns, enabling efficient embedding within cover images. At the same time, the deep learning encoder adds layer of protection by concealing an image within another. Extensive evaluations using benchmark datasets, including Tiny ImageNet, COCO, and CelebA, demonstrate the approach’s superior performance. Key contributions include achieving high visual fidelity with Structural Similarity Index Metrics (SSIM) consistently above 99%, robust data recovery with text recovery accuracy reaching 100% under standard conditions, and enhanced resistance to common attacks such as noise and compression. The proposed framework significantly improves robustness, security, and computational efficiency compared to traditional methods. By balancing imperceptibility and resilience, this paper advances secure communication and digital rights management, addressing modern challenges in data hiding through an innovative combination of compression, adaptive embedding, and deep learning techniques.

A Huffman code LSB based image steganography technique using multi-level encryption and achromatic component of an image

Article Open access 30 August 2023

Content-adaptive LSB steganography with saliency fusion, ACO dispersion, and hybrid encryption with ablation study

Article Open access 29 December 2025

Optimization-driven steganographic system based on fused maps and blowfish encryption

Article Open access 09 January 2026

Introduction

The digital world is experiencing rapid and exponential growth, necessitating secure communication as an indispensable requirement for safeguarding data from potential intrusions^1,2,3. Therefore, various technological methods have been suggested to address this concern, including cryptography, steganography, and Watermarking. Each approach has distinct advantages and limitations⁴. Cryptography is a method that utilizes encryption to ensure the security of data in transit, with its primary focus being on the confidentiality, integrity, and accessibility of the information. Steganography is a method to hide information within other media, such as images, video, audio, or text. Watermarking, as a specialized application of steganography, categorizes and safeguards the content of copyrighted media⁵. Nonetheless, Steganography possesses a distinctive benefit over Cryptography and Watermarking^6,7. Specifically, after data embedding, the media used for Steganography ostensibly remains identical to the original or cover media where the data is concealed⁸.

Recently, Deep Learning (DL) has emerged as a promising approach in steganography, offering novel methods for concealing and extracting information that is more resistant to detection⁹. Techniques such as Convolutional Neural Networks (CNNs), Generative Adversarial Networks (GANs), autoencoders, and other DL models have been utilized to develop steganographic systems that exhibit robustness against steganalysis^9,10,11,12, A detailed recent related works are presented in section “Related work”. As steganography advances, integrating DL techniques is expected to play a pivotal role in its future development. However, the key challenge lies in designing systems that demonstrate robustness to detection and exhibit efficiency and practicality for real-world applications. Despite these numerous steps forward in steganography, many challenges still exist with current approaches. There is, for example, one significant limitation regarding the payload capacity-security trade-off. Methods like LSB steganography are very simple and easy to implement but tend to be quite weak against steganalysis due to the relatively high level of modifications they introduce into the cover medium. While more sophisticated techniques-most of those falling into the category of deep learning methods-give higher security, they come with the following increase in computational complexity and are often prone to overfitting. Another significant problem is the limited capacity of many traditional methods, which restricts how much data can be hidden without significantly distorting the cover image. Besides, most current approaches cannot efficiently balance the imperceptibility of hidden data against their robustness to sustain possible attacks or modifications during transmission. The motivation for this paper is an attempt to bridge such gaps found in earlier research through the development of a new multilayered steganographic scheme that will be able to incorporate strengths from lossless compression, specifically Huffman encoding; efficient data embedding via LSB, and deep learning for robust security.

Traditional steganographic methods, such as LSB embedding, have long been used for data hiding due to their simplicity and ease of implementation. However, these approaches are inherently limited in several critical aspects, including payload capacity, robustness against attacks, and imperceptibility^13,14. LSB embedding, for example, is highly susceptible to steganalysis techniques and environmental distortions such as noise and compression, making it less reliable in scenarios requiring of levels security. DL integration introduces significant advancements in steganographic systems to address these challenges¹⁵. Deep learning models, such as encoder–decoder architectures and CNNs, provide adaptive capabilities that enhance the robustness of the embedded data. Unlike traditional methods that statically embed data, DL techniques dynamically adapt to the statistical properties of the cover medium, effectively mimicking its features. This adaptability ensures that the hidden data remains indistinguishable from the original, thus reducing detectability and increasing resilience against attacks. Furthermore, deep learning enables a superior balance between payload capacity and security. Traditional methods often compromise image quality when increasing the payload, whereas DL-based systems optimize embedding efficiency while maintaining high visual fidelity.

This paper proposes a multi-layered steganographic framework that combines Huffman coding, Least Significant Bit (LSB) embedding, and a deep learning-based encoder–decoder. While Huffman coding is traditionally utilized for its compression capabilities, its inclusion in our method serves a dual purpose. Beyond optimizing the size of the embedded payload, Huffman coding plays a critical role in enhancing security by obfuscating statistical patterns within the data. This added layer of randomness increases the robustness of the embedded information, making it less detectable by steganalysis techniques that rely on predictable data structures. The integration of Huffman coding aligns with our strategy to create a secure and invisible steganographic system. By introducing a preprocessing step that evaluates the feasibility of compression for small payloads, we ensure that Huffman coding is selectively applied, minimizing potential overhead while preserving its contribution to security. This approach demonstrates the complementary role of Huffman coding in the overall framework, emphasizing its significance beyond simple data compression. The novelty of the proposed method lies in its multi-layered approach, which integrates Huffman coding, LSB steganography, and a deep learning-based encoder–decoder. This combination enhances robustness and adaptability to diverse attack scenarios while maintaining imperceptibility. Unlike traditional single-layer methods, this hybrid framework introduces unique redundancy and statistical mimicry mechanisms to protect the hidden data effectively. The proposed approach strikes a balance between optimizing the security and capacity of steganographic systems without deteriorating the visual quality of the cover images. In proposing solutions for both challenges, the research enormously boosts steganography toward securing robust, efficient, and scalable solutions for various modern data-hiding challenges. The contributions of this paper can be summarized as follows:

1.
Innovative multi-layered framework: The method integrates Huffman coding, LSB steganography, and a deep learning-based encoder–decoder to enhance data security and storage efficiency. By combining these techniques, the framework ensures the high imperceptibility of cover images while improving robustness against steganalysis and various attack scenarios.
2.
Adaptive and robust data embedding: The deep learning-driven hiding network adaptively embeds secret information into cover images, closely mimicking their statistical properties. This significantly improves resistance to detection techniques, ensuring data robustness against noise, compression attacks, and statistical analyses.
3.
Enhanced security through dual-layer obfuscation: The method leverages Huffman coding for compression and as an additional layer of obfuscation by introducing statistical randomness in the data payload. This dual-layered approach significantly increases the resistance to unauthorized access and detection, making it highly suitable for secure communication and digital rights management applications.

The remaining paper’s structure is as follows: section “Related work” presents a review of recent related works in the field. section “Methodology” provides a detailed explanation of the methodology employed. Section “Experiments” outlines the experimental setup. Section “Results and discussion” presents and discusses the obtained results. Finally, section “Conclusion and future work” concludes the paper and highlights the potential for future research.

Related work

The term Steganography constructs from two Greek words, namely: steganos signifying “covered,” and graphein, which means “to write”. Essentially, it is the art and science of concealing information within other information, typically done so that the hidden information’s presence is not apparent. This technique is a potent method for covert communication, ensuring that the message’s existence is known only to the intended recipient¹⁶. The term of Steganalysis serves as a countermeasure to steganography, with its primary objective being detecting concealed information and potentially disrupting confidential communication¹⁷. Furthermore, several techniques for Steganography can be categorized based on two main categories, namely: (i) digital steganography and (ii) linguistic steganography, as depicted in Fig. 1.

Digital steganography leverages the characteristics of digital artifacts to disguise information. The first category, technical steganography, involves selecting a cover medium, which could be an image, text, video, audio file, or protocol. Subsequently, method-based steganography, a specific algorithm or approach, is chosen to conceal the information within the selected cover medium. The most common method-based steganography is the statistical method, a data-hiding technique employs statistical methods to conceal information within digital objects¹⁸. Several algorithms are utilized within this technique, including the Least Significant Bit (LSB) method, which manipulates the slightest bit in a byte to hide data. Similarly, Pixel-Value Differencing (PVD) uses the difference between pixel values, while Edge Map Data embedding (EMD) leverages the edges in an image for data concealment. Pixel Intensity (PI) methodology incorporates data in the intensity of the pixels, while the General Linear Model (GLM) utilizes linear relationships between variables to hide data. In addition, Quantization Index Modulation (QIM), a statistical steganography method, utilizes the index of a quantizer to entrench secret data into a cover signal^19,20.

In addition, with the increased progression in Artificial Intelligence (AI) over the past few years, machine learning (ML) and DL techniques have been incorporated into steganography methods. These techniques build upon learning models for the revelation of data hiding that is both efficient and secure. Some of the DL models used in steganography include GANs since they can generate new instances similar to the training set. Moreover, CNNs are used, as it is known that they are suitable for image processing and, therefore, it is possible to embed data into images. Finally, autoencoders are employed, and though they also learn data encoding unsupervised, they add to the repository of resources available in modern steganography. Also, the Quantum images method can be considered a new advancement within the steganography field, where quantum information is hidden in the images. This technique uses the standards of quantum mechanics that make it possible to enhance the level of protection in the concealment of information; it has the trends towards the progressive development of steganography in the future^21,22.

In contrast, linguistic steganography refers only to concealing the data at a linguistic level, within text or speech. For such categories of messages, this method of secret communication can be regarded as one of the oldest. It is broadly classified into two primary categories: semagrams and open codes. Semagrams, however, are a bit different from gliomas in their construction. Although they can be in the form of language, they are often graphics/ icons used to pass messages that do not require using alphabets or numerals. They are categorized as²³:

Visual semagrams: These include graphics, symbols, or pictographs that convey specific meanings or messages, such as road signs or symbols on maps.
Textual semagrams: These involve arrangements of words or letters that convey a hidden message. A prime example is an acrostic, where the initial letter of each line spells out a concealed word or phrase.

Open codes refer to messages that, on the surface, seem like standard communications but contain concealed meanings. One such method within open codes is the “Jargon Code,” which employs specific terminology or slang understood exclusively by a particular group, rendering the message cryptic to outsiders. On a different note, there are covered ciphers that use more obfuscated methods to conceal messages. The “Null Cipher” is a prime example, where the genuine message is embedded within a broader, seemingly harmless text. Another intriguing method in this category is the Grille Cipher, a technique for encrypting plaintext by writing it onto a sheet of paper through a pierced sheet, thus obscuring the original message uniquely and ingeniously^24,25. A novel methodology has been introduced in steganography using images, leveraging a deep convolutional autoencoder architecture that is both lightweight and effective. This architecture serves a dual purpose: a long-shrouded image is embedded into the cover image and excavated by a secret key from the secret image. The proposed methodology was evaluated using three distinct datasets: COCO, CelebA, and ImageNet, which are typical for benchmarking purposes. We used the peak signal-to-noise ratio (PSNR) ratio, which was accurately evaluated on the test data. As a result, the proposed method has a higher capacity for hiding information, better security robustness, and overall better performance than other learning methods²⁶. Besides that, a novel encoder–decoder architecture has been proposed based on CNNs. This architecture is designed to hide one image inside another and significantly increases both the capacity of the payload and the quality of the images. To that end, the researchers provided a new loss function capable of training the encoder–decoder network in a by-end EF manner with a high level of success across several diverse datasets, including but not limited to MNIST, CIFAR10, PASCAL-VOC12, and ImageNet. Their method proved to deliver State-Of-The-Art (SOTA) results in terms of payload capacity where in addition to having a high PSNR, an SSIM was also calculated and proved a significant improvement in the traditional methods that required manual creation of features for steganography²⁷.

In addition, a novel steganographic technique known as Adversarial Embedding (ADV-EMB) was proposed to work against machine learning-based steganalytic models, especially those based on CNNs. The ADV-EMB approach selectively replaces and rearranges image elements according to gradients derived from a target CNN step analyzer. This method hides the secret message and simultaneously gives way to deceive steganalysis algorithms by producing what they call ‘adversarial secret images’. Their experiments demonstrated that this technique could effectively deteriorate adversary-unaware and adversary-aware steganalysis performance, proposing a new paradigm in modern steganographic practices that can overcome powerful steganalysis attacks²⁸. In the study by Wu et al.²⁹, a deep CNN, which they dubbed ‘StegNet,’ is proposed and constitutes a leap forward in image steganography. Numerous NTFS methods focus comparatively on invisibility, and less has been affectionately considered for data throughput. But, StegNet is used to achieve an outstanding data decoding rate of 98.2% and a Bits Per Pixel (bpp) rate of 23.57, while only modifying the average cover image by the amount of 0.76% By exposing the mapping between cover and hidden images to learning through the entire process, this approach reveals a high level of robustness in regards to steganalysis. While the study in²⁹ offers a StegNet model based on a deep convolutional neural network approach, which focuses heavily on undetectability with less consideration for payload capacity, StegNet integrates recent DL techniques to achieve an exceptionally high decoding rate of 98.2% and a bits per pixel (bpp) rate of 23.57, while only modifying 0.76% of the cover image on average. the StegNet model is a learned map between the cover and embedded images from end-to-end that highlights its ability to remain undetectable by steganalysis.

At the same time, the latest developments in image steganography have been marked by integrating GANs for improved secrecy and robustness. Liu et al.³⁰ concentrate on numerous GAN strategies for image steganography, including cover alteration, selection, and creation. They exploit the Gans’s adversarial property to create segno images, which are more efficient against steganalysis. For instance, synthetic methods based on GANs have proved their effectiveness in generating imperceptible segno images with hidden messages inside them, which results in the maximal clandestine level of steganographic practices. The other approach was based on GANs, which enhanced the quality and capacity of steganography containing images without changing the cover image. Contrary to stealthy steganography techniques, alterations to the cover image are typically carried out, which can be identified by steganalysis tools. Their model capacity has a hidden payload of 2.36 bits per pixel, circumvents detection defences, and advances the steganography sector³¹. Further, Channel Attention Image Steganography With Generative Adversarial Networks introduces a channel attention mechanism within a GAN framework, dynamically adjusting embedding priorities based on attention weights and ensuring that modifications mimic natural image distributions to improve detection resistance. Together, these studies underscore the importance of adaptive techniques, channel-specific optimization, and deep learning frameworks in advancing the robustness and efficacy of image steganography systems³². Moreover, a novel steganographic algorithm using CycleGAN combines image-to-image translation with a steganalysis module to enhance the anti-detection ability of secret images. The approach leverages cycle consistency to preserve image quality while embedding secret data, demonstrating improved performance in resisting steganalysis and maintaining imperceptibility in IoT applications³³.

A new technique for character-level text image steganography based on adversarial attacks has been proposed to enhance secret information transmission security. This technique utilizes neural networks’ unique boundaries to embed coded information in the character regions of images, so the OCR software can’t detect it. The methodology involves the creation of adversarial examples that are recognizable by the intended local OCR model as a guarding mechanism against deciphering the embedded data with the help of unintended recipients. They obtained a high embedding success rate and, at the same time, retained the original appearance of text images by improving the adversarial sample generation process and introducing a validation model that ensured the low transferability of these samples³⁴. Further, a new model where Arabic text can be hidden with DL techniques was developed. This model of concealment of information through Arabic poems relied on a database of Arabic poetic pieces. It is based on LSTM networks and Baudot Code algorithm strategies that are used to raise the capacity level and enhance the linguistic accuracy of the embedded data³⁵. Furthermore, recent advancements in image steganography have significantly improved payload distribution strategies and detection resistance through adaptive and deep learning-based methods. Adaptive Payload Distribution in Multiple Images Steganography Based on Image Texture Features leverages image texture complexity to allocate higher payloads to regions with greater embedding capacities, thereby minimizing statistical anomalies and enhancing security³⁶. Similarly, A New Payload Partition Strategy in Color Image Steganography optimizes payload allocation across RGB channels by considering their distinct characteristics, achieving a balance that enhances imperceptibility and embedding efficiency³⁷.

Table 1 presents a comprehensive comparison of recent studies in steganography, highlighting their techniques, key features, advantages, and limitations to provide a clear understanding of the advancements and challenges in the field.

Table 1 Comparison of related works.

Subjects

Abstract

Similar content being viewed by others

A Huffman code LSB based image steganography technique using multi-level encryption and achromatic component of an image

Content-adaptive LSB steganography with saliency fusion, ACO dispersion, and hybrid encryption with ablation study

Optimization-driven steganographic system based on fused maps and blowfish encryption

Introduction

Related work

Methodology

Huffman-encoded

LSB algorithm

Encoder and decoder

Experiments

Experimental setup

Quality assessment metrics

Results and discussion

Quality analysis of the first layer

Quality analysis of the second layer

Security and robustness

Resistance to statistical attacks

Robustness against noise and compression

Security against unauthorized access

Effect of Huffman coding on small payloads

Comparison with existing methods

Computational complexity analysis

Conclusion and future work

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Appendix A

Appendix A

Rights and permissions

About this article

Cite this article

Share this article

Keywords

This article is cited by

Integration of artificial intelligence with information security: a systematic literature review

Secure edge-guided adaptive image steganography using HED-based attention maps and CNN

A hybrid steganography framework using DCT and GAN for secure data communication in the big data era

A scalable cloud-integrated AI platform for real-time optimization of EV charging and resilient microgrid energy management

Search

Quick links