Introduction

In histopathology, frozen sections are commonly used to obtain immediate diagnosis during surgeries or other procedures, as they are fast and inexpensive to process. However, frozen sections suffer from poor image quality and artifacts and might not provide enough information for a definitive diagnosis, as the tissue may not be fully preserved, and certain features might be difficult to identify1,2,3,4. On the other hand, permanent sections are typically of higher quality than frozen sections, with fewer artifacts and better preservation of tissue architecture. This can provide a more detailed analysis of the tissue, allowing for a more accurate diagnosis and potential identification of additional features. However, permanent sections require much more preparation time and typically cannot be processed during surgery, which may delay diagnosis and treatment. Therefore, developing a method to improve frozen section images is critical.The preparation of frozen sections involves rapidly freezing the tissue sample to preserve its structure and cellular details. This is typically done using a cryostat, which maintains the tissue at temperatures around −20 to −30 °C. The tissue is embedded in a gel-like medium, which consists of polyethylene glycol and polyvinyl alcohol. This medium ensures that the tissue remains stable and can be sliced into thin sections using a microtome. The rapid freezing process helps prevent the formation of most ice crystals that could damage the tissue, but it may still result in some artifacts and lower image quality compared to permanent sections.

In contrast, the preparation of permanent sections involves a more elaborate chemical process. The tissue is first fixed in formalin to preserve its structure and prevent degradation. It is then dehydrated through a series of alcohol baths, cleared with a solvent-like xylene, and finally embedded in paraffin wax. Once the wax hardens, the tissue block is sliced into thin sections using a microtome. These sections are then mounted on slides and stained to highlight different cellular components. The formalin fixation and paraffin embedding process provide excellent preservation of tissue architecture and cellular details, resulting in high-quality images suitable for detailed analysis.

Since frozen and permeant section chemical preparation processes are different, most available datasets include unpaired frozen and permanent section images, which makes the problem of learning and mapping between frozen to permanent section images challenging as there is no ground truth of a paired permanent section to compare to. Since frozen sections often contain blank regions, it is crucial not to introduce artificial data into these areas, as this could mislead pathologists. An example of frozen and permanent section images can be found in Fig. 1. As can be seen, the sub-nuclei features are lost in the frozen section image in comparison to the permanent section image. The nuclei texture is useful for cancer diagnosis, grading as well in prognosis prediction5. For example, the detailed analysis of nuclear morphometry and chromatin texture has been shown to provide critical insights into the diagnosis and prognosis of hepatocellular carcinoma6, highlighting the importance of precise nuclear characteristics in enhancing the accuracy of pathological assessments. These analyses can be done with permanent sections and cannot be done with frozen sections.

Fig. 1
figure 1

Examples of frozen (a) and permanent (b) section images. The frozen section has blank parts in the image, and the nuclei region almost does not contain details, compared to the permanent section. Scale bar: 50 μm.

With the rise of the generative adversarial networks (GANs)7, several methods have been proposed for image-to-image translation, such as Pix2Pix8 for paired image-to-image translation and StarGAN9 for multi-domain image-to-image translation. To address the unpaired image-to-image translation other methods, such as the CycleGAN8 and UNIT10, were proposed and have shown a great promise in transferring images from one domain to another, also in the medical domain11,12,13,14,15. Recently, diffusion models16 were adapted for the task of image-to-image translation17,18,19.

In the histopathology domain, a few works have generated permanent section from frozen section images20 using CycleGAN with perceptual loss21, contrastive unpaired translation (CUT)22, and stain normalization (SN)23. Specifically, Ozyoruk et al.24 suggested a GAN-based model including a spatial attention block (SAB) in the generator and self-regularization (SR) loss to preserve clinically relevant features and retain the content of the input frozen images. However, using these methods, the generation of detailed texture and content inside the nuclei region, as it should appear in permanent section images, is still a challenging task, missing important diagnostic details. Additionally, these approaches generate content even in the blank regions of the frozen sections, introducing artificial information that could potentially mislead pathologists.

In this paper, we propose a new approach to enhance frozen section images by guidance of permanent section images using a combination of CycleGAN and a unique attention method. This new approach, called Segmented Attention Network (SAN), is based on nuclei segmentation that pushes the model to learn the details inside the nuclei region as they appear in permanent images, enhancing the frozen images to have richer data, especially in the nuclei area, yet without adding misleading details to the empty areas in the frozen section images. To obtain this, we developed a two-step training procedure, incorporating both the original pair of images and the nuclei-segmented pairs. While the task of the first step is to translate frozen images into permanent images, the second step acts as a hard-attention mechanism, transferring nuclei-segmented frozen images into nuclei-segmented permanent images. By incorporating nuclei-segmented pairs of frozen and permanent images in the training process and adding additional loss for this process, our approach is pushing the discriminator to focus more on the nuclei region, which, in turn, encourages the generator to generate more detailed content within this region, producing permanent images more accurately in these challenging regions.

Methods

Data collection and preprocessing

The data used to train our models contained breast, colon, and kidney cancer slide images, obtained from The Cancer Genome Atlas (TCGA), a publicly funded project initiated by the National Cancer Institute and the National Human Genome Research Institute. Overall, we used 46,912 pairs of frozen and permanent images for breast, 25,362 pairs for colon, and 13,691 pairs for kidney for training. These pairs were prepared for each cancer type, where each pair of frozen and permanent slides was acquired from the same patient to get the best correspondence as fully paired (same tissue slice), but exact frozen-permanent pairs were not available due to different preparation protocols. To address the challenge of achieving better paired correspondence, renal tumor tissues were obtained via the Sheba Tissue Bank from patients who underwent partial nephrectomy under informed consent. Samples were taken from three kidney tumor resections (partial nephrectomies), all of which were diagnosed as conventional renal cell carcinoma. A central slice of each tumor was selected and divided into two adjacent tissue segments. One segment was submitted for frozen section, oriented with the cut surface facing up. The other half was formalin-fixed and embedded in paraffin with the cut surface facing down, to ensure optimal alignment between the frozen and permanent sections for comparative evaluation. Trimming of the specimens was minimized to preserve tissue integrity. Frozen and permanent section slides were prepared according to the standard protocols of the institute of Pathology at Sheba Medical Center. By maintaining patient-level pairing, we address the validation challenge of unpaired images, encouraging our model to learn clinically relevant transformations specific to each disease type rather than arbitrary tissue changes. We used the images with the highest resolution and cropped each of them into patches of 256 × 256 pixels, while removing images with significant blank parts. Afterward, the data was shuffled.

Segmented attention network (SAN) for frozen section image enhancement

Our goal is to utilize permanent images in order to improve frozen section images, placing particular emphasis on the nuclei regions to capture critical details with high accuracy.

The image-to-image translation from frozen images into permanent images was performed by CycleGAN.

We denote \(\:f\in\:F\), where \(\:f\) is a frozen section image taken from the frozen section set \(\:F\), and \(\:p\in\:P\), where \(\:p\) is a permanent section image taken from the permanent section set \(\:P\). The CycleGAN consists of two generators, \(\:{G}_{P}\:\)and \(\:{G}_{F}\), and two discriminators, \(\:{D}_{P}\:\)and \(\:{D}_{F}\). The generator \(\:{G}_{P}\:\)learns to map \(\:{G}_{P}\::\:F\to\:P,\:\)and the generator \(\:{G}_{F}\) learns to map \(\:{G}_{F}\::\:P\to\:F\). Each of the generators has the architecture of Attention U-Net25,26, as it improves the model sensitivity with minimal computation overhead. The two CycleGAN discriminators, \(\:{D}_{P}\:\)and \(\:{D}_{F},\) learn to distinguish between the real and generated images in their respective domains of permanent and frozen section images subset, respectively. Each of the discriminators has the architecture of Resnet-1827. Our loss function is:

$$\:\mathcal{L}\left({G}_{F},{G}_{P},{D}_{F},{D}_{P}\right)={\mathcal{L}}_{GAN}\left({G}_{P},{D}_{P},F,P\right)+{\mathcal{L}}_{GAN}\left({G}_{F},{D}_{F},P,F\right)+\lambda\:{\mathcal{L}}_{cyc}\left({G}_{P},{G}_{F},F,P\right)$$
(1)

where \(\:{\mathcal{L}}_{GAN}\) is the adversarial loss, \(\:{\mathcal{L}}_{cyc}\) is the cycle-consistency loss, and \(\:\lambda\:\) is a regulation weight to control the relative importance of the two objectives.

To improve the generation of frozen to permanent images specifically on the nuclei regions, we introduce a new loss function:

$$\:\mathcal{L}\left({G}_{F},{G}_{P},{D}_{F},{D}_{P}\right)={\mathcal{L}}_{GAN}\left({G}_{P},{D}_{P},F,P,{{F}_{seg},P}_{seg}\right)+{\mathcal{L}}_{GAN}\left({G}_{F},{D}_{F},P,F,{{P}_{seg},F}_{seg}\right)+\lambda\:{\mathcal{L}}_{cyc}\left({G}_{P},{G}_{F},{{F}_{seg},P}_{seg}\right),$$
(2)

where \(\:{F}_{seg}\) and \(\:{P}_{seg}\) are the nuclei segmented images of the frozen and permanent images, respectively, which include only the nuclei contents in the image, as can be seen in Fig. 2.

Fig. 2
figure 2

Examples of nuclei segmentation using StarDist on frozen and permanent patches. For the nuclei segmentation, we used the pre-trained StarDist model. Scale bar: 20 μm.

First, nuclei segmentation is done on the permanent and frozen patches using the StarDist model28, pre-trained on H&E nuclei segmentation. Next, we train the model in two steps. The first step is done on the original pairs of \(\:F\) and \(\:G\), and the second step is done on the nuclei-segmented pairs of \(\:{F}_{seg}\) and \(\:{P}_{seg}\). See full algorithm flow on Table 1.

Table 1 Algorithm flow—segmented attention network (SAN).

Applying the procedure appearing in Table 1 ensures that the discriminator will learn how to distinguish between the cell nuclei in the frozen-section domain and in the permanent-section domain in hard-attention manner, while pushing the generator to improve the cell nuclei details in the enhanced frozen section image. A block-diagram of the network flow is shown in Fig. 3.

Fig. 3
figure 3

Block-diagram of the proposed SAN method using CycleGAN architecture with dual-level training. The method trains simultaneously on two image pairs: original frozen/permanent sections (upper path) and nuclei-segmented frozen/permanent sections (lower path) through shared generators and discriminators. The CycleGAN framework ensures bidirectional transformation with adversarial and cycle consistency losses. The nuclei-segmented path acts as a hard-attention mechanism, compelling the generator to prioritize nuclear detail enhancement during training.

Model configuration

We trained the models on ASUS TUF Dash F15 i7 PC, with NVIDIA GeForce RTX 3060 GPU. The model configuration including the hyper-parameters used for each method are presented in Table 2.

Table 2 Model configuration for each method.

Results

Grad-cam evaluation

Our method pushes the discriminator to distinguish better on the feature of interest, which in our case is the nuclei region, and therefore to make the generator generate more reliable content in this region of interest. We first used Grad-cam29 to examine the influence of our method on the discriminator. Grad-cams are a visualization method that highlights the regions of an image that contribute the most to a neural network’s output, helping explain why a particular decision was made by the network. We compared CycleGAN with Attention UNET & Resnet, and SAN both with discriminator architecture of Resnet-18. We generated Grad-cams on the discriminator \(\:{D}_{P}\). Figure 4 visually demonstrates that our method, SAN, puts more attention on the nuclei region, which pushes the generator to generate better nuclei content.

Fig. 4
figure 4

Grad-cam comparison. Examples of permanent section patches and the Grad-cam maps of the discriminator \(\:{D}_{P}\). The Grad-cam maps on the second and third rows are for the discriminators of CycleGAN with Attention UNET & Resnet and SAN (our method), respectively. Using our method, SAN, the discriminator learns to put more focus on the nuclei parts of the image, while pushing the generator to put more efforts in generating more details in the nuclei regions as they should appear in a permanent section image. Scale bar: 20 μm.

FID score evaluation

To evaluate the similarity of our generated permanent section images to the unpaired permanent section targets, we used Frechet Inception Distance (FID)30, measuring the Wasserstein distance31 between multivariate Gaussian distributions fitted to feature representations of the generated and real images, typically using features from an Inception v3 net trained on ImageNet. FID can be computed from the mean and covariance of Inception activations on the generated and real images. Lower FID indicates greater similarity between the generated and real image distributions, making it useful for evaluating improvements in image generation quality for GANs and other generative models. The FID score comparison between our method, the CycleGAN, and CycleGAN with Attention UNET & Resnet on the test data is presented in Table 3.

Table 3 Comparison of FID scores between cyclegan, cyclegan with attention UNET & Resnet and SAN (ours), using tissue sections from breast, colon, and kidney cancers.

Visual inspection

First, to verify that our SAN model succeeds in generating both permanent section patches from frozen section patches and nuclei-segmented permanent section patches from nuclei-segmented frozen section patches using the same model, we visually inspected the full test data, with one example shown in Fig. 5.

Fig. 5
figure 5

Example of generated permanent section patch from frozen section patch and nuclei-segmented permanent section patch from nuclei-segmented frozen section patch using the same model.

For a full comparison, we present visual comparisons of the generated permanent section images using the CycleGAN, CycleGAN with Attention UNET & Resnet, and SAN, focusing on the breast, colon, and kidney tissue samples. Examples of the generated permanent section images alongside their corresponding frozen section counterparts are provided in Fig. 6. These examples illustrate the differences in image quality and detail preservation among the methods. CycleGAN tends to produce images with noticeable artifacts and a lack of fine details, particularly within the nuclei region. The CycleGAN with Attention UNET & Resnet shows a slight improvement in preserving details and avoiding artifacts, but still tends to inaccuracy in representing the structures within the tissue samples. In contrast, our proposed SAN method significantly enhances the frozen section, especially within the nuclei region, providing clearer and more diagnostically relevant images compared to the other methods.

Fig. 6
figure 6

Example of generated breast permanent section images in zoom-in with CycleGAN, CycleGAN with Attention UNET & Resnet and with SAN (ours). Scale bar: 20 μm.

Nuclei texture analysis

To evaluate the content of the enhanced nuclei regions, we quantitatively analyzed the texture of kidney tissue images by comparing the segmentation of nuclei across frozen sections, permanent sections, and the enhanced frozen sections produced using our SAN method and CycleGAN with Attention UNET & Resnet. The focus is on extracting Gray-Level Co-occurrence Matrix (GLCM)32 features, which are crucial for evaluating texture patterns related to diagnostic markers in histopathology.

For this analysis, we segmented the nuclei regions in all test images (frozen, permanent, and generated sections) and computed GLCM features from 14 × 14-pixel crops taken exclusively from within the nuclei. This ensured that our texture analysis focused on the most diagnostically relevant regions. We calculated GLCM features using distances of 1, 2, 3, 4, 5, and 6 pixels and angles of 0°, 45°, 90°, and 135°. The features analyzed include contrast, correlation, energy, and homogeneity, which provide a comprehensive statistical description of the textures within the nuclei.

The GLCM feature statistics are summarized in Table 4 below, which presents the mean and standard deviation (Std) for each feature across the permanent sections, frozen sections, CycleGAN with Attention UNET & Resnet generated images, and SAN enhanced images.

Table 4 GLCM features statistics of permanent sections, frozen sections, cyclegan with attention UNET & Resnet and SAN (ours) from kidney tissues.

These GLCM features provide important insights into the similarity of textures between the real permanent sections and the generated sections. Notably, the SAN method demonstrates superior performance, producing GLCM feature values that closely match those of the permanent sections, particularly in terms of contrast and energy, which are important for capturing diagnostic texture details.

This can also be seen in the outline of the GLCM features histograms, presented in Fig. 7.

Fig. 7
figure 7

Histograms outline of the GLCM features: contrast, correlation, energy, and homogeneity of permanent sections, frozen sections, and the generated sections: CycleGAN with Attention UNET & Resnet and SAN (ours) from kidney tissues.

In addition to the feature statistics, we also computed the Jensen-Shannon (JS) divergence33 to quantify the similarity of the GLCM feature distributions between each method and the real permanent sections. Table 5 presents the JS divergence for each GLCM feature:

Table 5 JS divergence for GLCM features between permanent sections and frozen sections, cyclegan with attention UNET and Resnet and SAN (ours) from kidney tissues.

The lower JS divergence values for SAN indicate that the textures generated by our method are closer to the real permanent sections. In particular, SAN achieves the lowest divergence in the image contrast feature (0.1651), indicating that it better reproduces the sharpness and variability of textures in the nuclei compared to both frozen sections and CycleGAN with Attention UNET & Resnet. In addition, SAN shows minimal divergence in the correlation feature (0.0057), suggesting that it accurately preserves the relationships between pixel intensities within the nuclei. Moreover, SAN exhibits improvements in the energy and homogeneity features over the other methods but still shows room for refinement in capturing smoothness and uniformity in the nuclei textures. Thus, the JS divergence analysis and GLCM feature statistics demonstrate that the proposed SAN method consistently outperforms both the frozen sections and CycleGAN with Attention UNET & Resnet in replicating the textures of the real permanent sections in the nuclei area. The significant reduction in divergence for contrast and correlation highlights SAN ability to capture fine textural variations within the nuclei, which are crucial for accurate tissue diagnosis. In contrast, CycleGAN with Attention UNET & Resnet shows higher divergence across all features, particularly for contrast and homogeneity, suggesting that it struggles to maintain the necessary texture details within the nuclei regions. Frozen sections also show relatively high divergence, especially in the contrast feature, further illustrating the limitations of the direct use of frozen section images for diagnostic purposes without enhancement.

Clinical analysis

To validate the clinical relevance and diagnostic utility of the proposed SAN method, we conducted a comprehensive pathological evaluation involving two independent senior pathologists, marked as Expert 1 (coauthor TK) and Expert 2 (coauthor AZ). They independently evaluated 99 matched histopathological image pairs from randomized kidney tissue samples using established criteria for frozen section assessment. Following standard pathological evaluation protocols, the pathologists assessed ten critical morphological parameters essential for accurate frozen section diagnosis: mitoses and apoptosis detection, cytoplasmic accumulation patterns, nucleolus characteristics, nuclear pleomorphism, nuclear-to-cytoplasm (NC) ratio, cellular number assessment, bubble artifact presence, stain quality, cell membrane definition, and nuclear membrane clarity. The evaluation methodology was based on common frozen section assessment rules: (1) technical quality including staining adequacy and artifact presence, (2) cellular detail analysis encompassing cytoplasm amount, nuclear size, NC ratio, nucleolus presence, mitotic activity, and membrane sharpness, (3) tumor identification and cell type confirmation, and (4) malignancy assessment criteria. Each parameter was evaluated on a binary scale (improved/not improved) comparing original frozen sections to SAN-enhanced images.

The pathological evaluation revealed several key findings regarding the clinical utility of SAN-enhanced images, directly corresponding to the established criteria for frozen section diagnosis (Table 6).

Table 6 Histopathological parameter assessment by two independent senior pathologists.
  1. (a)

    Technical quality improvements: According to the pathologists’ assessment, our SAN method significantly addressed the primary technical limitations of frozen sections. Stain quality received the highest mean score (98.5% ± 0.7%), indicating successful enhancement of eosinophilic cytoplasm staining that more closely resembled permanent section characteristics. Bubble artifact reduction showed marked improvement with a mean score of 87.9% ± 7.1%, with Expert 2 noting 92/99 cases improved (92.9%) compared to Expert 1 noting 82/99 cases (82.8%). The SAN-enhanced imaged demonstrated improved staining by more eosinophilic cytoplasm (closer to staining in permanent section) and reduction or elimination of bubble artifact provided more accurate pathological diagnosis.

  2. (b)

    Nuclear morphology enhancement: The evaluation demonstrated exceptional performance in nuclear-related parameters crucial for malignancy assessment. Nuclear pleomorphism achieved a mean score of 96.9% ± 1.5%, nuclear membranes showed 97.0% ± 1.4% enhancement, and nucleolus characteristics demonstrated 96.4% ± 3.6% improvement. The pathologists noted that nuclear membrane definition together with size and form of the nucleus help to confirm nuclear pleomorphism and that nuclear membrane definition, presence of nucleoli helps confirm nuclear atypia.

  3. (c)

    Cellular architecture assessment: Cell membrane definition showed substantial improvement with a mean score of 94.4% ± 3.6%, while nuclear-to-cytoplasm ratio assessment achieved 96.5% ± 0.7% enhancement. The pathologists emphasized that cell membrane clarity and nuclear membrane help to define NC ratio, which was a critical parameter for malignancy assessment.

  4. (d)

    Diagnostic significance: Nuclear pleomorphism together with nuclear atypia and NC ratio were helpful clues to confirm malignancy. The consistently high mean scores (> 95%) and low inter-pathologist variability (< 2% standard deviation) across these malignancy-related parameters demonstrated both the clinical relevance of SAN enhancements and strong inter-rater reliability for diagnostic accuracy.

Fig. 8
figure 8

Representative examples of pathological assessment criteria. (a) The original frozen section. (b) The SAN-enhanced image with annotations highlighting key diagnostic features. Red circles indicate nuclear inclusions, green circles mark cytoplasmic inclusions, orange circles highlight nucleoli, and blue circles identify bubble artifacts. Scale bar: 100 μm. 

Figure 8 demonstrates representative examples of pathologist evaluation criteria showing the original frozen section (a) and the SAN-enhanced image (b) with annotations highlighting key diagnostic features. The enhanced image demonstrates improved visualization of nuclear details, particularly nucleoli, while reducing bubble artifacts that can obscure diagnostic features.

To further assess the clinical impact of our SAN method, 826 patches from frozen kidney tumor sections obtained at Chaim Sheba Medical Center were processed and evaluated. Each frozen section patch was enhanced using our SAN model, and the paired original and enhanced images were assessed by a qualified pathologist (coauthor GG) for diagnostic utility.

The quantitative clinical assessment revealed substantial diagnostic improvements:

  1. (a)

    Nuclear detail enhancement: 74.9% of the SAN-enhanced patches showed significant improvement in diagnosis-relevant nuclear details, including better visualization of nuclear boundaries.

  2. (b)

    Cytoplasmic clarity: 49.1% of enhanced patches demonstrated improved cytoplasmic clearing, reducing the characteristic “smear of pink color” typical of frozen sections.

  3. (c)

    Cell border definition: 59.7% of patches showed enhanced cell boundary definition, providing clearer delineation between adjacent cells as typically seen in permanent sections.

These two pathologist evaluations demonstrates that our SAN method addresses critical limitations in frozen section diagnostics. The high scores for nuclear-related parameters (> 95% for most nuclear characteristics) directly support the method’s potential to improve rapid intraoperative diagnosis accuracy. The substantial improvement in nuclear detail visualization is particularly significant, as nuclear morphology assessment is fundamental to cancer diagnosis and grading protocols.

The inter-pathologist variability analysis reveals both the strengths and limitations of subjective pathological assessment, highlighting parameters where additional standardization may be beneficial. The consistently high performance across multiple tissue quality indicators suggests that SAN-enhanced images maintain diagnostic reliability while providing enhanced morphological detail.

These clinical validation results demonstrate that our SAN method produces diagnostically relevant improvements that are recognizable and valuable to practicing pathologists, supporting the translation of this technology into clinical workflows for enhanced frozen section diagnosis.

The extent of image enhancement achieved by our SAN method is further illustrated in Fig. 9, which presents side-by-side overlays of frozen and enhanced images, allowing for direct comparison of tissue quality and nuclear detail improvements.

Fig. 9
figure 9

Visual comparison of frozen and SAN-enhanced images using overlay presentation. (a) Original frozen section (left half) compared with SAN-enhanced image (right half), and (b) SAN-enhanced image (left half) compared with original frozen section (right half). (c) Difference map between frozen and SAN-enhanced images. The overlay visualization demonstrates the substantial improvement in nuclear detail, cytoplasmic clarity, and overall tissue architecture achieved by the SAN method while preserving the fundamental tissue structure. Scale bar: 100 μm. Colorbar presents grayscale level differences.

Discussion and conclusion

In this paper, we introduced a new approach, called SAN, designed to address image-to-image translation with guided attention to specific details, and applied it to the challenging task of enhancing frozen section images by utilizing permanent section image guidance with focus on the cell nuclei details. While frozen sections offer immediate diagnosis during surgeries, but often suffer from artifacts and poor image quality, making it challenging to provide an accurate diagnosis, permanent sections offer better image quality and preservation of tissue architecture, but are time-consuming to prepare. Our approach aimed to bridge this gap by integrating a unique attention mechanism, based on nuclei segmentation, into the CycleGAN model. This frozen-section image enhancement is done by pushing the model generator to provide detailed content within the nuclei regions. Our method introduced a two-step training procedure, incorporating both the original pair of images and the nuclei-segmented pairs. While the first step translates frozen section images into permanent section images, the second step acts as a hard-attention mechanism, transferring nuclei-segmented frozen images into nuclei-segmented permanent images, which leads to more accurate and detailed images that capture nuclei-level information. Our method was evaluated using data samples from breast, kidney, and colon tissues. The unique attention method based on nuclei-segmentation significantly improved the generator’s ability to capture and produce nuclei-level details, which are critical for precise diagnosis. Unlike previous approaches20,24 that apply general spatial attention mechanisms across entire images, our method introduces a nuclei-specific hard attention mechanism that directs enhancement exclusively to diagnostically critical nuclear regions. While Siller et al.20 demonstrated the acceptance of AI-enhanced frozen sections by pathologists, their approach lacks the targeted nuclear enhancement that is crucial for diagnostic accuracy. Our segmented attention network specifically addresses this limitation by ensuring that the most diagnostically relevant cellular features, nuclear morphology and chromatin patterns, are preserved and enhanced during the translation process. The proposed SAN method builds upon the framework of CycleGAN but incorporates attention mechanisms to prioritize the fidelity of nuclei features in the translated images. To further highlight the effectiveness of our approach, we conducted a comparative analysis of our method, SAN, with CycleGAN, and CycleGAN with attention UNET & Resnet. Our approach demonstrated its advantage in enhancing frozen section images with richer content in the nuclei region. Moreover, we used Grad-cam maps to visualize the impact of our method. The Grad-cam maps illustrated that our approach pushed the discriminator to focus more on the nuclei region, which, in turn, encouraged the generator to generate more detailed content within this region, producing permanent section images more accurately in these challenging regions. In addition, we evaluated the clinical improvements based on pathologist examination. While our results demonstrate improvements in image quality and detail preservation, several important considerations merit discussion. First, it is crucial to acknowledge the inherent challenge of working with unpaired frozen and permanent section images. We have obtained image pairs from the same patient, but the frozen and permanent section images are not exactly paired due to different chemical preparation protocols. The lack of direct correspondence between the two types of sections poses a fundamental limitation. As a result, there is ambiguity regarding whether the model has learned features from cancerous or non-cancerous areas, as the presence or absence of cancerous tissue areas may vary within individual patients. Addressing this issue would require meticulous labeling by pathologists to confirm the characteristics of each image pair, a task that we intend to explore in future work. Second, ensuring the relevance of enhancement learned from permanent section patches to corresponding frozen section patches is essential for the effectiveness of our translation model. While we have attempted to mitigate this concern through careful data selection and training, it remains a challenge to guarantee the alignment of features between the two types of sections. To address this, incorporating pathologist labeling can be carried out to validate the suitability of each image pair for training, thereby enhancing the relevance and accuracy of our model learning process. Third, the discrepancy in morphological characteristics between frozen and permanent sections presents a significant obstacle in achieving fully paired data. While it is not feasible to eliminate this difference entirely, we recognize the importance of utilizing successive or closely positioned histological section paired data to refine our model further. By exploring different degrees of pairing and conducting additional model training on such data, it will be possible to optimize the performance of our approach and enhance its robustness in translating frozen images into permanent-like quality.

In conclusion, SAN represents a significant advancement in the field of histopathology image enhancement. By combining nuclei segmentation with CycleGAN, we have developed a general method, which specifically addresses the limitation of frozen section images and offers a fast and reliable solution for frozen section image enhancement. This approach has the potential to improve the accuracy and effectiveness of pathological diagnosis and can be extended to various applications where detailed attention to specific image regions is required.