Introduction

Discovering the earliest signs of Alzheimer’s disease (AD) and thereby starting treatment as early in the disease progression as possible is of great interest1,2. Neuropathological brain changes associated with AD, namely β-amyloid plaques and neurofibrillary tangles (NFT), are thought to begin years before clinical symptoms become evident. In contrast to β-amyloid plaques, NFT are more strongly correlated with cognitive deficits3 and progress in a hierarchical manner throughout the brain in typical AD4. This continuous accumulation of NFT is strongly related to loss of neurons4,5 and has been shown to be causally associated to cerebral atrophy in affected regions6. In regard to the sporadic AD (representing approximately 95% of all cases) the first region typically affected by neurofibrillary tau pathology is the medial perirhinal cortex (mPRC), which approximately corresponds to the transentorhinal cortex and Brodmann area 354,7,8. In later stages, this neurofibrillary tau pathology spreads to the medially located entorhinal cortex (ERC) and eventually to the hippocampus, and throughout the brain4,9,10. In clinical settings, the diagnosis of AD often relies on visual assessment of atrophy, such as the ERC atrophy score (ERICA11) along with the medial temporal lobe atrophy score that evaluates the hippocampus, the choroid fissure, and the lateral ventricle (MTA12). In this context, the mPRC is often overlooked, and if considered, it is usually in form of the entire perirhinal cortex. Given its relatively small size, we believe that assessing mPRC integrity in clinical settings could be enhanced by using a computed measure of atrophy (e.g., cortical thickness). Recent studies are in line with this notion, reflecting the potential of the mPRC’s integrity in clinical settings. For example, Sone et al.13 discovered that in early stages of AD, regional NFT accumulation is associated with cortical thinning in the perirhinal cortex and ERC. As a result, brain structures that are initially impacted by NFT induced atrophy (e.g., mPRC) likely serve as sensitive preclinical structural imaging biomarkers. Indeed, more recent studies, with focus on mPRC atrophy, found promising results suggesting mPRC integrity as a sensitive and specific marker for early AD (e.g14,15,16). In addition, these findings further indicate, that the lateral part of the perirhinal cortex (lPRC, most likely encompassing Brodmann area 36) is only affected after the mPRC, which is in line with the proposed staging of NFT-related pathology4,17. Based on these results, we hypothesize that investigating the cortical thickness of mPRC and lPRC separately is more sensitive in early-stage AD than the cortical thickness of the entire perirhinal cortex.

Nonetheless, the perirhinal cortex, particularly the mPRC, remains underrepresented in AD research and diagnostics. One contributing factor is the ongoing debate regarding its precise anatomical boundaries in the human brain46. Foundational work in nonhuman primates47,48 have shaped our understanding of the medial and lateral subdivisions of the PRC. These areas are notably larger in humans and are critically involved in the progression of tau pathology in AD, highlighting their clinical relevance. Although manual segmentation protocols, such as those based on the cytoarchitectonic work of Insausti et al.49, enable differentiation between the mPRC and lPRC (e.g.,14,50), only a limited number of studies have adopted these methods. Anatomically, the transition from the mPRC to the lPRC is defined by the collateral sulcus of the medial temporal lobe, which exhibits considerable inter- and intraindividual variability (e.g., differences in length and form of the sulcus)8. This variability presents a major challenge to accurately segment the mPRC and lPRC regions, as it profoundly influences the delimitation of their boundaries (for an insight into manual segmentation see18). In a recent study we demonstrated an excellent inter-rater reliability between two raters using an existing manual segmentation protocol for the mPRC, lPRC, and ERC, which takes collateral sulcus variability into account18. Although manual segmentation for quantification of brain regions from MRI is the gold standard, it comes with the disadvantage of time-consuming implementation and is therefore not feasible for clinical and research setting19. Earlier work (e.g42,43), addressing this challenge, used atlas-based tools, which create standardized templates by spatially aligning anatomical structures across individuals. These robust, atlas-based approaches, use the manually annotated atlases as templates that are co-registered to brain image data and subsequently use label fusion followed by an AdaBoost classifier to derive the final segmentation. In contrast, the proposed deep-learning-based method directly predicts the class label from the input data. The algorithm learns the anatomical variability of the collateral sulcus from the training data and has a smaller tendency to the average. It is thus expected to better capture inter-individual differences, while possibly being less robust20.

Advancements in algorithm and computation resources over time have significantly propelled the development of different segmentation techniques for neuroimaging, such as FreeSurfer21 or Statistical Parametric Mapping (SPM22). Parallelly, machine learning approaches based on convolutional neural network (CNN) architectures (e.g., U-Net) are experiencing a growing trend20. Against this backdrop, we aimed to develop an automated segmentation tool based on U-Net, a generic deep-learning-based software package for cell detection and cell segmentation, which can be trained and applied to new data. In addition, it is customizable, which allows an adaption to specific challenges23,24. Using the automated segmentation tool, we aimed to replicate the results described by Krumm et al.14 to evaluate a potential clinical benefit. The study found a significant atrophy in the mPRC and ERC when comparing both Alzheimer’s dementia patients (dAD) as well as amnestic mild cognitive impairment (aMCI) patients with healthy controls. Notably, atrophy in the lPRC was observed exclusively in the dAD group, aligning with the NFT distribution pattern in early AD4,14,17. While Krumm et al.14 used a manual segmentation protocol to extract cortical thickness, we replicate the study using an automated segmentation method in the identical sample. This allows us to compare the automated segmentation to the manual segmentation for the cortical thickness of brain regions first affected by atrophy in typical AD (e.g., mPRC, ERC). A reliable automated segmentation, especially of the mPRC, would facilitate the use in research and clinical settings to improve early detection of AD.

Materials and methods

Participants and MRI acquisition

Training data set

The training data set (N = 126, mean age = 69.8 ± 10.8 years) consisted of 101 patients and 25 healthy control participants (NC). Written informed consent was obtained from all individuals prior to participation and the study was approved by the local ethics committee (EKNZ: Ethics Committee of Northwestern and Central Switzerland). All methods were performed in accordance with the relevant guidelines and regulations. NCs were recruited from the “Registry of Healthy Individuals Interested to Participate in Research” of the Memory Clinic FELIX PLATTER Basel, Switzerland. They had undergone a thorough medical screening and neuropsychological testing to confirm their cognitive health. In particular, the exclusion criteria encompassed severe impairments in auditory, visual, or speech abilities; substantial sensory or motor deficits; severe systemic illnesses; persistent moderate to intense pain; conditions with significant or likely effects on the central nervous system (e.g., neurological disorders such as cerebral-vascular disease, generalized atherosclerosis, and psychiatric disorders); and the use of potent psychoactive substances, except for mild tranquilizers. In addition, all individuals classified NC obtained standard scores within the normal range on the Mini-Mental State Examination (MMSE)25, California Verbal Learning Task26, Clock Drawing Test (Critchley, 1953), and the short version of the Boston Naming Test27. Of the 101 patients, 29 participants were diagnosed with mild cognitive disorder (MCI) according to DMS-IV28. 26 participants were diagnosed with Major Depression (MD) including 14 participants recruited from the Memory Clinic FELIX PLATTER Basel, Switzerland, and 12 recruited from the University Psychiatric Clinics Basel, Switzerland. MDs had to score 10 or more points on the Becks Depression Inventory29, 13 or more on Becks Depression Inventory-II30, or 6 or more points on the Geriatric Depression Scale31. 8 participants were diagnosed with aMCI32 according to DSM-IV28 and Winblad et al. (2004) criteria. 18 participants were diagnosed with dementia due to AD (dAD) according to DSM-IV criteria28, and NINCDS-ADRDA33. aMCI and dAD were combined to one AD group (N = 26) based on the assumption that the progression from aMCI to early dementia stage of AD is gradual and time of diagnosis can differ34. Further, 20 patients were diagnosed with dementia due to other etiologies than AD (non-AD; e.g., due to Lewy body disease) according to DSM-IV. For an overview see Table 1. All patients had been recruited either from the Memory Clinic FELIX PLATTER Basel, Switzerland, where they had received neuropsychological testing, and medical and neurological examinations including blood analyses, or in the case of the 12 MDs from the University Psychiatric Clinics Basel, Switzerland. All participants were native Swiss-German or German-speaking adults.

Participants received T1-weighted 3D magnetization-prepared rapid acquisition gradient echo (MPRAGE) structural MRI using the same 3-Tesla scanner (MAGNETOM Skyra fit, Siemens; inversion time = 900 ms, repetition time 2300 ms, echo time 2.92 ms, flip angle = 9; acquisition matrix = 256 × 256 mm, voxel size = 1 mm isotropic, acquisition time = 5 min 12 s) at the University Hospital Basel, Switzerland.

Table 1 Training data set—sample characteristics.

Test data set

The test data set (N = 103, mean age = 76.4 ± 7.0 years) is identical to the one used for group comparison in Krumm et al.14 and contained 46 healthy control participants (NC), 34 participants diagnosed with early Alzheimer’s dementia (dAD) according to NINCDS-ADRDA and DSM-IV criteria28 and 23 patients with amnestic mild cognitive disorder (aMCI) according to DSM-IV and Winblad et al.35 criteria (see Table 2). For a comprehensive overview of the inclusion and exclusion criteria, see Krumm et al.14. All patients had been recruited from the Memory Clinic FELIX PLATTER Basel, Switzerland, where they had received neuropsychological testing, and medical and neurological examinations including blood analyses. All participants were native Swiss-German or German-speaking adults.

Participants received T1-weighted 3D MPRAGE structural MRI using the same 3-Tesla scanner (MAGNETOM Verio, Siemens; inversion time = 1000 ms, repetition time 2000 ms, echo time 3.75 ms, flip angle = 8; acquisition matrix = 256 × 256 mm, voxel size = 1 mm isotropic, acquisition time = 7 min 30 s) at the University Hospital Basel, Switzerland.

Table 2 Test data set—sample characteristics.

Preprocessing of structural MRI and manual segmentation

MRI scans were preprocessed using FreeSurfer (Massachusetts General Hospital, Boston, MA, USA; http://surfer.nmr.mgh.harvard.edu; accessed on 7 January 202036,37). In a semi-automated processing stream, FreeSurfer segmented the T1-weighted 3D MPRAGE volumes into grey and white matter. Next, the surface of white matter, represented by the transition area from white to grey matter, and the pial surface were modeled36. Lastly, tissue classification was visually confirmed for all participants, and, if required, manual adjustments were performed. Regions of interest (ROIs; i.e., mPRC, lPRC, and ERC) for both hemispheres were manually drawn by a blinded rater on coronal slices, according to the protocol depicted in Krumm et al.14, which takes collateral sulcus variation into account (for visual examples of the anterior-posterior borders of manual segmentation, see18).

Training and application of automated segmentation

The semi-automatic labels were mapped to the gray matter obtained by Freesurfer and transformed to the 3D voxel space to create regional masks for mPRC, lPRC and ERC. Using each of the masks, we trained a separate network to segment the respective region as a voxel mask (for examples see Supplementary material). The predicted voxel mask was then mapped back to the Freesurfer space to compute morphological characteristics such as the average cortical thickness. We used the nnU-Net38 framework to train the networks. The nnU-Net23,24 is a toolbox to train 2D and 3D U-Nets, specifically optimized for user-friendly model training and selection with biomedical imaging data. The U-Net23,24 is a multi-stage neural network architecture for semantic segmentation. The input image, a T1 weighted MRI in this work, is processed on multiple resolution levels. The features from the analysis path (with increasing voxel size) are combined with the features from the synthesis path (with decreasing voxel size) at every resolution level except the lowest. This leads to an effective combination of high-level features with large spatial context and low-level features with small spatial context. The output is a pixel-wise semantic segmentation. At the border of regions, the class labels are ambiguous. For example, a pixel contains 50% of two classes due to interpolation. To better account for this ambiguity, we substitute the default sparse cross-entropy loss with dense cross-entropy loss that was capable of modeling a full probability distribution. The conversion from surface-based annotations to voxel label and back were done with Freesurfer. Eventually, we trained a separate network for the ERC, mPRC, and lPRC, respectively for 150 epochs.

The inference of the MRI data was performed without additional pre-processing. In two cases, the prediction of one of the masks failed and could not be projected to the FreeSurfer space to cortical thickness values (e.g., ERC right hemisphere for one participant, lPRC right hemisphere for another participant). To ensure the accuracy of the automated segmentations, we performed a quality control assessment on a subset of 60 participants, with 20 randomly selected from each diagnostic group (healthy controls, aMCI, and AD). The process involved a detailed visual inspection of coronal slices in FreeSurfer, focusing on key anatomical landmarks such as the medial and lateral borders of all ROIs (ERC, mPRC, and lPRC). Each segmentation layer was inspected systematically from the anterior to posterior border to detect any gross overextensions, under-segmentations, or incorrectly labeled pixels. A significant deviation would have included segmentation labels being entirely misplaced outside the medial temporal lobe, gross misplacement of the ROI, such as segmentation labels extending well beyond the expected anatomical boundaries, extensive gaps within the ROI where relevant pixels belonging to the cortical structure were consistently excluded, or a complete absence of labeled pixels for a given ROI. Additionally, a segmentation would have been flagged if it spanned fewer than 10 slices in the anterior-posterior direction, as this would indicate insufficient coverage of the expected anatomical region. In this sub-sample of 60 participants, the ROI masks performed as expected, with no significant deviations observed. Given these consistent results and the high ICC values between manual and automated segmentation, extending quality control to the full sample was deemed unnecessary. An example, where the progression of segmentation masks across consecutive coronal slices from the anterior to posterior boundary is displayed alongside the corresponding unsegmented T1-weighted images, is displayed in Supplementary Fig. 2. In addition, all quality control criteria used for evaluating the segmentation masks are summarized in Supplementary Table 1. Based on the regions that were analyzed in the study by Krumm et al.14, we additionally trained a separate network for the parahippocampal cortex. However, since this region is not the focus of this work, it is not further discussed in this manuscript.

Statistical analyses

For each ROI, an aggregated bilateral cortical thickness value was used. Cortical thickness measurements were normalized for head size (as total intracranial volume [TIV]) as reported by Krumm et al.14 using the formula [(cortical thickness)/(TIV) × 100]. For reporting in Table 3, normalized values were retransformed to mm using the mean TIV of the two comparing groups (e.g., dAD versus NC mean TIV = 1453 cm3; aMCI versus NC mean TIV = 1480 cm3). Group differences were examined conducting univariate analysis of covariance (ANCOVA), incorporating age, sex, and education level as covariates. To address multiple comparisons, significance thresholds were adapted using the Bonferroni correction (e.g., p = 0.05/8 = 0.00625). In addition, to evaluate the accuracy between the two methods (manual and automated segmentation), TIV corrected bilateral cortical thickness values of all participants of the test data set were compared using intraclass correlation coefficient (ICC) estimates and their 95% confidence intervals based on a single-rating, consistency, and a 2-way mixed-effects model according to the guidelines of Koo and Li39. All analyses were executed in SPSS software, and while Krumm et al.14 utilized SPSS 21.0, our replication utilized the subsequent version, SPSS 22.0 (IBM Corp. Released 2013. IBM SPSS Statistics for Windows, Version 22.0. Armonk, NY, USA).

Results

Two-tailed, univariate ANCOVAs with sex, age, and education as covariates were performed to determine whether each ROI (mPRC, lPRC, and ERC) was atrophied in the aMCI and dAD groups relative to their corresponding NC sample. Significance was tested with Bonferroni corrected p-values (i.e., p = 0.05/8 = 0.00625 according to Krumm et al., 201614).

In comparison to the NC group, the dAD group showed significantly lower average cortical thickness of the ERC, mPRC, and lPRC [ERC: F(1,60) = 39.820, p < 0.00625, mPRC: F(1,60) = 32.270, p < 0.00625, lPRC: F(1,60) = 10.907, p < 0.00625]. In comparison to the NC group, only the ERC was significantly atrophied in the aMCI group [ERC: F(1,64) = 13.249, p < 0.00625], while the p-value of the mPRC and lPRC did not survive Bonferroni correction of 0.00625 [mPRC: F(1,64) = 4.884, p = 0.031; lPRC: F(1,64) = 6.408, p = 0.014]. The ICC analyses, based on a single-rating, consistency, and a 2-way mixed-effects model, between manual and automated segmentation for TIV corrected cortical thickness estimates are summarized in Table 4.

Table 3 Normalized and Retransformed mean cortical thickness in each ROI over both hemispheres.
Table 4 ICC calculation for TIV corrected cortical thickness estimates between manual and automated segmentation using single-rating, consistency, two-way mixed-effects model.

Discussion

Our goal was to replicate the study of Krumm et al.14, who used a manual segmentation protocol to extract cortical thickness values of key regions within the parahippocampal gyrus (e.g., mPRC, lPRC, and ERC). Recognizing the labor-intensive nature of manual MRI-segmentation, we trained a deep convolutional network, specifically utilizing U-Net architecture, for the automated segmentation of the same regions and subsequently applied it to the identical sample as used in the group analysis in Krumm et al.14. In line with the findings of Krumm et al.14, our study found significant atrophy in the ERC, mPRC, and lPRC within the early dAD group when compared to the NC group. However, when comparing the aMCI group to the NC group, the null hypothesis of no difference could only be rejected in the ERC, while the null hypothesis of no difference in cortical thickness in the mPRC could not be rejected after applying strict correction for multiple comparisons, unlike the findings of Krumm et al.14. Nonetheless, the results are highly promising for future research in the early detection of AD, which will be discussed below. In addition, we compared the manual with the automated segmentation. The ICC analyses for cortical thickness estimates showed high ICC values between the manually and automatically generated cortical thickness values for all ROIs (mPRC, lPRC, and ERC), suggesting manual and automatic segmentation to generate comparable outcomes.

As highlighted, the early involvement of the mPRC in NFT pathology in typical AD positions a specific mPRC-integrity score (e.g., cortical thickness) as a promising early and sensitive imaging biomarker. Building upon our initial aim to replicate the critical findings of Krumm et al.14 regarding atrophy in the ERC and mPRC as early markers of Alzheimer’s disease, we have now bridged a significant gap in the field. The previously outlined very high reliability of a manual segmentation protocol, as detailed in the introduction, laid a solid foundation for accurate and detailed analysis of brain regions crucial for early detection of AD18. By integrating machine learning techniques, particularly through the training of a U-Net based deep convolutional network38, we have developed an automated segmentation tool that parallels the very high interrater reliability seen using the manual segmentation protocol. These findings establish a foundation for more efficient application in clinical and research settings, potentially improving the early diagnosis of AD.

A key strength of our approach lies in its ability to capture individual anatomical variability more accurately and less biased compared to traditional atlas-based methods. On the other hand, the segmentation may be less robust20. Deep learning methods benefit strongly from larger data sets. Thus, combining both data sets of this study and adding additional training data has the potential to improve prospective models. Earlier work (e.g42,43), used atlas-based tools that create standardized templates by averaging anatomical structures from multiple individuals. While these methods are robust, they tend to smooth out inter- and intra-individual differences due to the averaging process. In contrast, our deep-learning-based method predicts class labels directly from the input data, allowing the algorithm to learn and account for the anatomical variability of the collateral sulcus from the training data. This approach has a smaller tendency to average out these differences, making it better suited to capturing the unique anatomical features of each individual, which is crucial for personalized and precise measurements in clinical settings.

In evaluating the early dAD group against the NC group using the identical sample, we replicated significant atrophy findings in the ERC, mPRC, and lPRC as reported by Krumm et al.14. Yet, in the aMCI vs. NC comparison, significant difference in cortical thickness was confined to the ERC. The results diverge from Krumm et al.‘s14 findings regarding the mPRC, despite the same underlying sample. This discrepancy may be attributed to the inherent complexity of the sulcal pattern in this region. The collateral sulcus, where the mPRC is located, is known for its anatomical variability across individuals, which poses a challenge for automated segmentation algorithms8. While disease condition (e.g., AD) influences cortical thickness, it is not known to alter the anatomical borders of the mPRC (e.g., by changing the length of the collateral sulcus and thereby the anatomical borders)8. Hence, the variability observed in the automated compared to the manual segmentation is unlikely due to disease or disease progression but rather reflects the difficulty in consistently identifying the precise boundaries of the mPRC within a highly variable collateral sulcus.

Another factor to consider is the potential inclusion of pixels overlapping with the dura (see, for example, images in Table 3), which could impact atrophy measurements such as cortical thickness or volumes. However, it is crucial to note that cortical thickness was computed for both manual and automated segmentation methods using FreeSurfer. The computation of cortical thickness is performed after reconverting the voxel-based data back to the surface-based representation. While we believe the impact on cortical thickness—our key clinical measurement—is minimal, a potential influence cannot be entirely denied, particularly in volumetric analyses where surface area is also considered. Moreover, since the automated segmentation strives to replicate manual segmentation outcomes, its efficacy is inherently bounded by the precision of the manual segmentation technique. Although the manual segmentation served as the reference standard in our study, it still represents an estimation of the true cortical thickness, as all methodologies inherently possess limitations and potential biases. The results based on our sample suggest that the manual segmentation might be slightly more sensitive, but it lacks practicability in clinical or research setting. For example, manual segmentation for the mPRC requires about 20 min of labor per person, which is typically regarded as unfeasible in a clinical setting. The automated segmentation, on the other hand, runs unattended in the background in less than 5 min.

While the automated approach was not able to fully replicate the manual segmentation results for the mPRC in the aMCI vs. NC comparison, a clear trend was observed, even though it did not reach statistical significance as in14. Together with the high consistency between the automated and manual segmentation, the automated method emerges as a promising alternative, especially for larger datasets where manual segmentation would not be feasible. Nonetheless, the future application and validation of our new automated tool remains of utmost importance. Conducting a longitudinal study with initially healthy individuals would be particularly beneficial. This approach not only aims to collect essential normative data for the wider application of our tool in clinical settings but also enables the retrospective analysis of cortical changes in participants who later develop symptomatic AD. One important consideration for future studies is the need for larger sample sizes in group comparisons. Given the small size and high anatomical variability of the mPRC, larger sample sizes are crucial to reliably detect subtle differences in cortical thickness between groups. This is particularly relevant in early-stage conditions like aMCI, where atrophy may be less pronounced and more difficult to detect. Additionally, in the context of future imaging studies, it would be worth considering the use of less stringent statistical corrections for group comparisons, as they can lead to underestimation of meaningful effects, particularly in smaller regions where variability is high (e.g., the mPRC). By expanding the dataset and adjusting statistical thresholds, it may be possible to capture more subtle cortical changes in the mPRC in early stages of AD.

Furthermore, the mPRC should be evaluated alongside other well-established markers in AD research and clinical practice, such as the ERC11,44,45. Combining this established marker with a new mPRC atrophy score can potentially enhance the early detection and monitoring of AD. This dual approach could provide a more comprehensive understanding of cortical degeneration patterns and improve diagnostic accuracy in the preclinical stages of AD. In addition, incorporating longitudinal studies that not only employ conventional neuropsychological assessments but also integrate newer, more specific neuropsychological tests for assessing specific perirhinal cortex function, such as the novel object recognition task developed by Frei et al.40, will be instrumental. Our automated segmentation method, initially designed for assessing cortical thickness, also shows potential for functional imaging studies. This capability offers a valuable tool for exploring the functional dynamics of medial temporal lobe subregions in early AD progression, while mitigating the need for labor-intensive manual segmentation.

Finally, we did not investigate cortical thickness values of left and right hemisphere separately. Brain atrophy in typical AD often presents asymmetrically, particularly emphasizing the vulnerability of the left hemisphere41. This emphasizes the importance of separate evaluation of the hemispheres (e.g., cortical thickness of the left and right mPRC) in future research and clinical assessments. Although the mPRC rises as a promising vital structural biomarker in the early phases of AD, adopting a multi-domain approach that potentially includes a range of biomarkers becomes increasingly more important as we delve into the mild cognitive impairment spectrum moving towards asymptomatic stages. Such investigations require large data sets, which are not feasible for manual segmentation, a shortfall we believe to have successfully addressed with our presented automated segmentation method. This strategy not only aims to improve individualized patient evaluations but also promises to refine diagnostic precision and foster early, customized interventions in the incipient stages of the disease.

To conclude, our study extends beyond state-of-the-art validation methods by connecting segmentation outputs to clinical data, ensuring both anatomical precision and clinical relevance. Currently, there is no alternative that provides automated segmentation adhering to the same rigorous protocol. This integration marks a pivotal advancement, addressing the need for automated segmentation methods that convince anatomically while directly supporting clinical applications. By demonstrating strong agreement with manual segmentation and validating our findings against established clinical patterns, our approach represents a meaningful step forward in the development of automated tools. This dual-validation strategy enhances reliability and establishes a robust reference point for future studies aiming to integrate automated segmentation with clinical practice.

Conclusion

We aimed to replicate a prior study by Krumm et al.14, confirming early AD-associated cortical thinning within key parahippocampal gyrus regions via automated MRI-segmentation. The results showed high consistency for cortical thickness between the manual and automated segmentation. Therefore, despite the potentially slightly higher sensitivity of manual segmentation in our sample, the automated method still emerges as a promising tool, especially due to its applicability to larger datasets. We underscore the importance of future longitudinal studies, which should not only include initially healthy individuals but also focus on measuring unilateral cortical thickness values and incorporate neuropsychological testing, particularly tasks specifically assessing the function of the perirhinal cortex. This strategy not only aims to enhance diagnostic precision but also to pave the way for early, targeted intervention strategies, ultimately contributing to the development of personalized treatment plans and advancing our collective understanding of AD pathology.