Introduction

Successful communication relies on the capacity of community members to establish a shared understanding of the world. Language plays a central role in fostering this alignment by providing a medium for shared meanings. As such, past language experience influences not only an individual’s cognitive capabilities but also their capacity to align with others in communicative and thought processes. While the influence of language experience on linguistic and cognitive abilities is well-established, its role in facilitating shared processes across individuals in language comprehension and communication remains less understood.

One important aspect of language experience is the amount of exposure to a language. It is well-established that the quantity of language exposure predicts children’s vocabulary1, language2, and reading development3. Similarly, in adults, greater exposure to written texts has been linked to enhanced language and reading processing abilities4,5,6,7. Notably, the processing advantage observed in readers with higher print exposure over less frequent readers is particularly evident when comprehending sentence types that are more typical of written than spoken language7. This advantage persists even after controlling for general cognitive abilities such as executive function and perceptual speed8. Given that it is difficult to gauge the full scope of an adult’s cumulative language history, researchers have often used exposure to written text as a proxy. One widely used tool to assess this is the Author Recognition Test (ART), which measures the amount of exposure to written language and has been shown to predict reading speed, thus linking print exposure to reading abilities in adults9.

While the quantity of print exposure has been extensively studied, less is known about the specific content or topical distribution of what individuals read. Consider two adults with identical overall print exposure quantity: one reads exclusively about mystery and suspense novels, while the other reads only literary fiction. Despite similar exposure quantity, their language and cognitive repertoires may diverge given their distinct focus on topics. While this example is extreme, in reality, most people’s reading experiences are distributed across a range of topics and genres. Thus, the distribution of topics, i.e., the content of print exposure, should be considered a critical aspect of language experience. Studies have shown that the fiction and non-fiction sub-scores of the ART had differential explanatory effects in predicting language6,10 and reading comprehension abilities11. Print exposure in genres that are of strong interest to adolescents is a better predictor of their word reading ability than general genres12.

Crucially, what one reads – not just how much one reads – is related not only to their language or reading abilities, but also to how they process language. These effects operate at both the social-cognitive and lexical-semantic levels. On the one hand, reading literary fiction enhances theory of mind more effectively than reading other types of material13, and early-life exposure to literary or narrative fiction predicts the complexity of an individual’s worldview14. On the other hand, word meanings can vary significantly depending on the corpus, which can be seen as representing an agent/individual, wherein they are computed15,16. For example, news readership (CNN vs. BBC) can affect how people process words like humanitarian17. These findings highlight the necessity of considering not only the amount of reading experience but also the distribution of topics/genres in one’s print exposure, which should impose a strong impact on the processing of linguistic materials.

Although some studies have demonstrated that reading abilities influence neural activity during reading, such as by modulating functional connectivity within individuals18 and showing positive correlations with neural synchronization across participants19, the role of cumulative reading exposure on neural alignment remains largely untested. Preliminary evidence has shown that shared language experiences, even brief ones, can lead to neural alignment, as demonstrated by increased similarity in neural profiles among individuals who have engaged in a conversation20. Similarity in post hoc recall of a narrative people have just read also reflected the degree of neural alignment while reading the narrative21. These findings implicate that individualized long-term experience, such as distributed reading experience, has the potential to affect how similar people’s neural activity patterns are in language processing.

In addition to individual experience, the type of text being processed also contributes to variation in comprehension and neural engagement. Neurobiological studies have revealed distinct neural substrates for the comprehension of expository versus narrative texts. Comprehension of narratives engages left-lateralized semantic regions and bilateral cortex in the default mode network (DMN), including the posterior cingulate cortex (PCC), precuneus, and prefrontal cortex18,19,22, which were associated with mental model updating and integration. Whereas, expository comprehension engages distinct brain areas23, showing significantly greater activation in the frontoparietal control network (FPN) compared to narrative comprehension, in addition to language and comprehension-related areas24,25. These findings align with the goal of narrative and expository reading. The primary aim of reading narratives is to simulate and learn social behavior26,27, while expository reading concerns with learning factual information. As a result, compared with narratives, expository comprehension places a higher demand on situational integration and the reader’s background knowledge28. This underscores the importance of distinguishing between narrative and expository texts in both the assessment of language experience and the investigation of neural processes underlying comprehension.

Recent research has demonstrated the utility of inter-subject representational similarity analysis (IS-RSA) in recovering brain-behavior alignment during naturalistic stimuli exposure such as narratives or movies29,30. This method quantifies the extent to which individuals with similar behavioral profiles, such as reading abilities or cognitive traits, exhibit similar neural response patterns during language comprehension. It thus provides a powerful tool to link individual differences in reading experience to inter-subject neural alignment. In the current study, we investigated the influence of reading experience on neural dynamics during both narrative and expository comprehension. Given the well-established role of exposure quantity in prior research and the potential variability in its impact, we first investigated the influence of print exposure on inter-subject correlation of neural activity, and further distinguished its impact on high- and low-print exposure participants with IS-RSA analyses.

To assess the distribution of participants’ reading experience across various topics, we developed the topicalized Author Recognition Test (tART). Unlike traditional ART, which only measures overall print exposure quantity, the tART also assesses people’s text experience across nine topics/genres, based on the author names they recognize and how many books they have read by those authors. The results thus generate distributional vectors for individuals, representing their topic-specific exposures, which were then used to quantify the relative dissimilarity in reading experience among participants.

Participants also took part in a fMRI task where they read two narratives and two expository texts. We pre-defined Region of Interests (ROIs) corresponding to the language network, Default Mode Network (DMN) and frontoparietal control network (FPN), which have been shown to be strongly linked to narrative and expository processing23,25,31,32. Inter-subject representational similarity analysis was then used to identify shared patterns of reading experience (i.e., topic distributions in tART) and neural activity.

Results

Reading experience measured as a topic distribution

We developed the tART to measure participants’ reading experience across various topics, including fictions and non-fictions. To differentiate subjects’ reading experience, author names in the tART within each topic spanned a wide range of recognition difficulty, as visualized in Fig. 1a. The maximum score on exposure quantity in the tART is 360, with 40 in each of the 9 topics. The participants’ average exposure quantity is 45.844 (SD = 24.975), ranging from 8 to 106. Given the well-established role of exposure quantity in prior research and the potential variability in its impact, we median-split the participants into high-exposure (M = 65.687, SD = 17.606) and low-exposure groups (M = 24.958, SD = 9.398) in later analysis according to their overall exposure scores. We compared print exposure in fiction and non-fiction topics: exposure was averaged across the topics with fiction/non-fiction topics, and dependent t-tests showed that participants had overall higher exposure in fiction topics than non-fiction topics, for the entire group (N = 39, p < 0.001), the high- (N = 20, p < 0.01) and low- (N = 19, p < 0.05) exposure group (Fig. 1b).

Fig. 1: Results of tART.
Fig. 1: Results of tART.
Full size image

a The ninety author names (10 in each topic) showed varying recognition rates. The 9 colors correspond to different topics. The topics marked with * are fiction topics. b Participants showed overall higher exposure in fiction topics than in non-fiction topics. c Examples of three participants varying in reading experience across topics accessed with tART. The topics marked with * are fiction topics. LF Literary Fiction PT Poetry, MS Mystery and Suspense, YA Young Adult, DM Drama, PR Prose, HT History, PH Philosophy, SH Self-Help.

Participants also reported their frequency of reading on the nine topics in a survey. Print exposure measured using tART was significantly correlated with participants’ self-reported exposure, Spearman r = 0.744, p < 0.0001, suggesting that the tART could reliably assess their topic-specific reading experience.

In addition to quantifying overall print exposure, the test allowed us to profile a participant’s reading experience in each topic. In this approach, reading experience was represented as a distribution across multiple topics, which could be visualized using a spider chart for each participant (Fig. 1c). This allowed for a more detailed characterization of one’s reading experience.

Post-reading comprehension check

The post-reading comprehension test consisted of 24 multiple-choice questions, with 6 items for each text. Each item consisted of a related factual question and three options, one of which was the correct answer. The tests were manually scored by the researchers. Participants in the analysis showed acceptable accuracy (M = 0.873, SD = 0.061). Participants’ performance for expository texts (M = 0.917, SD = 0.063) was better (p < 0.001) than that for narrative texts (M = 0.829, SD = 0.098). The comprehension performance did not significantly correlate with print exposure quantity (Spearman rho = −0.109, p = 0.508).

Influence of overall print exposure on neural alignment

We investigated the relationship between print exposure quantity and neural dynamics during narrative and expository reading using the AnnaK model, which would predict that lower print exposure is linked to more idiosyncratic neural response pattern19,30. A neural representational dissimilarity matrix (RDM) and a predicted RDM based on print exposure assessed with the tART were constructed with the dissimilarity between each participant pair, and the two vectorized RDMs were then correlated (Fig. 2a). Specifically, neural dissimilarity was calculated as 1 minus the inter-subject correlation of neural signals. Predicted dissimilarity was defined as the mean print exposure of the participant pair, under the hypothesis that avid readers would show more consistent neural patterns, whereas infrequent readers would exhibit greater variability. Before the two RDMs were correlated, a RDM representing years of education was regressed out from the predicted RDM (see details in Methods). AnnaK model represents a variant of the IS-RSA method, which traditionally focuses only on positive correlations (r). In contrast, the current model is equally sensitive to effects in both directions: negative correlations reflect greater consistency among infrequent readers and greater idiosyncrasy among more frequent readers19. The AnnaK model was built for each of the selected 153 ROIs (Fig. 2b), and ROIs with q < 0.05 after Benjamin-Yekutieli false discovery rate (FDR) correction were reported as significant.

Fig. 2: The relationship between print exposure quantity and neural dynamics was investigated with the AnnaK model.
Fig. 2: The relationship between print exposure quantity and neural dynamics was investigated with the AnnaK model.
Full size image

a The AnnaK model was built by correlating the vectorized neural and predicted RDMs, with each cell representing the dissimilarity of a participant pair. b The 153 ROIs were selected from the Schafer-400 atlas. Different colors indicate whether an ROI belongs to the DMN, FPN, or supplemental language network (LAN). c Brain regions where the similarity of neural dynamics was associated with overall print exposure. There were different effects for narrative and expository reading. The color bar indicates Spearman correlation coefficients between predicted and neural RDMs. ROIs with FDR-corrected q < 0.05 were reported as significant.

The results showed that print exposure quantity was differentially associated with inter-subject correlation during narrative and expository reading. During narrative reading, the dissimilarity in exposure positively predicted variability in neural dynamics in the right medial frontal gyrus (MFG), anterior regions of bilateral middle temporal gyrus (aMTG), as well as the right posterior middle temporal gyrus (pMTG). However, during expository reading, print exposure showed negative effects in the left inferior frontal gyrus (IFG) and inferior parietal lobule (IPL), as well as posterior regions of the right inferior temporal gyrus (ITG). These results were consistent with previous findings that different types of discourse-level comprehension entail diverse neural demands.

Shared representational patterns during text reading

We then conducted IS-RSA to investigate the shared patterns between reading experience (i.e., topic distribution) and neural dynamics during text reading, separately for narrative and expository texts. The neural RDM was computed the same way as in the previous analysis on print exposure, but the behavioral RDM was calculated as the Kullback-Leibler (KL) divergence of the distributed vectors representing participants’ reading experience in the topics (Fig. 3a). Besides, the predicted RDM, representing dissimilarity in exposure quantity, as well as the participants’ education RDM, was regressed out from the vectorized behavioral RDM (see details in Methods). This helps to focus more on the difference in the distribution of reading experience and mitigate the influence of overall print exposure quantity.

Fig. 3: Neural dynamics during text reading was associated with reading experience represented as topic distributions in tART.
Fig. 3: Neural dynamics during text reading was associated with reading experience represented as topic distributions in tART.
Full size image

a IS-RSA was performed by correlating the vectorized neural and behavioral RDMs, with each cell representing the dissimilarity of a participant pair. b Brain regions where the neural dynamics of high-exposure readers during narrative reading were significantly correlated with their reading topic distributions. The color bar indicates Spearman correlation coefficients between the behavioral and neural RDMs. ROIs with FDR-corrected q < 0.05 were reported as significant.

IS-RSA was performed separately for the high- and the low-exposure groups. To test the possibility that fiction reading experience is better at explaining the neural dynamics for narrative reading, and non-fiction reading experience is better at explaining the neural dynamics for expository reading, we represented reading experience as distributions in both all 9 topics and the 5 fiction topics in tART (i.e., literary fiction, mystery and suspense, young adult, drama and poetry), as well as the 4 nonfiction topics (i.e., prose, history, philosophy, self-help). FDR corrections were applied to 153 ROIs for each group of analysis. The results were summarized in Table 1.

Table 1 Summary of IS-RSA results

For narrative reading, significant effects emerged only in the high-exposure readers, both when their reading experience was represented as a 9-topic and a 5-topic distribution. Specifically, the dissimilarity of reading experience represented as the 9 topic distributions was positively related to neural dissimilarity in the bilateral precuneus and PCC, middle frontal gyrus (MFG), as well as the left IPL. The 5-topic fiction reading experience captured the neural variability in more limited brain regions, including right MFG, bilateral precuneus, and PCC (Fig. 3b). No significant results were found for the low-exposure group, suggesting that the distributed reading profiles have little influence on narrative processing for low-exposure readers. During expository reading, no significant effects were found, regardless of whether reading experience was represented by all 9 topics, the 5 fiction topic dimensions, or the 4 non-fiction topic dimensions.

Discussion

In this study, we investigated the association between past reading experience and neural dynamics during narrative and expository reading. Using a novel tART, we quantified participants’ reading profiles across nine topics and then investigated their associations with inter-subject correlation in neural responses. We found that exposure quantity was differentially associated with neural inter-subject similarity during narrative and expository reading. More importantly, the results from IS-RSA revealed that the inter-subject neural similarity is related to shared reading experience, and the effects were modulated by overall print exposure. These findings suggest that (1) increased print exposure enhanced neural alignment during narrative reading; (2) avid readers show higher individuality in expository comprehension; (3) distributed reading experience showed shared patterns with neural dynamics in subregions of the default mode network (DMN) during narrative reading in high-exposure readers.

The brain regions involved in narrative and expository reading were largely distinct and differentially associated with print exposure. During narrative reading, individuals with greater print exposure exhibited higher neural similarity in bilateral MTG (Fig. 2c), encompassing areas implicated in lower-level semantic processing and social-cognitive processing27,33,34,35. The results indicate that increased reading experience enhances neural alignment among individuals during narrative processing. A plausible explanation is that reading activities – particularly engagement with fictional narratives – strengthen alignment in social-cognitive processes, which are supported by regions within DMN13,26,27. In contrast, during expository reading, print exposure showed primarily negative associations with neural similarity, indicating that readers with high exposure tend to comprehend expository texts more idiosyncratically. The negative effects emerged mainly in the FPN network, especially in the posterior regions of right ITG and the left IPL (Fig. 2c), which have been widely implicated in controlled semantic retrieval, executive processing and top-down regulation and attention36,37,38. Unlike previous studies linking neural ISC to reading ability measures, where higher word and sub-word level reading ability was associated with greater intersubject similarity19, our findings revealed a more complex pattern. With the growth of experience, participants might actually preserve individuality in their neural activities. Readers with high levels of print exposure may engage in more diverse or individualized cognitive strategies when comprehending expository texts. This is potentially because high-exposure readers may leverage flexible, reader-specific pathways for textual integration along with factual information, leading to increased variability in FPN activation patterns. In contrast, low-exposure readers may rely more on uniform, surface-level strategies, resulting in more convergent neural responses when processing expositions. This resonates with earlier findings, where increases in print exposure are linked to individual differences in reading comprehension patterns39,40.

Despite a handful of prior research trying to profile the reading experience of skilled language users, the focus has generally been on the amount of print exposure3,9,41,42. In contrast, our novel tART aimed to chart not just the volume but the thematic breadth of one’s reading profile. We found that reading experience, represented as a vector of topic distributions and controlling for overall print exposure, was associated with neural response patterns. However, differences manifested for high- and low-exposure readers, and the results varied depending on whether their reading experience was represented using all topics in the tART or just the fiction/non-fiction topics. For only narrative text processing, shared neural patterns were associated with shared reading experience among high-exposure readers. This may suggest that high-exposure readers are more strongly shaped by their print experience, which is consistent with Stoops & Montag7. Notably, this association remained robust when reading experience was represented either as a 5-fiction-topic distribution or represented by all 9 topics in tART (Fig. 3b). The most robust brain areas that showed shared patterns with reading experience were the precuneus and PCC within the DMN. They have been implicated in higher-order semantic processing43, autobiographical memory retrieval44,45, and mentalization46,47. The shared patterns between reading experience and neural dynamics offer neurobiological support for the interplay between experience and cognition highlighted in prior behavioral research13,14. They additionally highlight the role of the precuneus as a convergence zone integrating prior experiences with ongoing semantic construction during reading.

While previous studies have emphasized increased neural alignment driven by reading ability19 or level of engagement during reading48, our findings establish a direct connection between past reading experience and current neural dynamics in processing linguistic materials. Reading experience, particularly when represented as distributional topics with overall print exposure quantity factored out, functions as an “experiential” index that does not entail evaluative judgments of good or bad. Recent research has highlighted the idiosyncratic nature of subjective experience, showing that both past experiences and anticipated future moments shape one’s present experience49. This view is supported by findings that consensus-building conversations can induce neural alignment in both sensory and integrative brain regions20. Extending this line of evidence, our IS-RSA results revealed a link between neural alignment and shared past experiences, even among participants who had never met. Notably, this effect emerged during narrative reading rather than expository reading, suggesting that neural alignment is driven by subjective interpretation of texts rather than by objective comprehension of facts.

There are also several possible explanations for the absence of shared patterns between reading experience and neural dynamics during expository reading. First, because we aimed to assess reading experience only within participants’ native language, the tART covered a limited range of topics and omitted certain expository categories such as science, technology, and some social sciences. This may have introduced a bias toward literary and humanities-related author knowledge, thereby failing to capture aspects of reading experience that reflect readers’ processing of expository texts. Nevertheless, we observed a significant effect of reading exposure on intersubject neural alignment during expository reading (Fig. 2c), suggesting that the neural dynamics during expository text processing can, at least partially, be reflected by non-expository reading experience. Second, the neural dynamics associated with expository reading may also be captured by other forms of experience. For example, prior work has shown that comprehending expository texts is strongly influenced by reader characteristics, such as prior knowledge28,50. As expository reading recruits different brain areas from that of narrative reading23,25, it might be a complex cognitive activity involving strategy use and executive fuctions51,52.

Caution should be exercised when generalizing the results of this study. The absence of additional behavioral measures on language, cognition, and personal interests constrains the extent to which the intrinsic nature of the neural results can be interpreted. Although we established links between past reading experience and situational text reading, many other factors, such as IQ, socioeconomic status, personal interests, and cognitive traits, may have modulated this association, limiting the causal interpretation of our results. The potential interplay between the other factors and reading experience presents a “chicken and egg” dilemma. For example, one might be interested in reading about astronomy because of a pre-existing inclination, but it could also be that reading about astronomy fosters that interest. Further, there is ample evidence for the Matthew effect in reading: Better readers read more, which in turn improves their reading skills53. Disentangling the influences of these factors requires further investigation. The results of post-reading comprehension checks indicated that participants performed better on expository texts than narrative texts; however, all test questions were designed to focus on factual information, biasing towards expository reading. Better-designed and more comprehensive assessments, combined with standardized measures of comprehension, may provide a clearer interpretation of the results. In addition, existing ART-based measures of reading experience, including our adaptation, cannot fully account for individual variability in attention to or memory for author names, highlighting the need for more refined approaches to capturing past language and reading experience. Finally, due to the limited coverage of the current tART, the findings may not generalize to readers with non-humanities backgrounds or those unfamiliar with canonical texts. Moreover, as the tART primarily assesses the quantitative aspect of reading, namely, the amount of exposure, rather than qualitative dimensions such as depth of engagement, more comprehensive measures of reading behavior would be valuable for better explaining the neural patterns observed during reading activities.

Despite the limitations, our findings offer valuable insights for learning and education. While previous studies have established the effects of language experience, particularly print exposure, on the development of language and reading abilities3,5,4154, they often overlook potential variability in the content of individual reading experience and in the interpretations of the same reading materials. While our world experiences are constantly reshaped into memory21, this process is also influenced by individual states55, leading to variation in how readers process the same text. Our findings specifically link topic distributions in reading experience to activation patterns in the precuneus, a core region of the DMN that integrates past experiences with ongoing processes of “sense-making”56. This neurocognitive bridge between cumulated experience and real-time language comprehension thus has implications for educational strategies, particularly in designing reading interventions that take into account the quantity and diversity of reading materials. In learning contexts, customizing reading experiences to align with individual exposure levels may enhance cognitive engagement, improve comprehension efficiency, and foster critical thinking and empathy development.

In summary, this study underscores the link between reading experience and neural dynamics during reading comprehension. Print exposure has distinct effects on narrative and expository reading, likely reflecting their differential neural demands. Reading experience across genres and topics was associated with neural activity patterns, with greater similarity in reading experience linked to more similar neural responses in the DMN regions for avid readers in narrative reading. These findings indicate that both the quantity of print exposure and the content of reading experience reflect shared neural representations during reading. Future research extending these results to a broader range of texts and incorporating additional linguistic and cognitive variables may yield deeper insights into the interplay between language experience and text processing, with meaningful implications for education and learning.

Methods

Participants

Forty-four healthy adult participants (32 females, 12 males) took part in this study. The participants were between 18 and 24 years old (M = 22.18, SD = 1.63) college students from the same university. All were native Chinese speakers, right-handed, and had normal or corrected-to-normal vision. Three participants were excluded as they exhibited excessive head motion, with a maximum framewise displacement (FD) value greater than 2 mm. One participant was excluded due to poor performance on the comprehension test (accuracy < 0.667). One participant who reported little reading experience was also excluded. Data from 39 participants (27 females and 12 males) were included in the final analysis. All participants had a mean FD value below 0.25 mm (group M = 0.131, group SD = 0.031). All participants provided informed written consent before participation.

Reading experience assessment

The participants’ reading experience was measured using the topicalized Author Recognition Test (tART) we developed, which included 135 author names spanning 9 topics/genres. Each topic comprised 10 real author names and 5 foils. In the test, for each name, the participants were asked: Is this the name of an author that you know or whose works you have read? They selected one of five options for each name: (1) I do not recognize the name and have not read his/her works; (2) I recognize this author but have not read his/her works; (3) I have read one book by this author; (4) I have read two books by this author; (5) I have read three or more books by this author. These options were converted to scores of 0, 1, 2, 3, and 4, respectively.

The 9 topics in the tART were: literary fiction, poetry, mystery and suspense, young adult, drama, prose, history, philosophy, and self-help. These topics were chosen based on categories commonly found on popular online book stores and digital reading apps in mainland China, including Jingdong Books (https://book.jd.com/), Dangdang (https://www.dangdang.com/), and Weixin (https://weread.qq.com/). To assess readers’ experience with specific topics, we focused on genres that had at least 10 authors who specialized in that domain, and the authors had published at least three books in respective genre. Additionally, to limit the assessment to the native language, we excluded foreign authors whose works had been translated into Chinese. While translated works are prevalent, they tend to dominate specific genres, potentially biasing the results. For example, the most popular science popularization books in China are translations, whereas most young adult novels are written by native authors. This imbalance might introduce confounds. Therefore, some typical genres were not included, such as science, technology, and some social sciences. Based on the online book stores and reading apps, we first selected authors who had at least published three books within each topic. To further improve the efficiency of the test items, we removed authors who were either widely known or known to limited readers (i.e., low discriminability), as determined in a pilot experiment.

In the pilot test, we recruited 32 subjects (22 females, age M = 27.5, SD = 3.90, range = 22–35) who did not participate in the main experiment. The pilot aimed to identify real author names within each topic that spanned a wide range of recognition difficulties to be able to differentiate subjects’ reading experience. In the pilot, participants were only asked to indicate whether they recognized each name as an author or not. Based on their responses, we selected 10 author names for each topic with recognition rates ranging from 6.3% to 87.5% (M = 0.30, SD = 0.20). The foils for each topic were constructed to resemble real author names within that topic and were verified to ensure they did not correspond to any actual authors. The actual recognition rate of author names in this study is reported in Fig. 1a in the manuscript and Supplementary Table 1. Topic-specific foil recognition rates were shown in Supplementary Fig. 1.

To address potential overclaiming in self-report measures, participants’ reported print exposure was adjusted by multiplying it by a scaling factor derived from their overall false-alarm rate for all foils. This method is analogous to the correction used in standard ARTs, where the number of falsely recognized authors is subtracted from the number of correctly recognized authors. In the present study, however, a single scaling factor was applied to each participant to consistently correct both overall and topic-specific exposure.

To test the reliability of exposure measured by tART, we asked the participants to complete a survey, in which they directly reported the approximate time they spend on reading across the topics. They were asked how often they read in the topics over the last three years, and five options were provided for each topic: Never; About 1 hour per month; About 1 hour per week; About 1 hour per day; More than 1 hour per day. The options were then converted to numbers from 0 to 4. The overall exposure, quantified based on self-reported reading time, was correlated with that estimated by tART using Spearman’s rank correlation.

Post-reading comprehension check

Before the MRI task, participants were told that they would complete a comprehension test after the scanning, so they needed to read the texts attentively. Comprehension for each text was assessed using multiple-choice questions, each comprising a related factual question and three options, one of which was the correct answer. Sample questions are provided in the Supplementary Note 1. Six questions were created for each text (24 in total), and participants’ responses were manually scored by the researchers. Data from participants with poor performance (fewer than 16 correct answers, i.e., an accuracy rate below 0.667) were excluded from further analysis. Differences in comprehension performance between narrative and expository texts were tested using a paired-samples t-test. The correlation between comprehension performance and the amount of reading exposure was examined with Spearman’s rank correlation, as the relationship between reading and comprehension may not be linear.

Reading materials

Participants read four texts in Chinese in the task, including two narratives and two expository texts. In this study, we analyzed two narrative texts and two expository texts separately, as there are different neural bases between expository and narrative texts23,24. One of the narrative texts is a description of a dream (length = 577 characters) and the other is a fairy tale (length = 512 characters). One of the expository texts is about the planet Mars (length = 626 characters) and the other is about biomimetics (length = 577 characters). The narrative and expository texts were matched in character-based sentence length (expository: M = 19.603, SD = 0.610; narrative: M = 27.611, SD = 0.864) and mean parse tree depth (expository: M = 9.276, SD = 0.146; narrative: M = 10.707, SD = 0.213).

MRI procedure

The reading task was divided into two runs, each lasting approximately 4.5 minutes. Each run included an expository text and a narrative text presented in pseudo-random order. The texts were divided into phrases (number of characters: M = 5.29, SD = 1.83, range: 2–10). Quasi rapid serial visual presentation (RSVP) was used by presenting each phrase on a separate screen. The duration of each screen was determined by the type and number of characters and punctuation marks in the phrase: 170 ms for each Chinese character, 340 ms for commas and punctuation marks, and 890 ms for periods, question marks, and exclamation marks. These durations were chosen to approximate natural reading speed57. A blank screen of 10 s was shown at the end of each text. Before the experiment, participants were instructed to read each text attentively and were informed that they would take a reading comprehension test at the end. Both the comprehension test and the tART were administered after scanning.

MRI data acquisition

Structural and functional MRI data were acquired at the East China Normal University on a 3T Siemens Prisma scanner (Siemens, Erlangen, Germany), using a 64-channel head coil. T1-weighted structural images were collected with a 3D MPRAGE sequence with the following parameters: voxel size 1 × 1 × 1 mm, matrix size 224 × 224, TR = 2.300 ms, TE = 2.25 ms, FA = 8. Functional images were acquired with an interleaved multiband echo planar imaging (EPI) sequence with the following parameters: voxel size 2 × 2 × 2 mm, matrix size 96 × 96, TR = 1000 ms, TE = 32 ms, FA = 55, multiband factor = 6, with 72 axial slices.

Data preprocessing and cleaning

Primary preprocessing of the MRI data was performed using fMRIPrep 20.2.0, which performs standard anatomical and functional preprocessing steps, including motion correction, co-registration, spatial normalization, and confound estimation. A 4-mm full-width-at-half-maximum Gaussian kernel was applied for spatial smoothing. To remove the movement-related artifactual component58, voxel-wise nuisance regression was performed separately for each participant on the BOLD time series, including 24 motion regressors (six motion parameters, their temporal derivatives, and the squared terms of each).

fMRI ROI selection

All analyses involving neural data were conducted within 153 ROIs selected from the 400 ROIs in the Schafer atlas. We marked ROIs labeled with strings containing ‘default’ (corresponding to the default mode network) as DMN, and ROIs labeled with strings containing ‘Cont’ (corresponding to the frontoparietal control network) as FPN. Additionally, a language network ROIs were derived from the probabilistic functional atlas of a validated language localizer based on data from 806 individuals59. We identified voxels with a probability greater than 0.35. Any ROI from the Schaefer atlas containing more than 25% language-related voxels was classified as a language ROI, and it was marked as LAN if it was neither included in DMN nor FPN. A total of 153 ROIs were selected, including 91 DMN ROIs, 52 FPN ROIs, and 10 language-only ROIs (Fig. 2b).

Inter-subject correlation

We calculated pair-wise Inter-subject correlations (ISCs) for each of the selected ROIs. All ROIs were further limited to voxels that overlapped with brain masks from the 39 participants. Brain responses of a participant were averaged across the voxels within each specified ROI. For each ROI and each pair of participants, ISCs of their neural dynamics were calculated for the four texts separately, and the mean of the two correlations of narratives was taken as the neural similarity of narratives, as was the neural similarity of expository texts.

Reading experience quantification

As illustrated in Fig. 1, participants’ reading experience was represented as a vector distribution across 9 topics in the tART. The overall print exposure quantity was calculated as the sum of exposure in all topics. We compared print exposure in fiction and non-fiction topics: exposure was averaged across the topics with fiction/non-fiction topics, and dependent t-tests were conducted with the scipy package in Python.

To control for exposure quantity when representing one’s reading experience as a topic distribution, we first applied the SoftMax function to transform the vector into a probability distribution. This transformation amplifies differences across topics while ensuring that the sum of all dimensions equals 1. We used KL divergence, a measure of the difference between two distributions, to quantify the dissimilarity in participants’ reading experiences. KL divergence is computed using Eq. (1):

$${D}_{\rm{KL}\,}(P||{\text{Q}})=-\mathop{\sum }\limits_{t\in T}{P}_{t}* log\frac{{P}_{t}}{{Q}_{t}}$$
(1)

where P and Q denote the topic distribution of a participant’s reading experience, T represents the set of topics. As KL divergence is not symmetric, or DKL(P | | Q) is different from DKL(Q | | P), the dissimilarity between topic distribution P and Q was calculated as the mean value of DKL(P | | Q) and DKL(Q | | P).

Association between print exposure and neural ISC

To examine whether low print exposure would be associated with more idiosyncratic neural responses, the AnnaK model, introduced in prior work19,30, was used as an analytical framework. Specifically, AnnaK scores were computed by correlating vectorized neural RDM and vectorized predicted RDM. According to this model, participants with lower exposure would exhibit higher variability in neural response patterns, echoing the famous line from the novel Anna Karenina, “every unhappy family is unhappy in its own way”. A cell in the neural RDM represents 1 minus the ISC of a participant pair. In the predicted RDM, the behavior score (i.e., print exposure) was converted to ranked data, with the lowest score as 0 and the highest score as 1; the predicted similarity between participants was calculated as the mean of the participant pair. To control for the potential influence of education, we used age as a proxy, given that all participants were college students and age in this cohort is strongly associated with years of education. An education RDM was regressed out from the vectorized predicted RDMs using linear models implemented in the Python package statsmodels.

The AnnaK model was built for each of the selected 153 ROIs with data collected during narrative reading and expository reading separately. ROIs with FDR-corrected q < .05 were reported as significant.

Shared patterns between reading experience and neural responses

We performed IS-RSA to characterize the shared patterns between distributional reading experience and neural dynamics. A behavioral RDM was created by computing the KL divergence of the pairwise probability distributions in the topics in the tART (for 9-topic, 5-topic, and 4-topic representations separately). To mitigate the influence of overall print exposure and education, the corresponding exposure RDM (i.e., the predicted RDM in the AnnaK model) and the education RDM were regressed out from the vectorized behavioral RDM with the linear models in the Python package statsmodels.

Both the neural and behavioral RDMs were vectorized. The behavioral RDM was Spearman rank correlated with the neural RDM for each ROI (Fig. 3a). IS-RSA was performed separately for the high- and the low-exposure groups, the narrative and expository texts, the 9-/5-/4- topic reading distributions representing different characterizations of reading experience. All p-values were corrected for multiple comparisons among 153 ROIs using the FDR method, with q < 0.05 indicating statistical significance.