Transcriptional landscape of the cell cycle in a model thermoacidophilic archaeon reveals similarities to eukaryotes

Gomez-Raya-Vilanova, Miguel V.; Teulière, Jérôme; Medvedeva, Sofia; Dai, Yuping; Corel, Eduardo; Lopez, Philippe; Lapointe, François-Joseph; Bhattacharya, Debashish; Haraoui, Louis-Patrick; Turc, Elodie; Monot, Marc; Cvirkaite-Krupovic, Virginija; Bapteste, Eric; Krupovic, Mart

doi:10.1038/s41467-025-60613-8

Download PDF

Article
Open access
Published: 01 July 2025

Transcriptional landscape of the cell cycle in a model thermoacidophilic archaeon reveals similarities to eukaryotes

Nature Communications volume 16, Article number: 5697 (2025) Cite this article

1804 Accesses
4 Citations
13 Altmetric
Metrics details

Subjects

Abstract

Similar to many eukaryotes, the thermoacidophilic archaeon Saccharolobus islandicus follows a defined cell cycle program, with two growth phases, G1 and G2, interspersed by a chromosome replication phase (S), and followed by genome segregation and cytokinesis (M-D) phases. To study whether and which other processes are cell cycle-coordinated, we synchronized cultures of S. islandicus and performed an in-depth transcriptomic analysis of samples enriched in cells undergoing the M-G1, S, and G2 phases, providing a holistic view of the S. islandicus cell cycle. We show that diverse metabolic pathways, protein synthesis, cell motility and even antiviral defense systems, are expressed in a cell cycle-dependent fashion. Moreover, application of a transcriptome deconvolution method defined sets of phase-specific signature genes, whose peaks of expression roughly matched those of yeast homologs. Collectively, our data elucidates the complexity of the S. islandicus cell cycle, suggesting that it more closely resembles the cell cycle of certain eukaryotes than previously appreciated.

Loss of CDK4/6 activity in S/G2 phase leads to cell cycle reversal

Article Open access 05 July 2023

Archaeal extracellular vesicles are produced in an ESCRT-dependent manner and promote gene transfer and nutrient cycling in extreme environments

Article 26 April 2021

scHiCyclePred: a deep learning framework for predicting cell cycle phases from single-cell Hi-C data using multi-scale interaction information

Article Open access 31 July 2024

Introduction

The life of a cell unfolds through a series of intricately coordinated events that culminate in the production of two daughter cells. Faithful execution of this program, known as the cell cycle, ensures the perpetuation of cellular life. The core processes essential for cell cycle progression, such as accumulation of biomass, genome replication and cytokinesis, are common to all organisms^1,2,3, but their coordination and the underlying molecular mechanisms exhibit remarkable diversity. For instance, eukaryotes encode a diverse set of cyclins¹, the molecular regulators of the cell cycle, for which no homologs exist in archaea and bacteria^2,3. Understanding the interplay between different cellular processes and the evolution of these relationships in different cellular domains is of fundamental interest and practical importance, because it offers insights into how both unicellular and multicellular life forms have evolved on our planet.

Bacteria and eukaryotes rely on distinct genome replication and cell division machineries, with bacteria using the FtsZ-based system for division⁴ and eukaryotes employing the ESCRT (endosomal sorting complexes required for transport) machinery for membrane abscission during cytokinesis^5,6. In most model eukaryotes in which cell cycle has been studied, genome replication and cytokinesis are separated in time with the cell cycle being divided into four phases^1,7,8: (i) during the first gap (G1) phase the cell grows and prepares for genome replication; (ii) the synthesis of genomic DNA takes place during the S phase; (iii) the second gap (G2) phase is a period of rapid cell growth and protein synthesis; and, finally, (iv) during the mitosis (M) phase the sister chromatids are segregated and the cell is divided in two. Progression through the cell cycle phases is tightly controlled at several checkpoints^1,7,8, with errors at any of the checkpoints leading to devastating consequences at both cellular and organismal levels, including cell death, cancers and various other pathologies^9,10,11.

In bacteria, the cell cycle is traditionally divided into three periods^2,12: (i) the birth (B) period defined as the time between cell birth and initiation of genome replication, (ii) the C period–from chromosome replication initiation to termination, and (iii) the D period, which corresponds to the time between completion of DNA replication and cell division. Unlike in eukaryotes, the periods of the bacterial cell cycle are typically less strictly separated in time. For instance, bacterial genome replication is concomitant with segregation of chromosomal DNA into developing daughter cells and, depending on the growth conditions, generation time can be shorter than the combined duration of the C and D periods, leading to multiple DNA replication initiation events and overlapping replication cycles in each cell¹³. Accordingly, bacteria generally lack the cell cycle checkpoints, although in some bacterial models, cell volume and other characteristics (e.g., motility or lack thereof) are important for cell cycle progression^14,15,16.

Archaea comprise a distinct domain of life and represent the closest ancestors of eukaryotes^17,18. These single-celled organisms display a remarkable diversity of metabolic capabilities, environmental adaptations and molecular machineries responsible for key cellular processes. Similar to bacteria, archaea have circular chromosomes with most genes organized into operons¹⁹. However, archaeal proteins involved in replication, transcription and translation are more closely related to homologs in eukaryotes^20,21. In contrast, cell division systems can be either bacterial-like, based on the FtsZ rings^22,23,24, or eukaryotic-like, centered around the ESCRT complex^25,26,27,28.

Thermoacidophilic archaea (optimal growth at ~80 °C and pH~3) in the order Sulfolobales emerged as models for cell biology and cell cycle studies²⁹. The coccoid-shaped Sulfolobales cells use the ESCRT machinery for division and follow the eukaryotic-like cell cycle paradigm (Fig. 1A). An exponentially growing Sulfolobales cell starts the cycle with a short ( < 5% of the cell cycle) pre-replicative G1 phase, which is followed by the genome replication S phase (30-35% of the cycle). Then, the cell enters the longest G2 phase ( > 50% of the whole cycle) during which the cell prepares for genome segregation. Finally, the cycle culminates with two short, M and D, phases (each lasting <5% of the cycle) during which the genome copies are segregated and the cell is divided^19,29. The overall program of the cell cycle appears to be conserved throughout the class Thermoprotei (formerly known as phylum Crenarchaeota)³⁰. Importantly, the cell cycle in a Sulfolobales population can be synchronized using a transient treatment with acetic acid, which presumably induces respiration uncoupling, arresting the cells in the post-replicative G2 cell cycle phase. Acetic acid removal allows the near-synchronous resumption of the cell cycle²⁹. Whereas the overall outline of the Sulfolobales cell cycle and coordination between genome replication and cytokinesis have been defined^{19,25,31,32,33,34}, it remains unknown whether other central cellular processes, such as diverse metabolic and catabolic pathways or protein translation, are harmonized with the cell cycle.

**Fig. 1: Overview of the transcriptional landscape across cell cycle phases.**

To obtain a more integrated view of the different processes taking place during the archaeal cell cycle, we performed a deep transcriptomic analysis using Saccharolobus islandicus (formerly Sulfolobus islandicus; order Sulfolobales) as a model. We analyzed differential gene expression patterns during distinct cell cycle phases and harnessed the the statistical framework of the consensus Gene Co-expression Networks (GCN) to analyze negative and positive correlations between all expressed genes. The two complementary analytical approaches showed that the cell cycle of S. islandicus more closely follows the general eukaryotic paradigm than previously appreciated. We show that not only replication, chromosome segregation and division, but also other cellular processes occur in a cell cycle-dependent manner. Finally, we used a transcriptome deconvolution method to identify signature genes that are specific of particular cell cycle phases. Remarkably, these genes were generally well-conserved across Thermoproteota and their timing of expression matched the peak of expression of their homologs in yeast. Collectively, our data substantially improve the understanding of the S. islandicus cell cycle, opening new avenues for future research.

Results and discussion

Overview of the transcriptional landscape across cell cycle phases

To analyze if S. islandicus cells coordinate different processes as they progress through the cell cycle, we sequenced the transcriptomes at three time points during which the populations were enriched in cells undergoing the M-G1, S and G2 phases, as evidenced by flow cytometry (Fig. 1B, and Supplementary Fig. S1A). Due to the short duration of the consecutive M, D and G1 phases, populations specifically enriched in these phases could not be collected separately. Hence, the corresponding populations were pooled together within a single sample, denoted herein as M-G1. To obtain additional information about the coordination of gene expression, we leveraged the statistical framework of the consensus Gene Co-expression Networks (GCN). Out of the total of 2630 predicted genes, including protein coding genes, tRNA, rRNA and ncRNA, 2558 were expressed (i.e., at least one read per million reads), with 2356 genes being expressed in all three phases (Supplementary Fig. S1B, and Supplementary Data 1). Principal component analysis showed that 56.4% of the variance can be explained by the three first principal components and that samples can be clustered according to the cell cycle phases (Supplementary Fig. S1C). Notably, comparison of the transcriptomes from synchronized cultures with several publicly available transcriptomes from unsynchronized cultures showed that the two datasets clustered separately (Supplementary Fig. S1C), suggesting that the variance observed in our study is due to differences between the cell cycle phases.

The S. islandicus chromosome is structurally organized into A and B compartments that have high and low gene expression, respectively^35,36. Consistently, genes in the A compartment had on average higher expression than genes in the B compartment during all cell cycle phases (Supplementary Fig. S1D), with the expression being the highest near the three origins of replication (ori). However, there was only negligible difference (t-test, p = 0.07) in the overall level of expression within the ori proximal regions during different cell cycle phases.

Pairwise comparison of the gene expression profiles during consecutive cell cycle phases, namely, M-G1 vs S, S vs G2 and G2 vs M-G1, revealed 743 differentially expressed genes (i.e., more than 1.4-fold change difference, adjusted p value < 0.01), which were either up- or down-regulated during a particular cell cycle phase (Fig. 1C; Supplementary Fig. S2), suggesting that expression of 28% of S. islandicus genes follows a coordinated pattern throughout the cell cycle. Of the 743 differentially expressed genes, 308 were specifically up- or down-regulated during a particular cell cycle phase, pointing to marked differences between the processes that occur during the three phases. The M-G1 phase displayed the highest number of differentially expressed genes when compared to either S or G2 phase (n = 198 in M-G1 vs n = 3 in S and n = 107 in G2).

Differential gene expression (DGE) analysis provides information on the changes in the extent of expression of genes during a given cell cycle phase. However, although instructive, this information does not capture the subtle changes in the rewiring of gene co-expression patterns, which might have a major impact on the progression of the cell cycle. Thus, to gain an orthogonal view on the co-expression of genes during different phases of the cell cycle, we constructed consensus GCNs for each of the three phases. Each GCN consists of nodes and edges, where nodes are genes and edges, i.e., lines connecting the nodes, represent statistically significant positive or negative correlations between the expression values of the connected genes. Each GCN had a different number of nodes and edges, which were classified into (i) the ‘core’, i.e., common to all three phases, (ii) ‘phase-specific’, i.e., co-expressed during one of the phases or (iii) ‘other’, if the co-expression was detected during two of the three phases (Supplementary Fig. S3). This analysis revealed that the S phase is characterized by the most complex network containing the highest number of edges and nodes (Supplementary Data 1), indicative of higher coordination of gene expression during this phase compared to the other phases. The DGE and GCN analyses show that a considerable fraction of genes follows cell cycle-dependent patterns of expression that manifest either at the level of expression strength or co-expression wiring. Notably, the GCNs can be used to uncover diverse aspects of cell biology, including functional associations between genes, shared promoter signatures and so much more (see Supplementary Note 1 for details).

Composition and properties of the core network

Contrasting the phase-dependent expression of genes that, by definition, characterize a particular cell cycle phase, the genes and groups of genes that ensure the maintenance of cell viability are expected to be co-expressed throughout the cell cycle. This co-expression is expected to occur regardless of whether the products of the co-expressed genes are functionally coupled, i.e., function in the same process or pathway. Analysis of such constitutively co-expressed genes, which we refer to as the housekeeping ‘core’ network, allowed us to define the house-keeping tasks necessary for the functioning of a cell. The core GCN, consisting of the core nodes connected by the core edges, contained 417 genes (Supplementary Data 2; note that only groups of four or more co-expressing genes were included as part of the core network). Analysis of the overall expression levels of the genes forming the core network revealed notable variation in expression levels across cell cycle phases, with significantly higher activity during the S and G2 phases (p value < 0.0001; Fig. 2A).

Fig. 2: Core network of *S. islandicus.*

All the nodes (i.e., genes) in the core network were assigned to functional categories according to the archaeal Clusters of Orthologous Genes (arCOG) annotation³⁷ (Fig. 2B). Only ~15% of the genes in the core network could not be functionally annotated (arCOG categories R and S). Most genes in the core network were associated with central cellular processes such as translation, replication, energy production and various metabolic pathways. Analysis of the network topology, revealed several differentiated subclusters (Fig. 2C). Their presence suggests the existence of transcriptional fine-tuning in functionally linked genes. For instance, the largest subcluster was enriched in genes responsible for ribosome structure (e.g., ribosomal proteins uL23, uL2, uS3, uS8), transcription (e.g., transcription elongation factor NusA, DNA-directed RNA polymerase), genome replication and repair (e.g., DNA polymerase B1, topoisomerase VI, HerA, NurB), respiration (e.g., ATP-synthase, succinate dehydrogenase), carbon fixation (e.g., succinyl-CoA synthetase, aconitase A) and included genes responsible for the maintenance of the cell envelope (S-layer protein) (Fig. 2C). By contrast, several ABC-type transporters and sulfur metabolism genes (e.g., heterodisulfide reductase, sulfite reductase, ATP-sulfurylase), although also included in the core network, formed separate GCN subclusters. Notably, some of the CRISPR defense system genes (e.g., Cas10, Cmr6g7, Cmr1g7 and Cmr5SS of the Cmr-β cassette) were part of the core network, included in the largest subcluster, whereas other cas genes were expressed during particular cell cycle phases (see below), suggesting subtle transcriptional control of this complex defense system.

We assessed whether the core network is enriched in essential genes. To this end, we took advantage of the genome-wide gene essentiality information available for a closely related S. islandicus strain M.16.4³⁸ and compared the fraction of essential genes in the core network to that of the non-core genes. The proportion of essential genes was more than twice higher in the core network than in the rest of the genome (34% vs 13%, p value < 0.01; Fig. 2D). The majority (n = 115, 82%) of the essential genes were found in the largest subcluster of co-expressed core genes (Supplementary Fig. S4A), suggesting the existence of a mechanism ensuring coordinated and stable co-expression of genes that are critical for the cell functions. We assessed the distribution of the core genes with respect to the chromosomal A and B compartments. Slightly more than half (57.6%) of the core network genes were localized in the A compartment (p value < 0.01; Fig. 2E, Supplementary Fig. S4B). Collectively, these observations suggest that the S. islandicus chromosome has evolved to accommodate essential genes important throughout the cell cycle in the transcriptionally active, less tightly condensed part of the chromosome.

We then evaluated to what extent the core network is conserved in other archaea. To this end, genes from the core network were assigned to one of the four categories: (i) genes exclusive to the genus Saccharolobus, (ii) those restricted to the order Sulfolobales, (iii) genes conserved across the phylum Thermoproteota, and (iv) genes present in Thermoproteota and three other archaeal phyla (as defined by GTDB), namely, Methanobacteriota, Halobacteriota and Thermoplasmatota. We found that the core genes displayed significantly higher conservation compared to non-core genes, with 93.5% of the core genes being conserved across Thermoproteota, with 62.5% of the core genes being also conserved in three other phyla. However, when the non-core genes were considered, the fraction decreased to 83.7% and 46.1%, respectively (p value < 0.01; Fig. 2F). Moreover, analysis of the network topology showed that the most widely conserved genes occupied the central position of the largest co-expression subcluster (Supplementary Fig. S4C). Indeed, the degree of the most widely conserved genes was significantly higher than that of less conserved genes (Supplementary Fig. S4D), suggesting that the interplay between the core genes evolved prior to the radiation of archaeal diversity and that new, taxon-specific genes and their interactions with the conserved components of the core network have been established subsequently, likely concomitant with archaeal diversification.

Key cellular processes are coordinated with the cell cycle

To determine how S. islandicus coordinates different processes along the cell cycle, we assessed the differential expression and co-expression during different cell cycle phases of all 2558 expressed genes (Fig. 3A, B and Supplementary Fig. S5A). Additionally, KEGG enrichment analysis facilitated the identification of metabolic pathways significantly upregulated during each phase (Fig. 3C, and Supplementary Fig. S5B–D).

**Fig. 3: Cell cycle phase specific processes.**

Lipid biosynthesis and membrane biogenesis

Following cell division, the daughter cells start accumulating biomass and increase in size, up until the next round of cell division¹⁹. Increase in the surface area of the cell necessitates the synthesis of additional lipids. Similar to other members of the Sulfolobales, the S. islandicus cell membrane primarily consists of different species of glycerol dibiphytanyl glycerol tetraethers (GDGTs), which contain diverse polar head groups and a variable number of cyclopentane rings in the hydrophobic isoprenoid core³⁹. Nine enzymes that participate in the synthesis of GDGT lipids have been identified^40,41,42,43 (Fig. 4A) and five of them are significantly upregulated during the M-G1 phase compared to either S or G2 phase (Fig. 4B). Among these, digeranylgeranylglyceryl phosphate (DGGGP) synthase, which catalyzes the formation of an ether bond linking the second isoprenoid chain to the lipid precursor⁴⁴, and calditol synthase (Cds), responsible for the synthesis of a unique cyclopentyl head group which plays a key role in the acid resistance of Sulfolobales⁴³, are the most strongly upregulated genes. Notably, the GCN analysis showed that the lipid biosynthesis pathway is highly coordinated during the M-G1 and S phases, compared to G2 (Fig. 4C, inset) and synchronized with expression of diverse membrane proteins, post-translational modification enzymes and various systems embedded in the membrane, such as the Complex II of the electron transport chain and the ATP synthase (Fig. 4C). This observation suggests the existence of a link between cell membrane dependent systems and lipid biosynthesis. During M-G1, we observed higher specific co-expression, compared to S and G2, of genes encoding glycosyltransferases (SiRe_RS03975, SiRe_RS04230, SiRe_RS08195 and SiRe_RS02080), which according to their arCOG annotation are predicted to participate in membrane biogenesis (Supplementary Fig. S5A). Furthermore, many of the genes co-expressed with the lipid biosynthesis enzymes encode poorly characterized proteins, which could represent missing players in the membrane biogenesis processes. For instance, glycosyltransferase (SiRe_RS06775) and N-acetylneuraminate lyase (SiRe_RS10400) identified in the GCN analysis could participate in the synthesis of the lipid headgroups (Fig. 4C).

**Fig. 4: Changes in membrane biogenesis, motility and adhesion across the cell cycle.**

Genome replication and architecture

Genome replication is one of the focal points of the cell cycle and, by definition, occurs during the S phase, as evidenced by flow cytometry analysis (Fig. 1B). Unexpectedly, some of the key components of the replisome, such as replicative DNA polymerase PolB1 (SiRe_RS07370), replicative minichromosome maintenance (MCM) helicase (SiRe_RS06220), Gins23 (SiRe_RS06225), which participates in replication initiation and elongation, PolB1 binding protein 2 (SiRe_RS07230), and PCNA sliding clamp (SiRe_RS08085), are upregulated during the M-G1 (versus the G2 phase), a period preceding the actual S phase (Supplementary Data 1). Moreover, Orc1-1 (SiRe_RS08850), Orc1-3 (SiRe_RS00005) and WhiP (SiRe_RS06120), the three replication initiators of S. islandicus, are also upregulated during the M-G1 (versus the G2 phase), with expression of Orc1-3 and WhiP being also maintained throughout the S. The peak expression of Orc-1-1 and WhiP was observed during M-G1, whereas Orc1-3 expression peaks throughout the S phase. These results were validated for selected replisome genes using RT-qPCR and compared to non-synchronized cultures (Supplementary Fig. S6). Notably, the ATP-dependent DNA ligase (SiRe_RS09250), which ligates the Okazaki fragments during the lagging strand synthesis²¹, is upregulated during the S phase (when compared to the M-G1 phase), indicating that some replisome components have a different temporal expression during the cell cycle. Nevertheless, these observations suggest that the cell prepares for DNA replication in advance, by synthesizing most of the necessary enzymes during the M-G1 phase. Alternatively, the replisome components could also participate in DNA repair, preparing the genome for replication during the S phase. Indeed, some of the DNA repair genes are upregulated during the M-G1 phase as well (see Supplementary Note 2).

Many chromatin proteins, including Cren7 (SiRe_RS05625), two Sul7d family proteins (SiRe_RS03405 and SiRe_RS13370) and two Sso7c4 homologs (SiRe_RS07595 and SiRe_RS09950), were upregulated during the S phase compared to either the M-G1 or G2 phases (Fig. 3A, Supplementary Data 1), suggesting that chromatinization takes place concomitant with DNA replication. It has been recently suggested that Lrs14 family proteins of Sulfolobales should be considered as chromatin organizing proteins⁴⁵. Our data show that one of the Lrs14 family proteins in S. islandicus, SiRe_RS09945, displays a similar transcription pattern as the main chromatin proteins, supporting the conclusion of De Kock et al.⁴⁵ that Lrs14 is involved in chromatin organization. Given that chromatin proteins are among the most abundant proteins in the cell, chromatinization is likely to necessitate extensive protein translation during the S phase. Indeed, many translation-related genes, including those encoding a subset of ribosomal proteins, tRNAs, glycyl-tRNA synthetase, a subunit of the RNase P and translation initiation factor 6 as well as thermosome responsible for protein folding, were upregulated during the S phase (see Supplementary Note 3 for details).

Central metabolism

We next assessed whether the central metabolic pathways also displayed differential regulation during the cell cycle phases. To this end, we performed the KEGG enrichment analysis (Supplementary Fig. S5B-D) and analyzed the patterns of differential expression of genes assigned to arCOG categories related to metabolism (Fig. 3A). The two approaches provided congruent and complementary results. Many pathways, including those related to biosynthesis of amino acids and nucleotides (see Supplementary Note 4) as well as carbon metabolism, were not uniformly expressed across different cell cycle phases.

The tricarboxylic acid (TCA) cycle (also known as the Krebs or citric acid cycle) is one of the key energy-generating metabolic pathways of the cell. Through a series of biochemical reactions, TCA releases the energy stored in nutrients through the oxidation of acetyl-CoA derived from carbohydrates, fats, and proteins. We found that the carbon fixation pathways, an assemblage of metabolic pathways terminating in the TCA, are upregulated during the S phase (versus M-G1 phase) and stay active during G2 (Fig. 3C, Supplementary Fig. S5B-D). These pathways fix carbon through the synthesis of malonyl-CoA, which can then be transformed into succinyl-CoA, a key intermediate in the TCA cycle. The subunits of the acetyl-CoA carboxylase (SiRe_RS01265, SiRe_RS01270, SiRe_RS01275), which synthesizes malonyl-CoA, as well as the methylmalonyl-CoA epimerase (SiRe_RS01085) and mutase (SiRe_RS01080), which catalyze the last step in the transformation to succinyl-CoA, are upregulated during the S phase and even more strongly during G2, compared to M-G1 (Supplementary Data 1). Alternatively, the succinyl-CoA can enter the autotrophic hydroxybutyrate cycle⁴⁶ and produce acetoacetyl-CoA, which will be broken into two acetyl-CoA molecules. Some enzymes participating in this transformation, namely, succinyl-CoA reductase (SiRe_RS04600), succinate semialdehyde reductase (SiRe_RS07755), 3-hydroxybutyryl-CoA dehydrogenase (SiRe_RS11400) and two acetyl-CoA acetyltransferases (SiRe_RS07425 and SiRe_RS13030), follow expression patterns similar to enzymes producing succinyl-CoA. Moreover, although not significantly enriched in any of the phases, the TCA cycle includes several enzymes, such as succinate dehydrogenase (SiRe_RS00780 and SiRe_RS00785) and the aconitate hydratase (SiRe_RS0595), that are also upregulated during the S and G2 phases, compared to the M-G1 phase. These results indicate that the most central and important pathways for production of energy tend to be less active during M-G1.

Notably, the substrates used for energy production during the S and G2 appear to be different. For instance, fatty acid degradation pathways (Supplementary Fig. S5B) are upregulated during the S phase, according to the KEGG enrichment. This is concomitant with the downregulation during the S phase (compared to the M-G1 phase) of the transcriptional repressor FadR (SiRe_ RS01515), a TetR-family transcriptional regulator that has been shown to repress β-oxidation⁴⁷. By contrast, the G2 phase is associated with higher activation of the glycolysis/gluconeogenesis and other carbohydrate-related pathways as well as sulfur metabolism. Many of the differentially expressed genes in G2 are implicated in glycolysis/gluconeogenesis and sulfur metabolism. These include sulfide:quinone oxidoreductase (SiRe_RS13005), implicated in sulfur metabolism, and phosphoenolpyruvate carboxykinase (SiRe_RS02025), aldose 1-dehydrogenase (SiRe_RS11380) and gluconate dehydratase (SiRe_RS10395), which participate in the glycolysis/gluconeogenesis and the pentose phosphate pathways (Fig. 3C). Because many of the enzymes in the glycolysis/gluconeogenesis pathway are shared with the fructose, mannose, sucrose and starch metabolism, the latter pathways were also enriched in the G2 phase (Supplementary Fig. S5C). However, no differential gene expression of enzymes specific for each of these pathways was found during the S and G2 phases. Moreover, because S. islandicus lacks the essential phosphofructokinase needed to perform glycolysis through the canonical Embden-Meyerhof-Parnas Pathway⁴⁸, it is likely that glucose or fructose molecules are metabolized via the pentose phosphate pathway to glyceraldehyde or glycerate and then introduced into the glycolytic pathway to be fully oxidized. Finally, the profound shifts in the metabolic landscape during the G2 phase are also supported by the largest density of specifically co-expressed metabolism-related genes during this phase. In particular we observed specific co-expression of genes implicated in energy production and conversion and the carbohydrate metabolism genes (Supplementary Fig. S5A).

Cell division, motility and adhesion

As explained above, mitosis (M), division (D) and the pre-replicative first gap (G1) phases occur in rapid succession, precluding us from obtaining populations enriched in these discrete phases. Nevertheless, some of the proteins are known to be markers for the M and D phases. In particular, chromosome segregation during the M phase is mediated by a pair of proteins, SegA and SegB⁴⁹, while cell division during the D phase is driven by the ESCRT-based machinery^27,28. Consistently, during M-G1, we observed strong upregulation of genes encoding the pair of genome segregation proteins (Fig. 3A, Supplementary Data 1) and ESCRT-based cell division machinery (including CdvA, ESCRT-III, ESCRT-III-1, ESCRT-III-2 and Vps4), compared to either S or G2 phase. Moreover, aCcr1, a transcription factor which terminates the cell division by repressing cdvA, exhibits peak expression during the M-G1 phase, consistent with the published results³². Notably, several other transcription factors display strongly pronounced cyclical expression patterns, suggestive of their importance for the progression of the cell cycle (see Supplementary Note 5 for details).

As in other members of the Sulfolobales^{50,51,52,53,54}, motility and adhesion in S. islandicus are mediated by two evolutionarily related but functionally distinct extracellular filaments, the archaeal flagellum (or archaellum) and adhesive pili, respectively. Both filaments are composed of pilins/archaellins related to bacterial type IV pilins^55,56,57 that are secreted through a membrane pore with the help of a cognate ATPase motor^58,59. Live cell imaging of S. acidocaldarius cells revealed changes in the cell adhesion and motility around the time of division, suggesting the existence of coordination between these processes⁶⁰. In particular, ~80% of the observed cells underwent a transient loss of adhesion immediately prior or during cell division and >40% of newborn daughter cells rapidly moved away from the site of division⁶⁰. Our data are fully consistent with these findings. Differential gene expression analysis showed that the pore and ATPase that secrete the archaellins are upregulated during the M-G1 and S phases, compared to the G2 phase (Fig. 4D). By contrast, the pore and ATPase responsible for the export of the adhesive pilins are both upregulated during the G2 phase (Fig. 4D). These patterns suggest that adhesive pili, present during the G2 phase, would be replaced by archaella following cell division. Interestingly, during the S phase, adhesive pilins and the ATPase of the archaellum were inversely co-expressed with an FKBP family peptidyl-prolyl cis-trans isomerase chaperone (SiRe_RS06340), suggesting the involvement of the latter in the switch between swimming motility and adhesion (Fig. 4E).

Notably, the expression of the archaellum system seems to be activated by the one-component system ArnR, with deletion of this system in S. acidocaldarius affecting the expression of the archaellin and impairing motility⁶¹. Our data shows the ArnR ortholog in S. islandicus (SiRe_RS00635) is upregulated during the M-G1 phase (compared to G2), consistent with the possibility that activation of the archaellum operon during the M-G1 and S phases is regulated by the ArnR transcription factor. Moreover, the repression of the archaellum operon during the G2 phase is concomitant with the upregulation of the gene encoding coalescin (SiRe_RS00270), an SMC superfamily chromosome organizing protein which has been shown to have a high occupancy in the archaellum operon³⁶. High occupancy of coalescin within particular genomic loci impedes the access of RNA polymerase and results in transcriptional repression.

Defense systems

Our data shows a strong upregulation of various defense related genes during the S and G2 phases, compared to the M-G1 phase (Fig. 3A). Defense against viruses and mobile genetic elements in S. islandicus REY15A is primarily mediated by the CRISPR (clustered regularly interspaced short palindromic repeats) system⁶² (see Supplementary Note 6 for description of the S. islandicus CRISPR systems). In addition to the more extensively studied CRISPR-Cas systems, S. islandicus REY15A encodes two recently predicted, but functionally uncharacterized defense systems, namely, Hma⁶³, composed of a helicase (SiRe_RS03020), methyltransferase (SiRe_RS03035) and ATPase (SiRe_RS03030), and the Methylation Associated Defense System (MADS) (SiRe_RS00305 and SiRe_RS00315)⁶⁴ (Fig. 5A).

The differential gene expression analysis showed that different components of the Cascade complex, Cmr-α, and Cas6, which is responsible for the processing of the CRISPR RNA, were upregulated during the S phase, compared to the M-G1 phase, along with the Hma genes (Fig. 5B). All the aforementioned genes are also active during G2, when they are joined by the upregulated Cmr-β genes. Such pattern of expression suggests activation of the defense and surveillance systems during S and G2, albeit with slight variation between the different CRISPR types and modules. Curiously, concomitant with their downregulation during the M-G1 and subsequent de-repression during the S phase, networks of co-expression of CRISPR related genes showed an inverse correlation with transcriptional repressors during these two phases (two Lrp family transcriptional regulators, SiRe_RS02835 and SiRe_RS07120, during M-G1 and S, respectively; and two ArsR family transcriptional regulators, SiRe_RS09190 and SiRe_RS09585, during S phase) (Fig. 5C). In addition, a PIN-domain ribonuclease toxin (SiRe_RS03095) and several transposases (IS110 [SiRe_RS00565] and IS5 [SiRe_RS04210] family transposases and TnpB family nucleases [SiRe_RS04095, SiRe_RS05390, SiRe_RS06735]) also showed similar co-expression, indicative of a possible control by the defense system. By contrast, positive correlation between the expression of Cas nucleases with diverse cellular nucleases, including mRNA ribonuclease (SiRe_RS03015), Rrp4 cap of the RNA-degrading exosome complex⁶⁵ (SiRe_RS06470) and TatD-family nuclease (SiRe_RS05420), and proteins implicated in DNA repair, such as Mre11 (SiRe_RS00310), PolB3 (SiRe_RS09745) and NreA (SiRe_RS06100) suggests the presence of yet to be discovered cellular partners of the CRISPR-Cas systems. Some of these partner proteins could be, for instance, activated by cyclic oligoadenylate (cOA) as in the case of Csx1 or play a role in preserving genome integrity upon CRISPR-Cas activation. During the S phase, which contains the most coordinated network of the CRISPR-Cas systems (Fig. 5C inset), all three components of the Hma system and genes of the putative MADS defense co-expressed with different CRISPR-Cas modules suggesting coordination between these distinct defense systems.

It is generally considered that defense systems are under tight regulatory control and activated only upon invasion of foreign mobile genetic elements, such as viruses or plasmids^66,67. Our results suggest that this might not be entirely the case. Activation of some of the defense systems during the S phase could be triggered by the exposed DNA replication intermediates or more active proliferation of transposons. Alternatively, CRISPR systems may additionally function in non-defense contexts, for instance, during DNA repair, as previously hypothesized⁶⁸. For instance, it has been recently demonstrated that in halophilic archaeon Haloferax volcanii, Cas3 protein, component of type I systems, facilitates rapid recovery from DNA damage⁶⁹. Indeed, our co-expression networks show coordination of DNA repair and defense systems, suggesting a role of the defense systems in safeguarding the integrity of the genome.

Phase signature genes defined via transcriptome deconvolution

To validate the results of the differential gene expression and GCN analyses, and to identify the signature genes defining each phase, we applied a transcriptome deconvolution method to the transcriptomics and flow cytometry data (Supplementary Data 4; see Methods). The flow cytometry data demonstrates that each sample, although enriched in cells from the targeted phase, includes cells from all three phases (Supplementary Fig. S1). Using non-negative least squares (nnls) optimization we corrected gene expression levels in each sample by excluding transcription signal possibly originating from cells in non-targeted phases. S. islandicus genes were clustered based on corrected expression levels into four different groups depending on whether they tended to be more expressed at one of the three phases studied (groups 1-3) or display similar expression throughout the cell cycle (group 4) (Supplementary Data 4). This allowed identification of gene sets showing phase-specific expression patterns. In the next step, to gain a clearer view on the processes taking place during each of the phases, we filtered out all the poorly annotated genes assigned to arCOG categories R or S (Supplementary Data 4). The picture which emerged from this analysis was fully consistent with and complementary to the conclusions drawn from the manual analysis of the differential gene expression patterns and GCNs described in the previous sections (Fig. 6A). In particular, the signature genes specific of the M-G1 phase included those for genome segregation, cell division, DNA replication as well as nucleotide and lipid metabolisms; S-specific genes set included genes for diverse chromatin proteins and translation related genes as well as some of the carbon metabolism genes; G2 phase was characterized by genes for carbohydrate and sulfur metabolism as well as changes in the motility and defense. Moreover, functional enrichment analysis of the phase specific gene sets (Supplementary Data 4) shows significant enrichment of cell division, segregation and replication categories during the M-G1 phase; translation, transcription (which includes chromatin proteins), lipid metabolism and defense categories during the S phase, or defense and cell motility categories during the G2 phase.

Notably, some of the signature genes identified after transcriptome deconvolution, did not stand out in the differential gene expression analysis. For instance, the signature gene set of the M-G1 phase includes the β-subunit of the proteasome and the proteasome-activating nucleotidase, a regulatory subunit that drives the conformational changes during the proteasome functional cycle⁷⁰. It has been shown that in the presence of a proteasomal inhibitor, the ESCRT-III rings cannot be disassembled, resulting in cell division arrest in S. islandicus²⁸ and other Sulfolobales species²⁷. These experimental results are consistent with the proteasomal genes being selected as the M-G1 signature genes. Another example is the S-layer, its two subunits, SlaA and SlaB (SiRe_RS08185 and SiRe_RS08180), show their peak of expression during the M-G1 phase, concomitant with the up-regulation of membrane biogenesis pathways observed during that phase. Thus, the three phase-specific gene sets defined via the transcriptome deconvolution method appear to adequately represent the gene expression patterns along the S. islandicus cell cycle and can be used in subsequent analyses to assess the state of cells under different experimental conditions. Importantly, this analysis indicates that many processes in S. islandicus are coordinated with the cell cycle, being expressed during particular phases, resembling, at least qualitatively, the transcriptional landscape of the cell cycle in most eukaryotes.

The identified signature genes were significantly (p value < 0.01) more conserved compared to the rest of the genes. Namely, 89.9% and 51.6% of the signature genes were conserved across Thermoproteota and three other archaeal phyla, respectively, whereas the non-signature genes displayed lower conservation in the corresponding lineages (78.6% and 44.5%, respectively). The higher conservation of the phase-specific genes suggests that the regulation and the overall structure of the Saccharolobus cell cycle is conserved in archaeal lineages beyond the order Sulfolobales. Finally, we assessed whether the expression of the S. islandicus phase-specific genes also follows a cell cycle dependent transcription pattern in eukaryotes. To this end, we compared the cell cycle phase affiliation of genes that are homologous between S. islandicus and eukaryotes represented in the Cyclebase database⁷¹ (see Methods). The budding yeast S. cerevisiae shared the largest number (n = 403) of homologs with S. islandicus (Table 1, Supplementary Data 5). Of the 86 S. islandicus signature genes specific to the M-G1 genes, 67 homologous genes displayed peak expression during M or G1 phase in S. cerevisiae. Of the 57 G2 signature genes of S. islandicus, 37 were also expressed during G2 in the budding yeast. The S-specific genes, the largest category with 260 genes, 28.8% (n = 75) of which encode ribosomal proteins and other translation or protein folding related proteins (tRNA synthetases, tRNA ligases, thermosome subunits, etc.), displayed less congruence in the temporal expression, with only 57 genes displaying peak expression during the S phase in both organisms (Table 1, Supplementary Data 5). Thus, despite certain differences in the timing of expression of certain functions, the overall structuring of the transcriptional landscape follows a defined program in both S. islandicus, the budding yeast and possibly other Amorphea⁷.

Table 1 Percentage of similarity between expression peaks of S. cerevisiae genes homologous to S. islandicus signature phase-specific genes

Full size table

Parallels to the eukaryotic cell cycle and limitations of the study

Our understanding of gene expression during the cell cycle in all organisms, and particularly in archaea, is rather limited. In most bacteria, gene expression appears to be primarily defined by the movement of the replication fork^72,73,74,75. Our data suggest that this is not the case in S. islandicus or other Sulfolobales, where the expression program appears to be more akin to that of eukaryotes. In addition to the previously demonstrated cyclic expression of Sulfolobales genes related to cell division and DNA replication^{25,28,29,32,76,77}, our results indicate that many other key cellular processes in S. islandicus, including defense systems, cell motility and adhesion apparatuses as well as diverse metabolic pathways, are coordinated with the progression of the cell cycle (Fig. 6B). Some of these processes are also expressed cyclically in certain eukaryotic model systems, e.g., the budding yeast^1,7,8,78.

Despite the disparity in the duration of the G1 phase in most studied eukaryotes⁸ and Sulfolobales, the overall ‘logic’ appears to be similar, that is, to prepare for genome replication. Indeed, our data suggests a transcriptional activation of DNA repair pathways and expression of the major replisome components, including the replicative DNA polymerase, MCM helicase and two of the three origin recognition genes (Orc1-1 and WhiP), during the M-G1 phase. Similarly, the expression of the MCM subunits, helicase loader Cdc6 and origin recognition subunits (Orc1-6) in G1 have been demonstrated in opisthokonts, including yeast and vertebrates, such as Xenopus laevis⁷⁹. In most eukaryotes where cell cycle has been studied, G1 is the pivotal moment when the cell’s fate is decided: replicate, differentiate or die¹. Whether a similar checkpoint exists in S. islandicus remains to be investigated. In these eukaryotes, the G1 phase is also associated with high biosynthetic activity, accompanied by the increase in cell size¹. Although in S. islandicus, most of the metabolic pathways displayed reduced activity during the M-G1, lipid biosynthesis was significantly enhanced, suggesting cell growth.

DNA replication marks the start of the S phase. In opisthokonts, cyclin-dependent kinases initiate replication through an orchestrated process within different origins^1,79. Similarly, in Sulfolobales, the S phase starts with the almost simultaneous firing of two of the three origins of replication triggered by Orc1-1 and WhiP^80,81, even though their transcripts are produced during the M-G1 phase (see above), consistent with the previous observations⁷⁶. In opisthokonts, the DNA synthesis during the S phase is coordinated with the production of histones, facilitating the assembly of the newly replicated DNA into chromatin⁸². Remarkably, our data suggests that DNA replication and chromatinization are also coupled in S. islandicus. We found upregulation of the dominant chromatin-associated proteins, including Cren7, Sul7d and Lrs14, during the S phase⁴⁵. However, unlike in yeast and vertebrates, where apart from histones, protein synthesis appears to be generally low¹, we observed an upregulation of diverse translation-related genes, including those encoding the non-universally conserved core ribosomal proteins, tRNA synthetases, tRNAs and RNase P. Consistently, many genes encoding ribosomal proteins and tRNA synthetases were found as signature genes of the S phase after transcriptome deconvolution. Consistent with the low translation during the S phase in eukaryotes, only 10 of the 75 translation related genes specific to the S phase in S. islandicus showed peak expression at the same phase in S. cerevisiae. Instead, most of the S. cerevisiae homologs showed peak expression during the M or G2 phases (Supplementary Data 5). These differences may result from ultrastructural differences between eukaryotic and archaeal cells. In particular, presence of a nucleus in eukaryotes effectively uncouples transcription and translation, whereas in archaea, the two processes appear to be coupled^83,84. Another notable difference between the cell cycles of S. islandicus and opisthokonts concerns the duration of the G1 and G2 phases, with the G1 in eukaryotes being one of the longest and G2 one of the shortest phases⁸, and the opposite being true for Sulfolobales. Nevertheless, despite these differences, the general logic of the coordination of metabolism, DNA repair, and lipid biosynthesis during one of the two G phases appears to be shared.

In a seminal study, Takemata and colleagues have shown that chromosome organization into A and B compartments in Sulfolobales is coupled with the chromosomal distribution of the SMC superfamily protein coalescin, which is largely defined by the transcriptional state of particular loci³⁶. In non-synchronized S. islandicus populations, the archaellum operon, a heterodisulphide reductase gene cluster and fatty acid metabolism genes displayed high coalescin occupancy, consistent with their low transcriptional activity³⁶. We observed that the three operons display cell cycle-dependent differential expression, with the fatty acid metabolism genes being active during the S phase, archaellum-related genes during M-G1 and S, and the heterodisulphide reductase gene cluster during M-G1. The expression of these genes is inversely related to the highest transcription of the coalescin gene during the G2 phase. These results suggest that non-synchronized cultures in their transcriptional profile, coalescin occupancy and hence chromosome organization are likely to resemble the G2 phase, consistent with the reported DNA content profiles³⁶. Further research on chromosome organization during the cell cycle progression is likely to provide further insights into the functioning of Sulfolobales cells.

One of the limitations of our study is that changes in transcription do not necessarily correlate with protein expression and do not provide information on posttranslational protein modifications, which might be particularly relevant for cell cycle progression. Indeed, post-translational modifications, such as acetylation, phosphorylation and ubiquitylation, are critical to many complex regulatory networks that govern the cell cycle progression in eukaryotic cells⁸⁵. The proteome of S. islandicus has been shown to undergo extensive lysine methylation and N-terminal acetylation, with the methylation levels increasing with the progression of the growth phases⁸⁶. Notably, many chromatin proteins displayed differential methylation during different growth phases in non-synchronized populations. It will be interesting to see whether the chromatin methylation patterns change with the progression of the cell cycle. While ubiquitylation is absent in Sulfolobales, alternative mechanisms, e.g., SAMPylation, appear to play similar roles in protein turnover⁸⁷. Thus, studies on changes in posttranslational modifications throughout the S. islandicus cell cycle appear as a promising future research direction.

Collectively, our results illuminate the complexity of the transcriptional landscape in an archaeal model system. Notably, the overall program of the cell cycle appears to be conserved throughout the class Thermoproteia³⁰, suggesting that our findings could be extrapolated to other members of this archaeal lineage. In this context, the signature genes characteristic of different cell cycle phases could be particularly useful for future comparative studies.

Methods

Strains and growth conditions

Saccharolobus islandicus strain REY15A was grown aerobically at 76 °C with shaking in 25 ml of MTSV medium containing mineral salts (M), 0.2% (wt/vol) tryptone (T), 0.2% (wt/vol) sucrose (S) and a mixed vitamin solution (V); the pH was adjusted to 3.5 with sulfuric acid, as described previously⁸⁸.

Synchronization of saccharolobus islandicus

Cells were synchronized using acetic acid (final concentration: 6 mM) as previously described⁸⁹. Briefly, cells were cultured in 25 ml of MTSV media until OD₆₀₀ reached 0.2, then cultures were synchronized by addition of acetic acid to arrest the cells at the end of the G2 phase. Following the incubation for 6 h, the cells were pelleted at 2900xg for 15 minutes, washed with 0.7% sucrose to remove the acetic acid and resuspended in warm acetic-acid free media. Once synchronized, the cells were grown in the MTSV medium as described above and the progression of the cell cycle was followed by flow cytometry. Briefly, cells were fixed with 70% ethanol at +4 °C and washed once with PBS. Fixed cells were stained with 40 μg/ml PI (Invitrogen) in staining buffer (100 mM Tris pH: 7.4, 0.5 mM NaCl, 1 mM CaCl₂, 0.5 mM MgCl₂, 0.1% Nonidet p-40) and their DNA content analyzed using CYTOflex (Beckman-Coulter). Fluorescence in CYTOflex was measured through the ECD channel with a manual height threshold in that channel at 5993 points, to filter non-fluorescent background, and default gain parameters.

RNA extraction and sequencing

Cells were plated on solid media and 15 single colonies were inoculated in liquid media, forming 15 cultures that represent completely independent biological replicates. The 15 biological replicates were grown in 3 different batches (5 cultures per batch), each batch was grown and synchronized at a different time (5 cultures at a time). RNA was then isolated from each batch on different days, no variance between batches was found using principal component analysis (see Supplementary Data 1 for batch number for each sample). The total RNA was extracted from the 15 biological replicates at three different time-points after synchronization: 2h30, 4 h and 6 h (Fig. 1B). The time points were selected based on the results of the flow cytometry analysis which showed that samples at these three time points are enriched in cells in the M-G1, S and G2 phases, respectively (Fig. 1C). Total RNA was extracted using TRI Reagent (SIGMA-Aldrich), following the manufacturer’s protocol, and treated with DNase (TURBO DNA-free kit; Invitrogen) following the manufacturer’s instructions. DNase treated samples were further purified with RNeasy Mini Kit (Qiagen). RNA samples were quantified using a Qubit Fluorometer (Thermo Fischer Scientific) and assessed for quality with a BioAnalyzer (Agilent) (see Supplementary Data 1 for quality control parameters for each sample). Libraries were prepared using the Illumina® Stranded Total RNA Prep, Ligation with Illumina® Ribo-Zero Plus kit, with custom ribodepletion. Following PCR amplification, all samples underwent two rounds of purification with AMPure beads (Beckman Coulter) to remove small fragments. Libraries were subsequently validated using both the Qubit Fluorometer and the Fragment Analyzer (Agilent). Sequencing was performed on an Illumina NextSeq 2000 sequencer with a P3 50-cycle kit and a target of 30–40 million reads per sample. The reads obtained were mapped to the reference S. islandicus REY15A genome (RefSeq accession number: NC_017276 [https://www.ncbi.nlm.nih.gov/nuccore/NC_017276.1/]) with Bowtie-2⁹⁰ using default parameters.

Differential gene expression analysis

The amount of reads per gene was counted with featureCounts⁹¹ with default parameters. Data was then processed using R and the EdgeR library⁹². Read counts per gene were standardized to counts per million and filtered to eliminate all of the genes which did not have at least one read per million in all 15 samples. Finally, reads were normalized using the TMM method, which assumes that the majority of genes is not differentially expressed⁹³. The TMM method takes into account the sampling properties of the RNA-seq data and corrects for biases caused by different library size or differences on the expression properties of the whole sample, such as the presence of two chromosomes versus one chromosome (for more information see ref. ⁹³). Logarithmic fold change in base two (log₂FC) was calculated by subtracting the logarithmic average of two groups, e.g., the log₂FC of M-G1 vs S is calculated by subtracting the logarithmic average of S to the logarithmic average of M-G1. The data was fitted to a linear model to calculate statistical significance using the limma package and p values were adjusted using the Benjamini-Hochberg method to correct for multiple testing. Genes were considered up- or down-regulated if log₂FC was at least ±0.5 (FC ± 1.4) and the adjusted p value was <0.01, genes with a log₂FC of at least ±1 (FC ± 2) were considered to be strongly up- or down-regulated.

KEGG enrichment

Information on the different metabolic pathways was extracted from the Kyoto Encyclopedia of Genes and Genomes (KEGG)⁹⁴. KEGG enrichment p values were calculated by taking into account the adjusted p values of the genes in each pathway and comparing them with the genes outside the pathway by applying a two-sided Wilcoxon test. Significance threshold was set at 0.05. Gene ratio was calculated as the number of genes for which RNA sequencing data was available relative to the number of genes annotated in each pathway. Density plots were generated using the ggplot2 package in R with the transcriptomic data of the genes in each pathway.

Gene annotation

Genes were annotated and classified using the Clusters of Orthologous Genes (arCOG) framework³⁷, where each gene is assigned a code indexed to a specific category according to its orthologs in other archaea. Essentiality information was extracted from the previous study on the closely related S. islandicus strain M.16.4³⁸. Information on compartmentalization of the genome was extracted from data obtained previously³⁶. Statistical significance in the distribution of essential genes and compartments in the core network was calculated by performing a Fischer’s exact text comparing the distribution of two groups. For proteins of interest, the arCOG annotations were supplemented with the results from profile-profile comparisons using HHpred.

Codon usage

Codon usage for each gene was calculated using the coRdon library in R (Elek A, Kuzman M, Vlahovicek K (2023). coRdon: Codon Usage Analysis and Prediction of Gene Expressivity. doi:10.18129/B9.bioc.coRdon, R package v1.20.0, https://bioconductor.org/packages/coRdon) using the coding sequences for REY15A extracted from GenBank (accession number: NC_017276 [https://www.ncbi.nlm.nih.gov/nuccore/NC_017276.1/]). Average codon usage for the full genome was calculated using the Sequence Manipulation Suite⁹⁵ using the bacterial (11) genetic code. Information on codon usage is provided in Supplementary Data 3.

Data visualization

Data was represented in the form of volcano plots, violin plots, box plots or dot plots using the ggplot2 package in R. Ggplot2 was used to generate and plot the regression model of the expression dependent on chromosome position. The method used was a generalized additive model.

Gene co-expression networks

Read count matrices quantifying the gene expression in M-G1, S and G2 were normalized using the VST transformation from the DESeq2 R package⁹⁶. The normalized matrices were subsampled to generate the highest possible number of inferences of GCNs and to produce multiple gene co-expression networks (GCNs) for each phase of the cell-cycle; specifically, within 15 transcriptomes, 5 transcriptomes can be randomly sampled to generate 20 replicates with no more than 2 shared transcriptomes between any pair of replicates. For each cell cycle phase, this protocol returned 20 subsamples, from which 20 GCNs were built. In each GCN, nodes correspond to sequences assigned to genes, connected by edges. To build these GCNs, we used the normalized count matrix as input for package WGCNA⁹⁷ and the function core to compute the Pearson correlation coefficients (PCC). A high PCC threshold for edge inclusion in a GCN was set at 0.8 as a first step to avoid spurious correlations: the edges in each final filtered GCN were thus weighted by either strong positive (PCC > 0.8) or negative (PCC < -0.8) PCC values. Then, for each phase of the cell cycle, consensus GCNs were constructed using a majority rule that only retained high correlation edges present in more than 80% of the replicate GCNs (see Supplementary Note 7 for a more detailed description of the method used for the construction of the GCNs). This strategy identified strongly correlated gene co-expression very commonly observed in a cell cycle phase and robust to sampling effects, because these co-expressions are observed almost irrespectively of what transcriptome samples were used to describe each cell cycle phase.

Classification of genes by conservation level

To estimate the conservation of S. islandicus genes among archaeal groups, a reference archaeal protein sequence dataset was assembled from 292 complete representative genome assemblies out of all archaeal genomic assemblies recorded in the RefSeq database⁹⁸ and all proteins of S. islandicus REY15A. Diamond⁹⁹ was used to perform an all-against-all comparison (with parameters -e 1e-5 -k 1000) using this reference protein sequence dataset. Gene families were computed from the results of the Diamond search using a sequence similarity network, built by filtering results (using standard thresholding parameters: minimum % identity = 30%; minimum mutual sequence coverage = 80%), using the cleanblast and familydetector scripts from the MultiTwin tool¹⁰⁰. The resulting gene families were mapped on a reference archaeal phylogeny¹⁰¹ using the ete3 Python package, and further categorized by conservation level, based on the phylogenetic distribution of their members. Differences in conservation levels between genes in the core network compared to non-core genes was tested by performing a Chi-square test comparing the conservation distribution of the non-core genome with the core genes.

Identification of phase-specific signature genes

To estimate the signature genes for each phase, we applied a transcriptome deconvolution method¹⁰². We assumed that each sample consisted of three subpopulations of cells: (i) M-G1, (ii) S or (iii) G2. Each subpopulation has a different proportion of cells: \({p}_{M-G1}+{p}_{S}+{p}_{G2}=100\%\) (\({p}_{M-G1},{p}_{S},{p}_{G2}\) = percentage of cells in the M-G1, S and G2 subpopulations, respectively). Moreover, we assumed that each gene has a constant transcription level (number of reads) in all cells from one subpopulation (\({t}_{M-G1},{t}_{S},{t}_{G2}\)= transcription levels of a gene in the subpopulations M-G1, S and G2). With these assumptions, the total transcription level of a gene (\(T\)) can be presented as a sum of the transcription levels from three subpopulations of cells: \(T={p}_{M-G1} \,*\, {t}_{M-G1}+{p}_{S} \,*\, {t}_{S}+{p}_{G2} \,*\, {t}_{G2}\). The total transcription level or number of reads of a gene (\(T\)) was calculated with featureCounts (see above). The percentage of cells in different subpopulations (\({p}_{M-G1},{p}_{S},{p}_{G2}\)) was obtained from the flow cytometry data (see above) of three representative biological replicates (Supplementary Fig. S1A; Supplementary Data 4). From this data, for each gene we have nine linear equations (\(T={p}_{M-G1} \,*\, {t}_{M-G1}+{p}_{S} \,*\, {t}_{S}+{p}_{G2} \,*\, {t}_{G2}\)) with three unknown variables (\({t}_{M-G1},{t}_{S},{t}_{G2}\)). The unknown variables were predicted using the non-negative least square method in python (nnls function scipy version 1.11.4). As a result, for each gene we estimate \({t}_{M-G1},{t}_{S},{t}_{G2}\), the transcription levels or number of reads in each phase. Once the reads were predicted, we excluded all the genes whose expression was estimated to be lower than 3 counts per million reads in all phases. The remaining genes were clustered based on K-means into four different groups depending on whether they were more expressed at one of the three studied phases (groups 1-3) or displayed a similar expression throughout the cell cycle (group 4). A second round of clustering was performed excluding all poorly annotated genes (arCOG categories R and S).

Functional enrichment analysis of phase-specific signature genes was performed using the clusterProfiler package in R. The function enricher was used to test the enrichment and its significance of the different arCOG categories in the phase specific genes (M-G1, S and G2) compared to the full genome. The Benjamini-Hochberg method was used for multiple testing correction. An adjusted p value of less than 0.05 was considered significant.

Phylogenetic distribution of S. islandicus phase-specific genes

To determine whether S. islandicus phase-specific genes have homologs within bacterial and/or eukaryotic genomes, a reference proteic sequence dataset was built using all the reference proteomes from Uniprot¹⁰³ for Bacteria (9285 proteomes) and Eukaryota (2625 proteomes). Diamond⁹⁹ was used to perform a search (with parameters -e 1e-5 -k 10000) for homologs of 1530 proteins encoded by S. islandicus phase-specific genes against this reference protein sequence dataset. Depending on the phylogenetic diversity in homolog gene sets, S. islandicus phase-specific genes were classified as Prokaryotic if shared with bacteria only, Archaeoeukaryotic if shared with eukaryotes only or Mixed if shared with both groups.

Phase-specific gene expression in S. islandicus and eukaryotes

To compare the cell cycle-dependent gene expression patterns in archaea and eukaryotes, S. islandicus phase-specific genes eukaryotic homologs were identified in the CycleBase database⁷¹, which records the expression profile of periodically expressed genes during the eukaryotic cell cycle for 4 eukaryotic species (S. cerevisiae, S. pombe, H. sapiens and A. thaliana). Only the transcriptomic data of the CycleBase dataset was used for comparison. A blastp search was performed (with parameters -evalue 1e-5 -word_size 5) using the proteins encoded by S. islandicus phase-specific genes as query against an eukaryotic protein sequence dataset combining the proteins from Uniprot reference proteomes for S. cerevisiae (UP000002311), S. pombe (UP000002485), H. sapiens (UP000005640) and A. thaliana (UP000006548). Since hits were mostly found within S. cerevisiae proteins, expression peak times were then compared between S. islandicus and S. cerevisiae; for each S. islandicus phase-specific signature gene, similarity to S. cerevisiae cell cycle transcriptomic data was decided when at least one of the three closest periodically expressed S. cerevisiae homologs (best blast hits) was maximally expressed at a corresponding phase, according to either relaxed (correspondences: archaeal M-G1 with eukaryotic G1, G1/S, G2/M and M phases; archaeal S with eukaryotic G1/S and S phases; archaeal G2 with eukaryotic G2 and G2/M phases) or strict (correspondences: archaeal M-G1 with eukaryotic G1 and M phases; archaeal S with eukaryotic S phase; archaeal G2 with eukaryotic G2 phase) phase correspondence rules.

Two step RT-qPCR

After RNA extraction and purification (see above). First-strand cDNAs were synthesized from the total RNAs using the LunaScript® RT SuperMix Kit (New England Biolabs) following the manufacturer’s instructions. Briefly, 900 ng of RNA were mixed with the SuperMix and the reactions were incubated in a thermocycler at 25 °C for 2 minutes, followed by 10 minutes at 55 °C and finally 1 minute at 95 °C to heat inactivate the enzyme. One microlitre of the product was used as template in qPCR to evaluate the mRNA levels of the targeted genes. qPCR was performed using Luna® Universal qPCR Master Mix (New England Biolabs) and gene specific primers (Suplementary table S1). The reaction was performed following the manufacturer’s instructions with a denaturing at 95 °C for 2 min, 35 cycles of 95 °C for 15 seconds and 60 °C for 30 seconds. The qPCR was performed in a CFX Opus 96 Real Time PCR system machine by BioRad and the normalized expression was calculated using the software provided by the manufacturer with TBP as the reference gene (SiRe_RS05760). Briefly, expression was calculated using the ΔΔCq normalization or comparative Ct method with the reference gene TBP¹⁰⁴. This method calculates the relative quantity of the gene of interest (ΔCq) as: \({{\rm{Relative\; Quantity}}}={2}^{({{\rm{CqMin}}}-{{\rm{Cq\; Sample}}})}\) (where Cq_Min= the average Cq for the sample with the lowest average Cq for the gene of interest; and Cq_Sample= the average Cq for the sample). The relative quantity (RQ) is used to calculate the normalized expression (ΔΔCq) with the following formula: \({{\rm{Normalized\; Expression}}}=\frac{{{\rm{RQsample}}}}{{{\rm{RQsample\; Ref}}}}\).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The raw reads generated in this study were deposited in European Nucleotide Archive under the accession number PRJEB75364 and in Gene Expression Omnibus (GEO) repository under the accession number GSE296035.

References

Wang, Z. Cell cycle progression and synchronization: an overview. Methods Mol. Biol. 2579, 3–23 (2022).
Article CAS PubMed Google Scholar
Reyes-Lamothe, R. & Sherratt, D. J. The bacterial cell cycle, chromosome inheritance and cell growth. Nat. Rev. Microbiol 17, 467–478 (2019).
Article CAS PubMed Google Scholar
Cezanne, A., Foo, S., Kuo, Y. W. & Baum, B. The archaeal cell cycle. Annu Rev. Cell Dev. Biol. https://doi.org/10.1146/annurev-cellbio-111822-120242 (2024).
Article PubMed PubMed Central Google Scholar
McQuillen, R. & Xiao, J. Insights into the structure, function, and dynamics of the bacterial cytokinetic FtsZ-Ring. Annu Rev. Biophys. 49, 309–341 (2020).
Article CAS PubMed PubMed Central Google Scholar
Hurley, J. H. ESCRTs are everywhere. EMBO J. 34, 2398–2407 (2015).
Article CAS PubMed PubMed Central Google Scholar
Vietri, M., Radulovic, M. & Stenmark, H. The many functions of ESCRTs. Nat. Rev. Mol. Cell Biol. 21, 25–42 (2020).
Article CAS PubMed Google Scholar
Harashima, H., Dissmeyer, N. & Schnittger, A. Cell cycle control across the eukaryotic kingdom. Trends Cell Biol. 23, 345–356 (2013).
Article CAS PubMed Google Scholar
Cooper, G. M. The eukaryotic cell cycle in The Cell: A Molecular Approach (Sinauer Associates, 2000).
Dang, F., Nie, L. & Wei, W. Ubiquitin signaling in cell cycle control and tumorigenesis. Cell Death Differ. 28, 427–438 (2021).
Article CAS PubMed Google Scholar
Milletti, G., Colicchia, V. & Cecconi, F. Cyclers’ kinases in cell division: from molecules to cancer therapy. Cell Death Differ. 30, 2035–2052 (2023).
Article PubMed PubMed Central Google Scholar
Yang, L., Besschetnova, T. Y., Brooks, C. R., Shah, J. V. & Bonventre, J. V. Epithelial cell cycle arrest in G2/M mediates kidney fibrosis after injury. Nat. Med 16, 535–543 (2010).
Article CAS PubMed PubMed Central Google Scholar
Newton, A. & Ohta, N. Cell cycle regulation in bacteria. Curr. Opin. Cell Biol. 4, 180–185 (1992).
Article CAS PubMed Google Scholar
Skarstad, K., Steen, H. B. & Boye, E. Escherichia coli DNA distributions measured by flow cytometry and compared with theoretical computer simulations. J. Bacteriol. 163, 661–668 (1985).
Article CAS PubMed PubMed Central Google Scholar
Marczynski, G. T., Dingwall, A. & Shapiro, L. Plasmid and chromosomal DNA replication and partitioning during the Caulobacter crescentus cell cycle. J. Mol. Biol. 212, 709–722 (1990).
Article CAS PubMed Google Scholar
Si, F. et al. Invariance of initiation mass and predictability of cell size in Escherichia coli. Curr. Biol. 27, 1278–1287 (2017).
Article CAS PubMed PubMed Central Google Scholar
Beaufay, F., Coppine, J. & Hallez, R. When the metabolism meets the cell cycle in bacteria. Curr. Opin. Microbiol 60, 104–113 (2021).
Article CAS PubMed Google Scholar
Eme, L., Spang, A., Lombard, J., Stairs, C. W. & Ettema, T. J. G. Archaea and the origin of eukaryotes. Nat. Rev. Microbiol 15, 711–723 (2017).
Article CAS PubMed Google Scholar
Spang, A. et al. Complex archaea that bridge the gap between prokaryotes and eukaryotes. Nature 521, 173–179 (2015).
Article CAS PubMed PubMed Central Google Scholar
Lindas, A. C. & Bernander, R. The cell cycle of archaea. Nat. Rev. Microbiol 11, 627–638 (2013).
Article CAS PubMed Google Scholar
Olsen, G. J. & Woese, C. R. Archaeal genomics: an overview. Cell 89, 991–994 (1997).
Article CAS PubMed Google Scholar
Greci, M. D. & Bell, S. D. Archaeal DNA replication. Annu Rev. Microbiol 74, 65–80 (2020).
Article CAS PubMed PubMed Central Google Scholar
Liao, Y., Ithurbide, S., Evenhuis, C., Lowe, J. & Duggin, I. G. Cell division in the archaeon Haloferax volcanii relies on two FtsZ proteins with distinct functions in division ring assembly and constriction. Nat. Microbiol 6, 594–605 (2021).
Article CAS PubMed PubMed Central Google Scholar
Zhao, S. et al. Widespread photosynthesis reaction centre barrel proteins are necessary for haloarchaeal cell division. Nat. Microbiol 9, 712–726 (2024).
Article CAS PubMed Google Scholar
Nusßbaum, P. et al. Proteins containing photosynthetic reaction centre domains modulate FtsZ-based archaeal cell division. Nat. Microbiol 9, 698–711 (2024).
Article Google Scholar
Samson, R. Y., Obita, T., Freund, S. M., Williams, R. L. & Bell, S. D. A role for the ESCRT system in cell division in archaea. Science 322, 1710–1713 (2008).
Article CAS PubMed PubMed Central Google Scholar
Makarova, K. S. et al. Diversity, origin, and evolution of the ESCRT systems. mBio 15, e0033524 (2024).
Article PubMed Google Scholar
Tarrason Risa, G. et al. The proteasome controls ESCRT-III-mediated cell division in an archaeon. Science https://doi.org/10.1126/science.aaz2532 (2020).
Liu, J. et al. A relay race of ESCRT-III paralogs drives cell division in a hyperthermophilic archaeon. mBio 16, e0099124 (2025).
Article PubMed Google Scholar
Lundgren, M. & Bernander, R. Genome-wide transcription map of an archaeal cell cycle. Proc. Natl. Acad. Sci. USA 104, 2939–2944 (2007).
Article CAS PubMed PubMed Central Google Scholar
Lundgren, M., Malandrin, L., Eriksson, S., Huber, H. & Bernander, R. Cell cycle characteristics of crenarchaeota: unity among diversity. J. Bacteriol. 190, 5362–5367 (2008).
Article CAS PubMed PubMed Central Google Scholar
Hjort, K. & Bernander, R. Changes in cell size and DNA content in Sulfolobus cultures during dilution and temperature shift experiments. J. Bacteriol. 181, 5669–5675 (1999).
Article CAS PubMed PubMed Central Google Scholar
Yang, Y. et al. A novel RHH family transcription factor aCcr1 and its viral homologs dictate cell cycle progression in archaea. Nucleic Acids Res 51, 1707–1723 (2023).
Article CAS PubMed PubMed Central Google Scholar
Hurtig, F. et al. The patterned assembly and stepwise Vps4-mediated disassembly of composite ESCRT-III polymers drives archaeal cell division. Sci. Adv. 9, eade5224 (2023).
Article CAS PubMed PubMed Central Google Scholar
Yen, C. Y. et al. Chromosome segregation in Archaea: SegA- and SegB-DNA complex structures provide insights into segrosome assembly. Nucleic Acids Res 49, 13150–13164 (2021).
Article CAS PubMed PubMed Central Google Scholar
Badel, C., Samson, R. Y. & Bell, S. D. Chromosome organization affects genome evolution in Sulfolobus archaea. Nat. Microbiol 7, 820–830 (2022).
Article CAS PubMed PubMed Central Google Scholar
Takemata, N., Samson, R. Y. & Bell, S. D. Physical and functional compartmentalization of archaeal chromosomes. Cell 179, 165–179.e118 (2019).
Article CAS PubMed PubMed Central Google Scholar
Makarova, K. S., Wolf, Y. I. & Koonin, E. V. Archaeal clusters of orthologous genes (arCOGs): an update and application for analysis of shared features between thermococcales, methanococcales, and methanobacteriales. Life (Basel) 5, 818–840 (2015).
CAS PubMed Google Scholar
Zhang, C., Phillips, A. P. R., Wipfler, R. L., Olsen, G. J. & Whitaker, R. J. The essential genome of the crenarchaeal model Sulfolobus islandicus. Nat. Commun. 9, 4908 (2018).
Article PubMed PubMed Central Google Scholar
Wang, F. et al. Spindle-shaped archaeal viruses evolved from rod-shaped ancestors to package a larger genome. Cell 185, 1297–1307.e11 (2022).
Article CAS PubMed PubMed Central Google Scholar
Jain, S., Caforio, A. & Driessen, A. J. Biosynthesis of archaeal membrane ether lipids. Front Microbiol 5, 641 (2014).
Article PubMed PubMed Central Google Scholar
Guan, Z. et al. Gene deletions leading to a reduction in the number of cyclopentane rings in Sulfolobus acidocaldarius tetraether lipids. FEMS Microbiol. Lett. 365, fnx250 (2018).
Zeng, Z. et al. Identification of a protein responsible for the synthesis of archaeal membrane-spanning GDGT lipids. Nat. Commun. 13, 1545 (2022).
Article CAS PubMed PubMed Central Google Scholar
Zeng, Z., Liu, X. L., Wei, J. H., Summons, R. E. & Welander, P. V. Calditol-linked membrane lipids are required for acid tolerance in Sulfolobus acidocaldarius. Proc. Natl. Acad. Sci. USA 115, 12932–12937 (2018).
Article CAS PubMed PubMed Central Google Scholar
Hemmi, H., Shibuya, K., Takahashi, Y., Nakayama, T. & Nishino, T. S)-2,3-Di-O-geranylgeranylglyceryl phosphate synthase from the thermoacidophilic archaeon Sulfolobus solfataricus. Molecular cloning and characterization of a membrane-intrinsic prenyltransferase involved in the biosynthesis of archaeal ether-linked membrane lipids. J. Biol. Chem. 279, 50197–50203 (2004).
Article CAS PubMed Google Scholar
De Kock, V., Peeters, E. & Baes, R. The Lrs14 family of DNA-binding proteins as nucleoid-associated proteins in the Crenarchaeal order Sulfolobales. Mol. Microbiol. 123, 132–142 (2025).
Berg, I. A., Kockelkorn, D., Buckel, W. & Fuchs, G. A 3-hydroxypropionate/4-hydroxybutyrate autotrophic carbon dioxide assimilation pathway in Archaea. Science 318, 1782–1786 (2007).
Article CAS PubMed Google Scholar
Wang, K. et al. A TetR-family transcription factor regulates fatty acid metabolism in the archaeal model organism Sulfolobus acidocaldarius. Nat. Commun. 10, 1542 (2019).
Article PubMed PubMed Central Google Scholar
Schocke, L., Brasen, C. & Siebers, B. Thermoacidophilic Sulfolobus species as source for extremozymes and as novel archaeal platform organisms. Curr. Opin. Biotechnol. 59, 71–77 (2019).
Article CAS PubMed Google Scholar
Kalliomaa-Sanford, A. K. et al. Chromosome segregation in Archaea mediated by a hybrid DNA partition machine. Proc. Natl. Acad. Sci. USA 109, 3754–3759 (2012).
Article CAS PubMed PubMed Central Google Scholar
Chaudhury, P., Quax, T. E. F. & Albers, S. V. Versatile cell surface structures of archaea. Mol. Microbiol 107, 298–311 (2018).
Article CAS PubMed Google Scholar
Henche, A. L. et al. Structure and function of the adhesive type IV pilus of Sulfolobus acidocaldarius. Environ. Microbiol 14, 3188–3202 (2012).
Article CAS PubMed PubMed Central Google Scholar
Lassak, K. et al. Molecular analysis of the crenarchaeal flagellum. Mol. Microbiol 83, 110–124 (2012).
Article CAS PubMed Google Scholar
Ghosh, A., Hartung, S., van der Does, C., Tainer, J. A. & Albers, S. V. Archaeal flagellar ATPase motor shows ATP-dependent hexameric assembly and activity stimulation by specific lipid binding. Biochem J. 437, 43–52 (2011).
Article CAS PubMed Google Scholar
Shahapure, R., Driessen, R. P., Haurat, M. F., Albers, S. V. & Dame, R. T. The archaellum: a rotating type IV pilus. Mol. Microbiol 91, 716–723 (2014).
Article CAS PubMed Google Scholar
Kreutzberger, M. A. B. et al. The evolution of archaeal flagellar filaments. Proc. Natl. Acad. Sci. USA 120, e2304256120 (2023).
Article CAS PubMed PubMed Central Google Scholar
Makarova, K. S., Koonin, E. V. & Albers, S. V. Diversity and evolution of Type IV pili systems in archaea. Front Microbiol 7, 667 (2016).
Article PubMed PubMed Central Google Scholar
Wang, F. et al. The structures of two archaeal type IV pili illuminate evolutionary relationships. Nat. Commun. 11, 3424 (2020).
Article PubMed PubMed Central Google Scholar
Liu, J. et al. Two distinct archaeal type IV pili structures formed by proteins with identical sequence. Nat. Commun. 15, 5049 (2024).
Article CAS PubMed PubMed Central Google Scholar
Kreutzberger, M. A. B. et al. Convergent evolution in the supercoiling of prokaryotic flagellar filaments. Cell 185, 3487–3500.e3414 (2022).
Article CAS PubMed PubMed Central Google Scholar
Charles-Orszag, A., Lord, S. J. & Mullins, R. D. High-Temperature live-cell imaging of cytokinesis, cell motility, and cell-cell interactions in the thermoacidophilic crenarchaeon sulfolobus acidocaldarius. Front Microbiol 12, 707124 (2021).
Article PubMed PubMed Central Google Scholar
Lassak, K., Peeters, E., Wrobel, S. & Albers, S. V. The one-component system ArnR: a membrane-bound activator of the crenarchaeal archaellum. Mol. Microbiol 88, 125–139 (2013).
Article CAS PubMed Google Scholar
Manica, A. & Schleper, C. CRISPR-mediated defense mechanisms in the hyperthermophilic archaeal genus Sulfolobus. RNA Biol. 10, 671–678 (2013).
Article CAS PubMed PubMed Central Google Scholar
Payne, L. J. et al. Identification and classification of antiviral defence systems in bacteria and archaea with PADLOC reveals new system types. Nucleic Acids Res 49, 10868–10878 (2021).
Article CAS PubMed PubMed Central Google Scholar
Maestri A. et al. The bacterial defense system MADS interacts with CRISPR-Cas to limit phage infection and escape. Cell Host Microbe. 32, 1412–1426.e11 (2024).
Cvetkovic, M. A., Wurm, J. P., Audin, M. J., Schutz, S. & Sprangers, R. The Rrp4-exosome complex recruits and channels substrate RNA by a unique mechanism. Nat. Chem. Biol. 13, 522–528 (2017).
Article CAS PubMed PubMed Central Google Scholar
Quax, T. E. et al. Massive activation of archaeal defense genes during viral infection. J. Virol. 87, 8419–8428 (2013).
Article CAS PubMed PubMed Central Google Scholar
Leon-Sobrino, C., Kot, W. P. & Garrett, R. A. Transcriptome changes in STSV2-infected Sulfolobus islandicus REY15A undergoing continuous CRISPR spacer acquisition. Mol. Microbiol 99, 719–728 (2016).
Article CAS PubMed Google Scholar
Babu, M. et al. A dual function of the CRISPR-Cas system in bacterial antivirus immunity and DNA repair. Mol. Microbiol 79, 484–502 (2011).
Article CAS PubMed Google Scholar
Miezner, G. et al. An archaeal Cas3 protein facilitates rapid recovery from DNA damage. Microlife 4, uqad007 (2023).
Article PubMed PubMed Central Google Scholar
Sakata, E., Eisele, M. R. & Baumeister, W. Molecular and cellular dynamics of the 26S proteasome. Biochim Biophys. Acta Proteins Proteom. 1869, 140583 (2021).
Article CAS PubMed Google Scholar
Santos, A., Wernersson, R. & Jensen, L. J. Cyclebase 3.0: a multi-organism database on cell-cycle regulation and phenotypes. Nucleic Acids Res 43, D1140–D1144 (2015).
Article CAS PubMed Google Scholar
Pountain, A. W. et al. Transcription-replication interactions reveal bacterial genome regulation. Nature 626, 661–669 (2024).
Article CAS PubMed PubMed Central Google Scholar
Hajduk, I. V., Rodrigues, C. D. & Harry, E. J. Connecting the dots of the bacterial cell cycle: Coordinating chromosome replication and segregation with cell division. Semin Cell Dev. Biol. 53, 2–9 (2016).
Article CAS PubMed Google Scholar
Zhou, P. & Helmstetter, C. E. Relationship between ftsZ gene expression and chromosome replication in Escherichia coli. J. Bacteriol. 176, 6100–6106 (1994).
Article CAS PubMed PubMed Central Google Scholar
Arjes, H. A. et al. Failsafe mechanisms couple division and DNA replication in bacteria. Curr. Biol. 24, 2149–2155 (2014).
Article CAS PubMed PubMed Central Google Scholar
Samson, R. Y. et al. Specificity and function of archaeal DNA replication initiator proteins. Cell Rep. 3, 485–496 (2013).
Article CAS PubMed PubMed Central Google Scholar
Bernander, R., Lundgren, M. & Ettema, T. J. Comparative and functional analysis of the archaeal cell cycle. Cell Cycle 9, 794–806 (2010).
Article PubMed Google Scholar
Basu, S., Greenwood, J., Jones, A. W. & Nurse, P. Core control principles of the eukaryotic cell cycle. Nature 607, 381–386 (2022).
Article CAS PubMed PubMed Central Google Scholar
Limas, J. C. & Cook, J. G. Preparation for DNA replication: the key to a successful S phase. FEBS Lett. 593, 2853–2867 (2019).
Article CAS PubMed PubMed Central Google Scholar
Bell, S. D. Initiation of DNA Replication in the Archaea. Adv. Exp. Med Biol. 1042, 99–115 (2017).
Article CAS PubMed Google Scholar
Lundgren, M., Andersson, A., Chen, L., Nilsson, P. & Bernander, R. Three replication origins in Sulfolobus species: synchronous initiation of chromosome replication and asynchronous termination. Proc. Natl. Acad. Sci. USA 101, 7046–7051 (2004).
Article CAS PubMed PubMed Central Google Scholar
Nelson, D. M. et al. Coupling of DNA synthesis and histone synthesis in S phase independent of cyclin/cdk2 activity. Mol. Cell Biol. 22, 7459–7472 (2002).
Article CAS PubMed PubMed Central Google Scholar
Weixlbaumer, A., Grunberger, F., Werner, F. & Grohmann, D. Coupling of Transcription and Translation in Archaea: Cues From the Bacterial World. Front Microbiol 12, 661827 (2021).
Article PubMed PubMed Central Google Scholar
French, S. L., Santangelo, T. J., Beyer, A. L. & Reeve, J. N. Transcription and translation are coupled in Archaea. Mol. Biol. Evol. 24, 893–895 (2007).
Article CAS PubMed Google Scholar
Cuijpers, S. A. G. & Vertegaal, A. C. O. Guiding mitotic progression by crosstalk between post-translational modifications. Trends Biochem Sci. 43, 251–268 (2018).
Article CAS PubMed Google Scholar
Vorontsov, E. A., Rensen, E., Prangishvili, D., Krupovic, M. & Chamot-Rooke, J. Abundant lysine methylation and N-terminal acetylation in sulfolobus islandicus revealed by bottom-up and top-down proteomics. Mol. Cell Proteom. 15, 3388–3404 (2016).
Article CAS Google Scholar
Anjum, R. S. et al. Involvement of a eukaryotic-like ubiquitin-related modifier in the proteasome pathway of the archaeon Sulfolobus acidocaldarius. Nat. Commun. 6, 8163 (2015).
Article PubMed Google Scholar
Deng, L., Zhu, H., Chen, Z., Liang, Y. X. & She, Q. Unmarked gene deletion and host-vector system for the hyperthermophilic crenarchaeon Sulfolobus islandicus. Extremophiles 13, 735–746 (2009).
Article CAS PubMed Google Scholar
Liu, J. et al. Archaeal extracellular vesicles are produced in an ESCRT-dependent manner and promote gene transfer and nutrient cycling in extreme environments. ISME J. 15, 2892–2905 (2021).
Article CAS PubMed PubMed Central Google Scholar
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Article CAS PubMed PubMed Central Google Scholar
Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
Article CAS PubMed Google Scholar
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Article CAS PubMed Google Scholar
Robinson, M. D. & Oshlack, A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 11, R25 (2010).
Article PubMed PubMed Central Google Scholar
Kanehisa, M. & Goto, S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28, 27–30 (2000).
Article CAS PubMed PubMed Central Google Scholar
Stothard, P. The sequence manipulation suite: JavaScript programs for analyzing and formatting protein and DNA sequences. Biotechniques 28, 1102–1104 (2000).
Article CAS PubMed Google Scholar
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Article PubMed PubMed Central Google Scholar
Langfelder, P. & Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinforma. 9, 559 (2008).
Article Google Scholar
O’Leary, N. A. et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res 44, D733–D745 (2016).
Article PubMed Google Scholar
Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).
Article CAS PubMed Google Scholar
Corel, E. et al. MultiTwin: a software suite to analyze evolution at multiple levels of organization using multipartite graphs. Genome Biol. Evol. 10, 2777–2784 (2018).
Article CAS PubMed PubMed Central Google Scholar
Mendler, K. et al. AnnoTree: visualization and exploration of a functionally annotated microbial tree of life. Nucleic Acids Res 47, 4442–4448 (2019).
Article CAS PubMed PubMed Central Google Scholar
Avila Cobos, F., Alquicira-Hernandez, J., Powell, J. E., Mestdagh, P. & De Preter, K. Benchmarking of cell type deconvolution pipelines for transcriptomics data. Nat. Commun. 11, 5650 (2020).
Article CAS PubMed PubMed Central Google Scholar
UniProt Consortium. UniProt: the universal protein knowledgebase in 2023. Nucleic Acids Res 51, D523–D531 (2023).
Article Google Scholar
Schmittgen, T. D. & Livak, K. J. Analyzing real-time PCR data by the comparative C(T) method. Nat. Protoc. 3, 1101–1108 (2008).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

This work was supported by Agence Nationale de la Recherche grant ANR-23-CE13-022 to MK. The work in EB laboratory was supported by an ATM grant from the MNHN (ATM AAP 2023) and an Emergence grant from Sorbonne Université (S21JR31001—IP/S/V2 EMERG-ESPA). MGRV was supported by a stipend from the Pasteur-Paris University (PPU) International PhD Program. We also acknowledge the help of Pierre-Henri Commere and the Flow Cytometry platform at Institut Pasteur. The Biomics Platform, C2RT, Institut Pasteur, Paris, France, is supported by France Génomique (ANR-10-INBS-09) and IBISA.

Author information

These authors contributed equally: Miguel V. Gomez-Raya-Vilanova, Jérôme Teulière.

Authors and Affiliations

Institut Pasteur, Université Paris Cité, CNRS UMR6047, Archaeal Virology Unit, 75015, Paris, France
Miguel V. Gomez-Raya-Vilanova, Sofia Medvedeva, Virginija Cvirkaite-Krupovic & Mart Krupovic
Sorbonne Université, Collège doctoral, F-75005, Paris, France
Miguel V. Gomez-Raya-Vilanova
Institut de Systématique, Évolution, Biodiversité (ISYEB), Sorbonne Université, CNRS, Museum National d’Histoire Naturelle, EPHE, Université Des Antilles, Paris, France
Jérôme Teulière, Yuping Dai, Eduardo Corel, Philippe Lopez & Eric Bapteste
Department of Computational, Quantitative and Synthetic Biology (CQSB), Sorbonne Université, CNRS, IBPS, UMR7238, Paris, 75005, France
Jérôme Teulière, Yuping Dai, Eduardo Corel, Philippe Lopez & Eric Bapteste
Department of Microbiology and Infectious Diseases, Faculty of Medicine and Health Sciences, Université de Sherbrooke, Sherbrooke, QC, Canada
Yuping Dai & Louis-Patrick Haraoui
Département de Sciences Biologiques, Complexe Des Sciences, Université de Montréal, Montréal, QC, Canada
François-Joseph Lapointe
Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ, USA
Debashish Bhattacharya
Institut Pasteur, Université Paris Cité, Plate-forme Technologique Biomics, Paris, France
Elodie Turc & Marc Monot

Authors

Miguel V. Gomez-Raya-Vilanova
View author publications
Search author on:PubMed Google Scholar
Jérôme Teulière
View author publications
Search author on:PubMed Google Scholar
Sofia Medvedeva
View author publications
Search author on:PubMed Google Scholar
Yuping Dai
View author publications
Search author on:PubMed Google Scholar
Eduardo Corel
View author publications
Search author on:PubMed Google Scholar
Philippe Lopez
View author publications
Search author on:PubMed Google Scholar
François-Joseph Lapointe
View author publications
Search author on:PubMed Google Scholar
Debashish Bhattacharya
View author publications
Search author on:PubMed Google Scholar
Louis-Patrick Haraoui
View author publications
Search author on:PubMed Google Scholar
Elodie Turc
View author publications
Search author on:PubMed Google Scholar
Marc Monot
View author publications
Search author on:PubMed Google Scholar
Virginija Cvirkaite-Krupovic
View author publications
Search author on:PubMed Google Scholar
Eric Bapteste
View author publications
Search author on:PubMed Google Scholar
Mart Krupovic
View author publications
Search author on:PubMed Google Scholar

Contributions

M.G.R.V. performed the experimental studies, carried out the analysis, interpreted the data, wrote the original draft, reviewed and edited the manuscript. J.T. and Y.D. constructed the gene co-expression networks, carried out the analysis, interpreted the data, reviewed and edited the manuscript. S.M. carried out the analysis, reviewed and edited the manuscript. E.C., P.L., F.J.L., D.B. and L.P.H. reviewed and edited the manuscript. E.T. and M.M. prepared the library, performed the RNA sequencing and reviewed the manuscript. V.C.K. supervised the work, interpreted the data, reviewed and edited the manuscript. EB designed co-expression network analysis, interpreted the data, supervised the work, reviewed and edited the manuscript. M.K. supervised the work, interpreted the data, wrote the original draft, reviewed and edited the manuscript.

Corresponding authors

Correspondence to Eric Bapteste or Mart Krupovic.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Daniela Barilla and the other, anonymous, reviewers for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Description of Additional Supplementary Files

Supplementary Dataset 1

Supplementary Dataset 2

Supplementary Dataset 3

Supplementary Dataset 4

Supplementary Dataset 5

Supplementary Dataset 6

Reporting Summary

Transparent Peer Review file

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Gomez-Raya-Vilanova, M.V., Teulière, J., Medvedeva, S. et al. Transcriptional landscape of the cell cycle in a model thermoacidophilic archaeon reveals similarities to eukaryotes. Nat Commun 16, 5697 (2025). https://doi.org/10.1038/s41467-025-60613-8

Download citation

Received: 23 October 2024
Accepted: 28 May 2025
Published: 01 July 2025
DOI: https://doi.org/10.1038/s41467-025-60613-8

This article is cited by

Coupling chromosome organization to genome segregation in Archaea
- Azhar F. Kabli
- Irene W. Ng
- Daniela Barillà
Nature Communications (2025)
Gene expression and co-expression heterogeneity patterns and biodemography analyses during the cell cycle encourage aging studies in archaea
- Yuping Dai
- Miguel V. Gomez-Raya-Vilanova
- Eric Bapteste
GeroScience (2025)