Abstract
As an agar-producing red seaweed, Gracilaria vermiculophylla plays a significant role in the food industry as well as in multiple fields, including evolutionary studies, genetic diversity analysis, and ecological research. However, a high-quality chromosome-level genome of G. vermiculophylla is not currently available. In this study, we report a high-quality chromosome-level genome for G. vermiculophylla combining short-read, Nanopore long-read, and Hi-C data. The finally assembled genome size is 77.5 Mb, with Contig N50 of 2.61 Mb and Scaffold N50 of 3.16 Mb, comprising 22 pseudochromosomes. The transposable elements (TEs) constituted 45.93 Mb of the G. vermiculophylla genome, with long terminal repeats (LTRs) accounting for 55.03% of the predominant retrotransposons. The G. vermiculophylla genome contains a total of 10,689 protein-coding genes, of which 86.14% have been functionally annotated. The BUSCO evaluation, GC content and sequencing depth assessment demonstrated the high quality of the assembly and the success of the decontamination process. The high-quality genomic information provides an invaluable resource for agar development, evolution studies, comparative genomics, genetic diversity analysis, and ecological research.
Data availability
All data related to the genome of G. vermiculophylla are available through the following databases or links. Sequence Read Archive (SRA) data was uploaded in NCBI under SRP accession SRP56419448. The short reads DNA sequencing data of G. vermiculophylla was deposited in the SRA at SRR32361124. The Hi-C data of G. vermiculophylla was deposited in the SRA at SRR32361125. The long Nanopore Cyclone DNA sequencing data of G. vermiculophylla was deposited in the SRA at SRR32361123. The genome sequences are was deposited in GenBank https://www.ncbi.nlm.nih.gov/nuccore/JBPJGC000000000.149. The genome sequences and annotation were deposited in figshare50, which included four files. The genome sequences data of G. vermiculophylla is Gv_unknow_new.fa. The annotation data of G. vermiculophylla is Gv_unknow_new.gff. The coding sequences data of G. vermiculophylla is Gv_unknow_new.cds.fa. The peptide sequences data of G. vermiculophylla is Gv_unknow_pep.fa. The data are publicly accessible under the CC BY 4.0 license via the persistent identifier https://doi.org/10.6084/m9.figshare.28667702.v3.
Code availability
There was no specific code developed in this study. Data analyses were conducted in accordance with the protocols outlined in the Methods section.
References
Borg, M. et al. Red macroalgae in the genomic era. New Phytologist 240, 471–488, https://doi.org/10.1111/nph.19211 (2023).
Ismail, M. M., Alotaibi, B. S. & El-Sheekh, M. M. Therapeutic Uses of Red Macroalgae. Molecules 25, https://doi.org/10.3390/molecules25194411 (2020).
Gulbransen, D. J., McGlathery, K. J., Marklund, M., Norris, J. N. & Gurgel, C. F. Gracilaria vermiculophylla (rhodophyta, gracilariales) in the virginia coastal bays, usa: cox1 analysis reveals high genetic richness of an introduced macroalga. J Phycol 48, 1278–1283, https://doi.org/10.1111/j.1529-8817.2012.01218.x (2012).
Wang, X. et al. Diversity of Gracilariaceae (Rhodophyta) in China: An integrative morphological and molecular assessment including a description of Gracilaria tsengii sp. nov. Algal Research 71, 103074, https://doi.org/10.1016/j.algal.2023.103074 (2023).
Sousa, A. M., Alves, V. D., Morais, S., Delerue-Matos, C. & Gonçalves, M. P. Agar extraction from integrated multitrophic aquacultured Gracilaria vermiculophylla: evaluation of a microwave-assisted process using response surface methodology. Bioresour Technol 101, 3258–3267, https://doi.org/10.1016/j.biortech.2009.12.061 (2010).
Sousa, A. M. et al. Structural, physical, and chemical modifications induced by microwave heating on native agar-like galactans. Journal of agricultural and food chemistry 60, 4977–4985, https://doi.org/10.1021/jf2053542 (2012).
Souza, H. K., Sousa, A. M., Gómez, J. & Gonçalves, M. P. Complexation of WPI and microwave-assisted extracted agars with different physicochemical properties. Carbohydr Polym 89, 1073–1080, https://doi.org/10.1016/j.carbpol.2012.03.065 (2012).
Pereira, A. G. et al. The Use of Invasive Algae Species as a Source of Secondary Metabolites and Biological Activities: Spain as Case-Study. Mar Drugs 19, https://doi.org/10.3390/md19040178 (2021).
Magnoni, L. J. et al. Dietary supplementation of heat-treated Gracilaria and Ulva seaweeds enhanced acute hypoxia tolerance in gilthead sea bream (Sparus aurata). Biol Open 6, 897–908, https://doi.org/10.1242/bio.024299 (2017).
Valente, L. M. P. et al. Iodine enrichment of rainbow trout flesh by dietary supplementation with the red seaweed Gracilaria vermiculophylla. Aquaculture 446, 132–139, https://doi.org/10.1016/j.aquaculture.2015.05.004 (2015).
Xiang, J. X. et al. Genome-scale signatures of adaptive gene expression changes in an invasive seaweed Gracilaria vermiculophylla. Mol Ecol 32, 613–627, https://doi.org/10.1111/mec.16776 (2023).
Krueger-Hadfield, S. A. et al. Invasion of novel habitats uncouples haplo-diplontic life cycles. Mol Ecol 25, 3801–3816, https://doi.org/10.1111/mec.13718 (2016).
Liu, Y.-J. et al. The invasive alga Gracilaria vermiculophylla in the native northwest Pacific under ocean warming: Southern genetic consequence and northern range expansion. Frontiers in Marine Science ume 9, 2022, https://doi.org/10.3389/fmars.2022.983685 (2022).
Krueger-Hadfield, S. A. et al. Intraspecific diversity and genetic structure in the widespread macroalga Agarophyton vermiculophyllum. Journal of phycology 57, 1403–1410, https://doi.org/10.1111/jpy.13195 (2021).
Li, Y., Han, H. & Ma, X. Phylogenetic analysis of the complete chloroplast genome of Gracilaria vermiculophylla. Mitochondrial DNA Part B 5, 2141–2142, https://doi.org/10.1080/23802359.2020.1765708 (2020).
Nakamura-Gouvea, N. et al. Insights into agar and secondary metabolite pathways from the genome of the red alga Gracilaria domingensis (Rhodophyta, Gracilariales). J Phycol 58, 406–423, https://doi.org/10.1111/jpy.13238 (2022).
Lipinska, A. P. et al. The Rhodoexplorer Platform for Red Algal Genomics and Whole-Genome Assemblies for Several Gracilaria Species. Genome Biol Evol 15, https://doi.org/10.1093/gbe/evad124 (2023).
Lee, J. et al. Analysis of the Draft Genome of the Red Seaweed Gracilariopsis chorda Provides Insights into Genome Size Evolution in Rhodophyta. Mol Biol Evol 35, 1869–1886, https://doi.org/10.1093/molbev/msy081 (2018).
Sun, X. et al. Genomic analyses of unique carbohydrate and phytohormone metabolism in the macroalga Gracilariopsis lemaneiformis (Rhodophyta). BMC Plant Biol 18, 94, https://doi.org/10.1186/s12870-018-1309-2 (2018).
Flanagan, B. A. et al. Founder effects shape linkage disequilibrium and genomic diversity of a partially clonal invader. Mol Ecol 30, 1962–1978, https://doi.org/10.1111/mec.15854 (2021).
Krueger-Hadfield, S. A. et al. Using RAD-seq to develop sex-linked markers in a haplodiplontic alga. Journal of Phycology 57, 279–294, https://doi.org/10.1111/jpy.13088 (2021).
Lipinska, A. P. et al. Structural and evolutionary features of red algal UV sex chromosomes. Genome biology 26, 341, https://doi.org/10.1186/s13059-025-03797-y (2025).
Hu, Z. M., Zeng, X., Wang, A., Shi, C. & Duan, D. An efficient method for DNA isolation from red algae. Journal of Applied Phycology 16, 161–166 (2004).
Chen, Y. et al. SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data. Gigascience 7, 1–6, https://doi.org/10.1093/gigascience/gix120 (2018).
Zhang, J.-Y. et al. A single-molecule nanopore sequencing platform. bioRxiv, 2024.2008.2019.608720, https://doi.org/10.1101/2024.08.19.608720 (2024).
Hu, J. et al. NextDenovo: an efficient error correction and accurate assembly tool for noisy long reads. Genome biology 25, 107, https://doi.org/10.1186/s13059-024-03252-4 (2024).
Vaser, R., Sović, I., Nagarajan, N. & Šikić, M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome research 27, 737–746, https://doi.org/10.1101/gr.214270.116 (2017).
Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PloS one 9, e112963, https://doi.org/10.1371/journal.pone.0112963 (2014).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760, https://doi.org/10.1093/bioinformatics/btp324 (2009).
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95, https://doi.org/10.1126/science.aal3327 (2017).
Hanschen, E. R. & Starkenburg, S. R. The state of algal genome quality and diversity. Algal Research 50, 101968, https://doi.org/10.1016/j.algal.2020.101968 (2020).
Burton, J. N., Liachko, I., Dunham, M. J. & Shendure, J. Species-level deconvolution of metagenome assemblies with Hi-C-based contact probability maps. G3 (Bethesda 4, 1339–1346, https://doi.org/10.1534/g3.114.011825 (2014).
Chen, H. et al. Insights into the Ancient Adaptation to Intertidal Environments by Red Algae Based on a Genomic and Multiomics Investigation of Neoporphyra haitanensis. Mol Biol Evol 39, https://doi.org/10.1093/molbev/msab315 (2022).
Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27, 573–580, https://doi.org/10.1093/nar/27.2.573 (1999).
Jurka, J. et al. Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 110, 462–467, https://doi.org/10.1159/000084979 (2005).
Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics Chapter 4, 4.10.11–14.10.14, https://doi.org/10.1002/0471250953.bi0410s25 (2009).
Bao, W., Kojima, K. K. & Kohany, O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob DNA 6, 11, https://doi.org/10.1186/s13100-015-0041-9 (2015).
Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res 35, W265–268, https://doi.org/10.1093/nar/gkm286 (2007).
Stanke, M., Diekhans, M., Baertsch, R. & Haussler, D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 24, 637–644, https://doi.org/10.1093/bioinformatics/btn013 (2008).
Brawley, S. H. et al. Insights into the red algae and eukaryotic evolution from the genome of Porphyra umbilicalis (Bangiophyceae, Rhodophyta). Proceedings of the National Academy of Sciences of the United States of America 114, E6361–e6370, https://doi.org/10.1073/pnas.1703088114 (2017).
Yu, X., Mo, Z., Tang, X., Gao, T. & Mao, Y. Genome-wide analysis of HSP70 gene superfamily in Pyropia yezoensis (Bangiales, Rhodophyta): identification, characterization and expression profiles in response to dehydration stress. BMC Plant Biol 21, 435, https://doi.org/10.1186/s12870-021-03213-0 (2021).
Nagano, Y., Kimura, K., Kobayashi, G. & Kawamura, Y. Genomic diversity of 39 samples of Pyropia species grown in Japan. PloS one 16, e0252207, https://doi.org/10.1371/journal.pone.0252207 (2021).
Cho, C. H. et al. Genome-wide signatures of adaptation to extreme environments in red algae. Nat Commun 14, 10, https://doi.org/10.1038/s41467-022-35566-x (2023).
Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 37, 907–915, https://doi.org/10.1038/s41587-019-0201-4 (2019).
Kovaka, S. et al. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome biology 20, 278, https://doi.org/10.1186/s13059-019-1910-1 (2019).
Keilwagen, J. et al. Using intron position conservation for homology-based gene prediction. Nucleic Acids Res 44, e89, https://doi.org/10.1093/nar/gkw092 (2016).
Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome biology 9, R7, https://doi.org/10.1186/gb-2008-9-1-r7 (2008).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRP564194 (2025).
Jian, J. Chromosome-level genome assembly of agar-producing red seaweed Gracilaria vermiculophylla. https://identifiers.org/ncbi/insdc.gca:GCA_054346105.1 (2025).
Jian, J. et al. Chromosome-level genome assembly of agar-producing red seaweed Gracilaria vermiculophylla. figshare https://doi.org/10.6084/m9.figshare.28667702.v1 (2025).
Zhou, Z. et al. Chromosome-level assembly and gene annotation of Kappaphycus striatus genome. Scientific Data 12, 249, https://doi.org/10.1038/s41597-025-04583-y (2025).
Petroll, R. et al. The expanded Bostrychia moritziana genome unveils evolution in the most diverse and complex order of red algae. Current Biology 35, 2771–2788.e2778, https://doi.org/10.1016/j.cub.2025.04.044 (2025).
Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome biology 21, 245, https://doi.org/10.1186/s13059-020-02134-9 (2020).
Manni, M., Berkeley, M. R., Seppey, M., Simao, F. A. & Zdobnov, E. M. BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes. Mol Biol Evol 38, 4647–4654, https://doi.org/10.1093/molbev/msab199 (2021).
Marçais, G. et al. MUMmer4: A fast and versatile genome alignment system. PLoS Comput Biol 14, e1005944, https://doi.org/10.1371/journal.pcbi.1005944 (2018).
Tang, H. et al. JCVI: A versatile toolkit for comparative genomics analysis. iMeta 3, e211, https://doi.org/10.1002/imt2.211 (2024).
Acknowledgements
This work was financially supported by Research on industrial innovation technology for Guangdong modern marine ranching (2024-MRI-001-07), STU Scientific Research Initiation Grant (NTF25030T) and County-Level Innovation Base of the Guangdong “Hundreds–Thousands–Ten Thousands” High-Quality Development Initiative (Nan’ao County, Shantou City) (No.STKJ2024003).
Author information
Authors and Affiliations
Contributions
H. Du and J. Jian conceived the study. Y. Luo, J. Xu, Y. Zhong, Q. Liu, B. Luo, J. Chen and X. Yang collected the samples, conducted experiments, library construction and sequencing. J. Jian, Y. Peng and Z. Wu performed bioinformatics analysis. J. Jian wrote the manuscript. H. Du and S. Wang revised the manuscript. All authors have read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Jian, J., Luo, Y., Xu, J. et al. Chromosome-level genome assembly of agar-producing red seaweed Gracilaria vermiculophylla. Sci Data (2026). https://doi.org/10.1038/s41597-026-06635-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-026-06635-3