Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Data
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific data
  3. data descriptors
  4. article
High-Quality Genome Assemblies of Two Prototheca wickerhamii Strains
Download PDF
Download PDF
  • Data Descriptor
  • Open access
  • Published: 09 March 2026

High-Quality Genome Assemblies of Two Prototheca wickerhamii Strains

  • Lihua Fang  ORCID: orcid.org/0009-0003-4864-06721 na1,
  • Jian Guo2 na1,
  • Qing Ning3,
  • Yuhang Luo4,
  • Jianbo Jian4 &
  • …
  • Jie Ning1 

Scientific Data , Article number:  (2026) Cite this article

  • 759 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Bacterial genetics
  • DNA sequencing

Abstract

Prototheca wickerhamii is a non-photosynthetic microalgal species that has been implicated in opportunistic human infections. Understanding its genomic features is crucial for both medical applications and symbiosis research. We generated high-quality genome assemblies for two strains of Prototheca wickerhamii, Pw26 and PwS1, using PacBio HiFi reads. The assemblies were evaluated for completeness and accuracy using BUSCO analysis. The assembled genomes for Pw26 and PwS1 were 17.8 MB and 17.4 MB, respectively, with contig N50 values of 1.6 MB. The number of assembled contigs is closely related to the number of chromosomes. The GC content was 63.5% for both genomes. Comparative analysis showed high similarity in genome size and alignment, with Pw26 having slightly more protein-coding genes (46,394) than PwS1 (44,702). Repeat sequences accounted for 6.03% and 4.18% of the genomes in Pw26 and PwS1, respectively. These high-quality genome assemblies provide a valuable resource for comparative genomics and functional exploration of Prototheca wickerhamii. The detailed genomic characterization supports further studies on pathogenic mechanisms.

Similar content being viewed by others

Chromosome-level genome assembly and annotation of the predatory stink bug Eocanthecona furcellata

Article Open access 05 December 2025

Two high-quality genomes of Prototheca bovis strain SH08 and Prototheca ciferrii strain SH13

Article Open access 02 December 2025

Chromosome-level genome assembly of Trichogramma ostriniae Pang & Chen (Hymenoptera: Trichogrammatidae)

Article Open access 19 December 2025

Data availability

All data related to the genome of P. wickerhamii strain PwS1 and Pw26 are available through the following databases or links. Sequence Read Archive (SRA) data was uploaded in NCBI with project ID of PRJNA1314384. The HiFi long reads sequencing data for P. wickerhamii Pw26 and PwS1 have been deposited in the SRA, with accession numbers SRR35231276 and SRR35231275, respectively. The whole-genome shotgun projects for strains Pw26 and PwS1 have been deposited in GenBank under accessions JBTKVZ00000000032 and JBTKVY00000000033, correspondingly. Additionally, SRA data link is https://identifiers.org/ncbi/insdc.sra:SRP616729. The Figshare link: https://doi.org/10.6084/m9.figshare.30030796.v1.

Code availability

This study did not involve the development of any specific code. The data analyses were conducted in accordance with the protocols outlined in the Methods section.

References

  1. Masuda, M. et al. Protothecosis in Dogs and Cats-New Research Directions. Mycopathologia. 186(1), 143–152 (2021).

    Google Scholar 

  2. Kano, R. Emergence of Fungal-Like Organisms: Prototheca. Mycopathologia. 185(5), 747–754 (2020).

    Google Scholar 

  3. Guo, J. et al. Integration of transcriptomics, proteomics, and metabolomics data for the detection of the human pathogenic Prototheca wickerhamii from a One Health perspective. Frontiers in cellular and infection microbiology. 13, 1152198 (2023).

    Google Scholar 

  4. Bakuła, Z. et al. A first insight into the genome of Prototheca wickerhamii, a major causative agent of human protothecosis. BMC genomics. 22(1), 168 (2021).

    Google Scholar 

  5. Guo, J. et al. Genome Sequences of Two Strains of Prototheca wickerhamii Provide Insight Into the Protothecosis Evolution. Frontiers in cellular and infection microbiology. 12, 797017 (2022).

    Google Scholar 

  6. Lass-Flörl, C. & Mayr, A. Human protothecosis. Clinical microbiology reviews. 20(2), 230–242 (2007).

    Google Scholar 

  7. Urban, M. et al. PHI-base: the pathogen-host interactions database. Nucleic acids research. 48(D1), D613–d620 (2020).

    Google Scholar 

  8. Wolff, G., Plante, I., Lang, B. F., Kück, U. & Burger, G. Complete sequence of the mitochondrial DNA of the chlorophyte alga Prototheca wickerhamii. Gene content and genome organization. Journal of molecular biology. 237(1), 75–86 (1994).

    Google Scholar 

  9. Bakuła, Z. et al. Sequencing and Analysis of the Complete Organellar Genomes of Prototheca wickerhamii. Frontiers in plant science. 11, 1296 (2020).

    Google Scholar 

  10. Zhang, Q. Q., Zhu, L. P., Weng, X. H., Li, L. & Wang, J. J. Meningitis due to Prototheca wickerhamii: rare case in China. Medical mycology. 45(1), 85–88 (2007).

    Google Scholar 

  11. Li, J., Huang, Z. & Zhang, R. Unmasking Prototheca wickerhamii: A rare case of cutaneous infection and its implications for clinical practice. The Brazilian journal of infectious diseases: an official publication of the Brazilian Society of Infectious Diseases. 29(3), 104525 (2025).

    Google Scholar 

  12. Etchecopaz A. N., Del Vecchio L., Álvarez C., Mesplet M. & Cuestas M. L. Cytological and microbiological analysis of a Prototheca wickerhamii infection in a cat with cutaneous lesions successfully treated with intralesional amphotericin B. The Journal of small animal practice (2025).

  13. Guo J. et al. Two high-quality genomes of Prototheca bovis strain SH08 and Prototheca ciferrii strain SH13. Scientific data. (2025).

  14. Jagielski, T. et al. Occurrence of Prototheca Microalgae in Aquatic Ecosystems with a Description of Three New Species, Prototheca fontanea, Prototheca lentecrescens, and Prototheca vistulensis. Applied and environmental microbiology. 88(22), e0109222 (2022).

    Google Scholar 

  15. Mareso, C. et al. Optimization of long-range PCR protocol to prepare filaggrin exon 3 libraries for PacBio long-read sequencing. Molecular biology reports. 50(4), 3119–3127 (2023).

    Google Scholar 

  16. Ortigas-Vasquez, A. et al. High-fidelity long-read sequencing of an avian herpesvirus reveals extensive intrapopulation diversity in tandem repeat regions. PLoS pathogens. 21(8), e1013435 (2025).

    Google Scholar 

  17. Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27(2), 573–580 (1999).

    Google Scholar 

  18. Bao, W., Kojima, K. K. & Kohany, O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mobile DNA. 6, 11 (2015).

    Google Scholar 

  19. Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in large genomes. Bioinformatics. 21(Suppl 1), i351–358 (2005).

    Google Scholar 

  20. Jurka, J. et al. Repbase Update, a database of eukaryotic repetitive elements. Cytogenetic and genome research. 110(1-4), 462–467 (2005).

    Google Scholar 

  21. Stanke, M., Schöffmann, O., Morgenstern, B. & Waack, S. Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinformatics. 7, 62 (2006).

    Google Scholar 

  22. Holt, C. & Yandell, M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics. 12(1), 491 (2011).

    Google Scholar 

  23. Yang, Z. et al. Convergent horizontal gene transfer and cross-talk of mobile nucleic acids in parasitic plants. Nature Plants. 5(9), 991–1001 (2019).

    Google Scholar 

  24. Kovaka, S. et al. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome Biol. 20(1), 278 (2019).

    Google Scholar 

  25. Zhu, M., Wang, X. & Li, X. Genome-wide identification and expression analysis of glutamate receptor-like genes in three Dendrobium species. Biochimica et biophysica acta General subjects. 1869(6), 130789 (2025).

    Google Scholar 

  26. Zuo, W. & Wang, Z. Identification of ulcerative colitis diagnostic markers from differentially expressed genes shared with Hirschsprung disease. Scientific reports. 15(1), 11274 (2025).

    Google Scholar 

  27. Borza, T., Popescu, C. E. & Lee, R. W. Multiple metabolic roles for the nonphotosynthetic plastid of the green alga Prototheca wickerhamii. Eukaryotic cell. 4(2), 253–261 (2005).

    Google Scholar 

  28. Zou, C. et al. Genome and transcriptome wide association study identify candidate genes regulating folate levels in maize. Frontiers in plant science. 16, 1606220 (2025).

    Google Scholar 

  29. Zhu, J. et al. Genome-wide association study and transcriptomic analysis reveal the crucial role of sting1 in resistance to visceral white-nodules disease in Larimichthys polyactis. Frontiers in immunology. 16, 1562307 (2025).

    Google Scholar 

  30. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR35231276 (2026).

  31. NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR35231275 (2026).

  32. Fang, L. et al. High-Quality Genome Assemblies of Two Prototheca wickerhamii Strains, https://www.ncbi.nlm.nih.gov/nuccore/JBTKVZ000000000 (2026).

  33. Fang, L. et al. High-Quality Genome Assemblies of Two Prototheca wickerhamii Strains, https://www.ncbi.nlm.nih.gov/nuccore/JBTKVY000000000 (2026).

  34. Fang, L. et al. High-Quality Genome Assemblies of Two Prototheca wickerhamii Strains. figshare https://doi.org/10.6084/m9.figshare.30030796.v1 (2026).

  35. Zhou, Q. et al. Telomere-to-telomere gapless genome assembly of the giant grouper (Epinephelus lanceolatus). Scientific data. 11(1), 1342 (2024).

    Google Scholar 

  36. Zuo, W. et al. Whole genome sequencing of a multidrug-resistant Bacillus thuringiensis HM-311 obtained from the Radiation and Heavy metal-polluted soil. Journal of global antimicrobial resistance. 21, 275–277 (2020).

    Google Scholar 

  37. Qian, W. et al. Identification of novel single nucleotide variants in the drug resistance mechanism of Mycobacterium tuberculosis isolates by whole-genome analysis. BMC genomics. 25(1), 478 (2024).

    Google Scholar 

  38. Zhou, X., Wang, E., Xu, X. & Zhang, B. Chromosome-level genome assembly of Phytoseiulus persimilis Athias-Henriot. Scientific data. 12(1), 293 (2025).

    Google Scholar 

  39. Sodmann, A. et al. Human dorsal root ganglia are either preserved or completely lost after deafferentation by brachial plexus injury. British journal of anaesthesia. 133(6), 1250–1262 (2024).

    Google Scholar 

Download references

Acknowledgements

This work was financially supported by the STU Scientific Research Initiation Grant (NTF25030T), Sanming Project of Shenzhen Longhua Distrcit Central Hospital and Municipal Financial Subsidy of Shenzhen Longhua District Key Medical Discipline Construction.

Author information

Author notes
  1. These authors contributed equally: Lihua Fang, Jian Guo.

Authors and Affiliations

  1. Department of Endocrinology, Shenzhen Longhua District Central Hospital, Shenzhen, 518110, China

    Lihua Fang & Jie Ning

  2. Department of Laboratory Medicine, Shanghai East Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, 200120, China

    Jian Guo

  3. The Second School of Clinical Medicine, Southern Medical University, Guangzhou, Guangdong Province, 510515, China

    Qing Ning

  4. Guangdong Provincial Key Laboratory of Marine Biotechnology, Shantou University, Shantou, 515063, China

    Yuhang Luo & Jianbo Jian

Authors
  1. Lihua Fang
    View author publications

    Search author on:PubMed Google Scholar

  2. Jian Guo
    View author publications

    Search author on:PubMed Google Scholar

  3. Qing Ning
    View author publications

    Search author on:PubMed Google Scholar

  4. Yuhang Luo
    View author publications

    Search author on:PubMed Google Scholar

  5. Jianbo Jian
    View author publications

    Search author on:PubMed Google Scholar

  6. Jie Ning
    View author publications

    Search author on:PubMed Google Scholar

Contributions

J. Jian conceived the study. J. Guo collected the samples, conducted experiments, L. Fang, J. Jian and Y. Luo performed bioinformatics analysis. L. Fang, Q. Ning, J. Jian wrote the manuscript. J. Guo and J. Ning provided suggestion and revised the manuscript. All authors have read and approved the final manuscript.

Corresponding authors

Correspondence to Jianbo Jian or Jie Ning.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fang, L., Guo, J., Ning, Q. et al. High-Quality Genome Assemblies of Two Prototheca wickerhamii Strains. Sci Data (2026). https://doi.org/10.1038/s41597-026-06916-x

Download citation

  • Received: 08 September 2025

  • Accepted: 12 February 2026

  • Published: 09 March 2026

  • DOI: https://doi.org/10.1038/s41597-026-06916-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Associated content

Collection

Genetic markers, variants and recombination data

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims and scope
  • Editors & Editorial Board
  • Journal Metrics
  • Policies
  • Open Access Fees and Funding
  • Calls for Papers
  • Contact

Publish with us

  • Submission Guidelines
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Data (Sci Data)

ISSN 2052-4463 (online)

nature.com footer links

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing