Abstract
Lobsters, aquatic organisms of significant economic value, hold an important position in the global aquaculture and fisheries industries. However, due to overfishing and ecological change, the populations of certain lobster species have declined dramatically, prompting conservation efforts in various countries. However, limited genomics research has restricted our capacity to conserve and exploit lobster germplasm resources. Here, we present a chromosome-level reference genome for Panulirus homarus homarus constructed using PacBio long-read sequencing and Hi-C data. The genome assembly size was 2.61 Gb, with a contig N50 of 5.43 Mb, and a scaffold N50 of 36.69 Mb. The assembled sequences were anchored to 73 chromosomes, covering 96.05% of the total genome. A total of 25,580 protein-coding genes were predicted, and 99.98% of the genes were functionally annotated using various protein databases. The high-quality genome assembly provides a valuable resource for studying the biology and evolutionary history of P. h. homarus, and could facilitate sustainable resource management, aquaculture, and conservation of the species.
Similar content being viewed by others
Background & Summary
The scalloped spiny lobster, Panulirus homarus, belongs to Crustacea, Decapoda, Palinuridae, and Panulirus, and it consists of three economically valuable subspecies, including Panulirus h. homarus (Fig. 1a), Panulirus h. megasculptus, and Panulirus h. rubellus1,2. P. h. rubellus has also been reported to be a distinct species3,4. The species is an “engineer species” in coral reefs and rocky ecosystems, controlling population size by preying on benthic invertebrates (e.g., sea urchins, shellfish), maintaining ecological balance, and preventing overpopulation of certain species, which could destroy habitat structure5. In addition, P. homarus is distributed widely in the Indo-West Pacific (Fig. 1b), providing a good model for genetic comparisons of different geographic populations and for resolving gene flow, species differentiation, and ecological adaptation mechanisms4,6.
With increasing pressure on marine ecosystems globally, aquaculture is increasingly recognized as an important strategy for mitigating the depletion of wild fisheries resources7. P. homarus exhibits more rapid growth than other lobsters and it has been successfully farmed on a large scale in Vietnam and Indonesia, making it the optimal candidate for intensive large-scale lobster aquaculture8,9. Currently, the artificial nursery technology for scalloped spiny lobster remains valuable, with larvae mainly captured from the wild and then farmed in cage or industrial recirculating aquaculture systems9,10. Aquaculture research on P. homarus covers a range of topics, including resource assessment9 and the effects of nutrition11,12, salinity13, light14, and temperature15 on growth and reproduction. In addition, high-throughput sequencing technologies have advanced the study of lobster species genomes.
Currently, only the genomes of Homarus americanus16 and P. ornatus17 have been reported successfully. Genome assembly data for 13 Panulirus species are available in the NCBI database (Table S1). However, the average genome size of the species is only 1.5 Gb, which is significantly smaller than the anticipated genome size for lobsters. For instance, the genome size of P. h. homarus is merely 1.3 Gb, with a mere scaffold N50 of 2.9 kb, rendering the assembly quality inadequate to meet the demands of further research (Table S1). A comprehensive understanding of P. homarus is essential for effective management of its resources and the development of sustainable aquaculture practices.
In the present study, the authors adopted a comprehensive multi-platform sequencing approach, combining Illumina short-read sequencing, PacBio long-read sequencing, and Hi-C chromosome conformation capture technologies to generate a chromosome-level genome assembly for P. h. homarus (Fig. 2). The project generated 140.56 Gb of Illumina short-read data, 341.51 Gb of PacBio long-read data, and 364.32 Gb of Hi-C data, culminating in a final assembled genome with a size of 2.61 Gb, a contig N50 of 5.43 Mb, and a scaffold N50 of 36.69 Mb (Tables 1, 2). The chromosome-level assembly enhances the genomic resources available for lobsters substantially and provides a crucial reference genome.
Genomic landscape of P. h. homarus. Circos plot illustrating the genomic features of P. h. homarus. From the outermost to innermost rings: (a) gene density, (b) GC content, (c) densities of DNA transposons, (d) density of Long Terminal Repeats (LTRs), (e) density of Long Interspersed Nuclear Elements (LINEs), and (f) density of Short Interspersed Nuclear Elements (SINEs), all represented in 200-kb genomic windows.
Methods
Sample collection
An adult male P. h. homarus specimen was sourced from Hainan Yonghe Biotechnology Co., Ltd. (Qionghai, Hainan, China). Muscle tissue was collected after the lobster was anesthetized using cryogenic methods. The surface of the tissue was washed thoroughly several times with sterile phosphate buffered saline to effectively remove bacteria and impurities. The extraction of genomic DNA (gDNA) from muscle tissue for genome survey and library construction was carried out using the AMPure bead cleanup kit (Beckman Coulter, High Wycombe, UK) in strict accordance with the manufacturer’s instructions.
Total RNA was isolated from eye stalk, hemocyte, liver, muscle, intestine, and gills of the same specimen using TRIzol reagent, according to the manufacturer’s protocol. The integrity and quality of the RNA were evaluated by 1.5% agarose gel electrophoresis, whereas the concentrations were quantified precisely using a Qubit fluorometer (Thermo Fisher Scientific, Waltham, MA, USA).
Genome sequencing
A short-read library with an insert size of 350 bp was constructed and sequenced on the Illumina Novaseq-6000 (Illumina Inc., San Diego, CA, USA) platform, generating 2 × 150 bp paired-end reads. In total, 0.08 μg gDNA per sample was used as input material for the DNA library preparations. Library preparation was performed using the NEBNext® Ultra™ DNA Library Prep Kit (New England Biolabs, Ipswich, MA, USA), in strict accordance with the Illumina second-generation sequencing protocol, resulting in 140.56 Gb of raw data (Table 1).
For PacBio sequencing, gDNA was employed to construct SMRTbell libraries and sequenced on the PacBio Sequel (PacBio, Menlo Park, CA, USA) platform, leveraging single molecule real-time (SMRT) technology. In brief, the genomic DNA was first sheared into 6–20-kb fragments using g-TUBE. Subsequently, ExoVII (New England Biolabs, Beverly, MA, USA) was used to remove single-strand overhangs, followed by DNA damage repair with the SMRTbell Express Template Preparation Kit 2.0 (PacBio). T4 DNA polymerase and T4 PNK (New England Biolabs) were used to repair the ends, making them suitable for ligating SMRTbell hairpin adapters. After ligation, EXOIII (New England Biolabs) and ExoVII (New England Biolabs) enzymes were used to remove imperfect templates, and AMPure PB beads were used for purification. Subsequently, sequencing primers were annealed to the SMRTbell templates, and polymerase was bound to the template ends using the Binding Kit (PacBio). Finally, the library was loaded onto SMRT Cells for sequencing. A total of 341.51 Gb of continuous long reads was obtained, resulting in an extensive 131-fold coverage of the P. h. homarus genome (Table 1).
For Hi-C sequencing, high molecular weight gDNA was first cross-linked and then digested using the MboI restriction enzyme (New England Biolabs). The DNA was mechanically sheared into 300–500-bp fragments following 5′ overhang biotinylation and blunt-end ligation. Finally, the Hi-C library was sequenced on the Illumina NovaSeq 6000 platform (lllumina Inc., San Diego, CA, USA) in a 2 × 150-bp paired-end strategy, yielding 364.32 Gb raw reads, with a sequencing depth of 140× (Table 1).
For RNA sequencing, libraries of six tissues were prepared using the NEBNext® UltraTM RNA Library Prep Kit (New England Biolabs) for Illumina®, with all procedures following the manufacturer’s protocol rigorously. The RNA-seq libraries were subsequently sequenced on the Illumina NovaSeq 6000 platform in a 2 × 150-bp paired-end strategy and producing 55.7 Gb of clean reads.
Genome survey and assembly strategy
Prior to genome assembly, adapter sequences and low-quality reads from short-read sequencing data were filtered using Fastp software (v0.23.1)18 with default parameters, ensuring that only high-quality clean reads were retained for downstream processes. A comprehensive genome survey was performed to ascertain essential genomic characteristics, including overall size, heterozygosity, and repeat content. This was achieved by analyzing 17 distinct K-mer frequencies using SOAPec (v2.01)19 and GenomeScope (v2.0)20. Based on the analyses, the genome size of P. h. homarus was estimated to be 3,127.74 Mb, with 1.04% heterozygosity and 66.75% repetitive sequences at the dominant peak depth of 26 (Table S2, Fig. S1).
A dual-strategy utilizing two independent assembly software—Wtdbg2 (v2.5)21 and Flye (v2.9)22 with default parameters was employed for P. h. homarus genome assembly. The assembled genome drafts were subsequently improved using the Arrow (v8.0) polishing process23. The assembly results generated by Wtdbg2 and Flye after initial polish were merged using Quickmerge (v0.3)24. The merged assembly underwent further refinement through a comprehensive polishing process, involving two rounds of Arrow polishing followed by two rounds of Pilon polishing (v1.22)25, both utilizing default parameters. PacBio subreads were employed for Arrow polishing, whereas Illumina short reads were used for Pilon polishing to ensure high sequence accuracy. The rigorous and iterative assembly process culminated in the generation of 7,135 contigs, and a total assembly length of 2,720,451,987 bp (Table 2), representing a high-quality and robust genome assembly for P. h. homarus.
Chromosome-level assembly refinement
During the Hi-C scaffolding phase, the Juicer pipeline26 was used to align Fastp (v0.23.1)18 processed high-quality reads with draft genome assembly. The alignment was followed by the assembly of contigs into chromosomes, as well as to orient and sort contigs within each chromosome using the 3D-DNA pipeline27. Further refinement of the assembly was achieved through manual error correction using Juicebox Assembly Tools (v2.13.06)26. The rigorous scaffolding process successfully anchored 2,613.14 Mb of the genome to 73 chromosomes (Fig. 3), representing 96.05% of the total genome assembly (Table S3). The final assembly achieved a scaffold N50 of 36.69 Mb, reflecting a high level of continuity (Table 2). Remarkably, this assembly demonstrates exceptional contiguity, with 37 chromosomes containing no more than 30 gaps each (Table 3).
Comprehensive annotation of repetitive elements and noncoding RNAs
Repetitive sequences in the P. h. homarus genome were predicted using two strategies, including de novo assembly and homology matching28. Initially, the de novo-predicted repetitive sequence database was integrated with the Repbase homologous repetitive sequence database29. A suite of tools—including RepeatScout (v1.0.5)28, RepeatModeler (v2.0.1)30, Piler (v1.0)31, and LTR-FINDER (v1.0.6)32—was employed to identify transposable element (TE) families. Subsequently, classification of distinct repetitive elements was performed using RepeatMasker (v4.1.0)30, RepeatProteinMask (v4.1.0), and TRF (v4.0.9)33, aligning the P. h. homarus genome sequences against the integrated repetitive sequence database. After removing redundant entries from the various methods, repetitive sequences were observed to constitute 69.67% of the P. h. homarus genome (Table S4). Additionally, the Kimura divergence values of TEs were calculated using the ‘calcDivergenceFromalign.pl’ script34, and TE landscapes were visualized with ‘createRepeatLandscape.pl’35. The repeat elements identified included DNA transposons, which comprised 9.82% of the genome, long interspersed nuclear elements accounting for 37.44%, short interspersed nuclear elements representing 0.02%, and long terminal repeats making up 30.06% (Table 4 and Fig. 4).
Tools specialized for non-coding RNAs (ncRNAs) were used to annotate ncRNA in P. h. homarus genome. tRNA were identified using tRNAScan (v1.4)36, whereas rRNA were predicated via BLAST (v2.2.26)37. Additional ncRNAs, including miRNAs and snRNAs, were annotated by aligning sequences with the Rfam database38 using the INFERNAL tool (v1.0)39. Ultimately, four distinct classes of ncRNAs were annotated successfully, comprising 20,765 miRNAs, 3,608 tRNAs, 1,421 rRNAs, and 3,066 snRNAs (Table 5), providing valuable insight into the non-coding RNA landscape of P. h. homarus.
Integrated gene structure prediction and functional annotation
A comprehensive, integrated approach combining ab initio prediction, homology prediction, and transcriptome sequencing-based prediction was used to predict the gene structure of the P. h. homarus genome. For de novo gene predication, a robust suite of tools—AUGUSTUS (v3.2.3)40, GlimmerHMM (v3.02)41, SNAP (v2013.11.29)42, Geneid (v1.4)43, and Genscan (v1.0)44—was employed to predict gene structures directly from the genome sequence. For homology-based annotation, protein sequences from Drosophila melanogaster (fruit fly), Penaeus chinensis (Chinese shrimp), Eriocheir sinensis (Chinese mitten crab), Litopenaeus vannamei (Pacific white shrimp), Marsupenaeus japonicus (Kuruma shrimp), P. ornatus (ornate spiny lobster), Portunus trituberculatus (swimming crab), and H. americanus (American lobster) were retrieved from NCBI’s GenBank database and aligned to the P. h. homarus genome using BLAST (v2.2.26)37 and Genewise (v2.4.1)45. The integrated multifaceted strategy enabled comprehensive and accurate prediction of protein-coding genes, significantly advancing our understanding of the genetic architecture of P. h. homarus. A total of 8,545–178,660 homologous genes were identified for D. melanogaster, P. chinensis, E. sinensis, L. vannamei, M. japonicus, P. ornatus, P. trituberculatus, and H. americanus (Table 6). Gene length, along with CDS, exon, and intron lengths, was analyzed and compared to those of other species (Fig. 5). The mean transcript length in P. h. homarus was 31,472.77 bp, with CDS, exon, and intron lengths averaging 1,613.73 bp, 279.37 bp, and 6,251.44 bp, respectively (Table S5).
To further refine transcriptome assembly, two distinct methods were employed: genome-guided transcript assembly and de novo assembly using Trinity software (v2.11.0)46. Gene structures were identified using PASA (v2.1.0)47, and gene sets predicted through various methods were integrated into a non-redundant comprehensive gene set of 25,580 protein-coding genes using Evidence Modeler (v1.1.1)48 (Table 7 and Fig. 6a).
Functional annotation of these protein-coding genes was performed using BLASTp (v2.2.26)37 and Diamond (v0.8.22)49 to align sequences against several key protein databases, including SwissProt50, NCBI Nonredundant protein (NR), KEGG51, InterPro52, Gene Ontology (GO)53, and Pfam54, with an E-value cutoff of 1E-5. Protein domains and motifs were annotated using InterProScan (v5.52–86.0)55. Among the 25,580 predicted genes, 25,575 (99.98%) were annotated to at least one database (Table 7) and 16,526 proteins (64.61%) received annotation support from across all four databases (Fig. 6b).
Data Records
We have deposited the Hi-C sequencing data (SRR30872734), Illumina sequencing data (SRR30872735), PacBio sequencing data (SRR30872736), and transcriptomic sequencing data (SRR3105790260 - SRR3105790765) in the SRA at NCBI56.
The genome-wide shotgun project has been deposited in DDBJ/ENA/GenBank under accession number GCA_043589495.157, and the genome assembly along with its annotation information has been made available on Figshare58.
Technical Validation
The quality of the P. h. homarus genome assembly was technically verified rigorously through a multifaceted evaluation. First, the genomic quality was analyzed using Benchmarking Universal Single-Copy Orthologs (BUSCO) (v5.8.0)59, with the arthropoda_odb12 BUSCO database, to assess the presence of single-copy orthologous genes. Using tools such as tBLASTn (v2.2.26)37, AUGUSTUS (v3.2.3)40, and HMMER60, 98.2% of gene orthologs were detected, of which 97.2% were complete and 1.0% fragmented, indicating a highly comprehensive assembly (Table S6). Second, using the Core Eukaryotic Genes Mapping Approach (CEGMA) (v2.5)61, we identified homologs for 226 highly conserved core genes in P. h. homarus, representing 92.34% (229) of the total, further supporting the completeness of the assembly (Table S7). Third, the consensus quality value and k-mer (k = 21) completeness of the assembly evaluated using Merqury software62 were 31.78 and 87.59%, respectively (Table S8). In addition, alignment of Illumina sequencing reads to the nuclear genome using BWA (v0.7.8)63 yielded a high read mapping rate of 98.60% and a coverage rate of 94.85%, underscoring the robust integrity of the assembled genome and the consistency of the sequencing data (Table S9). Finally, to conduct genome-wide homology analysis, we used MCScanX within the JCVI toolkit (v1.1.12) (https://github.com/tanghaibao/jcvi) to perform a synteny comparison between the genomes of the P. h. homarus and P. ornatus, and visualized the macro-syntenic relationships using Circos (v0.69)64. The results showed that 73 chromosome-level scaffolds of P. h. homarus exhibited significant synteny with the corresponding chromosomes of P. ornatus (Fig. S2). These combined results affirm the exceptional quality and completeness of the P. h. homarus genome assembly.
Code availability
In the present study, no custom code was developed. All commands and pipelines used for data processing are detailed comprehensively in the methods section. For software where specific parameters are not explicitly mentioned, we adhered to the default settings as recommended by the software developers. The core code is available at https://github.com/sundongfang/Genome-Assembly-of-Panulirus-homarus.
References
George, R. W. Tethys sea fragmentation and speciation of Panulirus spiny lobsters. Crustaceana 78, 1281–1309 (2005).
Pancoro, A., Indriatmoko, Wijaya, D. & Karima, E. Sea surface temperature and current-related parameters affecting local adaptation of scalloped spiny lobster population in Indonesia’s archipelagic system. Sci. Rep. 14, 18403 (2024).
Sohana, P., Groeneveld, J. C., Al-Marzouqi, A. & Willows-Munro, S. A molecular phylogeny of the spiny lobster Panulirus homarus highlights a separately evolving lineage from the Southwest Indian Ocean. PeerJ 5, e3356 (2017).
Lavery, S. D. et al. Evolutionary divergence of geographic subspecies within the scalloped spiny lobster Panulirus homarus (Linnaeus 1758). PloS ONE 9, e97247 (2014).
Phillips, B. F., Wahle, R. A., & Ward, T. J. Lobsters: Biology, Management, Aquaculture and Fisheries. pp. 1–35 (2013).
Farhadi, A., Jeffs, A. G. & Lavery, S. D. Genome-wide SNPs in the spiny lobster Panulirus homarus reveal a hybrid origin for its subspecies. BMC Genomics 23, 750 (2022).
Jones, C. M., Le Anh, T. & Priyambodo, B. Lobsters: Biology, Fisheries and Aquaculture. pp. 541–570 (2019).
Hung, L. V., Tuan, L. A. & Williams, K. C. Lobster seacage culture in Vietnam. Australian Centre for International Agricultural Research (ACIAR), 10–17 (2009).
Priyambodo, B., Jones, C. M. & Sammut, J. Assessment of the lobster puerulus (Panulirus homarus and Panulirus ornatus, Decapoda: Palinuridae) resource of Indonesia and its potential for sustainable harvest for aquaculture. Aquaculture 528, 735563 (2020).
Huu, H. D. & Jones, C. M. Effects of dietary mannan oligosaccharide supplementation on juvenile spiny lobster Panulirus homarus (Palinuridae). Aquaculture 432, 258–264 (2014).
Kulmiye, A. J. Reproductive seasonality and fecundity of the spiny lobster Panulirus homarus homarus (Linnaeus, 1758) (Decapoda: Achelata: Palinuridae) in Kenya: implications for sustainable management. J. Crust. Biol. 42, ruac028 (2022).
Spencer, E. L., Fitzgibbon, Q. P., Day, R. D., Trotter, A. J. & Smith, G. G. Effects of acute salinity stress on the survival and haemolymph biochemistry of juvenile tropical rock lobster, Panulirus ornatus, at different moult stages. Aquaculture 573, 739597 (2023).
Chen, H. et al. Growth, health status and gut microbiota of the scalloped spiny lobster (Panulirus homarus) at different salinities. Aquaculture 562, 738779 (2023).
Ju, Z. M. et al. Cloning, expression analysis, and the photoperiodic response of CHH gene in Panulirus homarus. Aquacult. Rep. 40, 102552 (2025).
Kemp, J. O. G. & Britz, P. J. The effect of temperature on the growth, survival and food consumption of the east coast rock lobster Panulirus homarus rubellus. Aquaculture 280, 227–231 (2008).
Polinski, J. M. et al. The American lobster genome reveals insights on longevity, neural, and immune adaptations. Sci, Adv 7, eabe8290 (2021).
Ren, X. et al. Chromosome-level genome of the long-tailed marine-living ornate spiny lobster, Panulirus ornatus. Sci. Data 11, 662 (2024).
Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).
Li, R. et al. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 20, 265–272 (2010).
Ranallo-Benavidez, T. R., Jaron, K. S. & Schatz, M. C. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat. Commun. 11, 1432 (2020).
Ruan, J. & Li, H. Fast and accurate long-read assembly with wtdbg2. Nat. Methods 17, 155–158 (2020).
Kolmogorov, M., Yuan, J., Lin, Y. & Pevzner, P. A. Assembly of long, error-prone reads using repeat graphs. Nat. Biotechnol. 37, 540–546 (2019).
Wenger, A. et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 37, 1155–1162 (2019).
Chakraborty, M., Baldwin-Brown, J. G., Long, A. D. & Emerson, J. J. Contiguous and accurate de novo assembly of metazoan genomes with modest long read coverage. Nucleic Acids Res. 44, e147 (2016).
Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9, e112963 (2014).
Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3, 95–98 (2016).
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).
Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in large genomes. Bioinformatics 21(Suppl 1), i351–358 (2005).
Jurka, J. et al. Repbase Update, a database of eukaryotic repetitive elements. Cytogenet. Genome Res. 110, 462–467 (2005).
Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinformatics Chapter 4(1), 4.10.1–4.10.14 (2009).
Edgar, R. C. & Myers, E. W. PILER: identification and classification of genomic repeats. Bioinformatics 21(Suppl 1), i152–158 (2005).
Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268 (2007).
Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).
Hubley, R. GitHub repository https://github.com/rmhubley/RepeatMasker/blob/master/util/createRepeatLandscape.pl (2023).
Rosen, J. GitHub repository https://github.com/rmhubley/RepeatMasker/blob/master/util/calcDivergenceFromAlign.pl (2020).
Chan, P. P., Lin, B. Y., Mak, A. J. & Lowe, T. M. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 49, 9077–9096 (2021).
Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).
Griffiths-Jones, S. et al. Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res. 33, D121–D124 (2005).
Nawrocki, E. P., Kolbe, D. L. & Eddy, S. R. Infernal 1.0: inference of RNA alignments. Bioinformatics 25, 1335–1337 (2009).
Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res 34, W435–W439 (2006).
Majoros, W. H., Pertea, M. & Salzberg, S. L. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics 20, 2878–2879 (2004).
Korf, I. Gene finding in novel genomes. BMC Bioinformatics 5, 59 (2004).
Alioto, T., Blanco, E., Parra, G. & Guigo, R. Using geneid to identify genes. Curr. Protoc. Bioinformatics 64, e56 (2018).
Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997).
Birney, E., Clamp, M. & Durbin, R. GeneWise and Genomewise. Genome Res. 14, 988–995 (2004).
Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).
Haas, B. J. et al. Improving the arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 5654–5666 (2003).
Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome. Biol. 9, R7 (2008).
Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).
Bairoch, A. & Apweiler, R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28, 45–48 (2000).
Kanehisa, M., Sato, Y., Kawashima, M., Furumichi, M. & Tanabe, M. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 44, D457–D462 (2016).
Finn, R. D. et al. InterPro in 2017-beyond protein family and domain annotations. Nucleic Acids Res. 45, D190–D199 (2017).
Ashburner, M. et al. Gene ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000).
Mistry, J. et al. Pfam: the protein families database in 2021. Nucleic Acids Res. 49, D412–D419 (2014).
Mulder, N. & Apweiler, R. InterPro and InterProScan: tools for protein sequence classification and comparison. Methods Mol. Biol. 396, 59–70, https://doi.org/10.1007/978-1-59745-515-2_5 (2007).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRP536394 (2024).
NCBI GenBank https://identifiers.org/ncbi/insdc.gca:GCA_043589495.1.
Sun, D. F. et al. Chromosome-level genome assembly of the scalloped spiny lobster Panulirus homarus. figshare. https://doi.org/10.6084/m9.figshare.28465208.v1 (2025).
Simao, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
Potter, S. et al. HMMER web server: 2018 update. Nucleic Acids Res. 46, W200–W204 (2018).
Parra, G., Bradnam, K. & Korf, I. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23, 1061–1067 (2007).
Rhie, A. et al. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 21, 245 (2020).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Martin, K. et al. Circos: an information aesthetic for comparative genomics. Genome Res. 19, 1639–1645 (2009).
Acknowledgements
This work was supported by grants from the Key R&D Program of Shandong Province, China [grant number 2024CXPT071-3]; the Project First National Survey of Aquaculture Germplasm Resources in the Yellow and Bohai Seas [grant number17210247]; the China Agriculture Research System of MOF and MARA [grant number CARS-48) and the Central Public-interest Scientific Institution Basal Research Fund, CAFS [grant number 2023TD50].
Statement of ethics
Animal experiments were conducted in accordance with relevant national and international guidelines and were approved by the Yellow Sea Fisheries Research Institute, Shandong, China. No specific permits were required for the collection of wild lobster from seawater in China. This study did not involve endangered or protected species.
Author information
Authors and Affiliations
Contributions
X.Y.R. and J.T.L. conceived the study and supervised the project. D.F.S. collected the sample and wrote the manuscript. J.J.L. performed the data analysis and data uploading. B.Q.G. and J.L. supervised this work and assisted in data analysis. S.T.J., P.L. collected the samples. All authors contributed to the final manuscript editing.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Sun, D., Lv, J., Gao, B. et al. Chromosome-level genome assembly of scalloped spiny lobster Panulirus homarus homarus. Sci Data 12, 900 (2025). https://doi.org/10.1038/s41597-025-05253-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-025-05253-9