Chromatin landscape, transcriptomic and ChIP-seq profiling of Anopheles stephensi MSQ43 cell line

Lukyanchikova, Varvara; Nuriddinov, Miroslav; Khabarova, Anna; Gridina, Maria; Popov, Andrey; Belokopytova, Polina; Sharakhov, Igor V.; Fishman, Veniamin

doi:10.1038/s41597-025-05721-2

Download PDF

Data Descriptor
Open access
Published: 26 September 2025

Chromatin landscape, transcriptomic and ChIP-seq profiling of Anopheles stephensi MSQ43 cell line

Scientific Data volume 12, Article number: 1566 (2025) Cite this article

2879 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

Anopheles mosquitoes are known as dominant vectors of malaria parasites and other viral and bacterial pathogens. Deciphering Anopheles genomes has opened a new era of research and allowed in-depth investigation of several molecular mechanisms involved in pathogen transmission. However, there is a lack of appropriate model systems to study specific vector-pathogen interactions at the molecular level. Even inbred mosquito colonies represent genetically heterogeneous population and require special care facilities. Insect cell cultures could serve as a useful alternative due to the ease of handling and maintenance, especially for genomic studies requiring millions of cells per experiment. Here, we provide chromatin contact maps and a dataset of epigenetic characteristics, including histone mark profiles and RNA-seq data, as well as transfection conditions and promoter specificity for the mosquito cell line MSQ43.

The genome trilogy of Anopheles stephensi, an urban malaria vector, reveals structure of a locus associated with adaptation to environmental heterogeneity

Article Open access 04 March 2022

Anopheles mosquitoes reveal new principles of 3D genome organization in insects

Article Open access 12 April 2022

A multi-omic meta-analysis reveals novel mechanisms of insecticide resistance in malaria vectors

Article Open access 23 May 2025

Background & Summary

Quick uncontrollable spreading of insects causes simultaneous expansion of human and animal diseases all over the world. According to the Bradshow invasive insects cost a minimum of US$70.0 billion per year globally, while associated health costs exceed US$6.9 billion per year¹. Mosquitoes, in particular, do a giant damage to human health worldwide. The list of diseases transmitted by mosquitoes includes malaria, dengue, West Nile virus, chikungunya, yellow fever, filariasis, tularemia, dirofilariasis, Japanese encephalitis, Saint Louis encephalitis, Western and Eastern equine encephalitis, Venezuelan equine encephalitis, Ross River fever, Barmah Forest fever, La Crosse encephalitis, O’nyong-nyong virus (ONNV), and Zika virus, as well as newly detected Keystone virus and Rift Valley fever². In total, over one million people die from mosquito-borne diseases every year. Anopheles mosquitoes are responsible for global spreading of malaria within the human populations, while An. stephensi is known as a primary vector of malaria in South Asia and is capable of transmitting both P. falciparum and P. vivax parasites³ (Fig. 1a). Originated from Southeast Asia and the Arabian Peninsula it is nowadays known as a pervasive mosquito with an extensive geographical range, occupying territories from Northern Thailand through the Indian Subcontinent up to the Himalayan region and from the Persian Gulf area across the Middle East towards southern China⁴. Lately, An. stephensi was reported in the Lakshadweep Islands (2001)⁵, in countries in the Horn of Africa (2012)⁶, in Sri Lanka in 2016–2017⁷, in Ethiopia in 2016–2018⁸ and most recently in the Republic of Sudan (2019), which reflects its fast geographical expansion from Asian regions towards the African continent. The intervention of An. stephensi into North Africa where over 40% of sub-Saharan Africans live in urban environments is even more worrying given its ability to successfully survive in both rural and urban environments and exhibiting the resistance to several classes of insecticides⁹.

The mosquito transmission potential and vectorial capacity varies within Anopheles gender^10,11. Biological reasons for that difference are still undercovered and need further scientific expertise. Mosquito immunity responses to the parasite invasion as well as molecular mechanisms activated during the infection and involved into mosquito-plasmodium interactions stand the challengeable questions. Recent studies have discovered several genomic loci which are probably involved in parasite transmission and mosquito immune response^{12,13,14,15,16,17,18}. Midgut chromatin comparison between infected and non-infected mosquitoes detected tremendous discrepancies in transcriptional and epigenetics profiles^17,19. As molecular technologies improve, they give rise to new opportunities for addressing key biological questions. Since functional genomic techniques often require tens of millions of cells per experiment, the usage of cell cultures remains the most practical approach. Immortalized insect cell lines have been introduced in the 1960s, and have been generally used for recombinant protein expression, while have recently emerged as a powerful tool for biomanufacturing for human medicine and veterinary science (reviewed in²⁰). Additionally, dipterian cell cultures are known as reliable tools for studying vector-parasite interactions²¹ as well as insecticide efficiency and toxicity²². To date, 24 Anopheles cell lines are available and listed at ExPASy Cellosaurus database²³ (http://web.expasy.org/cellosaurus), including six cell lines established from An. stephensi mosquito. The MSQ43 was originally established from a 1st stage larva of An. stephensi mosquito as an adhesive monolayered immortalized cell culture (Fig. 1b)²⁴. Laboratory strains of the An. stephensi can be categorized into three ecological variants²⁵, and MSQ43 was obtained from the variety myrosensis²⁴. Since well-annotated biological models are in high demand, especially mosquito ones due to their tight molecular connection with the transmitting process of human diseases, MSQ43 cell line could serve as an appropriate model system for mosquito genomic and epigenetic studies.

Here, we comprehensively characterized the MSQ43 epigenetic and chromatin profiles and demonstrated transcriptional patterns based on RNA-seq data. Chromatin three-dimensional (3D) interactions were analysed using the Hi-C approach. We produced 78 millions unique reads, containing 64 millions Hi-C contacts. ChIP-seq experiments performed in 2 replicas include the set of five antibodies specific for active (H3K27Ac, H3K4me3) and repressive histone marks (H2AK119Ubiq, H3K27me3, H3K9me3). On average, 7–15 millions uniquely aligned reads were generated for each ChIP-seq library. To complement epignetic profiles, we generated an RNA-seq data set and compared it with embryonic data available from a recent study²⁶. Finally, we optimized transfection conditions and accessed activity of several mammalian and Drosophila promoters frequently used in research laboratories, specifying conditions for exogenous genes delivery in MSQ43 cells.

Methods

Cell culture

Adhesive monolayered MSQ43 cells (Fig. 1b) were maintained at 28 C in 5% CO2 at approximately 60–80% confluence in MEM (Minimum Essential Medium) with Earle′s salts (Sigma Aldrich, #0275) containing 5% heat-inactivated fetal bovine serum (Sigma Aldrich, #F9665), 0.1% D-glucose (Sigma Aldrich, G7021), 1 mM NEAA (Sigma Aldrich, #M7145), 2 mM L-glutamine (Thermo Fisher Scientific, #35050038), 1x vitamin solution (Thermo Fisher Scientific, #11120052), 2.2 g/L sodium bicarbonate, and 1% Penicillin/Streptomycin (Sigma Aldrich,#P4333). TripleExpress (Thermo Fisher Scientific, #12604013) was used for harvesting/passaging. 10% DMSO in KSR (Thermo Fisher Scientific, #10828028) was used for freezing.

Transfection

Transfection was performed with Lipofectamine 3000 on cell lines MSQ43 and HEK293T (human cell lines used as a control) according to the manufacturer’s recommendations. To test promoter activity we employed circular Addgen plasmids with red and green fluorescent proteins: 40259 and 25917 (to test pCMV), 80945 and 80946 (pCAG), 129717 and 129719(pEF1a), 60511 (pRPBSA) and 112686 (pUbiquitin-63E), and linear construct obtained from Addgene plasmid 25917 after treatment with restriction NheI and NotI (pSFFV). After 24 h after transfection cells were collected for analysis on FACS (BD FACS Aria III) and fluorescent microscope (ZOE Fluorescent Cell Imaging System).

Karyotyping and chromosome number analysis

MSQ43 was established in 1969 from 1st instar larvae tissues of An. stephensi but karyotype was unspecified. We performed karyotype analysis according to standard cytogenetic protocol²⁷. The metaphase spreads for karyotyping were prepared using an air-drying method. In brief, cells were treated with colcemid with a final concentration of 0.05 mg/ml for 1.5 h, then exposed to hypotonic solution, and fixed in methanol/glacial acid solution (3:1 ratio). The number of chromosomes was analyzed using DAPI staining. A total of >400 metaphase spreads were examined to determine the modal number.

We determined the chromosome plate as 2n = 6, displaying characteristics of male genotype (Fig. 1c–e). Regular PCR performed for ASTEI00671, ASTEI00780, ASTEI01072, ASTEI07753, Guy-1 genes from 2 R, X, and Y-chromosomes, respectively, confirmed that statement (Fig. 1e; set of primers is available at Table S1). Analysis of sequenced reads coverage demonstrated that Chr X accumulated almost the same amount of reads per 100 kb as autosomal arms (Table S2). That finding suggests that Chr X was duplicated and, more likely, two homologous X-chromosomes merged and formed single X (shown on the Fig. 1c). Interestingly, in MSQ43 cells DAPI-positive pericentric heterochromatin occupy substantially smaller portion of chromosome X than on chromоsome Y. This X-chromosome organization contrast our data with previously published karyotype of An. stephensi mosquitos from Indian strain²⁸, where both X- and Y-chromosomes manifest as acrocentric chromosomes carrying equally large blocks of DAPI-positive pericentric heterochromatin.

In situ Hi-C for MSQ43

Hi-C was performed as technical duplicates from 5–10 mln cells per experiment as described previously²⁹. In brief, cells were harvested with TripleExpress (Thermo Fisher Scientific, #12604013), turned into a single-cell suspension, washed twice in PBS, and fixed in 2% formaldehyde solution (Sigma Aldrich, #8775) in PBS for 15 minutes. Then, cells were pelleted by centrifugation (500–900 g, 5 min), quenched with 2 M glycine solution, washed twice with PBS, flash-frozen, and stored at −80 °С until the next use. The lysis step was performed for 60 minutes in total, incubating samples 30 minutes on ice and 30 minutes at RT with overhead mixing. Cells were washed with a lysis buffer and Neb 3.1 buffer and pelleted by centrifugation (900–1100 g, 7 min, 4 °С). 0.3% SDS solution in Neb 3.1 buffer was used to open the chromatin. SDS was quenched by adding triton X-100 to the final concentration 1.8% and incubation at 37 °С for 30 minutes. The reaction buffer was replaced with fresh Neb 3.1 containing 1.8% triton X-100 and 400U of DpnII (New England Biolabs, R0543M) were added for overnight incubation. Next day, to improve restriction efficiency the reaction was supplemented with additional 200U of DpnII. The overhangs were labeled with Biotin-14-dATP (Thermo Fisher Scientific, #19524016) and DNA molecules, located close in proximity in 3D-nuclear space, were ligated with 100U T4-ligase (local suppliers) in an overnight reaction. DNA was extracted by phenol/chloroform method, sonicated with M220 COVARIS instrument, and Hi-C-libraries were prepared for Illumina sequencing using Kapa Hyper Prep Kit (Roche, #KK8504) and a KAPA Single-Indexed Adapter Set A, (Roche, #KK8701). Biotin pull-down was performed with Dynabeads (Thermo Fisher Scientific, #65601) after the adapter ligation step. Final PCR was approached with 6–8 cycles of amplification. Libraries were sequenced on DNBSEQ sequencing systems (BGI, China).

Processing of Hi-C experiments

Raw sequencing reads were mapped to the reference genome AsteI4, and then filtered and processed further using the Juicer pipeline, modified as described previously³⁰. Hi-C valid read pairs were collected from two biological replicates, merged after approving the high concordance between them, and filtered for a mapping quality ≥ 30. Complete Hi-C statistics can be found in Supplementary materials (Library statistics in Table S4).

Immunocytochemistry

Cells were grown at approximately 60% confluence and fixed in 3% paraformaldehyde for 30 minutes at RT. Cell membranes were penetrated with 0.1–0.2% Triton X-100 for 5 minutes. Nonspecific protein interactions were blocked with 5% BSA/PBS solution for 30 minutes at RT. Primary antibodies in dilution buffer (1% BSA, 0.1% tween, PBS) were added and incubated overnight at 4 °С. Next day, after 3 washes in 0.2% Tween/PBS, secondary antibodies were added and incubated in dilution buffer for 2 h at RT. After 3 washes in 0.2% tween/PBS, nuclei were stained with DAPI. Specific signal was visualized under confocal microscope LSM 780 NLO (Zeiss).

Chromatin immunoprecipitation (ChIP-seq)

20–30 mln MSQ43 cells were fixed in 1% paraformaldehyde for 15 minutes at RT with overhead mixing. After quenching with glycine cells were lysed for 20 minutes in lysis buffer (10 mM Tris-HCl, 1 mM EDTA, 1% Triton X-100, 0.1% sodium deoxycholate, protease inhibitors) with 0.5% SDS. Chromatin was sonicated using a Bandelin Sonopulse machine with 75% power within 8 cycles of 30/90 sec ON/OFF in the same buffer with SDS. The total chromatin concentration was determined by Qubit, 20–40 μg were used per one ChIP. Before incubation with specific antibodies, chromatin was diluted with a lysis buffer up to 0.1–0.2% SDS and pre-cleared by incubation with Protein A magnetic beads (New England Biolabs, S1425S) for 2 h at 4 C with slow rotation. During this time, another aliquot of Protein A magnetic beads was washed in PBS, combined with 5 μg target antibodies, and incubated for 2 h at 4 C with slow rotation. Beads were removed and pre-cleared chromatin was immunoprecipitated with antibody/magnetic beads complexes and incubated overnight at 4 °С with slow rotation. Next day, the beads were thoroughly washed in a series of buffers (Buffer 1: 10 mM Tris-HCl, 1 mM EDTA, 1% Triton X-100, 0.1% SDS, 0.1% sodium deoxycholate, protease inhibitors; Buffer 2: 500 mM NaCl, 10 mM Tris-HCl, 1 mM EDTA, 1% Triton X-100, 0.1% SDS, 0.1% sodium deoxycholate, protease inhibitors; Buffer 3: 0.25 M LiCl, 10 mM Tris-HCl, 1 mM EDTA, 0.5% NP-40, 0.5% sodium deoxycholate; Buffer TE/Triton: 10 mM Tris-HCl, 1 mM EDTA, 1% Triton X-100; TE buffer: 10 mM Tris-HCl, 1 mM EDTA). Cross-links were removed and DNA was eluted in 100 μL elution buffer (10 mM Tris-HCl, 1 mM EDTA, 1% SDS) by incubation at 65 °С for 14 hours. After treatment with RNAse A (New England Biolabs, T3018) and Proteinase K (New England Biolabs, P8107S), magnetic beads were removed. DNA was extracted using ChIP DNA Clean & Concentrator columns (Zymo Research, D5205). ChIP-seq libraries were prepared for sequencing using Kapa Hyper Prep Kit (Roche, #KK8504) and a KAPA Single-Indexed Adapter Set A, (Roche, #KK8701). Libraries were sequenced on DNBSEQ sequencing platform.

Processing of ChIP-seq data

The quality of raw reads was analyzed using FASTQC³¹ and multiQC³² Illumina adapters were cut using the Cutadapt³³ program. Then, raw reads were processed including trimming, aligning, peak calling, filtering, and quality controlling using ENCODE ChIP-seq AQUAS pipeline³⁴. We used AsteI4 as a reference genome. Consistency between replicates was analyzed using functions makeTagDirectory and analyzeRepeats.pl from HOMER Software³⁵ for computing RPKM values. The obtained RPKM values were correlated using python library scipy.stat for Spearman’s correlation calculation and matplotlib library for hexbin graph plotting.

RNA-seq

Cells were grown at approximately 60% confluence, washed twice with 1x PBS, harvested with TripleExpress solution (Thermo Fisher Scientific, #12604013), counted, and transferred to 1.5 ml tubes. Total RNA was extracted using Aurum™ Total RNA Mini Kit (Bio-Rad, #7326820). 1–2 mln cells were lysed followed by purification on silica membrane in a spin-column format using a spin-mediated protocol, including DNA digestion for 25 minutes. Concentration was measured with a Qubit 4 spectrophotometer using Qubit RNA HS Assay kit (Thermo Fisher Scientific, #Q32852), then samples were sent for RNA sequencing to the BGI facility (Hong Kong).

Processing of RNA-seq data

Raw reads were processed at usegalaxy.org³⁵ using FASTQC³¹, multiQC³² and Trimmomatic³⁶ tools for quality control. Trimmed reads were aligned to the reference genome AsteI4 using Hisat2³⁷. Transcriptome assembly was done using Stringtie³⁸ and Stringtie-merge tools. Normalized FPKM values were generated using Stringtie tool and gtf file created by Stringtie-merge. Consistency between replicates and PCA analysis were done using python. Comparison of RNA-seq data between replicas and between tissues was performed using FPKM values.

Data Records

All raw reads are available on NCBI’s SRA database³⁹. Separate data accessions for ChIP-seq, RNA-seq, and Hi-C are detailed provided in the Table S6.

Technical Validation

ChIP-seq data validation and quality control

We described epigenome of MSQ43 cell line by producing chromatin immunoprecipitation–sequencing (ChIP-seq) maps for a comprehensive set of active (H3K27Ac), enhancer-related (H3K4me3), PRC1-mediated (H2AK119Ubiq), and PRC2-mediated (H3K27me3) histone modifications, as well as for H3K9me3 - epigenetic mark of facultative heterochromatin Fig. 2.

First, we validated antibodies of choice to be compatible with mosquito cells. The list of antibodies used in the study can be found in Table S3. We carried out immunostaining experiments using the protocol described in Methods section and all antibodies were approved to be used in chromatin immunoprecipitation experiment (Fig. 2a–e). Then, we performed immunoprecipitation experiments following the protocol, described in Methods. Quality validation of the raw sequencing datasets was performed by FastQC and MultiQC programs. The average base quality score for all ChIP-seq samples was in range of 30–50 (green zone). On average 40 million reads were generated for each ChIP-seq sample with a read length ~150 bp. Raw reads were aligned to AsteI4 by Hisat2 tool with mapping rate ranging from 41 to 81%.

The consistency between replicates was analyzed using RPKM normalized reads by HOMER in gene bodies. We calculated Spearman correlation coefficient for every pair of replicates obtaining R~0.9 for all samples except H3K27me3, where correlation between replicas was slightly lower (R~0.8).

To study connections between distribution of chromatin marks and gene expression we built row-linked heatmaps showing RPKM normalized number of reads across a 10 kb genomic interval in 20 bp bins relative to the TSS for A and B compartments separately (Fig. 2f–o). We can clearly see that genes in B-compartments reflect epigenetic characteristics of repressive histone marks. H3K9me3, H3K27me3, H2AK119Ubiq signals were broadly enriched near the TSS in B-compartments while H3K27Ac and H3K4me3 were abundant near TSS located in A-compartments.

RNA-seq profiling and quality control

We generated a gene expression profile for MSQ43 cell line in three replicas Fig. 3. The average base quality score for all RNA-seq samples was in range of 30–50 (green zone). On average, 40 million reads were generated for an individual RNA-seq sample with a read length of 100–150 bp. Raw RNA-seq reads were aligned to AsteI4 by Hisat2 tool with a high mapping rate about 80%.

The reproducibility of biological replicas was analysed in several ways. Spearman correlation analysis showed a correlation coefficient above 0.975 for each pair (Fig. 3a shows a representative example for one pair of replicas), revealing the high reliability of RNA-seq data. To provide baseline for between-replica correlation and access specificity of RNA-seq results, we compared the MSQ43 RNA-seq data with previously reported transcriptome of An. stephensi embryo. In this comparison, Spearman correlation coefficient was lower (0.593) validating the difference between embryos and MSQ43 transcriptome (Fig. 3b). Additional test for RNA-seq replicas concordancy was done by PCA analysis, when 3 replicas produced from MSQ43 were clustering at the one side/pole of the PCA plot while embryo replicas were anchored at the different one (Fig. 3c). The first principal component (PC1) explained 95.6% and the second (PC2) −2.4% of gene difference between MSQ43 and embryonic data. Concording with these results, cell-type specific clustering was observed when RNA-seq data of MSQ43 and embryonic cells were subjected to the hierarchial clustering analysis (Fig. 3d,e). Overall, these results suggest high quality of the produced RNA-seq data.

Usage Notes

Cell lines are a convenient model because they allow simple and efficient perturbation using exogenous expression of genetic vectors. Such vectors can be used to overexpress specific coding or non-coding RNA, deliver components required for targeted genome editing, or creating reported systems and biosensors. One of the important characteristics of a cell line is its ability to transform using various genetic constructs. First we decided to test the activity of promoters in various genetic constructs available in our laboratory. The most widely used methods of cell transformation are the use of special chemicals (Lipofectamine) or electroporation. Using different transformation methods will allow us to choose the most effective one since not all cell lines are transformed by lipofectamine with high efficiency. In MSQ43 we observed the activity of the CMV (Fig. S2a, Table S5). and Ubiquitin-63E promoters (Fig. S2b, Table S5). When using the MSQ43 cell line for Anopheles genomics research, several limitations should be considered. First, as with all immortalized cell lines, MSQ43 may exhibit chromatin organization, transcriptomic profiles, and chromosomal alterations that differ from those of primary tissues. Second, evolutionary divergence in chromatin architecture may constrain the applicability of MSQ43 for studying Anopheles species distantly related to An. stephensi. Nevertheless, MSQ43 remains a suitable model for investigating all three ecological variants of An. stephensi, as the absence of post-copulatory reproductive barriers among them²⁵ suggests conserved epigenetic regulation.

Code availability

The code used for data analysis is available on GitHub: https://github.com/Asirix1/Anopheles_project; https://github.com/kundajelab/chipseq_pipeline; https://github.com/open2c/coolpuppy; https://meme-suite.org/meme/.

References

Bradshaw, C. J. A. et al. Massive yet grossly underestimated global costs of invasive insects. Nature Communications 7, 12986 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Vector-borne diseases. https://www.who.int/news-room/fact-sheets/detail/vector-borne-diseases.
Sinka, M. E. et al. The dominant Anopheles vectors of human malaria in the Asia-Pacific region: occurrence data, distribution maps and bionomic précis. Parasites Vectors 4, 89 (2011).
Article PubMed PubMed Central Google Scholar
Sinka, M. E. et al. A new malaria vector in Africa: Predicting the expansion range of Anopheles stephensi and identifying the urban populations at risk. Proc. Natl. Acad. Sci. USA. 117, 24900–24908 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Sharma, S. K. & Hamzakoya, K. K. Geographical Spread of Anopheles stephensi Vector of Urban Malaria, and Aedes aegypti, Vector of Dengue/DHF, in the Arabian Sea Islands of Lakshadweep, India. WHO Regional Office for South-East Asia. (2001).
Faulde, M. K., Rueda, L. M. & Khaireh, B. A. First record of the Asian malaria vector Anopheles stephensi and its possible role in the resurgence of malaria in Djibouti, Horn of Africa. Acta Tropica 139, 39–43 (2014).
Article PubMed Google Scholar
Gayan Dharmasiri, A. G. et al. First record of Anopheles stephensi in Sri Lanka: a potential challenge for prevention of malaria reintroduction. Malar J 16, 326 (2017).
Article CAS PubMed PubMed Central Google Scholar
Carter, T. E. et al. Genetic diversity of Anopheles stephensi in Ethiopia provides insight into patterns of spread. Parasites Vectors 14, 602 (2021).
Article CAS PubMed PubMed Central Google Scholar
Ishtiaq, F., Swain, S. & Kumar, S. S. Anopheles stephensi (Asian Malaria Mosquito). Trends in Parasitology 37, 571–572 (2021).
Article PubMed Google Scholar
Budiansky, S. Creatures of Our Own Making. Science 298, 80–86 (2002).
Article CAS PubMed Google Scholar
Neafsey, D. E. et al. The Evolution of the Anopheles 16 Genomes Project. G3 Genes|Genomes|Genetics 3, 1191–1194 (2013).
Article PubMed PubMed Central Google Scholar
Osta, M. A., Christophides, G. K. & Kafatos, F. C. Effects of Mosquito Genes on Plasmodium Development. Science 303, 2030–2032 (2004).
Article ADS CAS PubMed Google Scholar
Riehle, M. M. et al. Natural Malaria Infection in Anopheles gambiae Is Regulated by a Single Genomic Control Region. Science 312, 577–579 (2006).
Article ADS CAS PubMed Google Scholar
Blandin, S. A. et al. Dissecting the Genetic Basis of Resistance to Malaria Parasites in Anopheles gambiae. Science 326, 147–150 (2009).
Article ADS CAS PubMed PubMed Central Google Scholar
Mitri, C., Thiery, I., Bourgouin, C. & Paul, R. E. L. Density-dependent impact of the human malaria parasite Plasmodium falciparum gametocyte sex ratio on mosquito infection rates. Proc. R. Soc. B. 276, 3721–3726 (2009).
Article CAS PubMed PubMed Central Google Scholar
Crawford, J. E. et al. Evidence for Population-Specific Positive Selection on Immune Genes of Anopheles gambiae. G3 Genes|Genomes|Genetics 2, 1505–1519 (2012).
Article CAS PubMed PubMed Central Google Scholar
Ruiz, J. L. et al. Chromatin changes in Anopheles gambiae induced by Plasmodium falciparum infection. Epigenetics & Chromatin 12, 5 (2019).
Article Google Scholar
Ruiz, J. L., Ranford-Cartwright, L. C. & Gómez-Díaz, E. The regulatory genome of the malaria vector Anopheles gambiae: integrating chromatin accessibility and gene expression. NAR Genomics and Bioinformatics 3, lqaa113 (2021).
Article PubMed PubMed Central Google Scholar
Gómez-Díaz, E., Rivero, A., Chandre, F. & Corces, V. G. Insights into the epigenomic landscape of the human malaria vector Anopheles gambiae. Front. Genet. 5 (2014).
Rubio, N. R., Fish, K. D., Trimmer, B. A. & Kaplan, D. L. Possibilities for Engineered Insect Tissue as a Food Source. Front. Sustain. Food Syst. 3, 24 (2019).
Article Google Scholar
Sandiford, S. L. et al. Cytoplasmic Actin Is an Extracellular Insect Immune Factor which Is Secreted upon Immune Challenge and Mediates Phagocytosis and Direct Killing of Bacteria, and Is a Plasmodium Antagonist. PLoS Pathog 11, e1004631 (2015).
Article PubMed PubMed Central Google Scholar
Mak, M. et al. Triangulation of methods using insect cell lines to investigate insecticidal mode‐of‐action. Pest Management Science 77, 492–501 (2021).
Article CAS PubMed Google Scholar
Bairoch, A. The Cellosaurus, a Cell-Line Knowledge Resource. J Biomol Tech 29, 25–38 (2018).
Article PubMed PubMed Central Google Scholar
Pudney, M. & Varma, M. G. R. Anopheles stephensi var. mysorensis: Establishment of a larval cell line (Mos. 43). Experimental Parasitology 29, 7–12 (1971).
Article CAS PubMed Google Scholar
Subbarao, S. K., Vasantha, K., Adak, T., Sharma, V. P. & Curtis, C. F. Egg‐float ridge number in Anopheles stephensi: ecological variation and genetic analysis. Medical Vet Entomology 1, 265–271 (1987).
Article CAS Google Scholar
Lukyanchikova, V. et al. Anopheles mosquitoes reveal new principles of 3D genome organization in insects. Nat Commun 13, 1960 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Prokhorovich, M. A. et al. Cultures of hESM human embryonic stem cells: Chromosomal aberrations and karyotype stability. Bull Exp Biol Med 144, 126–129 (2007).
Article CAS PubMed Google Scholar
Jiang, X. et al. Genome analysis of a major urban malaria vector mosquito, Anopheles stephensi. Genome Biol 15, 459 (2014).
Article PubMed PubMed Central Google Scholar
Salnikov, P. et al. Structural variants in the Epb41l4a locus: TAD disruption and Nrep gene misregulation as hypothetical drivers of neurodevelopmental outcomes. Sci Rep 14, 5288 (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
Gridina, M. et al. A cookbook for DNase Hi-C. Epigenetics & Chromatin 14, 15 (2021).
Article CAS Google Scholar
Andrews, S. FastQC: a quality control tool for high throughput sequence data (2010).
Ewels, P., Magnusson, M., Lundin, S. & Käller, M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32, 3047–3048 (2016).
Article CAS PubMed PubMed Central Google Scholar
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet j. 17, 10 (2011).
Article Google Scholar
Landt, S. G. et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 22, 1813–1831 (2012).
Article CAS PubMed PubMed Central Google Scholar
The Galaxy Community. et al. The Galaxy platform for accessible, reproducible, and collaborative data analyses: 2024 update. Nucleic Acids Research 52, W83–W94 (2024).
Article Google Scholar
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Article CAS PubMed PubMed Central Google Scholar
Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol 37, 907–915 (2019).
Article CAS PubMed PubMed Central Google Scholar
Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol 33, 290–295 (2015).
Article CAS PubMed PubMed Central Google Scholar
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRP532727 (2025).

Download references

Acknowledgements

This work was supported by the grant of the state program of the «Sirius» Federal Territory «Scientific and technological development of the «Sirius» Federal Territory» (Agreement №26-03, 27/09/2024) (Hi-C libraries preparation, data analysis, manuscript preparation). An. stephensi MSQ43 cell line was initially obtained through Malaria Research and Reference Reagent Resource Center (MR4) stocks and BEI Resources, NIAID, NIH. All computations were performed using nodes of the high-throughput cluster of the Novosibirsk State University (supported by the Ministry of Science and Higher Education of the Russian Federation, grant #FSUS-2024-0018), and bioinformatics resource center of the Institute of Cytology and Genetics (state project FWNR-2022-0019). We acknowledge the Center for Shared Use of Flow Cytometry SB RAS for providing FACS sorting facilities.

Author information

Authors and Affiliations

Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
Varvara Lukyanchikova, Miroslav Nuriddinov, Anna Khabarova, Maria Gridina, Andrey Popov, Polina Belokopytova & Veniamin Fishman
Novosibirsk State University, Novosibirsk, Russia
Varvara Lukyanchikova, Maria Gridina, Polina Belokopytova & Veniamin Fishman
Department of Entomology, Virginia Polytechnic Institute and State University, Blacksburg, VA, 24061, USA
Varvara Lukyanchikova & Igor V. Sharakhov
Fralin Life Science Institute, Virginia Polytechnic Institute and State University, Blacksburg, VA, 24061, USA
Varvara Lukyanchikova & Igor V. Sharakhov
Department of Genetics and Cell Biology, Tomsk State University, Tomsk, Russia
Miroslav Nuriddinov
Sirius University of Science and Technology, Sirius Federal Territory, Sochi, 354340, Russia
Maria Gridina & Veniamin Fishman
Artificial Intelligence Research Institute, AIRI, Moscow, Russia
Andrey Popov & Veniamin Fishman
Skolkovo Institute of Science and Technology, Moscow, Russia
Andrey Popov

Authors

Varvara Lukyanchikova
View author publications
Search author on:PubMed Google Scholar
Miroslav Nuriddinov
View author publications
Search author on:PubMed Google Scholar
Anna Khabarova
View author publications
Search author on:PubMed Google Scholar
Maria Gridina
View author publications
Search author on:PubMed Google Scholar
Andrey Popov
View author publications
Search author on:PubMed Google Scholar
Polina Belokopytova
View author publications
Search author on:PubMed Google Scholar
Igor V. Sharakhov
View author publications
Search author on:PubMed Google Scholar
Veniamin Fishman
View author publications
Search author on:PubMed Google Scholar

Contributions

V.F. and I.V.S. conceived and supervised the study. V.L. cultivated MSQ43 cells and performed Hi-C, ChIP-seq, and RNA-seq with help from M.G. and A.K. M.N. performed Hi-C data processing and technical validation with help from A.P. P.B. performed ChIP-seq and RNA-seq data processing and technical validation. M.N. and V.L. generated genome assembly and analyzed chromosomal rearrangements. A.K. optimized transfection conditions. All the authors contributed to the manuscript preparation.

Corresponding author

Correspondence to Veniamin Fishman.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementery Table 6 (download XLSX )

Supplementary information (download DOCX )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Lukyanchikova, V., Nuriddinov, M., Khabarova, A. et al. Chromatin landscape, transcriptomic and ChIP-seq profiling of Anopheles stephensi MSQ43 cell line. Sci Data 12, 1566 (2025). https://doi.org/10.1038/s41597-025-05721-2

Download citation

Received: 21 October 2024
Accepted: 30 July 2025
Published: 26 September 2025
Version of record: 26 September 2025
DOI: https://doi.org/10.1038/s41597-025-05721-2