Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Nature Precedings
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • RSS feed
  1. nature
  2. nature precedings
  3. articles
  4. article
Cunningham: a BLAST Runtime Estimator
Download PDF
Download PDF
  • Manuscript
  • Open access
  • Published: 12 October 2011

Cunningham: a BLAST Runtime Estimator

  • James White1,
  • Malcolm Matalka1,
  • W. Florian Fricke1 &
  • …
  • Samuel Angiuoli1 

Nature Precedings (2011)Cite this article

  • 453 Accesses

  • 2 Citations

  • Metrics details

Abstract

BLAST is arguably the single most important piece of software ever written for the biological sciences. It is the core of most bioinformatics workflows, being a critical component of genome homology searches and annotation. It has influenced the landscape of biology by aiding in everything from functional characterization of genes to pathogen detection to the development of novel vaccines. While BLAST is very popular, it is also often one of the most computationally intensive parts of bioinformatics analysis. In our workflows, BLAST typically takes the majority of cpu time, and we need to parallelize to finish in a reasonable time frame. Waiting for BLAST to finish without having any clue of how long it’s going to take is kind of depressing, and you could waste a day of work trying to run a job that would never finish. If you feel the same way we do, then check out Cunningham, a tool we designed to estimate BLAST runtimes for shotgun sequence datasets using sequence composition statistics. We’ve trained its models on real metagenomic sequence data using the Amazon EC2 cloud, and it will provide a relatively quick estimate for datasets with up to tens of millions of sequences. It’s not perfect, but it’ll give you at least some idea of expected runtime, how large a cluster you’re going to need, how much you’ll need to partition your data, etc. We use it all the time now, so we hope it’ll be useful to someone else out there. Cunningham has been implemented in CloVR for efficient autoscaling in the cloud and is freely available at http://clovr.org.

Similar content being viewed by others

Studying pathogens degrades BLAST-based pathogen identification

Article Open access 03 April 2023

Practical guide for managing large-scale human genome data in research

Article Open access 23 October 2020

Variants of β-lactamase-encoding genes are disseminated by multiple genetically distinct lineages of bloodstream Escherichia coli

Article Open access 01 July 2025

Article PDF

Author information

Authors and Affiliations

  1. Institute for Genome Sciences, University of Maryland - School of Medicine, Baltimore, MD, 21201

    James White, Malcolm Matalka, W. Florian Fricke & Samuel Angiuoli

Authors
  1. James White
    View author publications

    Search author on:PubMed Google Scholar

  2. Malcolm Matalka
    View author publications

    Search author on:PubMed Google Scholar

  3. W. Florian Fricke
    View author publications

    Search author on:PubMed Google Scholar

  4. Samuel Angiuoli
    View author publications

    Search author on:PubMed Google Scholar

Corresponding author

Correspondence to James White.

Rights and permissions

Creative Commons Attribution 3.0 License.

Reprints and permissions

About this article

Cite this article

White, J., Matalka, M., Fricke, W. et al. Cunningham: a BLAST Runtime Estimator. Nat Prec (2011). https://doi.org/10.1038/npre.2011.5593.2

Download citation

  • Received: 12 October 2011

  • Accepted: 12 October 2011

  • Published: 12 October 2011

  • DOI: https://doi.org/10.1038/npre.2011.5593.2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • blast
  • runtime
  • linear models
  • sequence alignment
Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Sign up for alerts
  • RSS feed

About the journal

  • Journal Information

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Nature Precedings (Nat Preced)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2025 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing