Main

Microarray-based comparative genomic hybridization (array CGH) provides a means to quantitatively measure DNA copy-number aberrations and to map them directly onto genomic sequence. Because arrays comprised of large-insert genomic clones such as BACs provide reliable copy-number measurements on individual clones1,2, they are potentially useful for research and clinical applications in medical genetics and cancer. Preparation and spotting of BAC DNA is problematic, however, because (i) BACs are single-copy vectors (ii) the yield of DNA from BAC cultures is low compared to that from plasmid-bearing cultures and (iii) spotting high–molecular weight DNA at sufficient concentration to obtain a good ratio of signal to noise in the hybridizations may be difficult. To overcome these problems, we used ligation-mediated PCR3 to generate representations of human and mouse BAC DNAs for spotting on arrays. We produced sufficient spotting solution (0.8 μg/μl DNA in 20% DMSO) from 1 ng of BAC DNA to make tens of thousands of arrays (see Web Note A for methods). The ratios we measured using arrays comprised of BAC representations are essentially identical to ratios previously reported for DNA from the same BACs1. Independently prepared DNA representations yield highly reproducible data (average variation of the linear ratios on individual clones from two independent preparations, 6.6%).

For copy-number assessment across the human genome, we printed 2,460 BAC and P1 clones in triplicate (approximately 7,500 elements) in a 12 mm×12 mm square (HumArray 1.14; see Web Table A and Web Fig. A). Each clone contains at least one STS, allowing linkage to the genome sequence. Cytogenetic mapping indicated that 2,298 of the arrayed clones are single copy4,5; these arrays thus provide average resolution of approximately 1.4 Mb across the genome. We have also assembled an array of approximately 1,300 clones for the mouse, which will be reported elsewhere. With the human arrays, we have obtained highly reproducible measurements over a wide dynamic range in cancer cell lines (see Web Tables B and C for analyses of COLO320, HCT116, HT29, MDA-MB-231, MDA-MB-453, MPE600, SW837 and T47D). These copy-number alterations ranged from homozygous deletions (log2 ratio<−2, HCT116 chromosome 16) to very high-level amplifications (log2 ratio>6, amplification of CMYC, COLO320). We also obtained nearly identical ratios (average s.d. of the log2 ratio=0.08) in three replicate hybridizations with BT474 cell line DNA, two labeled by random priming and one by nick translation, using an array of 1,777 clones (HumArray 1.11; see Web Table D and Web Fig. B).

To test our ability to measure single-copy changes (trisomies and monosomies), which is critical for applications in medical genetics and cancer, we measured 15 cell strains containing cytogenetically mapped partial or whole-chromosome aneuploidies (see Web Tables E–I). Figure 1 shows representative analyses, including detection of whole-chromosome gains (Fig. 1a), detection of a deletion (Fig. 1b) and its confirmation (Fig. 1c). The mean log2 ratios of trisomic chromosomal regions were 0.49±0.05, compared to the ideal value of 0.58 for a 3/2 ratio. In female/male comparisons, the mean log2 ratios on the X chromosome were 0.72±0.08, compared to the expectation of 1.0. The underestimation of the magnitude of copy-number deviations most likely results from incomplete suppression of repetitive sequences or errors in background subtraction1.

Figure 1: Measurement of single-copy changes.
figure 1

a, Normalized copy-number ratios of a comparison of genomic DNA from cell strain GM03576 and from normal reference DNA (see Web Note for methods). Data are plotted as the mean log2 ratio of the triplicate spots for each clone normalized to the genome median log2 ratio. The BACs are ordered by position in the genome beginning at 1p and ending at Xq. Borders between chromosomes are indicated by vertical bars. Cytogenetic analysis indicates that this cell line is trisomic for chromosomes 2 and 21. b, Normalized copy-number variation of cell line GM03563 on BAC clones from chromosome 9. The mean log2 ratios of the triplicate spots normalized to the median log2 ratio for the genome are plotted relative to the position of the clones on the draft genomic sequence. The log2 ratio of approximately −1 indicates a single-copy deletion of the first two clones on chromosome 9. The standard deviation of the log2 ratios of the clones that are not deleted is 0.08. Colored arrows indicate clones hybridized to interphase nuclei in c. c, Confirmation of the deletion on 9p by fluorescent in situ hybridization to interphase GM03563 nuclei. The Texas red–labeled test clone RP11-28N06 included in the deletion (indicated by the red arrow in b) gave a single red hybridization signal and the FITC-labeled reference clone RP11-115L05 (indicated by the green arrow in b) gave two green signals.

In principle, each clone may show a different relationship between ratio and copy number because of the differential ability to block its repetitive sequences. If so, we would expect that ratio differences among clones at the same copy number would become larger as the copy number departed farther from genome average. In the aneuploid cell lines, we found that the vast majority of the autosomal clones had the same response to copy-number changes, as the standard deviations of the log2 ratios for autosomal clones at 1, 2 or 3 copies were all 0.09. However, on the X chromosome, the standard deviation of the log2 ratios increased from 0.10 in male/male comparisons to 0.15 in female/male comparisons. Moreover, the ratio variations among X chromosome clones were very reproducible (see Web Fig. C), suggesting that the sequence characteristics of individual clones, possibly differing amounts of sequence shared with the Y chromosome, do have a measurable effect on X chromosome ratios.

We detected copy-number gains and losses (Fig. 2a,b; Web Table J) as well as amplifications (Fig. 2b) using DNA isolated from trimmed, frozen breast tumor tissue blocks. Many of the ratio changes are of smaller magnitude than would be expected for single-copy changes in diploid genomes. For example, the log2 ratios of 0.47±0.08 (Fig. 2a) and 0.32±0.07 (Fig. 2b) recorded for parts of the genome are less than the expected log2 ratio=0.58 for a copy-number ratio of 1.5. These ratios most likely reflect the presence of admixed normal cell DNA, tetraploid DNA content and/or tumor heterogeneity. In particular, the intermediate ratios indicating a gain of 16p and loss of 16q in Fig. 2a probably result from the presence of these aberrations in only a portion of the tumor cells. The magnitude of these easily discriminated ratio changes is well below the 'two-fold' level often considered to be the limit for significant differences in expression-array measurements, indicating the potential of array technology to provide very precise ratio measurements.

Figure 2: Genome-wide copy-number variation in two breast tumors.
figure 2

a and b, Normalized fluorescence ratios for breast tumors. We labeled DNA extracted from sections of trimmed, frozen tumor specimens with Cy3–dCTP by random priming and hybridized it to the array together with normal male reference DNA (see Web Note for methods). We found low-level gains and losses in both tumors and a high-level amplification on chromosome 20q in b. The elevated X-chromosome ratios reflect the male–female difference in X-chromosome copy number. Ratios are plotted as in Fig. 1a.

Previously, measuring DNA copy number using arrays assembled from representations of genomic clones prepared by other methods6,7 resulted in highly variable ratios, so that detecting single-copy changes required averaging over several adjacent clones. In contrast, the arrays described here, produced from ligation-mediated PCR representations of the genomic clones, provide reliable data from individual clones, even in polyploid or heterogeneous specimens. This array CGH platform thus provides the performance required for potential clinical applications in medical genetic diagnosis and cancer.

Note: Supplementary information is available on the Nature Genetics web site (http://genetics.nature.com/supplementary_info/).