Although our understanding of gene expression is much more complex these days, the 'central dogma' — namely that DNA encodes RNA encodes protein — remains a valid principle. Back in the 1970s, proteins were known to be expressed in a highly regulated fashion in response to specific signals. How this regulation is achieved and how the signals are relayed has kept molecular biologists busy ever since. Regulated gene transcription by trans-acting sequence-specific transcription factors that bind cis-regulatory DNA sequences to regulate transcription by RNA polymerase (pol) I, II or III (see Milestone 7) was the first level of regulation to emerge. Today, this stands as a near universal mode of gene regulation.

The first detailed mapping of DNA sequences bound by a transcription factor was published in 1978. Tjian used an adenovirus–simian-virus-40 (SV40) hybrid, which supported higher expression of a protein functionally similar to SV40 T antigen in a system more amenable to purification. He showed that the purified protein bound in a sequential manner to tandem recognition sequences at the SV40 replication origin, which turned out to overlap with the promoter. Tjian drew comparisons to phage lambda-repressor DNA binding, characterized a few years earlier by Ptashne and colleagues (see Milestone 2). He noted that the binding region contained palindromic stretches now emblematic of many other transcription-factor-binding sites, and that the protein probably bound as a multimer. Notably, large T-antigen binding to this region had been shown previously by electron microscopy.

Around the same time, Roeder observed that although RNA pol III alone did not allow 5S RNA gene transcription, Xenopus laevis oocyte extracts contained an activity that allowed accurate transcription. Roeder and colleagues purified the activity — TFIIIA — in 1980, showed specificity for 5S genes and DNA binding independent of pol III, and mapped the promoter binding site by footprinting.

The following year, Yamamoto and colleagues reported specific in vitro binding of the glucocorticoid receptor to a 4.5-kb fragment of the mammary tumour virus, which they had shown previously to mediate hormone-responsive transcription. One year later, McKnight and Kingsbury were the first to apply a 'linker-scanning' mutagenesis approach to produce a detailed map of the promoter region of the viral thymidine kinase gene. In 1983, Dynan and Tjian isolated the human transcriptional activator Sp1 using an in vitro transcription assay, and located its binding sites upstream of the transcription-initiation site.

The next breakthrough came from experiments by Brent and Ptashne establishing the modular nature of transcription factors. Using a hybrid fragment that contained a DNA-binding region of the prokaryotic transcriptional repressor LexA fused to a DNA-binding-deficient fragment of the yeast transcriptional activator Gal4, they showed that this chimaera supported transcription only from yeast promoters containing the lexA operator. Remarkably, this operator could drive expression even when located downstream of the promoter.

Arguably, sequence-specific transcription factors constitute the most important and diverse gene-regulatory mechanism. The combinatorial diversity afforded by transcription-factor binding is an effective means of coordinating the regulation of complex sets of genes. Moreover, signalling cascades regulate gene expression predominantly by modulating transcription-factor activity.

With transcription factors, the 'central dogma' had come full circle: protein regulates gene, hence message and protein.