Each of our billions
of cells contains about two metres or six feet of nuclear DNA. All of this DNA
has to be packed into a nucleus that is about 10 microns or one hundredth of a
millimetre across. In order to fit all this DNA into this tiny space most of
the DNA strand is wrapped into tiny loops called nucleosomes. This is true for
all organisms that have nuclei - animals, plants and yeasts which together are
called eukaryotes.
The nucleosomes are loops of DNA, 147 base pairs long, wrapped tightly around a
core made of eight protein molecules called histones. Each loop or knot of DNA
is connected to the next by a stretch of unwrapped DNA (called linker DNA) that
can be anything between 10 and 50 base pairs long. DNA in the nucleus,
therefore, resembles a string of beads, each bead representing a nucleosome.
The structure of individual nucleosomes with their histone core has been known
for some time. There are about 30 million nucleosomes in a human nucleus.
Now the fascinating thing about this arrangement is that its physical structure
affects the cell and the organisms. It has a profound effect on the readout of
the DNA sequence. The very compact structure of DNA in the nucleosomes makes it
difficult to physically access the DNA. So, genes or parts of genes which are
located in a nucleosome are far less easily transcribed than genes located in
linker DNA. Transcription factors and other regulatory factors will bind more
readily to target sites located in linker DNA than in nucleosomes. Indeed not
only do individual nuclesomes hide DNA from regulators and from transcription,
but complexes of nucleosomes, packed tihghtly together to physically hide the
DNA do the same job, but in spades. This enables the genome to physically hide
non-functional binding sites from regulatory factors and moderate the
expression of genes.
So, the obvious question is: since the position of nucleosomes along the DNA
sequence is so important, how is it controlled? Segal et have shown
thatthe DNA sequence itself makes a major contribution to determining the
position of nucleosomes along the DNA.
What's the interesting thing here? Well we have been fascinated by the logical
code in DNA for decades. We have known for some time how proteins are coded
in DNA sequences. The idea that the physical arrangement of DNA is
important is more recent. But the fascinating thing that Segal et al show is
that DNA encodes proteins as we expect, and that it also encodes its
physical arrangemen and packaging. It's very complex - it's like a computer
program that not only codes for the actions that it should perform, but also
for the computer that reads itself out.
The code that Segal et al have revealed predicts that centromeres (where sister
chromatids connect) and telomeres (the ends of chromosomes) should be highly
occupied by nucleosomes. On the contrary, genes that code for ribosomal and
transfer RNA, which need to be highly expressed are predicted to be poor in
nucleosomes. Nucleosomes are not expected at the functional binding sites for
regulatory factors but are expected at the non-functional sites with similar
sequence. Transcription start sites, where gene transcription begins, are poor
in nucleosomes and so available to the transcription machinery. So the genome
itself encodes a physical packing structure that encourages regulatory factors
to bind at the active functional sites. If the logical protein coding wasn't
complex enough, we find it overlaid with coding for physical DNA packing which
has functional consequences.
To make things even more complicated, nucleosome positions are adjusted
dynamically and nucleosomes compete with factors to bind DNA; furthermore
nucleosome positions are determined not just by DNA sequence but dynamically by
enzymes (which are themselves produced by DNA transcription) called chromatin
remodelling factors. Biology is more complex, much more complex , than we
realised a couple of decades ago.
(First published on the evolutionpages
blog. Go
here for the most recent posts on the blog)