POSITIONAL CORRELATION OF PHYSICO-CHEMICAL CHARACTERISTICS WITHIN THE ALPHA-HELICES OF C2H2-TYPE ZINC-FINGER DNA-BINDING DOMAINS: ANALYSIS OF PHAGE DISPLAY DATA

Afonnikov D., Wingender E.
Institute of Cytology and Genetics, Novosibirsk, Russia; Gesellschaft fьr Biotechnologische Forschung, Braunschweig, Germany; E-mail: ada@bionet.nsc.ru
The C2H2 zinc-finger domain is one of the main classes of DNA-binding motifs. Its tertiary structure consists of two beta-strands packed against an alpha-helix. The alpha-helix binds into the major groove of DNA and involves several side chains, which are in specific contact with the nucleotide bases (positions -1, 2, 3 and 6 relative to the first residue of the alpha-helix). However, the details of this specific recognition remain unclear. In this work, we investigate previously published data from phage display experiments using the Zif268 polypeptide (Choo Y., Klug A., 1994, PNAS, pp. 11163-11167 and 11168-11172). These data represent a set of amino acid sequences of the alpha-helical region that were selected according to the highest DNA-binding affinity from a large pool of possible polypeptide mutants. Note that these "artificial selection" data have no "evolutionary dependence" thus contrasting with sequence data extracted from the protein databases. To estimate interdependencies between individual positions in the DNA-binding helix, we used both the linear and partial correlation coefficient approaches for several physico-chemical parameters in the sequence positions, namely, isoelectric point value, hydrophobicity, polarity, and the side chain volume. As a result, we demonstrated that the isoelectric point values in positions -1, 1, 2, 3, and 6 were negatively dependent upon each other (the values of linear correlation coefficients are in the range of -0.4 to -0.65, i.e. 95%-significant). Analysis of the other parameters also showed the presence of several highly correlated position pairs. We have also investigated clustering of the sequences from the phage display data in the space of amino acid quantities in the sequence positions. We used tree diagram reconstruction for all the above quantities. The analysis of these diagrams shows that the cluster's structure for the isoelectric point values, hydrophobicity, and polarity trees are similar contrary to the tree for the side chain volume. The tree diagram for the isoelectric point values shows also that the sequences binding the same DNA site fall into the same cluster. Our results reflect the functional importance of the positions -1, 1, 2, 3, and 6 of the alpha-helix and agree with the existing data on the zinc-finger domain. In addition, our results may provide an information on quantitative relation between the isoelectric point values at these positions. We believe that this information could be useful in understanding the mechanism of zinc-finger/DNA recognition.

This work was supported by grants of NATO (N. HTECH.LG 951149), Russian National Human Genome Project (N 12312GCh-5), Russian Foundation for Basic Research (N 96-04-50006 and 97-04-49740), and German ministerial grant (N X224.6).