Ponomarenko M. P., Titov I. I., Ponomarenko J. V., Kolchanov N. A., Mazin A. V., Kowalczykowski S. C.
Institute of Cytology & Genetics, 630090, Novosibirsk, Russia; FAX: +7(3832)356-558; E-mail: pon@bionet.nsc.ru ;; University of California, Davis, California 95616-8665, USA
The RecA protein plays a key role in both DNA repair and homologous
recombination. When RecA is binding ssDNA, a nucleoprotein RecA-filament
is formed, which is essential for all RecA-mediated biologically important
reactions. It has been commonly accepted that vital importance of the RecA-promoted
functions for the entire
E. coli genome exclude the RecA-filament
preference of any DNA sequences. Therefore, the recent discovery that the
RecA filament binds preferentially to certain sequences (Mazin, Kowalczykowski,
1996) was quite unexpected and requires its explanation. That is why, we
processed these data (Mazin, Kowalczykowski, 1996) by the computer system
ACTIVITY (Kolchanov, 1998). The ssDNA/RecA-filament affinity was found
maximal for the sequence devoid of the trinucleotide DRV={AAA, AAC, AGA,
AGC, GAA, GAC, GGA, GGC, TAA, TGA, AAG, AGG, TAG, TGG, GAG, GGG, TAC, TGC}
and decreasing with DRV concentration in the vicinity of the ssDNA 5'end.
This concentration is calculated:
,
where is a given sequence;
R=A/G, V=A/G/C, and D=A/T/G;
if x=y, if ,
weight(i) is exponentially decreasing with position i. In
ten training sequences with known ssDNA/RecA filament affinities, simple
regression was optimized:
,
where is a given sequence; R=A/G, V=A/G/C, and D=A/T/G; if x=y, if ; weight(i) is exponentially decreasing with position i . In ten training sequences with known ssDNA/RecA filament affinities, simple regression was optimized: . Significance of this regression was tested in six control sequences.
The obtained linear regression coefficient was r=0.812; significance,
.
Thus, this regression is reliably predicting the ssDNA/RecA filament affinity
from the ssDNA sequence. Then, the trinucleotide DRV and the genetic code
were superimposed. It resulted that the DRV trinucleotide corresponds to
the codons of lysine, cystein, serine, tyrosine, glycine, asparagine, tryptophan,
arginine, and both glutamic and aspartic acids. In proteins, these residues
are reliably frequent on surface and seldom in domain nuclei (Karlin, 1989).
Protein surfaces are commonly associated with functional sites. Hence,
the RecA-filament is likely to ignore the gene regions encoding protein
functional sites to prefer domain nuclei. Consequently, structurally similar
proteins may differ in the order of their domains, while their functional
sites should be the most conservative. These both fenomena are, indeed,
well known facts.
This work was granted by Russian Basic Research Foundation.