Kozlov N. N.
Keldysh Institute of Applied Mathematics, Russian Academy of Sciences, 125047, Moscow, Miusskaya Sq. 4, Russia
Overlapping genes of one RNA chain are investigated. The inverse problem
is posed: to compute all the possible nucleotide sequences corresponding
to protein sequences which genes overlap. Its solutions for binary (Theorem
1) and triple (Theorem 2) overlappings are presented. From Theorem 1 it
follows that, for the double overlaps, 286 different local overlaps may
be selected. Each of these overlaps determines one or two positions and
type of nucleotide substitutions resulting in silent mutations. In 187
of these overlaps termination codons (ter) are contained, and in 99 the
codons of leucine (Leu) or arginine (Arg) are involved. From Theorem 2
it follows that, for the region of the triple overlapping, the positions
of this type do not occur. The specific features of nucleotides entries
in the positions under examination are studied. For genomes which contain
the longest regions of genetic overlaps (two groups of viruses HBV and
HIV) the non-random nature of entries of this type was stated. Ressible
reasons of that non-randomness, as well as features of local overlappings,
are discussed. For codon families ter, Leu, Arg their special properties,
which result in existence of the positions under consideration, have been
studied. Due to the structure of these families and serine (Ser) family,
there exists the degeneracy of the universal genetic code with respect
not only to the third base of the codon. The specific features of using
the codons families Ser, Leu, Arg for double and triple overlaps have been
studied. The analysis performed leads to a hypothesis about the origin
of these codons. It was suggested that the final "choice" of
these six corresponding triplet codons can be related to the evolution
of DNA molecules at the stages, when according to contemporary conceptions,
the restrictions of the genome size began to influence and overlapping
genes appeared. In the paper it is demonstrated that this "choice"
could not be independent of the "choice" of ter codons.
This research was supported by Russian Foundation for Basic Research
Grants N 98-01-00059 and N 96-15-97229.