Abstract
By an extensive statistical analysis in genes of bacteria, archaea, eukaryotes, plasmids and viruses, a maximal $C^3$-self-complementary trinucleotide circular code has been found to have the highest average occurrence in the reading frame of the ribosome during translation. Moreover, dinucleotide circular codes have been observed in non-coding regions of eukaryotic genomes. By using a graph-theoretical approach of circular codes recently developed, we study mixed circular codes $X \subseteq \mathcal{B}_2 \cup \mathcal{B}_3 \cup \mathcal{B}_4$, which are the union of a dinucleotide circular code $X_2 \subseteq \mathcal{B}_2$, a trinucleotide circular code $X_3 \subseteq \mathcal{B}_3$ and a tetranucleotide circular code $X_4 \subseteq \mathcal{B}_4$ where $\mathcal{B} = \{A, C, G, T\}$ is the $4$-letter genetic alphabet. Maximal mixed circular codes $X$ of (di,tri)- nucleotides, (tri,tetra)-nucleotides and (di,tri,tetra)-nucleotides are constructed, respectively. In particular, we show that any maximal dinucleotide circular code $D \subseteq \mathcal{B}_2$ of size $6$ can be embedded into a maximal mixed circular code $X \subseteq \mathcal{B}_2 \cup \mathcal{B}_3$ such that $X \cap \mathcal{B}_3$ is a maximal $C^3$-comma-free code. The growth function of self-complementary mixed circular codes of dinucleotides and trinucleotides is given. Self-complementary mixed circular codes could have been involved in primitive genetic processes and an evolutionary model based on mixed circular codes is also proposed.