Biophysical Journal 71:1539-1544 (1996)



Is There an Error Correcting Code
in the Base Sequence in DNA?

by

Larry S. Liebovitch, Yi Tao, Angelo T. Todorov, and Leo Levine

Center for Complex Systems
Florida Atlantic University
777 Glades Road
Boca Raton, FL 33431
Telephone: (407) 367-2239
FAX: (407) 367-2223
Internet: LIEBOVITCH@WALT.CCS.FAU.EDU



ABSTRACT

Modern methods of encoding information into digital form include error check digits which are functions of the other information digits. When digital information is transmitted, the values of the error check digits can be computed from the information digits to determine if the information has been received accurately. These error correcting codes make it possible to detect and correct common errors in transmission. The sequence of bases in DNA is also a digital code consisting of 4 symbols: A, C, G, and T. Does DNA also contain an error correcting code? Such a code would allow repair enzymes to protect the fidelity of non-replicating DNA and increase the accuracy of replication. If a linear block error correcting code is present in DNA then some bases would be a linear function of the other bases in each set of bases. We developed an efficient procedure to determine if such an error correcting code is present in the base sequence. We illustrate the use of this procedure by using it to analyze the lac operon and the gene for cytochrome c. These genes do not appear to contain such a simple error correcting code.