Thursday, August 22, 2013

How to search for something when you don't know what it is

One of the reasons why a computer scientist like me finds such a great delight in the works of G. K. Chesterton is the continual appearances of hilarious truths, almost like the subtleties of Alice in Wonderland or the Gospels. It's the grand mystical revelation in a statement like this:
"Perhaps the weapon was too big to be noticed," said the priest, with an odd little giggle.
[GKC "The Three Tools of Death" in The Innocence of Father Brown]
The curious truth I am trying to express today is the challenging idea about which I wrote my doctoral dissertation: the idea of finding a string (a series of letters or characters) of some desired uniqueness within a given collection of information. That problem arose in a very curious need of molecular biologists - a puzzle a bit too elaborate to explain here and now, and about which I am developing a book. However, the method wsa summarized in the old cartoon version of Geisel's The Cat in the Hat in the method he called "Calculatus Eliminatus": the principle that to find something lost you find out where it isn't. Very Chestertonian.
I defer the fascinating technical commentary to my planned book - but I think you would be amazed to hear what happens when one applies the tools of DNA sequence analysis to the works of GKC. Among other things you can find out what is the longest sequence of words which appears in (for example) both his Orthodoxy and his The Everlasting Man. And here is the answer:
But the repetition in Nature seemed sometimes to be an excited repetition, like that of an angry schoolmaster saying the same thing over and over again. The grass seemed signalling to me with all its fingers at once; the crowded stars seemed bent upon being understood. The sun would make me see him if he rose a thousand times. The recurrences of the universe rose to the maddening rhythm of an incantation, and I began to see an idea.
[GKC Orthodoxy CW1:263, emphasis added]

They began to betray to the world the fact that they were walking in a circle and saying the same thing over and over again. Philosophy began to be a joke; it also began to be a bore. That unnatural simplification of everything into one system or another, which we have noted as the fault of the philosopher, revealed at once its finality and its futility.
[GKC The Everlasting Man CW2:292-3, emphasis added]
So if you think you are happy with your "search engines" see if you can find such curiosities for yourself - but don't hold your breath. One needs to know a bit more automata theory if one wants to aid molecular biologists - or curious Chestertonians.

And if you think you know how to accomplish such things, you might try finding the longest repeated series of words between his St. Francis of Assisi and his St. Thomas Aquinas. (I'd ask you to find the longest repeated series in The Everlasting Man but I think I have already posted that answer.)