Tuesday, December 25, 2007

Strong verbal relationships?

My wife and I just conducted a thought experiment. If two bodies of text share words, how could one begin to measure their relationships.

We used a three line poem -
Little Jack Horner
Sat in a corner
Eating his Christmas Pie

Little Miss Muffet
Sat on a tuffet
Eating her Curds and Whey

On inspection, these share 3 words in 3 lines not three words in 6 lines. As a result I think I must modify my match algorithm to divide by the greater of the number of verses rather than by the sum. I think I will also distinguish samech from sin by using w for sin. So previously recorded scores will change and can be read as x matches per verse.

