David J. Hand, the author of the book The Improbability Principle has a list of items that go into his principle-
- The Law of Inevitability
- The Law of Truly Large Numbers
- The Law of Selection
- The Law of the Probability Lever
- The Law of Near Enough
In the book he describes events that seem like they should never happen and shows how those laws make those events not particularly surprising.
The Odds of Finding Something
This week as I’ve been tracing immigrants back to Sweden. That means trying to determine information in American records that will be useful in matching to Swedish records. That has put the Law of Near Enough on my mind. Put simply that law means that the odds of finding something increase as you loosen your criteria for what you consider to be a match. It is easy to see how that could contribute to our sense that a coincidence has occurred. If we think that two things happened at the same moment, we might find it interesting. We might still find it interesting if they occurred during the same hour. What if they happened during the same day? The same month? Somewhere, as we alter the amount of time, those occurrences go from interesting to boring if we stop and consider the span of time. If we don’t think about it, we can be fooled into thinking that something is wildly unlikely when it is actually rather probable.
Often we don’t even think about the fact that criteria exist but, like the time span above, they do and they affect our research. In genealogy we are always trying to determine if two records correspond to the same person. What criteria do we use? We can look at the opposite of the Law of Near Enough. We can restrict the criteria we use until we will never match records with each other. How about an obituary that claims that a person died at 4am and a death certificate with information that matches the obituary in every detail but one. It records the time of death as 3:56am? Clearly different people, right? Wrong, our allowable difference between the times is ridiculously small.
Where do we draw the line? If a document records a man’s age as 50 and the person we are looking for would have been 51 at the time, is that “near enough”? Now we need to start answering questions. How common was his name? How close was the record made to the expected place (another “near enough” question)? How likely is it that the man’s age was rounded down? How likely is a misunderstanding? Did anyone have a reason to lie? Might the person who made the statement not have known any better? Might the informant have given an accurate version of their uncertainty only to have the clerk right down something exact—could “about 50” have been said to a clerk who “simplified” the answer to “50”?
An Old Story
An apparently old story, which I have to admit I never heard before, involves a traveler going down the road passed a barn. The side of the barn was peppered with arrows, each sitting dead center in a target. The traveler concluded that a master archer lived there until turning the corner to discover a man carefully painting targets around each one of a set of arrows. As genealogists, we are stuck (so to speak) with the arrows that were shot into the barn generations ago but we do need to think about the targets that we paint around them. Draw targets that are too small and no two arrows will sit in the same target. Draw targets that are too large and arrows that have nothing to do with each other will be in the same bullseye.
So how big do we draw the targets around our documents? How far out can a document be and still be a match for another document? Make the target large enough and we will find something but that something will very likely be a false positive—something we consider “near enough” when it really isn’t. When we thought a moment ago about how common a person’s name was, that was a false positive type of thing to worry about. A woman with a very strange and unusual name who was within a few years of the correct age and found in almost the right place is much less likely to be a false positive than is a man named James Smith in otherwise identical circumstances.
Then there is the question of confirmation. The obituary and the death certificate that differ by 4 minutes in the reported time of death don’t need confirmation in order to conclude that, with everything else matching, they record the same person. On the other hand, if we don’t find a match and expand our target, what kind of confirmation will we need to really show that the bigger target was justified? Expanding our target often means needing to find more arrows that hit it.