CHAPTER I-5
THE ROLE OF JUDGMENT IN STATISTICAL INFERENCE
The role of judgment is a key issue in the philosophy of
statistical inference. It also is a major source of controversy.
The issue of judgment appears in many contexts, and arises
even before we get to statistical inference. For example, to
produce a sensible forecast of any kind, continuity (sameness)
must always be combined with other knowledge. Whether one pre-
dicts a firm's sales next year, or the price of copper, or the
trend in the murder rate, or whether the fish will be biting
tomorrow, one must ask and answer such questions as: How far
away in time or space (or other dimension) will one seek and
employ data? How will you weight the relative importance of
various classes of data (a more general version of the previous
question)? Which other variables will be brought into the
discussion? The previous questions boil down to: How should one
apply the principle of sameness in a practical situation?
It is all very well to notice, as you are driving on a New
Hampshire road, that car after car has New Hampshire license
plates, and it is natural to forecast that upcoming cars will
have New Hampshire plates, too. But if you do not recognize and
take into account that you are approaching the Massachusetts
border, you may be in for a rude shock; sameness by itself is not
enough.
It seldom is wise - though we all may do sometimes do so -
to attempt to create a system of prediction that is automatic and
objective in the sense that it disregards or does not seek out
such "outside" information.
An example of disregarding information which arises in the
context of discussions of prior assumptions is the assumption
that it is reasonable to implicitly assign equal probabilities of
.5 to two hypotheses in the absence of any solid knowledge. This
"principle of indifference" (better called "the principle of
agnosticism") implies either a) there is no other relevant knowl-
edge, which is most unusual, or b) ignoring other knowledge,
which is stupid in most cases (such as refusing to consider
whether an earthquake affected the meter reading by pretending
that there was no earthquake), though ignoring information is
sound practice in selected situations, as discussed below. <1>
The principle of indifference is little more than a
preference for round or symmetric numbers. One can find few
realistic examples where one truly is agnostic about two
competing hypotheses.
It may be useful and appropriate to act as if one is agnos-
tic about, and indifferent toward, two hypotheses or outcomes.
For example, when comparing scientific hypotheses, there would be
no point in doing an experiment if one already is sure that one
knows the answer, or one expects others to already agree about
the answer, and it therefore may be good scientific strategy to
proceed as if one is agnostic and indifferent.<2> The same may
be said about comparing (say) two candidates for a job, such as
place-kicking specialists in football, where it is sound tactics
to have them start off apparently equal even if one has some
preconceived impressions. It is also sound to proceed as if one
is agnostic about hypotheses concerning the opponent's behavior
when the opponent can be expected to game you and be good at it -
as when you are playing a skilled poker opponent (rely here on
this one-time poker player who reported considerable winnings on
his income tax return the only year he played seriously), but
this has nothing to do with inferring knowledge.
Yes, I remember an incident when our family was in a desert
national park - Chaco Canyon - and rain threatened not long
before we planned to leave. We had absolutely no idea about the
probabilities that rain would make the dirt road impassable, or
how long the road might remain impassable, or the chances of
various amounts of rainfall. In a situation like this one you
feel as if you are absolutely bereft of relevant information and
knowledge that will help you reach a decision. But a case like
this one is most unusual.
The issue of judgment also arises in the decision about
whether to treat an astronomical or psychological observation as
an outlier; this interesting problem is one of the earliest roots
of statistical inference. One might think to simply check the
likelihood that a Normal distribution fitted to all the other
observations would produce such an outlier. But if someone tells
you that the telescope was disturbed by an earthquake at the time
of the outlying observation, you have no hesitation in immediate-
ly throwing out the observation (though you may prudently want to
study the effect of an earthquake by way of this observation).
Even the strictest non-Bayesian would act in this fashion, and
approve of it. The important question with respect to the use of judgment
and outside information is not whether to rein in judgment, but
rather how to expand the domain of knowledge-getting that does
not rely only on judgment. The benefits of objectivization are
that a) you increase your ability to communicate your knowledge
to other persons, and b) it reduces the burden on judgment, which
can well be faulty. This is like replacing an skin-and-eyeball
judgment of outside temperature with a meter which is still not
perfect and which must be read by eye; there is less scope for
judgment with the meter even though it cannot be removed
completely.
Consider as an example the control room for the process of
separating oil from gas from water at the Prudhoe Bay oilfield in
Alaska. There are lots of complex instruments being used, plus
an intricate computer program that reads and compares readings to
target values. Additionally, human operators make adjustments to
the process based on their experience with the process. There
still is a subjective element. But there is much less scope for
subjective operator adjustment than in Russian plants that
operate without these instruments and computer programs.
The enormous virtue of random sampling is that analysis
which relies upon it requires less judgment than does analysis
where one cannot rely on the representativeness of the samples.
Yet this is a snare and a delusion if one thinks - as some
"objectivists" do - that one can get by completely with random
sampling and without any additional information or judgments.
THE INEVITABLE SUBJECTIVITY OF KNOWLEDGE
To understand statistics one must understand that
statistical inference can never be reduced to a set of rules
which can be routinely applied according to objective criteria of
particular problems. It has always been the dream of
mathematically-minded persons to reduce all statistical practices
to routines. But as Michael Polanyi (following Kant) makes very
clear, even classifications of all kinds can never be reduced to
rules because each circumstance is necessarily at least a bit
different from others.
... even a writer like Kant, so powerfully bent on
strictly determining the rules of pure reason,
occasionally admitted that into all acts of judgment
there enters, and must enter, a personal decision which
cannot be accounted for by any rules. Kant says that
no system of rules can prescribe the procedure by which
the rules themselves are to be applied. There is an
ultimate agency which, unfettered by any explicit
rules, decides on the subsumption of a particular
instance under any general rule or a general concept.
And of this agency Kant says only that it `is what
constitutes our so-called mother-wit'. (Critique of
Pure Reason, A.133.) Indeed, at another point he
declares that this faculty, indispensable to the
exercise of any judgment, is quite inscrutable. He
says that the way our intelligence forms and applies
the schema of a class to particulars `is a skill so
deeply hidden in the human soul that we shall hardly
guess the secret trick that Nature here employs'.
(Critique of Pure Reason, A.141.) (Page 105)
It is true that in certain cases you can apply a
statistical analysis to decide between regularity and
randomness. This method does work by strict
mathematical rules. But actually its application
depends both at the start and at its conclusion on
decisions that cannot be prescribed by strict rules.
We must start off by suggesting some regularity which
the deviations seem to possess -- for example, that
they are all in one direction or that they show a
definite periodicity -- and there exist no rules for
reasonably picking out such regularities. When a
suspected pattern has been fixed upon, we can compute
the chances that it might have arisen accidentally, and
this will yield a numerical value (for example 1 in 10
or 1 in 100) for the probability that the pattern was
formed by mere chance and is therefore illusory. But
having got this result, we have still to make up our
minds informally whether the numerical value of the
probability that the suspected regularity was formed by
chance warrants us in accepting it as real or else in
rejecting it as accidental.
Admittedly, rules for setting a limit to the
improbability of the chances which a scientist might
properly assume to have occurred have been widely
accepted among scientists. But these rules have no
other foundation than a vague feeling for what a
scientist may regard as unreasonable chances. (1969,
pp. 107-8)
Mathematics only inserts a formalized link in a
procedure which starts with the intuitive surmise of a
significant shape, and ends with an equally informal
decision to reject or accept it as truly significant by
considering the computed numerical probability of its
being accidental. (p. 108)
The foregoing quotation implies that judgment is required about
which "model" to use in a particular circumstance.
The greatest statisticians recognized the need for, and
inevitability of, the exercise of judgment, though followers
often thought differently.
The need for personal judgment -- for Fisher in the
choice of model and test statistic; for Neyman and
Pearson in the choice of a class of hypotheses and a
rejection region; for the Bayesians in the choice of a
prior probability -- as well as the existence of
alternative statistical conceptions, were [recognized
by these writers but were]ignored by most textbooks...
(Gigerenzer et. al., 1989, pp. 106, 107)
The non-Bayesian statistician David Freedman writes:
When drawing inferences from data, even the most hard-
bitten objectivist usually has to introduce assumptions
and use prior information. The serious question is how
to integrate that information into the inferential
process and how to test the assumptions underlying the
analysis (quoted by Zellner, in Eatwell, John, Murray
Milgate, and Peter Newman, editors, The New Palgrave -
A Dictionary of Economics (Volume 1, A to D), (New
York: The Stockton Press, 1987), page 217).
Another non-Bayesian, John Tukey:
It is my impression that rather generally, not just in
econometrics, it is considered decent to use judgment
in choosing a functional form, but indecent to use
judgment in choosing a coefficient. If judgment about
important things is quite all right, why should it not
be used for less important ones as well? Perhaps the
real purpose of Bayesian techniques is to let us do the
indecent thing while modestly concealed behind a formal
apparatus. If so, this would not be a precedent. When
Fisher introduced the formalities of the analysis of
variance in the early 1920s, its most important
function was to conceal the fact that the data was
being adjusted for block means, an important step
forward which if openly visible would have been
considered by too many wiseacres of the time to be
"cooking the data." If so, let us hope that day will
soon come when the role of decent concealment can be
freely admitted....The coefficient may be better
estimated from one source or another, or, even best,
estimated by economic judgment...
It seems to me a breach of the statistician's
trust not to use judgment when that appears to be
better than using data (986, [1978], quoted by Zellner,
Eatwell, John, Murray Milgate, and Peter Newman,
editors, The New Palgrave - A Dictionary of Economics
(Volume 1, A to D), (New York: The Stockton Press,
1987, p. 217)
This issue is not limited to the field of statistics. In
economics, any freshman can learn the mathematics of the models
of monopoly and competition. But only a master economist knows
which one to apply in a government anti-trust case, and even
famous economists whose theoretical and research skills are great
often lack good judgment in making this decision according to the
particulars of the situation being discussed.
The 20th Century has seen the success of two "impossibility"
ideas about human knowledge: 1) There cannot be complete
knowledge of any system; and 2) the point of view of the observer
cannot be omitted from the system. The two ideas can be seen as
a single idea. Both will now be discussed. I greatly hope that
the reader does not see this discussion as just one more
pretentious display of irrelevant but imposing ideas from afar
that may be found not infrequently in all types of writings.
1) There cannot be complete knowledge of any system. The
idea that we cannot ever have complete knowledge of any system
has a base in logic, deriving from Godel's Theory, which teaches
us that some statements must always be undecidable. This should
not be seen as a note of despair, however. As Nagel and Newman
say,
The discovery that there are formally indemonstrable
arithmetic truths does not mean that there are truths
which are forever incapable of being known, or that a
mystic intuition must replace cogent proof. It does
mean that the resources of the human intellect have not
been, and cannot be, fully formalized (1956, p. 1695).
This deductive proposition has a parallel in empirical
science. Heisenberg's Uncertainty Principle makes the point for
the physics of the small. But even for macro systems scientists
have long known that there must always be measurement error,
though we may reduce it in size with time and work; astronomy is
an important example. One can also see in a famous homely
example the necessary inexactness of measurement, and the
inevitable dependence of the outcome on the observer's decisions
and actions - that is, it depends on how you view the phenomenon.
What is the length of the coastline of England? The more
detailed the map you use, the more unevenness it shows, and
therefore the longer the line drawn around the coast. And this
will continue infinitely, until you are tracing around each grain
of sand on each beach, getting a longer and longer coastline with
each increase in detail of the map. (This ties in with the
concept of chance discussed in Chapter 00, and the concept of the
Normal distribution discussed in Chapter 00).
Hayek's views of the impossibility of certain kinds of
knowledge of human systems (1967, Chapter 2) can seen as similar
to Heisenberg's idea.
Still another reason for believing that there cannot be
complete knowledge of any system is that any system we study is,
in principle and in reality, embedded in a larger system, and it
is impossible even in principle to have full knowledge of the
largest encompassing system (Ekeland, 1988). This means that our
knowledge of any subsystem must be at best an approximation.
2) The point of view of the observer cannot be omitted from
the system. This is the central point of Einstein's Theory of
Special Relativity - that time is what you read on a clock, and
not a matter of "properties". To avoid thinking in terms of
properties is important in statistics in many places, as we have
already seen in the discussion of the operational definition of
probability. (The concept of operational definition is a
generalization of what Einstein did with Special Relativity.)
Physicists may be impressed that Eugene Wigner was
especially emphatic about the inevitability of human
consciousness in the process of doing physics. "[P]hysicists
have found it impossible to give a satisfactory description of
atomic phenomena without reference to the consciousness...the
consciousness evidently plays an indispensable role...[T]he laws
of quantum mechanics itself cannot be formulated, with all their
implications, without recourse to the concept of consciousness"
(1979, p. 202).
Wigner quotes with approval John von Neumann (1958) saying,
"The conception of objective reality...has thus evaporated
...into the transparent clarity of a mathematics that represents
no longer the behavior of elementary particles but rather our
knowledge of this behavior" (Wigner, 1979, p. 202). And he
quotes Heisenberg as "The laws of nature which we formulate
mathematically in quantum theory deal no longer with the
particles themselves but with our knowledge of the elementary
particles" (Wigner, 1979, p. 187-188)
Both the above ideas are involved with (or imply) the idea
that (following Kant, Einstein, Bohr) we create the scientific
models and equations that we use, rather than discovering them.
And different models are appropriate for different purposes;
there is no "real" model.
As Conant (1965, p. 14) says: "I even question such
statements as 'This table is really composed of empty space in
which are electrons and the nuclei of atoms'".
The general point here is that there can never be a single
organically-complete or logically-best method for any statistical
situation, let alone for all situations. Each method must have
loose ends, even in its most appropriate use.
(But the above paragraphs certainly do not imply that the
results of an inquiry depend only upon the observer's thoughts
and procedures. There is no ground for support here for the
"Idealist" view of Berkeley and others that it is all in our own
minds. If it were so, the betting odds should be the same for
all sports teams and racehorses, and you would not dress any
differently in the winter than in the summer. But the fact that
everyone does prepare differently for different conditions, and
demands different odds for betting on one team or horse than
another, demonstrates that no one believes that our ideas about
the world are unaffected by something "out there". (It is
sometimes difficult even for Idealist philosophers themselves to
believe that they mean what they say But then, it is always
difficult even for philosophers, let alone laypersons, to take
seriously many of the ideas that other philosophers have taken
seriously over the centuries.)
The Necessity of Making Some Assumptions About the Population
It is quite impossible to conduct a statistical analysis
without making some arbitrary assumptions. Most fundamental are
the assumptions about the nature of the universe from which the
sample is drawn, or might have been drawn from. Objectivists
sometimes argue against the assumptions that underlie resampling
distributions (see later chapters), but at the same time they
themselves assume in many cases that their samples have been
drawn from Normally distributed populations when they have no
immediate evidence that this is so; instead, they rely upon a
large body of experience, and implicitly identify the situation
at hand with some part of that body of experience. One can argue
that this is less arbitrary than Bayesian or resampling
distributions, but one cannot argue that it has no arbitrary
element. Certainly there is something to be said for less
arbitrary assumptions, which can be said to be more objective.
But no assumptions can be said to be perfectly objective, and
once that is admitted, the pristine purity of the objectivist
view would seem to be at risk.
The Specificity of Decisions to Individuals and Institutions
Another reason why pure objectivism is untenable is the
inherent unavoidable opposition and tension between general
acceptability on the one hand, and on the other hand the aim of
throwing light on how a specific individual or institution can
best make a decision. We measure objectivity mainly by the extent
to which a statement - or better, the process that produces the
statement - compels agreement among reasonable people, and that
can only happen when the statement and process are "general".
But a particular decision, and the process which leads to a
decision, is almost entirely specific to a particular situation,
and usually specific to a particular person, and therefore it
must be affected by particular knowledge and by particular costs
and benefits.
Judgment enters into all acquisition of knowledge, of
course, and not just statistical inference. The decision about
whether two dogs should be considered - or better, treated - as
similar or different depends largely upon one's purposes.
To get beyond controversy between the objectivist and
subjectivist camps, we must find some ways to go beyond the
simple opposition of these two goals. I suggest doing so by
recognizing both aspects explicitly - separating out the part of
the process that can be made reasonably objective, and then
suggesting how that element can usefully be embedded in the
larger process. To turn around the Caesar-God connection, I
suggest we take from science and mathematics what they can give
us, and take from the less objective personal matrix the facts
and considerations which we personally consider relevant.
What I seek to avoid is the subjectivists rejecting the
objectivist calculations as hopelessly inadequate, and the
objectivists rejecting the subjectivist conclusions as hopelessly
uncheckable and dangerous. It would be better if the argument
would move beyond each group saying to the other, "You can't do
that".
To be a bit more specific about concepts before getting down
to case examples: The objective process referred to here is the
computation of probabilities based on conventional rules for
estimators (including confidence limits) and for significance
tests. The personal considerations include an individual's
perceived costs and benefits of the possible outcomes of the
various alternatives that might be chosen in light of the
objective statistical calculations as well as the costs and
benefits; the goals that the individual seeks to attain for the
society and for him/herself; the person's desire to get to the
beach soon; and many many other intangible and unclassifiable
influences. The personal knowledge includes the individual's
stock of information about the competence and integrity of per-
sons involved in the objective work; information about the spe-
cific case at hand - the particular patient who might be operated
upon, or the specific ship whose passage might be insured; and
much much else.
The Bayesian statisticians try to objectify the subjective
process somewhat, largely by making explicit and therefore
objective as much as possible of the information and belief that
enters into the process of inference. And certainly Bayes'
formula, using such material as poll data in social science, or
data on patient outcomes in medicine, is quite objective. But
some material cannot easily be made objective and quantitative,
e. g. estimates of another researcher's character, or one's
reasons for choosing to work on this piece of research rather
than another. And there are (as we have seen earlier in the
chapter) some situations in which it best serves our purposes to
begin a piece of inference with as clean a slate as possible so
as to avoid the self-defeating result of only arriving at a
result that we have dictated in advance by including strong prior
beliefs in our assessment.
The entire issue echoes arguments about "value free" social
science. Science cannot be wholly free of influence from the
researcher's values. But this is not a warrant for us to cease
struggling to reduce the role of values as much as possible in
producing and communicating data and conclusions.
Perhaps an analogy may help frame the issues. Imagine that
about 1870 you and another person are in Dodge City, Kansas, and
you inquire about travel time on the stagecoach to Carson. The
station manager tells you that the average time over the past
forty trips was 12 hours, but with considerable variation caused
by misadventures with which he regales you. You compute a
standard deviation of 24 hours from his data - the data that is
right there in his log book in front of you - and the other
prospective patron (an early statistician) agrees with your
computation and interpretation. Now what?
Let's consider some of the decisions you might make partly
on the basis of this information. Let's first discuss your
decision about whether to take the stage that is leaving in one
hour. What is the purpose of the trip - to deliver a legal paper
that must reach Carson within the next 48 hours, or to visit an
old buddy, or to catch a poker game scheduled for tomorrow? What
would be the costs, and the expected travel time, if instead of
stage travel you hire a horse and a renowned Indian-fighter guide
to take you to Carson? What do you know about who will drive the
stage tomorrow - will it be Whisky Jack, whom you know as an
untrustworthy bum who may not even show up for the run, or Parson
Willy, who has the best on-time record in the industry? You've
lived in Carson for years, and you have a pretty good line on
such matters. And how friendly are the Indians these days - how
likely is an attack upon the stage? Are there soldiers around
now?
The other traveler's situation is very different from yours.
He works for the federal government, and is interested in
upgrading the transportation system. And his knowledge is very
different - he just arrived from Washington last week.
Even though your situations are very different, the two of
you might consider a friendly wager on the travel time of today's
run, and also on the mean travel time of runs over the next
month. I contend that both of you might wisely begin with the
same calculations, using conventional inferential statistics. It
might be that your personal knowledge - who is driving tomorrow,
and rumors about Indian activity - might so dominate your
thinking that you would pay little attention to the data and
calculations - as apparently has been the case with ship
insurance right up to the present (which is astonishing); yet I
still contend that conventional calculations are a good place for
you to start your estimate about the odds at which you will take
a bet from either direction (that is, whether you bet that the
trip would take either more, or less, time than the criterion in
the bet). On the other hand, a bettor who disregarded the local
knowledge would be likely to do less well in the long run (though
maybe not).
The interplay of objective and subjective elements may
emerge more sharply from these examples.
1. Every statistician knows that "the coin has no memory",
and will laugh at the bettor who pays attention to the fact that
the last five tosses came up heads. But after fifty heads in a
row, even the statistician will check the coin, and keep an eye
open to see if the person pitching the coin is a clever magician.
That is, the data at hand never constitute a perfectly closed
system - except for fools. On the other hand, there are great
dangers in the Bayesian method which comes directly to grips with
this problem by making explicit judgments.
It is also relevant that when presented with a sample of 9
heads out of ten tosses, the statistician does not compute a
population mean and standard deviation as s/he would if the data
referred to average temperatures in August rather than July.
Outside knowledge - call it "theory", if you will - always is
relevant if available. And people almost never start off with a
perfectly blank slate or with equal beliefs about all the
possibilities - even about whether Classic Coke or the new Coke
will be preferred by more people.
The case of baseball batters and basketball shooters is
quite the same as the situation with the coins.
ON BASIC ASSUMPTIONS IN RELATING SAMPLES TO UNIVERSES
The issues discussed in this section take us to the border
between statistics and research methods, and into areas where
judgment is needed.
As discussed above, purpose must enter into all the many
judgments that are made in the course of statistical inference.
Rather than saying "one can" or "one cannot" make certain
inferences, I suggest that we say "it is reasonable to regard
...", and so on. This takes the discussion out of the realm of
logic and into the realm of soundness of judgment.
THE TENSION BETWEEN OPEN-SYSTEM AND CLOSED-SYSTEM THINKING
The need for the making of judgments as part of inference
directly opposes the strongest intellectual need and desire of
most people who do and teach statistics - the felt need to treat
the situation under discussion as logically closed. Without
closure, much of formal mathematics is not possible. And if
one's interest and profession is doing formal mathematics, it is
not surprising that one wishes to view systems as closed.
It is the assumption that systems may usually be regarded as
closed that Godel threatened so dangerously, and perhaps explains
why he brought such fear and trembling even to those on whose
work he did not directly impinge.
Indeed, the very measures of virtue in mathematics - and
indeed, in the rest of academia - relate to skill in manipulating
closed systems. These adjectives of virtue are: Rigorous.
Elegant. Sophisticated. These are the attributes of neat,
clever, and "beautiful" work.
Chapter 00 presents John Barrow's lovely scenario about
Martian mathematicians. Earthly mathematicians admire that which
is esthetic - rigorous, elegant, and sophisticated. It is this
that they consider makes a great work of the mind, and they
themselves make a living by being brilliant in these ways. In
contrast, engineers, businesspeople, policymakers, (and I, I
confess) admire that which is helpful, workable, usable, and
useful - that is, pragmatic. Once again we notice the split that
was remarked as early as Roman times in comparing the great
orators - between those who seek to have the crowd say, "How well
he speaks", and those who seek to have the crowd say, "Let us
march".
The Desire for "Justification"
Many of us, especially those with a mathematical bent, have
a strong psychological need for "justification" - that is, to
feel a solid axiomatic structure underneath one's beliefs. But
it may well be that there is little or no logical need for such a
structure. Often one can simply start out with propositions at
the level of the lowest empirical work you want to do; for
example, Milton Friedman suggests that microeconomics can begin
with empirical supply and demand curves, and can dispense with
all of the underlying theory of consumer behavior. Or, one can
choose axioms in a pragmatic way to fit the particular needs of
your work, without being concerned that they are the "best"
foundation for the entire science. Alfred Whitehead came close
to suggesting that point of view for the most basic axioms for
all knowledge - but still he felt the need for some small set of
most basic propositions.
The point of view offered here fits with a vision of
knowledge-seeking as a group of people in the nighttime jungle
groping to find reference points to map the area. There will
never be an "ultimate" map, but merely improved ones. Each
person seeks for partial knowledge, not for the whole. Each
person can at best perceive the outlines of a fragment of the
whole. But unlike Rashomon, this does not imply that there
cannot be some version of the map or story that most people can
agree is mostly valid.
The beautiful axiomatic structure of mathematics seduces
people in many ways. Just one example here: Some believe that
because there is a function, there must be aspects of the world
that it describes - a natural structure like the mathematical
one. This fallacy is like the fallacy of believing that because
there is a word (say, unicorn), there must be a reality that it
represents. Environmental doomsters since before Malthus have
believed that because there is an exponential function, there
must be exponential growth of population. And in statistics the
existence of the extraordinarily beautiful Normal (from
"normalized") curve seduces many into expecting this to be a
frequently-observed distribution in nature - which it is not (see
Chapter 00).
The central point that the mathematical statisticians ought
to accept - but which few are prepared to accept - is that there
cannot be a logically-closed system of induction and of
statistical inference the way there can be a logically-complete
body of probability theory. Confidence intervals and tests of
hypotheses are blunt instruments that can validly point one in a
general direction - like knocking the rough outlines off a block
of granite for the sculptor of knowledge - but such devices can
never allow one to make with precision statements about (say) the
probability of the location of a parameter, or the probability of
the existence of a particular difference between some groups.
It is to Ronald Fisher's credit that he emphasized that
science is a matter of approximation and judgment. As Gigerenzer
et al said about him,
The choice of the test statistic, and of null
hypotheses worth testing, remains, for Fisher, an art,
and cannot be reduced to a mechanical process (1989,
p.**)
In Fisher's own words,
It is, I believe, nothing but an illusion to think that
this process can ever be reduced to a self-contained
mathematical theory of tests of significance.
Constructive imagination, together with much knowledge
based on experience of data of the same kind, must be
exercised before deciding on what hypotheses are worth
testing, and in what respects. Only when this
fundamental thinking has been accomplished can the
problem be given a mathematical form (Fisher, 1939, p.
6). (p. 95)
Yet Fisher was attracted to the exactness of mathematical
arguments, and used mathematics even when it was not necessary to
do so.
Jeffreys, too, emphasized the need to apply broad judgment
in the entire process of statistical inference (1961). But both
Jeffreys and Fisher included huge chunks of formal mathematics in
their writings, and both made major contributions to mathematical
statistics. Indeed, the mathematics was at the center of their
work, and the cautions about judgment were almost footnotes,
though important ones; this may explain why they and their work
were not rejected by their fellow statisticians. (Jeffreys was
only secondarily a statistician, and primarily a physicist.)
If deductive analysis of closed systems cannot be the whole
of statistical inference, but only a tool in the overall work,
and if one cannot rely on a set of first principles as a vantage
point from which to proceed when one is in doubt, how can we
expect to get knowledge? What other method is there?
Our most general method is successive approximations; if
anything deserves that name of the scientific method (see
discussion of the scientific method in Chapter 00), this is it.
We make a first guess at the dimension of a quantity or the
probability of an event, and we gradually improve our estimate
with successive work in statistical testing, gathering more
information, and connecting up to other bodies of knowledge.
This view of knowledge-getting fits with the viewpoint that we
operate in an open rather than a closed environment. It also
fits with the idea that scientific work can never be "value
free"; rather, our goal should be to reduce as much as possible
the influence of our values on the conclusions we reach, so that
others with different values can examine our methods and data
objectively, and hopefully agree on the validity of those
results.
CONCLUSIONS
I conclude about the use of judgment and outside information
that the proper issue is not whether to rein in judgment but
rather how to expand the domain of knowledge-getting that does
not rely only on judgment.
Given the inevitability of subjectivity and judgment in the
getting of knowledge (if one accepts that it is a good
description), one must have religious faith or be a mathematician
to believe that mathematics or anything else can proven to be
exact knowledge. And to promise exactness to others is a fraud.
ENDNOTES **ENDNOTES**
<1>: This is one of the few issues on which I differ from
Jeffreys (1961); see his pages 20 (?) and 401 (?)
<2>: See Appendix 00 for discussion of this issue in the context
of the interpretation of zero correlations.