In the words of Poincare:
The most interesting facts are those which may serve many times; these are the facts which have a chance of coming up again. We have been so fortunate as to have been born in a world where there are such. Suppose that instead of sixty chemical elements there were sixty milliards of them, that they were not some common, the others rare, but that they were equally distributed. Then, every time we picked up a new pebble there would be great probability of its being formed of some unknown substance; all that we knew of other pebbles would be worthless for it; before each new object we should be as the new-born babe; like it we could only obey our caprices or our needs. Biologists would be just as much at a loss if there were only individuals and no species, and if heredity did not make sons like their fathers.[1]
[Footnote 1: Poincare: _Foundations of Science_, p. 363.]
The aim of cla.s.sification in science is grouping in such a way as to make manifest at once similarities in the behavior of objects. That characteristic is selected as a basis of cla.s.sification with which is correlated the greatest number of other characteristics belonging to the facts in question. It would be possible to cla.s.sify all living things according to color, but such a cla.s.sification would be dest.i.tute of scientific value.
Biology offers some interesting examples of how an illuminating cla.s.sification may be made on the basis of a single characteristic.
It has been found, for example, that the differences or resemblances of animals are correlated with corresponding differences or resemblances in their teeth. In general, the function of cla.s.sification may be summarized in Huxley's definition as modified by Jevons:
By the cla.s.sification of any series of objects is meant the actual or ideal arrangement together of those things which are like and the separation of those things which are unlike, the purpose of the arrangement being, primarily, to disclose the correlations or laws of union of properties and circ.u.mstances, and, secondarily, to facilitate the operations of the mind in clearly conceiving and retaining in memory the characters of the object in question.
It should be noted that the object of cla.s.sification is not simply to indicate similarities but to indicate distinctions or differences. In scientific inquiry, differences are as crucial in the forming of generalizations as similarities. It is only possible to cla.s.sify a given fact under a scientific generalization when the given fact is set off from other facts, when it is seen to be the result of certain special conditions.
If a man infers from a single sample of grain as to the grade of wheat of the car as a whole, it is induction, and under certain circ.u.mstances, a _sound_ induction; other cases are resorted to simply for the sake of rendering that induction more guarded and correct. In the case of the various samples of grain, it is the fact that the samples are unlike, at least in the part of the carload from which they are taken, that is important. Were it not for this unlikeness, their likeness in quality would be of no avail in a.s.sisting inference.[1]
[Footnote 1: Dewey: _How We Think_, pp. 89-90.]
EXPERIMENTAL VARIATION OF CONDITIONS. In forming our generalizations from the observation of situations as they occur in Nature, we are at a disadvantage. If we observe cases just as we find them, there is much present that is irrelevant to our problem; much that is of genuine importance in its solution is hidden or obscure. In experimental investigation we are, in the words of Sir John Herschel, "active observers"; we deliberately invent crucial or test cases. That is, we deliberately arrange conditions so that every factor is definitely known and recognized. We then introduce into this set of completely known conditions one change, one new circ.u.mstance, and observe its effect. In Mill's phrase, we "take a phenomenon home with us," and watch its behavior. Mill states clearly the outstanding advantage of experimentation over observation:
When we can produce a phenomenon artificially, we can take it, as it were, home with us, and observe it in the midst of circ.u.mstances with which in all other respects we are accurately acquainted. If we desire to know what are the effects of the cause _A_, and are able to produce _A_ by means at our disposal, we can generally determine at our own discretion ... the whole of the circ.u.mstances which shall be present along with it; and thus, knowing exactly the simultaneous state of everything else which is within the reach of _A's_ influence, we have only to observe what alteration is made in that state by the presence of _A_.
For example, by the electric machine we can produce, in the midst of known circ.u.mstances, the phenomena which Nature exhibits on a grander scale in the form of lightning and thunder. Now let any one consider what amount of knowledge of the effects and laws of electric agency mankind could have obtained from the mere observation of thunderstorms, and compare it with that which they have gained, and may expect to gain, from electrical and galvanic experiments....
When we have succeeded in isolating the phenomenon which is the subject of inquiry, by placing it among known circ.u.mstances, we may produce further variations of circ.u.mstances to any extent, and of such kinds as we think best calculated to bring the laws of the phenomenon into a clear light. By introducing one well-defined circ.u.mstance after another into the experiment, we obtain a.s.surance of the manner in which the phenomenon behaves under an indefinite variety of possible circ.u.mstances. Thus, chemists, after having obtained some newly discovered substance in a pure state, ... introduce various other substances, one by one, to ascertain whether it will combine with them, or decompose them, and with what result; and also apply heat or electricity or pressure, to discover what will happen to the substance under each of these circ.u.mstances.[1]
[Footnote 1: Mill: _Logic_ (London, 1872), vol. I, pp. 441-42.]
Through experiment, we are thus enabled to observe the relation of specific elements in a situation. We are, furthermore, enabled to observe phenomena which are so rare in occurrence that it is impossible to form generalizations from them or improbable that we should even notice them: "We might have to wait years or centuries to meet accidentally with facts which we can readily produce at any moment in a laboratory; and it is probable that many of the chemical substances now known, and many excessively useful products, would never have been discovered at all, by waiting till Nature presented them spontaneously to our observation." And phenomena, such as that of electricity, which can only be understood when the conditions of their occurrence are varied, are presented to us in Nature most frequently in a fixed and invariable form.
GENERALIZATIONS, THEIR ELABORATION AND TESTING. So far we have been concerned with the steps in the control of suggestion, the reexamination of the facts so that significant suggestions may be derived, and the elimination of the significant from the insignificant in the elements of the situation as it first confronts us. In logically elaborating a suggestion, as we have already seen, we trace out the bearings of a given situation. We expand it; we see what it _implies_, what it means. Thus, if we came, for example, to a meeting that had been scheduled, and found no one present, we might have several solutions arise in our minds. The meeting, we might suppose, had been transferred to another room. If that were the case, there would probably be some notice posted. In all cases of deductive elaboration, we go through what might be called the If-Then process. If _such-and-such_ is the case, then _such-and-such_ will follow. We can then verify our suggested solution to a problem, by going back to the facts, to see whether they correspond with the implications of our suggestion. We may, to take another example, think that a man who enters our office is an insurance agent, or a book solicitor who had said he would call upon us at a definite date. If such is the case, he will say such-and-such things.
If he does say them, then our suggestion is seen to be correct.
The advantages of developing a suggestion include the fact that some link in the logical chain may bear a more obvious relation to our problem than did the undeveloped suggestion itself.
The systematic sciences consist of such sets of principles so related that any single term implies certain others, which imply certain others and so on _ad infinitum_.
After the facts have been elaborated, the generalization, however plausible it may seem, must be subjected to experimental corroboration. That is, if a suggestion is found through local elaboration to mean _A, B, C_, then the situation must be reexamined to see if the facts to be found tally with the facts deduced. In the case cited, the suggestion that the man who entered the room was the insurance agent we expected would be verified if he immediately broached the subject and the fact, say, of a previous conversation. In the case of disease, if the illness is typhoid, we shall find certain specific conditions in the patient. If these are found, the suggestion of typhoid is verified.
The _reliability_ of generalizations made by this scientific procedure varies according to several factors. It varies, in the first place, according to the correspondence of the predictions made on the basis of the generalization, with subsequent events. The reason we say the law of gravitation holds true is because in every instance where observations or experiments have been made, the results have tallied precisely with expectations based upon the generalization. We can, to a certain extent, determine the reliability of a generalization before comparing our predictions with subsequent events.
If a generalization made contradicts laws that have been established in so many instances that they are practically beyond peradventure, it is suspect. A law, for example, that should be an exception to the laws of motion or gravitation, is _a priori_ dubious.
If an induction conflicts with stronger inductions, or with conclusions capable of being correctly deduced from them, then, unless on reconsideration it should appear that some of the stronger inductions have been expressed with greater universality than their evidence warrants, the weaker one must give way. The opinion so long prevalent that a comet, or any other unusual appearance in the heavenly regions, was the precursor of calamities to mankind, or to those at least who witnessed it; the belief in the veracity of the oracles of Delphi or Dodona; the reliance on astrology, or on the weather prophecies in almanacs, were doubtless inductions supposed to be grounded on experience.... What has really put an end to these insufficient inductions is their inconsistency with the stronger inductions subsequently obtained by scientific inquiry, respecting the causes on which terrestrial events really depend.[1]
[Footnote 1: Mill: _Logic_ (London, 1872), vol. I, pp. 370-71.]
THE QUANt.i.tATIVE BASIS OF SCIENTIFIC PROCEDURE. Science _is_ science, some scientists insist, in so far as it is mathematical.
That is, in the precise determination of facts, and in their repet.i.tion with a view to their exact determination, quant.i.ties must be known. The sciences have developed in exactness, in so far as they have succeeded in expressing their formulations in numerical terms. The physical sciences, such as physics and chemistry, which have been able to frame their generalizations from precise quant.i.ties, have been immeasurably more certain and secure than such sciences as psychology and sociology, where the measurement of exact quant.i.ties is more difficult and rare. Jevons writes in his _Principles of Science_:
As physical science advances, it becomes more and more accurately quant.i.tative. Questions of simple logical fact resolve themselves after a while into questions of degree, time, distance, or weight.
Forces hardly suspected to exist by one generation are clearly recognized by the next, and precisely measured by the third generation.[1]
[Footnote 1: Jevons: _Principles of Science_, p. 270.]
The history of science exhibits a constant progress from rude guesses to precise measurement of quant.i.ties. In the earliest history of astronomy there were attempts at quant.i.tative determinations, very crude, of course, in comparison with the exactness of present-day scientific methods.
Every branch of knowledge commences with quant.i.tative notions of a very rude character. After we have far progressed, it is often amusing to look back into the infancy of the science, and contrast present with past methods. At Greenwich Observatory in the present day, the hundredth part of a second is not thought an inconsiderable portion of time. The ancient Chaldreans recorded an eclipse to the nearest hour, and the early Alexandrian astronomers thought it superfluous to distinguish between the edge and center of the sun.
By the introduction of the astrolabe, Ptolemy, and the later Alexandrian astronomers could determine the places of the heavenly bodies within about ten minutes of arc. Little progress then ensued for thirteen centuries, until Tycho Brahe made the first great step toward accuracy, not only by employing better instruments, but even more by ceasing to regard an instrument as correct.... He also took notice of the effects of atmospheric refraction, and succeeded in attaining an accuracy often sixty times as great as that of Ptolemy. Yet Tycho and Hevelius often erred several minutes in the determination of a star's place, and it was a great achievement of Roemer and Flamsteed to reduce this error to seconds. Bradley, the modern Hipparchus, carried on the improvement, his errors in right ascension, according to Bessel, being under one second of time, and those of declination under four seconds of arc. In the present day the average error of a single observation is probably reduced to the half or the quarter of what it was in Bradley's time; and further extreme accuracy is attained by the multiplication of observations, and their skillful combination according to the theory of error. Some of the more important constants... have been determined within a tenth part of a second of s.p.a.ce.[2]
[Footnote 2: _Ibid._, pp. 271-72.]
The precise measurement of quant.i.ties is important because we can, in the first place, only through quant.i.tative determinations be sure we have made accurate observations, observations uncolored by personal idiosyncrasies. Both errors of observation and errors of judgment are checked up and averted by exact quant.i.tative measurements. The relations of phenomena, moreover, are so complex that specific causes and effects can only be understood when they are given precise quant.i.tative determination. In investigating the solubility of salts, for example, we find variability depending on differences in temperature, pressure, the presence of other salts already dissolved, and the like. The solubility of salt in water differs again from its solubility in alcohol, ether, carbon, bisulphide. Generalization about the solubility of salt, therefore, depends on the exact measurement of the phenomenon under all these conditions.[1]
[Footnote 1: See Jevons, p, 279 ff.]
The importance of exact measurement in scientific discovery and generalization may be ill.u.s.trated briefly from one instance in the history of chemistry. The discovery of the chemical element _argon_ came about through some exact measurements by Lord Rayleigh and Sir William Ramsay of the nitrogen and the oxygen in a gla.s.s flask. It was found that the nitrogen derived from air was not altogether pure; that is, there were very minute differences in the weighings of nitrogen made from certain of its compounds and the weight obtained by removing oxygen, water, traces of carbonic acid, and other impurities from the atmospheric air. It was found that the very slightly heavier weight in one case was caused by the presence of argon (about one and one third times as heavy as nitrogen) and some other elementary gases. The discovery was here clearly due to the accurate measurement which made possible the discovery of this minute discrepancy.
It must be noted in general that accuracy in measurement is immediately dependent on the instruments of precision available. It has frequently been pointed out that the Greeks, although incomparably fresh, fertile, and direct in their thinking, yet made such a comparatively slender contribution to scientific knowledge precisely because they had no instruments for exact measurement. The thermometer made possible the science of heat. The use of the balance has been in large part responsible for advances in chemistry.
The degree to which sciences have attained quant.i.tative accuracy varies among the physical sciences. The phenomena of light are not yet subject to accurate measurement; many natural phenomena have not yet been made the subject of measurement at all. Such are the intensity of sound, the phenomena of taste and smell, the magnitude of atoms, the temperature of the electric spark or of the sun's atmosphere.[1]
[Footnote 1: See Jevons, p. 273.]
The sciences tend, in general, to become more and more quant.i.tative. All phenomena "exist in s.p.a.ce and involve molecular movements, measurable in velocity and extent."
The ideal of all sciences is thus to reduce all phenomena to measurements of ma.s.s and motion. This ideal is obviously far from being attained. Especially in the social sciences are quant.i.tative measurements difficult, and in these sciences we must remain therefore at best in the region of shrewd guesses or fairly reliable probability.
STATISTICS AND PROBABILITY. While in the social sciences, exact quant.i.tative measurements are difficult, they are to an extent possible, and to the extent that they are possible we can arrive at fairly accurate generalizations as to the probable occurrence of phenomena. There are many phenomena where the elements are so complex that they cannot be a.n.a.lyzed and invariable causal relations established.
In a study of the phenomena of the weather, for example, the phenomena are so exceedingly complex that anything approaching a complete statement of their elements is quite out of the question.
The fallibility of most popular generalizations in these fields is evidence of the difficulty of dealing with such facts. Must we be content then simply to guess at such phenomena? ... In instances of this sort, another method ... becomes important: The Method of Statistics. In statistics we have an _exact_ enumeration of cases. If a small number of cases does not enable us to detect the causal relations of a phenomenon, it sometimes happens that a large number, accurately counted, and taken from a field widely extended in time and s.p.a.ce, will lead to a solution of the problem.[1]
[Footnote 1: Jones; _Logic, Inductive and Deductive_, p. 190.]
If we find, in a wide variety of instances, two phenomena occurring in a certain constant correlation, we infer a causal relation. If the variations in the frequency of one correspond to variations in the frequency of the other, there is probability of more than connection by coincidence.
The correlation between phenomena may be measured mathematically; it is possible to express in figures the exact relations between the occurrence of one phenomenon and the occurrence of another. The number which expresses this relation is called the coefficient of correlation. This coefficient expresses relationship in terms of the mean values of the two series of phenomena by measuring the amount each individual phenomenon varies from its respective mean. Suppose, for example, that in correlating crime and unemployment, the coefficient of correlation were found to be .47. If in every case of unemployment crime were found and in every case of crime, unemployment, the coefficient of correlation would be +1. If crime were never found in unemployment, and unemployment never in crime, the coefficient of correlation would be -1, indicating a perfect inverse relationship.
A coefficient of 0 would indicate that there is no relationship.
The coefficient of .47 would accordingly indicate a significant but not a "high" correlation between crime and unemployment.
We cannot consider here all the details of statistical methods, but attention may be called to a few of the more significant features of the process. Statistics is a science, and consists in much more than the mere counting of cases.
With the collection of statistical data, only the first step has been taken. The statistics in that condition are only raw material showing nothing. They are not an instrument of investigation any more than a kiln of bricks is a monument of architecture. They need to be arranged, cla.s.sified, tabulated, and brought into connection with other statistics by the statistician. Then only do they become an instrument of investigation, just as a tool is nothing more than a ma.s.s of wood or metal, except in the hands of a skilled workman.[1]
[Footnote 1: Mayo-Smith: _Statistics and Sociology_, p. 18.]
The essential steps in a statistical investigation are: (1) the collection of material, (2) its tabulation, (3) the summary, and (4) a critical examination of the results. The terms are almost self-explanatory. There are, however, several general points of method to be noted.
In the collection of data a wide field must be covered, to be sure that we are dealing with invariable relations instead of with mere coincidences, "or overemphasizing the importance of one out of a number of cooperating causes." Tabulation of the data collected is very important, since cla.s.sification of the data does much to suggest the causal relations sought.
The headings under which data will be collected depend on the purposes of the investigation. In general, statistics can suggest generalizations, rather than establish them. They indicate probability, not invariable relation.[2]
[Footnote 2: See Jones: _Logic_, pp. 213-25, for a discussion of Probability.]