Designing Scientific Experiments
People sometimes argue about which scientific discovery or advance ranks as the greatest, as the most significant in the history of humankind. Popular answers include electricity, computers, immunizations, and antibiotics. Yet we’d argue it’s none of these, but rather the scientific method of inquiry itself.
The scientific method only came into being relatively recently in human history as people began to become interested in proving all the things that other people believed (for example, that the sun rotates around the Earth, the Earth is flat, and the Earth is only 5,000 or so years old). Science at its core is a method of thinking and observing that produces evidence which supports or refutes theories about how the world works. What makes science so powerful, what makes it so novel a way of thinking, and what guarantees it yields truths is that when done correctly it eliminates observer bias. No matter how hard we try to observe things impartially, it turns out we can’t—at least, not without the help of the scientific method, which requires us to form a testable hypothesis about some aspect of how the world works and then figure out a way to actually test it. The scientific method requires that we regard as true only those things proven to be so by experiment, a controlled situation in which variables are manipulated one at a time and differences in results are recorded. Results that are free of contamination from observer bias can be believed. Results that remain influenced by observer bias—in other words, by what we already believe to be true or by what we want to be true—cannot. How amazing that we’ve discovered a tool that so reliably separates truth from fiction, a tool that’s essentially responsible for every modern convenience we have.
But what’s just as remarkable as the scientific method itself is the creativity with which scientists have learned to wield it. Suppose, for example, you wanted to test the idea, as scientist Stanley Milgram famously did, that people will obey authority even when it contradicts their moral impulses. How would you go about testing that? You couldn’t simply have people fill out a questionnaire asking them if they would have marched 6 million Jews into the gas chambers because what you’d be testing then is how people predict what they’d do. Milgram was interested in discovering what people actually would do when told to follow a command that contradicted their moral impulses. That is, did the Nazis all murder 6 million Jews because they were uniquely evil or was their behavior at least partially explained by some element of human nature common to us all?
So what did Milgram do? He designed an experiment. An experiment that hid the goal and design of the experiment itself from its subjects so that they would likely behave as they would were they to be faced with similar circumstances in the real world—an experiment that reproduced the choice that faced the Nazis who marched Jews into the showers. This was no easy feat. No one told him how to do it. He had to figure it out himself. From the “Method” section of his original paper from 1963:
After a general introduction on the presumed relation between punishment and learning, subjects were told: “But actually, we know very little about the effect of punishment on learning, because almost no truly scientific studies have been made of it in human beings. For instance, we don’t know how much punishment is best for learning—and we don’t know how much difference it makes as to who is giving the punishment, whether an adult learns best from a younger or an older person than himself—or many things of that sort. So in this study we are bringing together a number of adults of different occupations and ages. And we’re asking some of them to be teachers and some of them to be learners. We want to find out just what effect different people have on each other as teachers and learners, and also what effect punishment will have on learning in this situation. Therefore, I’m going to ask one of you to be the teacher here tonight and the other one to be the learner. Does either of you have a preference?”
Subjects then drew slips of paper from a hat to determine who would be the teacher and who would be the learner in the experiment. The drawing was rigged so that the naive subject was always the teacher and the accomplice always the learner. (Both slips contained the word “Teacher.”) Immediately after the drawing, the teacher and learner were taken to an adjacent room and the learner was strapped into an “electric chair” apparatus.
The experimenter explained that the straps were to prevent excessive movement while the learner was being shocked. The effect was to make it impossible for him to escape from the situation. An electrode was attached to the learner’s wrist, and electrode paste was applied “to avoid blisters and burns.” Subjects were told that the electrode was attached to the shock generator in the adjoining room.
In order to improve credibility the experimenter declared, in response to a question by the learner: “Although the shocks can be extremely painful, they cause no permanent tissue damage.”
Later, a description of the device to be used to administer the shocks:
Upon depressing a switch: a pilot light corresponding to each switch is illuminated in bright red; an electric buzzing is heard; an electric blue light, labeled “voltage energizer,” flashes; the dial on the voltage meter swings to the right; various relay clicks are sounded. The upper left-hand corner of the generator is labeled Shock Generator, Type ZLB, Dyson Instrument Company, Waltham, Mass. Output IS Volts—4SO Volts. Details of the instrument were carefully handled to insure an appearance of authenticity. The panel was engraved by precision industrial engravers, and all components were of high quality. No subject in the experiment suspected that the instrument was merely a simulated shock generator.
Sample shock. Each naive subject is given a sample shock on the shock generator, prior to beginning his run as teacher. This shock is always 45 volts, and is applied by pressing the third switch of the generator. The shock is applied to the wrist of the naive subject, and has its source in a 45-volt battery wired into the generator. This further convinces the subject of the authenticity of the generator.
Shock instructions. The subject is told to administer a shock to the learner each time he gives a wrong response. Moreover—and this is the key command—the subject is instructed to “move one level higher on the shock generator each time the learner flashes a wrong answer.” He is also instructed to announce the voltage level before administering a shock.This serves to continually remind subjects of the increasing intensity of shocks administered to the learner.
Feedback from the victim. In all conditions the learner gives a predetermined set of responses to the word pair test [the imaginary test Milgram used to convince his subjects they were participating in an experiment designed to test learning styles], based on a schedule of approximately three wrong answers to one correct answer. In the present experimental condition no vocal response or other sign of protest is heard from the learner until Shock Level 300 is reached. When the 300-volt shock is administered, the learner pounds on the wall of the room in which he is bound to the electric chair. The pounding can be heard by the subject. From this point on, the learner’s answers no longer appear on the four-way panel.
At this juncture, subjects ordinarily turn to the experimenter for guidance. The experimenter instructs the subject to treat the absence of a response as a wrong answer, and to shock the subject according to the usual schedule.
Despite the dubious ethical nature of allowing subjects to think they were actually shocking people (the “Learners” were, of course, actors who were only pretending to be shocked), the design of this experiment is ingenious. In fact, its results produced an uproar at the time precisely because it was so well designed that people were forced to take its results seriously, namely that 65% of all subjects continued the experiment up to the final massive shock of 450 volts, even after some “Learners” were heard to beg for the experiment to stop. In other words, Milgram’s original hypothesis—that the Nazi soldiers who gassed 6 million Jews weren’t necessarily all inherently evil but that some part of human nature predisposes people in certain circumstances to obey authority even when to do so is to violate their moral principles—was supported by his experiment’s results. Since that time, Milgram’s experiment has been repeated around the world, mostly (though interestingly not always) yielding the same results.
The point to remember then is this: as strange as we’re discovering the world to be, the methods we use to discover it are by necessity even stranger, and are dependent on whatever degree of creativity our scientists can bring to bear on experimental design, whose purpose isn’t merely to isolate that which is to be measured but to eliminate any biases that might interfere with its measurement. Though few people would rate creativity important when thinking about what’s required for a career in science, in fact few fields require as much—something that makes the knowledge we’ve painstakingly acquired over the last few centuries even more remarkable and makes all our physicians at ImagineMD determined to practice only evidence-based medicine.