Very quickly the subject learns to orient toward the other compartment during the tone and quickly shuttles into it as soon as the shock occurs, thus terminating both shock and tone. The subject is now escaping both shock and tone. After more experience, however, something new happens: after orienting toward the other compartment during the tone, the subject shuttles into it before the shock has begun. This act has two consequences: (1) it terminates the tone, and (2) it cancels the shock that was programmed to occur after 20 seconds of tone. The subject has escaped from the tone, but it has avoided the programmed shock.
Notice that the subject can avoid shock by shuttling either way -- from the left to the right compartment or vice versa. For this reason the procedure is called "two-way avoidance."
But now, here in the shuttlebox, subjects were learning to shuttle, apparently in the absence or any immediate stimulus change that could serve as a reinforcer. There was no shock present before the shuttle, and there continued to be no shock present afterward. What, then, could be serving as the reinforcer? This question came to be known as "the problem of avoidance."
Now that the tone produced strong fear, any response that terminated this conditioned stimulus would be strengthened through the process of negative reinforcement. What response terminates the CS? Shuttling to the other compartment! Thus, according to Mowrer's account, the subjects shuttle during the tone and before the shock begins, not to avoid the shock, but rather to escape the fear-inducing tone CS.
Note that Mowrer's account appeals to two processes or "factors." First, there is classical conditioning of fear to the tone, transforming it into an elicitor of strong fear. Second, there is instrumental conditioning of shuttling through negative reinforcement, the termination of the aversive and fear-evoking CS. Because it appeals to both classical and instrumental conditioning, the account is today known as Mowrer's two-factor theory of avoidance.
Two-factor theory not only failed to explain why extinction of avoidance follows such a slow course, it predicted a phenomenon during extinction that was not observed to happen. In two-factor theory, successful acquisition of avoidance should lead to the extinction of conditioned fear to the CS. That is because, when the subject successfully avoids the shocks, the CS is no longer being paired with them -- which constitutes extinction of the conditioned response to the CS. As fear to the CS extinguishes, escape from the CS by jumping over the hurdle will no longer be reinforcing, and shuttling should extinguish. But this will then lead to further pairings between CS and shock, reinstating fear of the CS and making escape from the CS an effective reinforcer once again. This cycle of acquisition, extinction, reacquisition, extinction, etc. is not observed in shuttlebox avoidance.
To explain the slow extinction of avoidance, some theorists went so far as to suggest that the process of evolution might have selected for organisms whose avoidance responses are extremely resistant to extinction. In this view, continuing to avoid when it is no longer necessary may be far less costly to the organism than failing to avoid when avoidance is necessary (to avoid being eaten, for example). Slowness of extinction was said to be a special property of avoidance responses.
Later it was realized that the slowness of extinction of avoidance could be accounted for without appeal to any special characteristics of avoidance. Subjects learn when placed in the shuttlebox that they receive a shock at the end of the tone if they do not shuttle, and that they do not receive a shock at the end of the tone if they do shuttle. After a bit of practice, they almost always jump shuttle before the tone has reached its end. Consequently, when the shocker is disconnected, they have little opportunity to learn that the previous contingency between shuttling and shock-omission has changed. Tone-in-the-absence-of-shuttling still signals shock and, just as before, after a shuttle the tone ends and no shock follows. From the subject's perspective, nothing has changed and so avoidance behavior continues. Only after the subject has experienced a few failures to shuttle that were not followed by shock can it learn that there is no need to shuttle.
The rats learned to avoid the shocks on this procedure, which came to be known as Sidman avoidance or less comonly, as free-operant avoidance. (I prefer to reserve the latter term as a generic category label for any free-operant, as opposed to discrete-trials, avoidance procedure.) In Sidman avoidance, the interval between response and shock is called the R-S or response-shock interval (typical value: 20 seconds). The interval between shocks in the absence of responding is called the S-S or shock-shock interval (typical value: 5 seconds). The short S-S interval is designed to break up freezing behavior, which tends to follow shocks, so that the baseline of lever-pressing remains above zero during initial training.
Anger referred to these time-correlated CSs, somewhat tongue-in-cheek, as conditioned aversive temporal stimuli," or "CATS." Anger's analysis thus came to be known as the "CATS theory of avoidance." The rats pressed the lever to escape from CATS . . .
The CATS theory ran into difficulty from data collected by Sidman in a modified version of his procedure. In this version, a warning signal came on five seconds before the end of the R-S interval and continued until shock occurred, unless a response occurred first and reset the R-S interval timer. This modification is called "signaled Sidman avoidance" to distinguish it from the standard variety in which there is no warning signal.
In unsignaled avoidance, subjects typically respond at a much higher rate than necessary, typically resetting the R-S timer every few seconds. However, when the warning signal was added, subjects quickly learned to wait until the signal came on before responding. Even then they tended to respond in no apparent haste, waiting until almost the last possible moment before pressing the lever and resetting the R-S timer.
This behavior is problematic for the CATS theory, because the explicit signal presented just before the end of the R-S interval should have become highly aversive, just as the internal, time-correlated stimuli are supposed to do, and subjects should have been highly motivated to escape quickly from it by pressing the lever. Instead, subjects seemed to be using the signal more as a discriminative stimulus for lever-pressing. Rather than acting out of fear, the subjects seemed to be calmly acting to prevent the shock that would otherwise follow.
Sidman himself proposed a different analysis that did not depend on conditioned fear. Instead, it was based on differential punishment of other behavior." The idea there is that, when the subject fails to press the lever, whatever other behaviors it engages in sooner or later will be paired with shock. This punishment of "other behaviors" suppresses them, leaving only lever-pressing unpunished and unsuppressed. In other words, the subjects in Sidman avoidance are lever-pressing because it's the only thing they can do that is not paired with shock.
In the Herrnstein and Hineline avoidance procedure, shocks were programmed to occur on a variable time (VT) schedule. This is like a variable interval schedule except that there is no response requirement: a shock is delivered at the end of each programmed interval regardless of what the subject does. This schedule programmed shocks to occur at an average rate of one per two minutes. However, if the rat pressed a lever, this act would switch control of shock delivery to a second VT schedule, this one programming shocks at an average rate of one per four minutes, or at half the rate of the first schedule. When the interval currently being timed by the VT 4-minute schedule ended, the shock was delivered and control was switched automatically back to the VT 2-minute schedule. At this time, another response on the lever could again switch control over shock-delivery to the VT 4-minute schedule, and so on.
Note that in this procedure, there are no regular time intervals between response and the next shock. In fact, if the subject pressed the lever just as the current VT 4-minute interval was ending, a shock would follow the response almost immediately. However, on the average, the time to shock would be longer following a response than if the response had not been made.
The procedure as designed did not allow subjects to avoid all shocks, but by pressing the lever immediately after each shock, they could keep the VT 4-minute schedule in effect for most of the session and thereby receive about half as many shocks as they would if not responding. The rats required quite a bit of experience with this schedule before they "caught on," but most did eventually learn to press the lever and avoid unnecessary shocks.
In Herrnstein and Hineline's procedure, about all the subject can learn is the average rate at which shocks occur in the absence of responding, and the average rate at which they occur in the presence of responding. Herrnstein and Hineline proposed that subjects are in fact able to perceive these shock rates. If one makes the reasonable assumption that rats prefer a lower rate of shock-delivery over a higher one, they said, then the rats will learn a response that reduces the shock frequency. In other words, the reinforcer for avoidance responding is shock frequency reduction.
The idea that rats can sense the average rate of events over time, and that their choices are influenced by differences in such long-run outcomes, amounts to a molar account of avoidance behavior. Molar accounts assume that behavior can be sensitive to long-run outcomes. They require us to assume that there are mechanisms inside the organism that can do things like summate or average events over time -- rather sophisticated capabilities. In contrast, molecular accounts assume that behavior is sensitive only to immediate events, such as the following of response by shock. An example is Sidman's differential punishment analysis, where the immediate association of bits of behavior with shock suppresses those behaviors.