Originally published in Brush, F. R. (1960),
Aversive Conditioning and Learning, 183-233.
Only the first section (Introduction) is reproduced here, pp. 183-190.
There is something fundamentally wrong with our traditional interpretations of avoidance learning. There is probably also something basically wrong with the way in which avoidance experiments have traditionally been conducted. Consider the following contrasting pair of studies. Maatsch (1959) found that rats could learn in a single trial to jump out of a box where they had been shocked. On the other hand, when D'Amato and Schiff (1964) tried to train rats to press a bar to avoid shock they found that only 3 of their 24 Ss attained even a modest level of proficiency in 1000 trials. Other research, which we will consider shortly, indicates that although there were many differences in procedure between these two studies, much of this difference between 1-trial learning and 1000-trial failure to learn must be attributed to the requirement of different avoidance responses in the two cases.
Perhaps it is not surprising that jumping out of a box should be much easier to learn as an avoidance response than pressing a bar, but if we expect such a difference, then we would hope that our current theories of instrumental or operant learning would predict it and indicate why it occurs. However, our major theoretical statements of animal learning principles seem to be concerned with other matters since they contain no hind that the choice of the response is such an important parameter. They might attribute the poor acquisition of the bar press response to a low operant level. But this is an inadequate explanation because some of the D'Amato and Schiff Ss made hundreds of responses and still failed to learn to respond consistently. The response occurred, the presumed reinforcement contingency was applied, but little or no learning was found. It seems that either (1) we do not yet know what the effective reinforcing events are, so that we do not know how to arrange for their effective application, or (2) whatever the effective reinforcing contingencies are in avoidance learning, they simply are not effective in strengthening some responses in some situations. In this chapter I hope to convince the reader that we are actually beset by both of these difficulties, i.e., that the effective reinforcement contingencies in avoidance learning are not what they are usually assumed to be, and that there is, in fact, a rather limited class of responses that can serve effectively as avoidance responses. Let us consider first the problem of the response.
Learning theorists are naturally reluctant to recognize limits on the generality of their theories, and more than one theorist has gone to considerable trouble to encompass the phenomena of avoidance behavior. Certainly this is an important goal. The destiny of Pavlovian conditioning theory no doubt hinged in considerable part upon the classic papers of Hull (1929) who showed that it could account for avoidance learning and of Mowrer (1947) who showed that it could not. Perhaps equally important for the drive-reduction hypothesis of reinforcement were the papers of Mowrer (1939) showing how it explained avoidance learning and of Schoenfeld (1950) showing how it did not.
Today we may place less faith in the general systematic positions than we used to, but we are no less inclined to try to generalize our favorite assumptions. Thus, the widespread and largely successful application of various versions of reinforcement theory has gradually inclined us to see no limits to it. We have become inclined to believe that we may choose any animal, choose any response in its repertoire, and strengthen that response as much as we please by the appropriate application of a suitable reinforcer. Of course some allowance is made for the fact that learning is faster and more certain in some situations than in others, but this is treated not as a question of principle but only as a matter of technique. With sufficient diligence, the use of automated equipment and a few tricks included in the preliminary training we can train Ss to do almost anything. Such confidence in the powers of reinforcement may be justified in some cases (even though it detracts from the credit that belongs to a dazzling display of diligence, equipment, and tricks), but its application to avoidance behavior inspires little confidence. Applied to the aversive case, our principles of reinforcement have resulted in little practical success, not much ability to predict behavior, and almost no understanding of the behavior involved.
The trouble is that the difference between speed of learning to jump out of a box and the speed of learning to press a bar is a huge difference, vastly larger than can be produced by manipulating any of the usual array of experimental variables - including the events that are supposed to reinforce avoidance. This difference cannot be dismissed as a mere technical matter and there seems to be nothing in our bag of tricks that can get rid of it. We are proposing therefore that since the response requirement looms as such a large factor it should be made fundamental. It should be recognized as a first principle in avoidance learning.
As indicated elsewhere (Bolles, 1967; Bolles, Stokes, & Younger, 1966), the learning of an avoidance response (Ra is greatly facilitated if it is chosen to be one of the S's innate defense reactions. Here, we will go one step further and argue that an Ra can be learned only if it is a species-specific defense reaction (SSDR), or at least that it is very much like one of the animal's SSDRs. Let us see how far the SSDR hypothesis can go toward accounting for the importance of what Ra is required of the animal.
It is clear that the responses which are the easiest for the rat to learn in aversive situations (excepting perhaps the fear reaction itself) are those that take it out of the situation. Our first assumption then has to be that taking flight is one of the rat's SSDRs. The rat does, in fact, withdraw from stimulus events, and particularly geographical locations, that elicit fear. If a rat is forcibly placed in a frightening situation, then it will leave if it can do so. If the rat gets itself into a frightening situation, then it retreats in a species-specific manner from whence it came. A large part of the daily activity of the rat in the wild involves tracing and retracing escape routes in its territory with the result that should the occasion arise it will be able to take flight very quickly and effectively.
If an experimental S is places in a distinctive box and shocked a few seconds after a door is opened to an adjoining no-shock region, it will learn in very few trials to leave the shock box immediately upon being put there. This is the experimental paradigm of the "one-way" avoidance situation where the Ra is rapidly acquired, performed quite consistently, and is quite resistant to extinction (e.g., Clark, 1966; Theios, 1963). One-way avoidance learning can sometimes be demonstrated after a single trial (e.g., Maatsch, 1959). Santos (1960) reports that one-way avoidance of shock is acquired as rapidly as escape from shock: in fact, avoidance learning is so dependable in the one-way situation that no escape contingency is necessary. Here we find extremely rapid learning which is like avoidance in that the S eliminates situational cues that have been paired with shock by leaving the situation. There are a number of similar reports of rapid learning from "acquired drive" studies as well as avoidance studies (Denny in Chapter 4 of this volume describes some examples).
If an Ra which provides for a clean flight from the situation is readily acquired, we might also expect some moderate degree of learning of Ras that provide for a less clean leave-taking. The shuttlebox is a situation of this type since it permits S to leave its position but requires returning to a location where it has previously been shocked. The shuttlebox introduces an element of conflict because both sides of the apparatus elicit fear and the tendency to flee. Somewhat poorer acquisition should be expected than in the one-way situation and this is precisely what is reported (e.g., Theios & Dunaway, 1964). Typically, some proportion of the experimental group fails to learn (e.g., Brush, 1966; Kamin, 1959), and Ss that do learn the response are likely to be performing at a level of only 60 to 90% after 100 trials.
Considerably faster acquisition and better performance levels than these are usually reported in running wheel studies where the rat can run as though it were getting away but where it is not actually altering its location. In the running wheel nearly all rats acquire the Ra and performance approaches 100% within 40 trials or so (Bolles et al., 1966). The situations which involve equivocal flight-taking Ras provide a means of checking the SSDR hypothesis and making it more explicit. Unfortunately, there is relatively little data suitable for that purpose at the present time. We can only conclude that these Ras, and particularly the Ra required in the shuttlebox, are ambiguous: they are neither obviously SSDRs nor obviously not SSDRs. They occupy some middle ground that needs to be more carefully defined.
The rat has also been trained extensively in nonlocomotory situations where, in fact, flight is impossible. The investigator selects some "convenient, arbitrary piece of behavior" or operant, but what the selection invariably comes down to is pressing a bar or turning a small paddle wheel. The rat is notoriously poor at acquiring both of these responses as Ras (D'Amato & Schiff, 1964; Meyer, Cho, & Wisemann, 1960; Myers, 1959; Smith, McFarland, & Taylor, 1961).
Many Ss apparently never learn, as we noted above, and many more show a dropout effect in which a partially learned Ra begins to disintegrate after a few hundred trials (Anderson & Nakamura, 1964; Coons, Anderson, & Myers, 1960). The full extent of the difficulty is probably not apparent because failures to learn generally do not appear in the literature. We can only guess at the number of aspiring students of behavior who have undertaken to train rats to press a bar or engage in some other task to avoid shock, and who have given up after finding few Ss able to learn. (The first discriminated bar-press avoidance seems to have been reported as late as 1959 by Myers.) A further unfortunate consequence is that if the aspirant does not give up entirely, he may confine his study to a shuttlebox where the level of achievement is likely to be more encouraging. Due publication of his results in the shuttlebox is likely only to reinforce the widespread belief that of course the rat can learn any response, and we end up with serious ignorance about the actual range of responses that can serve the rat effectively as Ras.
In the last few years there have been a number of attempts to alter the bar-press situation so as to facilitate the acquisition of this response. Some of these attempts have been remarkably unsuccessful (e.g., Chapman & Bolles, 1964; D'Amato & Schiff, 1964; Meyer et al., 1960). But D'Amato has discovered that using discontinuous shock, i.e., brief pulses rather than continuous shock, produces considerable improvement (D'Amato & Fazzaro, 1966; D'Amato, Keller, & DiCara, 1964). Minimizing shock intensity seems to help (Bolles & Warren, 1965a; D'Amato & Fazzaro, 1966), as does using the longest possible CS-US interval (Bolles, Warren, & Ostrov, 1966). So does "shaping" the response (Feldman & Bremner, 1963; Keehn & Webster, 1968). But while all of these effects appear to be genuine and rationally explicable, none of these experimental manipulations makes the Ra easy to learn. Even under optimum conditions, bar pressing is a much more difficult Ra to acquire than running in a shuttle box.
From time to time we are told that the difficulty arises because the rat tends to freeze and hold onto the bar (Feldman & Bremner, 1963; Meyer et al., 1960). We may suppose, incidentally, that because freezing is such a characteristic part of the frightened rat's response repertoire, it too is an SSDR. Thus we find that the rat's defensive repertoire contains two quite different responses: taking flight and freezing. Now, rather than viewing freezing while holding the bar as decremental factor, a competing response which interferes with the acquisition of bar pressing, it can be argued that the rat is only able to press the bar at all because it freezes. In a typical instance what we observe is that the shock comes on and stays on while S frantically jumps and scratches and runs around in the box. Then it runs into the bar. The shock stops and so does the rat. (The rat has a remarkable tendency to stop whatever it is doing immediately upon shock termination.) It freezes in whatever posture and in whatever location it happens to be in when the shock stops. Then when the next shock comes on the rat is right there, the probability of hitting the bar again is much higher, and what counts is that it is in a position to make a very short latency escape response. Within 50 trials the rat will be terminating the shock with a latency on the order of .05 sec (Bolles & McGillis, 1968). By freezing on the bar the rat is able to execute an extremely rapid escape response (Re) which limits the duration of the shock to a value that does not disrupt its freezing behavior. Any other response topography is likely to lead to slower escape responses, i.e., longer shocks, and consequent disruption of that topography (Campbell, 1962; Migler, 1963). In short, this procedure is the best possible technique for training the rat to freeze on the bar. Since freezing is part of its natural defense repertoire, the animal is trapped in the performance of this behavior. Keehn (1967) has shown that the rat is very good at "learning" to hold the bar down continuously. It can be argued that since holding the bar down requires a less delicate touch than initiating a new press, it is more rapidly acquired because it is more compatible with freezing.
The rat can further reduce shock duration by partial execution of the response, i.e., by depressing the bar part way. When this occurs we should expect to start getting intertrial responses, some of which would avoid shock and produce a further reduction in the total amount of shock. However, the low levels of performance and the instability of the behavior so often found in this situation suggest that responses of this sort provide a poor starting point for establishing more substantial levels of performance. It is only after a number of "freezing trials" that some rats begin to initiate responses characterized by a relatively long latency, a clean topography, and increased variability. The fact that some rats ultimately acquire such behavior poses a challenge to the SSDR hypothesis since the bar-press response can scarcely be construed as an SSDR. The challenge can be met at this time only by making a set of assumptions. These are, first, that the rat must go through a phase in which if freezes on the bar, second, it must goe through a phase in which it minimizes shock partly by making adventitious Ras but mostly by making very quick Res. Third, it is then necessary to assume that a critical state is reached in which fear begins to dissipate and as it does the restriction upon S's repertoire relaxes so that S can not only flee and freeze, but also bar press. The transition cannot be made too quickly because, by the first assumption, freezing has to occur to keep S at the bar. It is a precarious situation and the transition is uncertain. Indeed, our assumptions are uncertain, but perhaps no more uncertain than whether in any given instance the rat will acquire the response. This assumed dissipation of fear and broadening of the response repertoire with continued training will be discussed further later. The emphasis here is upon the early acquisition of Ra, and the point of the argument is that the available data from bar press situations is not wholly inconsistent with the SSDR hypothesis.
Just as the rat frequently makes its first bar press Ras in the effort to get out of the box, so it tends to make the first wheel turning Ras while attempting to get out through the window the wheel is mounted on. On subsequent trials the situation is the same as in the bar press apparatus: S freezes attempting to hold onto the wheel. Because the wheel tends to turn when a little pressure is put on it, S has to readjust its paws onto the next rung and thereby generates a number of inter-trial responses, some of which serve to avoid shock. We have observed that essentially the same pattern occurs, although the topography of the response is a little different, when the rat is required to jump up on a pole suspended from the ceiling in order to escape and/or avoid shock.
In all of these cases, the response which is said to be a conditioned operant can be considered to be simply a slight modification of the S's freezing behavior. If the typical S learns anything in such situations it is to freeze while holding onto the manipulandum. Experimental procedures which minimize bar holding by punishing it sometimes facilitate performance (Feldman & Bremner, 1963; Brush, 1964a; Jones & Swanson, 1966) but they do not necessarily do so (Bolles & Warren, 1965b; Anderson, Rollins, & Riskin, 1966).
To summarize the argument so far: there is a clear alternative to the prevailing view that Ra can be any response in the S's repertoire. The alternative hypothesis states that the Ra is either a defense reaction or some very slight topographic modification of a defense reaction. The rat appears to have two such reactions, namely, fleeing and freezing. These are in a sense the only responses available to the frightened rat and we cannot profitably require it to learn any more than to do one rather than the other in any aversive situation. If we require a running Ra, in a shuttlebox, for example, the rat quickly learns it because this SSDR is already quite strong and only freezing competes with it. On the other hand, if we require bar pressing, then learning is uncertain because, to begin with, this Ra requires freezing in a particular location and with a particular posture, and this SSDR is in competition with all other freezing behavior and all of the S's flight behavior.
We must note that many animals, including the rat, have a particularly interesting defense reaction which is not really defensive at all but offensive. If a rat is shocked it will exhibit certain unconditioned reactions to the shock such as jumping, running, or flinching, depending upon the intensity (Kimble, 1955; Trabasso & Thompson, 1962). This behavior appears to be largely under the control of the prevailing shock stimulation: S appears to be stimulus bound. But Azrin and Ulrich and their collaborators have shown that if another animal is introduced into the situation there is a sudden and complete alteration in behavior: S will attack the other animal (see especially Azrin, Hutchinson, & Hake, 1967). Two aspects of this interesting phenomenon are relevant here. One is that the precise nature of the animal's repertoire when being aversively stimulated is more a function of other environmental stimuli than might be thought. The defensive repertoire is not inflexible by any means; it is highly adaptable to specific environmental constraints. In different apparatus S's behavior may consist mostly of running, jumping, or scratching at the floor. If a pathway for flight is available S is likely to find it. If no escape route is available S is likely to freeze, but it may attack another animal that is present, and it may attack even if there is no other animal: it may bite at the bar or the grid floor or the walls of the apparatus. Flight, freezing, and attack are all rather broad classes of behavior rather than specific responses with fixed topographies.
The second conclusion is that an animal such as the rat has three rather than two responses in its defense repertoire: it can leave the situation, it can freeze, or it can attack. It is entirely conceivable that the stance of the rat in front of the bar or in front of the paddle wheel has as its prototype the attack response or the threat reaction. Such an analysis is suggested by the findings (Bolles, unpublished) that bar-press avoidance is facilitated by having the bar about 3 in. Above the floor rather than much closer, and by putting shock on the bar itself rather than leaving it uncharged.
Up to this point, we have been primarily concerned with demonstrating that the chief factor determining how readily an Ra is learned is what the Ra is. By now there should be no question about either the main effect - some Ras are enormously easier to learn than others - or that the effect is intrinsic in the responses involved. We have tried to show how the SSDR hypothesis provides an explicit interpretation of this effect, and have reviewed in a rather cursory manner a few illustrative experiments. Now we shall have to hake a closer look at the experimental literature to test the viability of the SSDR hypothesis. The argument must also become a little more complicated because of the fact that the response requirement effect shows up not only as a main effect, i.e., in how fast learning occurs, it also shows up in a number of interactions. That is, the effects of other experimental treatments on avoidance learning also depend upon what response is required of S. The following sections will be concerned with the difficult question of what reinforces avoidance behavior. We will find that the contingencies which have historically been afforded the greatest importance, i.e., the escape contingency and the CS-termination contingency, appear to have rather limited generality, and that, in fact, their effectiveness appears to depend primarily upon what Ra is chosen to be.