The Rescorla-Wagner Model of Classical Conditioning

In 1972, Robert Rescorla and Allen Wagner presented a mathematical model intended to account for several well-known phenomena of classical conditioning, including the acquisition and extinction of the conditioned response to a simple CS, conditioned inhibition, and phenomena of conditioning to a compound CS (overshadowing and blocking). Here I describe the basic elements of the Rescorla-Wagner model and illustrate its application in several areas.

The Basic Model

I'll start by defining the variables and parameters of the basic model that applies to conditioning a simple CS, such as a tone. The model attempts to predict something called the associative value of the CS, an internal state that normally determines the strength of the conditioned response (e.g., in salivary conditioning, the number of drops of saliva produced during CS presentation). The job of the model is to predict how the associative value of the CS changes over trials of CS presentation during conditioning, extinction, or other procedures. So for starters, we have two variables:

V = the current associative value of the CS

deltaV = the change in associative value of the CS during a trial.

After each trial, the new value of V will equal the old value of V plus the change in value:

Vnew = Vold + deltaV

The basic Rescorla-Wagner formula shows how V changes during each trial. I'll give the formula first and then describe what the various parameters are:

deltaV = alpha(beta)(lamda - V)

Alpha represents the relative salience of the CS, or roughly speaking, its attention-gettingness. It is a number that can vary between 0 and 1, where 0 indicates that the CS attracts no attention and 1 indicates that it attracts maximum attention. Salience can be manipulated by, for example, making the stimulus more or less intense, or making it vary rapidly as in a warbling tone or flashing light.

Beta represents the relative strength of the US, in terms of its ability to promote conditioning to the CS, and like alpha, can vary between 0 and 1. The closer beta is to 1.0, the greater the rate of conditioning (or extinction). Rescorla and Wagner describe it as a learning-rate parameter.

Finally, lamda represents the maximum associative value that can be conditioned to the CS under the conditions of the experiment. It reflects such things as the strength of the unconditioned response elicited by the US and the effect of any time lags between CS and US.

Given these definitions, the basic Rescorla-Wagner formula states that the change in the associative strength of the CS during a trial will depend directly on (a) the salience of the CS (alpha), (b) the strength of the US (beta), and (c) the difference between the maximum associative value of the CS and its current value (lamda - V).

To apply the basic Rescorla-Wagner model, we need to do three things. First, we need to set the values of our parameters, alpha, beta, and lamda. For example, we could model simple conditioning with a moderately salient CS by choosing alpha = .5 (this would make it half-way between no salience and maximum salience). Beta, the learning rate parameter, affects only the rate at which the value changes and not the basic shape of the curve, so we can choose a value that produces learning that is neither too slow nor too rapid, say, .2. lamda can be set conveniently to 100, thus scaling the value of the CS as a percentage. Second, we need to set the initial value of the CS. For example, if the CS starts out as a neutral stimulus, you would assign it an initial value of zero. To model extinction, you might start it out fully conditioned, or a value of 100%. Third, we need to do the calculations, trial-by trial, in order to see how the value of the CS changes during the course of the experiment.

Simple Conditioning: Acquisition

For this model, we'll assume that the CS is initially neutral and has a moderate salience.. To make the illustration simple, we'll assume that the learning rate is maximum: Stated mathematically, we have

V = 0%

alpha = .5

beta = 1

lamda = 100%

Plugging these values into the basic formula gives:

deltaV = .5(1)(100 - 0) = .5(100) = 50%

Thus, on the first paring of CS and US (Trial 1) and given our parameter values, the associative value of the CS has increased by 50%. Its new value after the trial is

V = 0 + 50% = 50%

On the second trial,

deltaV = .5(1)(100 - 50) = .5(50) = 25%

Thus, the associative strength of the CS has increased by 25%, yielding a new value of

V = 50% + 25% = 75%

On the third trial,

deltaV = .5(1)(100 - 75) = .5(25) = 12.5 %, and

V = 75 + 12.5 = 87.5%.

Notice that, given our parameters, on each trial, the value of the CS increases by half the remaining difference between its maximum possible value and its current value. Because there is less and less remaining difference after each subsequent trail, the rate at which the CS value increases is actually declining, and eventually there will be very little increase in value across trials as the value of the CS approaches its maximum value (lamda). A graph of these data showing V as a function of trials would present a line that rises steeply at first but gradually bends more and more horizontally as it approaches lamda. This particular type of curve is called an increasing, negative exponential function. Thus, the Rescorla-Wagner model predicts that the associative strength of the CS during acquisition will rise as a negative exponential function of trials toward an asymptotic value of lamda (in this case, 100%).

Simple Conditioning: Extinction

To model extinction, you can assume that the CS has already been fully conditioned, so that its initial value is 100%. We know that extinction (presentation of the CS without the US) results in the loss of the conditioned response, so the maximum value that the CS will approach over trials is 0%. Thus we should set V = 100 and lamda = 0.

I won't go through the computations here; suffice it to say that the value of the CS decreases toward lamda rapidly at first, but ever more slowly as the CS value approaches lamda at asymptote. This decreasing, negative exponential function is sometimes referred to as an exponential decay function.

Compound Conditioning

In compound conditioning, two or more stimuli are presented and withdrawn together, as a package, called a compound stimulus. The simple stimuli making up the compound stimulus are termed the elements of the compound. For example, a tone and light might be the elements of the compound stimulus, tone+light. When substituted for a simple CS during conditioning, the compound stimulus would be called a compound CS.

The Rescorla-Wagner model for compound conditioning looks very much like the model for simple conditioning, but allows for the addition of the elements of the compound, as well as the compound CS itself. Thus, there is a V, deltaV, and alpha for each element, plus a V for the compound. Subscripts are used to distinguish the variables and parameters of each stimulus. For a two-stimulus compound consisting of stimuli A and B, the following formulas are necessary:

deltaVA = alphaA (beta)(lamda - VAB)

deltaVB = alphaB (beta)(lamda - VAB)


The last formula is Rescorla and Wagner's solution to the problem of how to combine the values of the elements to produce a value for the compound. Lacking any good reason to do otherwise, they chose the simplest approach and assumed simply that the value of the compound would be the sum of the values of the elements.

The interesting thing about a compound CS is that you can measure the strength of a conditioned response to the compound and to each of its elements separately. Doing so has revealed several surprising phenomena, including overshadowing and blocking. In the next sections, I describe these phenomena and how they are accounted for by the Rescorla-Wagner Model as developed for compound conditioning.

Compound Conditioning: Overshadowing

Overshadowing is sometimes observed after two neutral elements are combined to form a compound neutral stimulus and then the compound stimulus is converted into a compound CS by repeatedly pairing it with a US, such as food. The compound CS acts just like an ordinary simple CS during conditioning, but when the elements are tested separately, one element produces a stronger conditioned response than the other. For example, after reinforcing a compound CS consisting of tone+light, separate testing may reveal that the tone produces more salivation than the light. In that case the tone would be said to have overshadowed the light during conditioning.

In certain species, some stimuli naturally overshadow others, everything else being equal. In rats, for example, tones tend to overshadow lights, whereas in pigeons the reverse is true. This may be because rats depend more on their ears and noses, and pigeons on their eyes, when foraging for food, and thus tend to pay more attention to these inputs. But the "natural" order can be reversed by making the normally overshadowed stimulus more salient.

The Rescorla-Wagner model handles overshadowing by assigning an initially neutral value to each of the elements: VA = 0; VB = 0; VAB = VA + VB = 0 + 0 = 0. One of the two elements is then assigned a higher salience than the other, e.g., alphaA = .6, alphaB = .4. As training precedes, the stimulus with the greater salience acquires a higher proportion of the associative strength than the stimulus with the lesser salience, and when the compound CS becomes fully conditioned, no further change can occur. In the end, the stimulus with the higher salience accounts for a greater proportion of the total associative value of the compound, and thus produces a stronger conditioned response. In other words, it overshadows the less salient stimulus.

Compound Conditioning: Blocking

Blocking occurs when one of the elements of a compound CS begins already fully conditioned. The fully conditioned element is then paired with a neutral element and the compound CS thus formed is paired repeatedly forward-paired with a US as in ordinary conditioning. Despite the fact that, as part of the compound, the neutral stimulus has been paired many times with the US, subsequent testing of the elements reveals that the initially neutral stimulus of the compound has remained neutral. For example, if you fully condition salivation to a tone and then repeatedly pair the compound of tone+light with food, presenting the light alone will elicit no salivation, whereas the tone will continue to elicit strong salivation. In this case the tone is said to have blocked conditioning to the light.

You can model blocking with the Rescorla-Wagner model by assigning one element of the compound an initial value of zero (neutral stimulus) and the other element maximum conditioning (100%). Set the two saliences the same (e.g., .5 each). On the first trial, the value of the compound CS will initially be 0 + 100 = 100%. Each element will thus be changed in value during the trial by an amount equal to .5(1)(100 - 100) = 0. The same will happen on every trial, so the neutral stimulus will remain neutral and the fully conditioned stimulus will remain fully conditioned. The presence of the latter in the compound blocks all conditioning to the former during compound conditioning trials.

Compound Conditioning: Overexpectation

A victory for the Rescorla-Wagner model was its prediction of a phenomenon called overexpectation. Imagine separately conditioning two simple CS's until both were fully conditioned. Now, for the first time, present the two together as a compound CS. What will happen? According to the Rescorla-Wagner model, the value of the new compound CS will be 100 + 100 = 200%. In other words, the associative value of the compound is predicted to be twice the maximum value that can be conditioned to either CS alone! This prediction was tested, and although a dog doesn't actually salivate twice as much to the compound as to the elements (there are limits, after all), it does salivate more to the compound than to the individual elements! This was a surprising outcome at the time. Moreover, the model predicts that, as the compound is paired with the US trial after trial, the associative value of the compound CS will actually decrease until it reaches the same maximum that a simple CS would sustain.

A cognitive explanation for overexpectation would have it that the dog, on receiving both the tone and the light for the first time, might expect that it would receive the amount of food normally delivered following each separate stimulus, i.e., twice as much food as usual. The expectation of extra food leads to greater salivary output. Of course, whether this explanation is correct or not is a bit difficult to test, as we can't just ask the dog what it was thinking. However, this explanation may at least help you to recall what the overexpectation effect is.

Contextual Conditioning and the Simple CS

One of the new ideas Rescorla and Wagner proposed along with their theory is that even ordinary conditioning to a simple CS may actually be a case of compound conditioning. When the dog in Pavlov's laboratory heard a tone CS, its senses were also registering to greater or lesser degree the visual, auditory, olfactory, and other stimuli impinging on them at the time. These stimuli may enter into a compound with the CS the experimenter explicitly presents and affect the course of conditioning. Depending on their relative saliences, these various background or contextual stimuli would undergo increases in associative value along with the nominal CS with each pairing of CS and US. However, the contextual stimuli would also remain during the time between CS-US presentations (the intertrail interval), during which time they would not be further paired with the US. Consequently they would be expected to undergo some degree of extinction during the intertrial interval, whereas the nominal CS would not (because it is not present then). When this situation is modeled, the results indicate that the contextual stimuli initially acquire some degree of associative value, but as the CS gains in associative strength, the between-trials extinction of contextual stimuli eventually brings their value back to zero and the CS becomes fully conditioned. As it turns out, evidence does suggest that contextual stimuli participate in conditioning and during the early phases of training temporarily acquire some ability to elicit the conditioned response. Thus at least initially, the dog in Pavlov's laboratory might be expected to salivate not only to the CS used in training, but on entering the laboratory to the contextual stimuli of the lab itself.

Evaluating the Rescorla-Wagner Theory

Despite its impressive successes, the Rescorla-Wagner model has failed a number of important tests in research conducted over the past 25 years. Consequently others, including Wagner himself, have offered alternative proposals. These newer alternatives have had their own successes and difficulties, however, and at present there is no entirely satisfactory theory that can handle all the extant data on classical conditioning. By succeeding as well as it did, the Rescorla-Wagner model served to stimulate a considerable amount of research on classical conditioning designed to evaluate the model and test its implications, and to offer the hope that a quantitatively precise model of conditioning may yet be possible. In this sense the model has been a rousing success, even if eventually it began to show its weaknesses.

I should note that Rescorla-Wagner was never designed to account for some features of classical conditioning, and thus was always recognized as only a starting point toward a more inclusive theory. A major parameter left out of the model is time. All changes take place as a function of trials, and although trials follow themselves in time, such parameters as the length of a CS presentation, time between US presentations, or duration of a trace interval are not explicitly included in the Rescorla-Wagner model. As these factors have strong effects on the course of conditioning, their absence from the model was always a serious limitation.

Written by Bruce B. Abbott, Ph.D.
Psychology Department
Indiana University-Purdue University Fort Wayne