Basic Terms and Procedure
Thorndike's "Puzzlebox" Experiments
Instrumental or Operant Conditioning was first studied experimentally by Edward L. Thorndike in what came to be known as the "puzzlebox" experiments. In Thorndike's procedure, the experimental subjects (cats or dogs) were observed as they learned how to operate one or more latch mechanisms that would release the door to the puzzlebox and allow the subject to escape the box and gain thereby gain access to some food that was available outside. What Thorndike observed was that the subjects engaged in more-or-less random behavior within the puzzlebox until one of those behaviors triggered the release of the latch, allowing the subject to escape. Upon being returned to the puzzlebox, the subject again appeared to engage in the same random mix of behaviors, but over the series of trials, those behavioral acts that had been immediately followed by latch-release occurred more and more frequently, displacing those other, ineffective behaviors until the animal was opening the latch almost immediately upon reentering the box.
Thorndike's Law of Effect
To describe this process, Thorndike used several terms that are defined below:
Thorndike envisioned that the process of learning the correct response went something like this:
- The Situation -- A complex of stimuli identifying the context in which the behavior occurs. For Thorndike's subjects, these would be stimuli that identify the interior of a puzzlebox, and perhaps internal stimuli such as those arising from hunger.
- Responses -- Specific behavioral acts, such as pushing against one of the slats forming the walls of the puzzlebox, or swiping a paw at a loop of wire.
- Satisfying state of affairs -- Anything the subject willingly approaches, and resists being separated from.
- Discomfort, or an annoying state of affairs -- Anything the subject avoids or abandons.
Notice that Thorndike, like Pavlov, assumed that learning involves the formation of associations via contiguity and repetition, and that the result is the creation of a new stimulus-response reflex, one based on the subject's personal history of experience rather than on inborn connections. The idea that all behavior can be reduced to stimulus-response reflexes has been called S-R psychology, and it became the dominant theoretical viewpoint in American psychology for at least the first half of the twentieth century.
- Situational stimuli elicit (trigger) numerous behavioral acts (responses), either because they are instinctively produced by such perceived circumstances, or because such behaviors had been effective in similar circumstances in the subject's experience. Some responses are more likely than others to occur in the situation, others less likely.
- If a response is accompanied or immediately followed by a satisfying state of affairs (release from the box, access to food), then the response's "connection to" (association with) the situation is strengthened. This means that situational stimuli become better able to elicit this particular response. Thus, when the situation recurs (the subject is placed back in the box), that particular response is more likely to occur. ("positive law of effect")
- Over trials, each time the response is followed by a satisfying state of affairs, the connection of response to situation strengthens. Eventually it becomes the response most likely to be immediately elicited by the situation.
- If a response is accompanied or immediately followed by a discomforting or annoying state of affairs, the connection of response to situation is weakened, so that the response becomes less likely to occur in that situation. ("negative" law of effect)
Basic Procedure for Demonstrating Instrumental Conditioning
The basic procedure demonstrating instrumental or operant conditioning can be described as follows:
A rather typical example of this procedure involves training hungry laboratory rats to press a lever
for food reward. Initially, lever-pressing occurs at a low rate, basically as a side-effect of the rat's exploratory activities. However, each lever-press response is followed immediately by the delivery of a food pellet, which the rat soon discovers and eats, and very quickly the rate of lever-pressing increases. We can say that the pellets are serving as reinforcers of the lever-press response, and that the lever-press response has been reinforced.
- Specify some behavioral act (response) you wish to condition.
- Observe how often this act occurs prior to conditioning.
- Arrange for that behavior to be followed immediately by some specified consequence.
- Observe how often this act occurs while the specified consequence is in force.
- If the specified response increases in frequency, then the specified consequence is a reinforcer and the response has been reinforced.
Note that what is important in instrumental or operant conditioning is the consequence of the response. This consequence occurs because we have arranged a contingency or relationship between the occurrence of the response and the delivery of the reinforcer. Because the response is "instrumental" in obtaining the reinforcer, we call this type of conditioning instrumental conditioning. That response also "operates" on the environment, producing some sort of change, so this type of conditioning is sometimes also referred to as operant conditioning and the response as an operant.
Some Basic Terms Defined
I've used several new terms while describing the instrumental conditioning process above. In this section I provide definitions for these terms and some additional ones.
- Some observable behavioral act, such as pressing a lever. (Equivalent to Thorndike's "response")
- A consequence of behavior that increases the probability of the behavior
- Reinforcement -- this has three distinct meanings:
- The procedure of arranging a contingency between an operant and a reinforcer. (Equivalent to Thorndike's "satisfying state of affairs")
- The process whereby an operant increases in probability owing to its being followed by a reinforcer.
- A synonym for reinforcer. (poor usage, in my opinion)
- Discriminative Stimulus -- this has two definitions:
- A stimulus in the presence of which a response is reinforced. (Somewhat equivalent to Thorndike's "situation.")
- A stimulus that sets the occasion for a response. (Such a stimulus indicates when a response will or will not tend to be reinforced.)
- Stimulus Control
- The effect of a discriminative stimulus on an operant -- when the discriminative stimulus occurs, the operant becomes more probable.