Instrumental Conditioning

Basic Terms and Procedure

Thorndike's "Puzzlebox" Experiments

Instrumental or Operant Conditioning was first studied experimentally by Edward L. Thorndike in what came to be known as the "puzzlebox" experiments. In Thorndike's procedure, the experimental subjects (cats or dogs) were observed as they learned how to operate one or more latch mechanisms that would release the door to the puzzlebox and allow the subject to escape the box and gain thereby gain access to some food that was available outside. What Thorndike observed was that the subjects engaged in more-or-less random behavior within the puzzlebox until one of those behaviors triggered the release of the latch, allowing the subject to escape. Upon being returned to the puzzlebox, the subject again appeared to engage in the same random mix of behaviors, but over the series of trials, those behavioral acts that had been immediately followed by latch-release occurred more and more frequently, displacing those other, ineffective behaviors until the animal was opening the latch almost immediately upon reentering the box.

Thorndike's Law of Effect

To describe this process, Thorndike used several terms that are defined below:

Thorndike envisioned that the process of learning the correct response went something like this: Notice that Thorndike, like Pavlov, assumed that learning involves the formation of associations via contiguity and repetition, and that the result is the creation of a new stimulus-response reflex, one based on the subject's personal history of experience rather than on inborn connections. The idea that all behavior can be reduced to stimulus-response reflexes has been called S-R psychology, and it became the dominant theoretical viewpoint in American psychology for at least the first half of the twentieth century.

Basic Procedure for Demonstrating Instrumental Conditioning

The basic procedure demonstrating instrumental or operant conditioning can be described as follows: A rather typical example of this procedure involves training hungry laboratory rats to press a lever for food reward. Initially, lever-pressing occurs at a low rate, basically as a side-effect of the rat's exploratory activities. However, each lever-press response is followed immediately by the delivery of a food pellet, which the rat soon discovers and eats, and very quickly the rate of lever-pressing increases. We can say that the pellets are serving as reinforcers of the lever-press response, and that the lever-press response has been reinforced.

Note that what is important in instrumental or operant conditioning is the consequence of the response. This consequence occurs because we have arranged a contingency or relationship between the occurrence of the response and the delivery of the reinforcer. Because the response is "instrumental" in obtaining the reinforcer, we call this type of conditioning instrumental conditioning. That response also "operates" on the environment, producing some sort of change, so this type of conditioning is sometimes also referred to as operant conditioning and the response as an operant.

Some Basic Terms Defined

I've used several new terms while describing the instrumental conditioning process above. In this section I provide definitions for these terms and some additional ones.