Then one day, Skinner returned to the lab after an experimental session had finished and discovered that the relay actuating the feeder had become unreliable: only occasionally would one of the rat's lever-presses succeed in producing a pellet -- most went unreinforced. One might have expected the rate of responding on the lever to fall, but instead, it actually went up.
One immediate implication of this result was that Skinner could begin to schedule reinforcers like this on purpose and get by on far fewer pellets per session. But far more important was Skinner's realization that behavior could not be completely predicted simply from the knowledge that certain responses were being reinforced. The rate of responding and its pattern of change over time depends not only on what response is reinforced, but on the way in which reinforcers are scheduled -- on the schedule of reinforcement. With his usual thoroughness, Skinner began the job of determining the pattern of responding characteristic of various reinforcement schedules.
To examine how the rate of responding varies over time on different schedules, Skinner created a device called a cumulative recorder, which automatically produces a type of graph known as a cumulative record. In this graph, each response moves the pen upward one small notch, while the pen moves horizontally at a constant slow speed across the paper. At any given moment during the experimental session, the vertical position of the pen indicates the total number of responses produced since the session started, a number called the cumulative responses. The horizontal distance from the origin at the left edge of the paper indicates the number of minutes and seconds since the session began.
A steady rate of responding causes the pen to move upward at a steady rate while, at the same time, the paper is being moved under the pen at a constant rate. The combination of these two motions causes the pen to draw a line whose slope directly indicates the rate of responding -- steep slopes for high rates, shallow slopes for low rates, and a horizontal line if the response rate falls to zero. A laboratory rat that began the session responding at a high rate and then gradually slowed down would produce a line that began by rising steeply and then gradually curved to become more and more nearly horizontal.
Current evidence suggests that the post-reinforcement pauses occur because of the delay between the resumption of responding and eventual delivery of the reinforcer. With high ratios this delay can be considerable. Delay of reinforcement is known to weaken the effect of reinforcer delivery on the probability of the response. Following reinforcer delivery, the subject has the option of initiating a new ratio run or engaging in some other activity, which may provide its own inherent, immediate reinforcement. Because of the delay between initiating a ratio run and receiving the reinforcer, this activity may not be as attractive as some others, so the subject does these others first, creating the post-reinforcement pause.
As for the ratio run, it is generally true of ratio schedules that higher response rates yield higer rates of reinforcement. This builds in a positive feedback loop that tends to drive response rates to the highest levels that the subject can comfortably sustain.
The absence of significant post-reinforcement pauses is a striking feature of variable ratio schedules, given their ubiquity on fixed ratio schedules. The lack of pauses appears to be due to the fact that occasionally even the very first response or two after reinforcement will yeild another reinforcer, due to the schedule's inclusion of a few very low ratios among the variable ratio sizes provided. Consequently, initiation of a new ratio run is sometimes strongly reinforced by nearly immediate reinforcer delivery, so that this behavior remains a relatively high-probability one even right after completion of the previous ratio. The steeply rising cumulative record is due both to the elimination of pauses and, as in fixed ratio schedules, to the built-in direct relationship between rate of response and rate of reinforcement.
The explanation for the scalloped pattern of responding on FI schedules is still a matter of controversy. One explanation begins by noting that subjects are not perfect at timing intervals. If they were, they could simply wait until the interval ended and then respond once to receive the reinforcer. However, given some variability in the estimates, the subject using this strategy might wait too long to respond, thus reducing the overall rate of reinforcement. Responding too soon means wasted responses, but this may be of relatively little consequence to the subject compared to waiting too long. By beginning to respond early, subjects guarantee that they will be responding when the interval ends, and thus maximize the rate of reinforcement.
Another explanation refers to the effect of reinforcement delay on responding. In this view, responses occuring near the end of the interval, because they are soon followed by reinforcement, are strongly reinforced and can be expected to occur at a high rate. However, those occuring earlier in the interval will also be reinforced, although at a greater delay and therefore less effectively. Consequently the rates at which these earlier resonses occur will be lower. This analysis thus predicts a gradually rising rate of responding as the interval elapses. However, it cannot easily account for the fact that the point along the interval at which the rate begins to rise is a constant proportion of the interval length.
FI schedules are sometimes used in research on drug effects, where any disruption, caused by the drug, of the subject's ability to regulate its behavior according to time will produce a corresponding disruption of the fixed-interval scallop.
The reason for the lower response rates on variable interval schedules is that on these schedules, the rate of reinforcement is almost independent of the rate of responding, given at least a moderate response rate. No matter how rapidly the subject responds, a reinforcer is not going to be delivered until after the interval currently being timed is finished. So response rates will rise to the point that reinforcers are being collected almost as soon as they become available, and no higher. Because the size of the next interval cannot be predicted, the subject has no choice but to keep responding at that rate until reinforcer delivery. As in variable ratio schedules, the return to responding is occasionally reinforced almost immediately (when the interval being timed is very short) so subjects tend not to pause after reinforcement but begin responding again as soon as the reinforcer has been collected.
The response rate sustained on variable interval schedules is highest when the average interval length is short and declines with increasing average interval size. For example, subjects will respond faster on a VI 10-s schedule than on a VI 30-s schedule.
Because on VI schedules, reinforcement rate tends to be relatively independent of response rate, and because response rates tend to be relatively steady, VI schedules provide excellent baselines against which to evaluate the effects of drugs on behavior. For example, amphetamine produces an increase in response rates on VI schedules. Because the increased response rate produces little or no change in reinforcement rate, the response-rate effects of amphetamine are not confounded by any effect of response rate on rate of reinforcement.