Three-Term Contingency
The famous behavioral scientist B. F. Skinner
believed that, in order to experimentally analyze human and animal behavior, each behavioral act can be broken down into three
key parts. These three parts constitute his three-term contingency: discriminative stimulus, operant response, and reinforcer/punisher. The three-term contingency is fundamental to the study of operant conditioning.
To illustrate the operation of behavioral analysis,
the behavior of leaving class when the school day is over can be broken down into the parts of the three-term contingency.
The bell, which serves as the discriminative stimulus, is sounded at the end of the school day. When the bell rings, students
exit the classroom. Exiting the classroom is the operant response. Reinforcement of leaving the classroom at the proper time results from the other behaviors
in which students can engage now that the school day is over.
But, if the same behavior of exiting the classroom
occurs prior to the bell's ring (that is, in the absence of the discriminative stimulus), a student now faces punishment.
Punishment of leaving class early would result because this behavior violates school rules
and leads to a variety of adverse consequences, like staying after school.
Discriminative
Stimulus
A discriminative
stimulus influences the occurrence of an operant response because of the contingencies of schedules of reinforcement or paradigms of reinforcement/punishment that are or have been associated with that response. Many authors further suggest
that discriminative stimuli provide information to the organism, allowing it to respond appropriately in the presence of different
stimuli. An observing response is sometimes necessary for presentation of the discriminative stimulus/stimuli.
For example, different
individuals can serve as discriminative stimuli in a joke-telling situation. The jokes that you tell your priest are probably
different from the jokes that you tell your best friend because of your past history of telling jokes to both people.
Experimentally,
we can observe the discrimination of stimuli by associating different discriminative stimuli with different schedules of reinforcement or paradigms of reinforcement/punishment.
For instance, in
the laboratory, a pigeon could be required in the presence of a steady chamber light to peck a key on a Fixed Interval schedule
to produce food, whereas it could be required in the presence of a blinking chamber light to pull a chain on a Variable Ratio
schedule to turn off a loud noise. The discriminative stimuli clarify the "rules of the game," making each prevailing three-term contingency unambiguous.
Operant Conditioning
Operant conditioning, also called instrumental
conditioning, is a method for modifying behavior (an operant) which utilizes contingencies between a discriminative stimulus, an operant response, and a reinforcer to change the probability of a response occurring again in that situation. This
method is based on Skinner's three-term contingency and it differs from the method of Pavlovian conditioning.
An everyday illustration of operant conditioning
involves training your dog to "shake" on command. Using the operant conditioning technique of shaping, you speak the command to "shake" (the discriminative stimulus) and then wait
until your dog moves one of his forepaws a bit (operant response). Following this behavior, you give your dog a tasty treat
(positive reinforcer). After demanding ever closer approximations to shaking your hand, your dog finally comes to perform
the desired response to the verbal command "shake."
Skinner is famous for the invention of the
Skinner box, an experimental apparatus which he designed to modify animal behavior within
an operant conditioning paradigm.
Shaping
Shaping modifies behavior by reinforcing behaviors that progressive approximate the target behavior (operant response). Shaping can be used to train organisms to perform behaviors that would rarely
if ever occur otherwise.
For example, to teach a child to write his
or her first name, you initially give praise for writing the first letter correctly. After the child has mastered that first
step, letter-by-letter you give praise until the entire name is correctly written.
Operant Response
An operant response is a behavior that is modifiable
by its consequences. When behavior is modified by its consequences, the probability of that behavior occurring again may either
increase (in the case of reinforcement) or decrease (in the case of punishment).
For example, speeding through a red light may
lead to getting struck broadside by another vehicle. If this consequence follows such a response, then the likelihood of a
person's responding in the same way under similar conditions should drop significantly.
It is also possible for temporal or topographical
properties of behavior to be modified by reinforcement (in the case of response differentiation).
Free Operant
Once an operant response occurs, it may
be "free" or available to occur again without obstacle or delay. This would be the case, for example, of someone picking up
a stone from a rocky beach and skipping it across the water.
Other operants are only available for very
limited periods of time and cannot be freely repeated. This would be the case, for example, of someone wishing their friend
a happy birthday.
Reinforcement
Reinforcement is defined as a consequence that follows an operant response that increase (or attempts to increase) the likelihood of that response occurring
in the future.
Positive Reinforcement
In an attempt to increase the likelihood of
a behavior occurring in the future, an operant response is followed by the presentation of an appetitive stimulus. This is positive reinforcement.
If you stroke a cat's fur in a manner that
is pleasing to the cat it will purr. The cat's purring may act as a positive reinforcer, causing you to stroke the cat's fur in the same manner in the future.
Negative Reinforcement
In an attempt to increase the likelihood of
a behavior occurring in the future, an operant response is followed by the removal of an aversive stimulus. This is negative reinforcement.
When a child says "please" and "thank you"
to his/her mother, the child may not have to engage in his/her dreaded chore of setting the table. Therefore, not having to
set the table will act as a negative reinforcer and increase the likelihood of the child saying "please" and "thank you" in the
future.
Reinforcer
A behavior (operant response) is sometimes more likely to occur in the future as a result of the consequences
that follow that behavior. Events that increase the likelihood of a behavior occurring in the future are called reinforcers.
Positive Reinforcer
A positive reinforcer is an appetitive event
whose presentation follows an operant response. The positive reinforcer increases the likelihood of that behavior occurring
again under the same circumstances.
click here for an example of a positive reinforcer
Negative Reinforcer
A negative reinforcer is an aversive event
whose removal follows an operant response. The negative reinforcer increases the likelihood of that behavior occurring
again under the same circumstances.
click here for an example of a negative reinforcer
Primary Reinforcer
A primary reinforcer is a reinforcer that is
biologically pre-established to act as reinforcement.
Food, water, and sex are all primary reinforcers
because they satisfy biological desires.
Conditioned Reinforcer
A conditioned reinforcer is a previously neutral
stimulus. If the neutral stimulus is paired with a primary reinforcer it acquires the
same reinforcement properties associated with the primary reinforcer.
Money is a conditioned reinforcer. The actual
paper bills are not themselves reinforcing. However, the paper bills can be used to acquire primary reinforcers such as food,
water, and shelter. Therefore, the paper bills become reinforcers as a result of pairing them with the acquisition of food,
water, and shelter.
Punishment
Punishment is defined as a consequence that follows an operant response that decreases (or attempts to decrease) the likelihood of that response occurring
in the future.
Positive Punishment
In an attempt to decrease the likelihood of
a behavior occurring in the future, an operant response is followed by the presentation of an aversive stimulus. This is positive punishment.
If you stroke a cat's fur in a manner that
the cat finds unpleasant, the cat may attempt to bite you. Therefore, the presentation of the cat's bite will act as a positive punisher and decrease the likelihood that you will stroke the cat in that same manner
in the future.
Negative Punishment
In an attempt to decrease the likelihood of
a behavior occurring in the future, an operant response is followed by the removal of an appetitive stimulus. This is negative punishment.
When
a child "talks back" to his/her mother, the child may lose the privilege of watching her favorite television program. Therefore,
the loss of viewing privileges will act as a negative punisher and decrease the likelihood of the child talking back in the future
Punisher
A behavior (operant response) is sometimes less likely to occur in the future as a result of the consequences
that follow that behavior. Events that decrease the likelihood of a behavior occurring in the future are called punishers.
Positive Punisher
A positive punisher is an aversive event whose
presentation follows an operant response. The positive punisher decreases the likelihood of the behavior occurring again
under the same circumstances.
click here for an example of a positive punisher
Negative Punisher
A negative punisher is an appetitive event
whose removal follows an operant response. The negative punisher decreases the likelihood of that behavior occurring again
under the same circumstances.
Schedules of Reinforcement
Schedules of reinforcement are the precise
rules that are used to present (or to remove) reinforcers (or punishers) following a specified operant behavior. These rules
are defined in terms of the time and/or the number of responses required in order to present (or to remove) a reinforcer (or
a punisher). Different schedules schedules of reinforcement produce distinctive effects on operant behavior.
Interval Schedule
Interval schedules require a minimum amount
of time that must pass between successive reinforced responses (e.g. 5 minutes). Responses which are made before this time
has elapsed are not reinforced. Interval schedules may specify a fixed time period between reinforcers (Fixed Interval schedule)
or a variable time period between reinforcers (Variable Interval schedule).
Fixed Interval schedules produce an accelerated
rate of response as the time of reinforcement approaches. Students' visits to the university library show a decided increase
in rate as the time of final examinations approaches.
Variable Interval schedules produce a steady
rate of response. Presses of the "redial" button on the telephone are sustained at a steady rate when you are trying to reach
your parents and get a "busy" signal on the other end of the line.
Ratio Schedule
Ratio schedule require a certain number of
operant responses (e.g., 10 responses) to produce the next reinforcer. The required number of responses may be fixed from
one reinforcer to the next (Fixed Ratio schedule) or it may vary from one reinforcer to the next (Variable Ratio schedule).
Fixed Ratio schedules support a high rate of
response until a reinforcer is received, after which a discernible pause in responding may be seen, especially with large
ratios. Sales people who are paid on a "commission" basis may work feverously to reach their sales quota, after which they
take a break from sales for a few days.
Variable Ratio schedules support a high and
steady rate of response. The power of this schedule of reinforcement is illustrated by the gambler who persistently inserts
coins and pulls the handle of a "one-armed bandit."
Extinction
A special and important schedule of reinforcement
is extinction, in which the reinforcement of a response is discontinued. Discontinuation of reinforcement leads to the progressive
decline in the occurrence of a previously reinforced response.
ADDITIONAL TERMS,
CONCEPTS & EXAMPLES
Observing response
An observing response is a response that leads
to exposure to a discriminative stimulus or discriminative stimuli. An observing response can also be viewed as a type
of attention.
For example, before you can take money from
your account at an ATM, you have to enter your personal identification number (PIN). Entering the PIN is the observing response
required to view the transaction choices
For example, before you can take money from
your account at an ATM, you have to enter your personal identification number (PIN). Entering the PIN is the observing response
required to view the transaction choices
Positive Patterning
The positive patterning procedure involves
presenting simultaneous compound stimuli that are paired with reinforcement (AB+) while withholding reinforcement when a single stimulus is presented (A-,
B-). Positive patterning produces an increase in responding to the compound stimuli and a decrease in responding to a single stimulus.
An example of positive patterning is present
in the daily medicinal routine of an asthmatic. To prevent an asthma attack, the person has to take two medicines: one that
opens the patient's lungs and another to enhance breathing capabilities. When the medicines are taken simultaneously, the
asthmatic's lungs are opened, breathing is improved, and therefore an asthma attack is prevented. However, if each medicine
is taken alone an asthma attack may still occur. As a result of the medicines preventing an attack when taken simultaneously,
the patient is more likely to take the drugs together than separately.
Skinner Box
Prior to the work of Skinner, instrumental
learning was typically studied using a maze or a puzzle box. Learning in these settings is better suited to examining discrete trials or episodes of behavior, instead of the continuous stream of behavior. The Skinner Box is an experimental environment that is better suited to examine
the more natural flow of behavior. (The Skinner Box is also referred to as an operant conditioning chamber.)
A Skinner Box is a often small chamber that
is used to conduct operant conditioning research with animals. Within the chamber, there is usually a lever (for rats)
or a key (for pigeons) that an individual animal can operate to obtain a food or water within the chamber as a reinforcer. The chamber is connected to electronic equipment that records the animal's lever
pressing or key pecking, thus allowing for the precise quantification of behavior.
Discrete Trial
A discrete trial represents an isolated opportunity
for an organism to make a single operant response to a discriminative stimulus. Successive trials are separated by intertrial intervals during which no discriminative
stimuli are presented and operant responses are either precluded or are not reinforced.
Thorndike's puzzle box, the Skinner Box, and the T-maze are all apparatuses that can be used for experimental designs
involving discrete trials.
An example of a discrete trial procedure can
be illustrated with the Skinner Box. When it is inserted into the box, pressing a lever can deliver food to a hungry rat;
otherwise, the lever is unavailable and no presses can produce food.
Relative Validity
If multiple discriminative stimuli are differentially correlated with reinforcement, then stimulus control may be stronger to those stimuli that more reliably signal the occurrence and
nonoccurrence of reinforcement. Such stimuli are often said to be more valid or diagnostic cues.
For example, suppose that in Case 1, compound
discriminative stimuli AX and BX are explicitly associated with reinforcement and extinction, respectively. We would expect behavioral control by the A and B elements to
exceed that by the X element.
Now, suppose that in Case 2, the compound discriminative
stimuli AX and BX are equally often associated with reinforcement and extinction. Here, we would expect that control by the
A and B elements would be similar to that by the X element.
These expectations have been borne out by behavioral
evidence in both human beings and laboratory animals. Critically, behavioral control by the X element in Case 1 is much lower
than in Case 2, despite the fact that the X element is equally often associated with reinforcement and extinction in each
case.
This relative validity effect shows
that the behavioral control that is exerted by one element of a compound discriminative stimulus (here X) depends on the relative
discriminative validity of other stimuli (here A and B) with which it occurs in compound.
Casually speaking, organisms learn to attend
to the stimuli that are the most diagnostic of events of importance to them.
Overshadowing
A discriminative stimulus when it is presented alone may exert strong stimulus control over operant behavior. However, if that discriminative stimulus is accompanied by another, then stimulus
control by the first (or overshadowed) stimulus may be reduced or eliminated by the second (or overshadowing) stimulus.
For example, I might easily recognize my son
by the cowlick in his hair. But, I would be more likely to recognize him from his distinctive gait when he is walking in the
playground. We would say that control by his cowlick was overshadowed by control by his gait.
Response-Dependent Reinforcement
Response-dependent reinforcement requires the
organism to perform an operant response before receiving reinforcement.
An example of response-dependent reinforcement
involves a vending machine. If after inserting your money into the machine you don't push the proper button to deliver your
candy bar, then you will not receive the reinforcer.
Response-Independent Reinforcement
Response-independent reinforcement delivers
reinforcers to an organism regardless of its behavior (see yoked control).
Mass mailings of trial-size new products, like
a new cereal, provides a reinforcer to the recipient, regardless of the recipient's behavior
Temporal Contiguity
Temporal contiguity occurs when two stimuli are experienced close together in time and, as a result an association may be
formed. In Pavlovian conditioning the strength of the association between the conditioned stimulus (CS) and the unconditioned stimulus (US) is largely affected by temporal contiguity. In operant conditioning, the association
between the operant behavior and the reinforcer/punisher is also largely affected by temporal contiguity. Superstitious behavior occurs as a result of the temporal contiguity between a behavior and a reinforcer/punisher
that is independent of that behavior.
You utilize aspects of temporal contiguity
each time you make a causal judgment. For example, when your stomach becomes upset, you typically attempt to figure out why.
When you are trying to find the cause of your stomach ache, it is more likely that you will place the blame on the Chinese
food you ate earlier that afternoon, as opposed to the fast-food you ate the week before. This is because you ate the Chinese
food at a time that was closer to the occurrence of your stomach ache.
Blocking
If a discriminative stimulus (A) is first presented alone and is followed by reinforcement, then that stimulus may gain strong stimulus control over operant behavior. If that discriminative stimulus is later combined with a novel stimulus (X)
and the stimulus compound (AX) is followed by the same reinforcement, then little or no control may be exerted by the second
stimulus (X) when tests with it alone are conducted.
An additional comparison is critical to show
that stimulus control by X has been blocked by prior training with A. Here, only AX training is given. Stimulus control by
X alone in this comparison condition greatly exceeds that in the first condition.
For example, the printed word "stop" on a black
sign might easily be able to control a motorist's braking response. But, after a long history of braking at red stop signs,
a motorist might very well speed past a black sign with "stop" printed on it. We would then say that the color of the stop
sign had blocked the lettering on it in controlling the motorist's braking behavior.
Yoked Control
Experimental control is essential in operant conditioning. Without baseline measures, such as the operant level, it is difficult to draw conclusions about the reasons for any observed changes
in operant response rates.
In the yoked control procedure, the rate of
responding by an experimental subject is compared to that by a control subject; the latter is yoked to the former in terms
of the receipt of reinforcement (or punishment). This comparison helps to confirm that changes in operant responding by the
experimental subject are due to the contingencies of reinforcement between its behavior and its consequences; otherwise, the
two subjects receive reinforcers (or punishers) at the same time.
In the yoked control procedure, then, responding
by the experimental subject results in response-dependent reinforcement. In contrast, the control subject receives response-independent reinforcement.
For example, parents often give their child
a treat as a reinforcer for good grades. Frequently, if the child has a sibling, parents will give the sibling a gift as well
so that child will not feel ignored. The gift is not dependent on any previous behavior of the sibling.
ANOTHER TYPE OF REINFORCEMENT (LIKE A REFLEX)
Pavlovian Conditioning
Pavlovian conditioning is an important form
of learning that involves the pairing of stimuli independent of an organism's behavior. The key stimulus and response elements
of Pavlovian conditioning are:
Unconditioned
stimulus
This type of stimulus
unconditionally elicits a response, also referred to as a respondent. For example, a puff of air to the cornea of the eye is an unconditioned stimulus
that produces a blinking response.
Unconditioned
response
This type of response
occurs to an unconditioned stimulus without prior conditioning. The blinking response after a puff of air to the cornea of
the eye is an example of an unconditioned response.
Conditioned
stimulus
A conditioned stimulus
in Pavlovian conditioning is an initially neutral stimulus that is paired with the unconditioned stimulus. For example, a
tone sounded just prior to the puff of air being delivered to the cornea of the eye. Without prior training, the tone does
not elicit an eye blink: however, after a number of tone-puff pairings, the tone alone comes to elicit the blinking response.
Conditioned
response
A conditioned response
in Pavlovian conditioning is the response that the conditioned stimulus elicits after it has been repeatedly paired with an
unconditioned stimulus. The conditioned response may be similar in form to the unconditioned response. For example, the eye
blink to the tone conditioned stimulus may involve the same bodily musculature as the eye blink to the puff of air to the
cornea.