Shaping Behaviour: A Guide to Operant Conditioning
Explore how actions are linked to consequences, a fundamental learning process first described by B.F. Skinner. We learn to repeat acts that bring rewards and avoid those that bring unwanted results.
What is Operant Conditioning?
Operant conditioning, also known as instrumental conditioning or Stimulus response learning, is a theory of learning that focuses on how the consequences of a voluntary behavior influence the likelihood of that behavior being repeated or stopped. Developed largely by B.F. Skinner, this theory suggests that behaviors are learned and modified through a system of rewards and punishments.
The core idea is simple: if a behavior is followed by a desirable outcome (a reward), it is more likely to happen again in the future. Conversely, if a behavior is followed by an undesirable outcome (a punishment), it is less likely to be repeated.
How Does It Work? The A-B-C Model
At its heart, operant conditioning follows a simple three-part model known as the A-B-C's: Antecedent, Behaviour, and Consequence. This sequence explains how the environment triggers a behaviour, which is in turn shaped by what happens immediately after.
A: Antecedent
The environmental trigger or cue that comes before a behaviour.
Example: The school bell rings for lunch.
B: Behaviour
The person's voluntary response or action to the antecedent.
Example: A student lines up quietly.
C: Consequence
What happens immediately after the behaviour, which determines if it will be repeated.
Example: The teacher praises the student.
The Four Pillars of Operant Conditioning
Positive Reinforcement
Adding something desirable to increase a behaviour.
Example 1: A child cleans their room and their parent gives them praise. The likelihood of them cleaning again increases.
Example 2: An employee exceeds their sales target and receives a bonus. They are motivated to continue performing at a high level.
Negative Reinforcement
Removing something undesirable to increase a behaviour.
Example 1: You fasten your seatbelt to stop the car's annoying beeping sound. You are more likely to buckle up to avoid the sound in the future.
Example 2: A student studies hard for an exam to avoid the anxiety of potentially failing. The removal of anxiety reinforces the studying behaviour.
Positive Punishment
Adding something undesirable to decrease a behaviour.
Example 1: A driver exceeds the speed limit and is given a speeding ticket. This addition of a fine decreases future speeding.
Example 2: A child touches a hot stove and feels pain. The painful sensation makes them less likely to touch the stove again.
Negative Punishment
Removing something desirable to decrease a behaviour.
Example 1: Two siblings fight over a toy, so a parent takes the toy away. The removal of the desired item decreases future fighting.
Example 2: A teenager misses their curfew, so their parents take away their phone privileges for a week. They are less likely to miss curfew again.
Operant Conditioning in Behaviour Support
Behaviour Support Practitioners use operant conditioning within a Positive Behaviour Support (PBS) framework. The primary goal is not just to reduce behaviours of concern, but to teach new, functional skills that help individuals get their needs met in more appropriate ways. The focus is overwhelmingly on reinforcement-based strategies.
Key Technique: Differential Reinforcement
This involves reinforcing one behaviour while withholding reinforcement for another. It's a powerful way to teach replacement skills.
Differential Reinforcement of Alternative Behaviour (DRA)
Reinforcing a more appropriate, alternative behaviour.
Example: A child who screams to get a snack is taught to use a picture card to request it. When they use the card, they get the snack (reinforcement). Screaming is ignored (extinction).
Differential Reinforcement of Other Behaviour (DRO)
Reinforcing the *absence* of a behaviour for a set time.
Example: A person who engages in disruptive vocalizations receives praise every 5 minutes they are quiet, encouraging calm behaviour.
Differential Reinforcement of Incompatible Behaviour (DRI)
Reinforcing a behaviour that's physically impossible to do with the target behaviour.
Example: A child who bites their nails is given a fidget toy and reinforced for using it, as they cannot bite their nails and use the toy simultaneously.
Common Application: Token Economies
Token economies are structured systems where individuals earn tokens for specific target behaviours. These tokens can then be exchanged for a choice of meaningful rewards ("backup reinforcers").
Example for a Person with a Disability:
An adult with an intellectual disability is learning independent living skills. They earn a token for each completed task on their chart (e.g., making their bed, preparing breakfast, packing their bag for their day program). After earning 10 tokens, they can choose to spend 30 minutes on their favourite hobby, like painting or listening to music.
Real-World Application: Supporting People with Disabilities
Behaviour Support Practitioners apply these principles in a highly structured and ethical way to enhance the quality of life for people with disabilities. The focus is always on positive reinforcement to teach new skills that serve the same purpose as a behaviour of concern.
Example: Fostering Independence in a Person with an Intellectual Disability
Scenario: An adult named John wants to do his own grocery shopping but gets overwhelmed by the noise and crowds. This anxiety (the antecedent) leads to him leaving the store (the behaviour). The relief he feels from escaping the stressful situation negatively reinforces the behaviour of leaving, making it more likely to happen again.
Intervention: A practitioner breaks the shopping trip into small, manageable steps (e.g., getting a trolley, finding three items, going to the checkout).
Reinforcement: For each completed step, John receives immediate praise and a token. After collecting five tokens, he can buy a favourite magazine. This powerful positive reinforcement for staying and completing the task begins to outweigh the negative reinforcement of escaping. Over time, John gains the confidence and skills to shop independently.
Example: Developing Communication Skills in a Child with Autism
Scenario: Mia, a non-verbal child, hits her arm when she wants a toy she can't reach. Seeing the toy is the antecedent, and hitting is the behaviour. When a caregiver gives her the toy to calm her down, it inadvertently provides positive reinforcement for hitting.
Intervention (DRA): A practitioner first identifies that the *function* of the behaviour is to request something. They then teach Mia an alternative, more appropriate behaviour: tapping a picture of the toy on a communication board.
Reinforcement: When Mia taps the picture, she is immediately given the toy and praised enthusiastically (strong positive reinforcement). When she hits her arm, the behaviour is ignored (as is safe and possible), and the toy is not given. Mia quickly learns that tapping the picture is a faster and more effective way to communicate her needs.
Schedules of Reinforcement
Reinforcement doesn't have to happen every time. The pattern, or 'schedule', drastically affects learning speed and how resistant a behaviour is to extinction.
Schedule | Description | Example |
---|---|---|
Fixed-Ratio | Reinforcement after a specific number of responses. | A coffee shop gives a free drink after 10 purchases. |
Variable-Ratio | Reinforcement after an unpredictable number of responses. Highly motivating. | Playing a slot machine; a win could happen at any time. |
Fixed-Interval | Reinforcement for the first response after a specific time has passed. | Getting a weekly paycheck on a set day. |
Variable-Interval | Reinforcement for the first response after an unpredictable amount of time. | Your boss checking on your work progress at random times. |