Psychologists define learning as a long lasting change in behaviour as a result of experience. Classical and operant conditioning both lead to learning. What’s the difference between them? Classical conditioning was first described by Ivan Pavlov, and is the association of a stimulus with an involuntary response. It focuses on involuntary, automatic behaviours. Pavlov noticed that a neutral stimulus before a reflex causes an association. He conduced an experiment in which he rang a bell before presenting dogs with food. When dogs see or smell food, they salivate. No one taught them to do this. It is an unconditioned response to an unconditioned stimulus. Of course, that is not how dogs would normally respond to seeing or hearing a bell ring. This is a neutral stimulus. However, Pavlov found that if he always rang a bell before presenting dogs with food, then they eventually began to salivate as soon as they heard the bell, even when there was no food around. The bell has now become a conditioned stimulus, and the dogs salivate as a conditioned response. Operant conditioning, first described by B.F. Skinner, is the association of a voluntary behaviour with a consequence. Skinner found three types of environmental responses, or operants, that can follow a behaviour. Reinforcers, punishers, and neutral operants. Reinforcers increase the probability of a behaviour recurring, punishers decrease the probability (extinguish the behaviour), and neutral operants do neither. Skinner put a rat in a box with a lever. On accidentally bumping the lever, the rat discovered that it would receive a food pellet. With this positive reinforcement, the rat learned to keep pressing the lever. Negative reinforcers remove unpleasant stimuli. Skinner put a rat in a box which had a mild electric current that caused the rat discomfort. On wandering randomly around the box, the rat randomly hit a lever to turn the current off. The rat learned to always press the lever once inside the box, something called “Escape Learning”. Skinner eventually also taught the rat to turn on a light that prevented the electric current being turned on in the first place, something called avoidance learning. Punishment weakens a behaviour by providing an aversive consequence. Just like reinforcement, it can occur through the addition or removal of a stimulus. For example, if a rat receives an electric shock when it pushes a button, it will avoid that button. Or if you’re an unfortunate raccoon that decided to wash his cotton candy before eating it, only to watch it dissolve before your very eyes, that is punishment through the removal of a positive stimulus. It must be noted that punished behaviour is not forgotten but is suppressed. If a punishment is no longer present, the behaviour returns. Also, unlike reinforcement, it does not guide towards the desired behaviour, but only suppresses undesired behaviour. There have been further experiments done with rats in the Skinner box. After a rat has received operant conditioning, and has learned to press a lever to receive a food pellet, what happens if the lever is pressed, but no food pellet is received? At first, the rat will keep pressing the lever, but eventually, it will stop, and the behaviour is extinguished. Why press this thing without payment? However, a rat can learn or unlearn a behaviour at different rates with different schedules of reinforcement. This has been termed the response rate (rate at which behaviour repeats) and the extinction rate (how soon the behaviour stops). Let’s see what happens with 5 reinforcement schedules. With continuous reinforcement, the response rate is slow, and extinction is fast. With a fixed ratio reinforcement, where positive reinforcement is offered after several repetitions of a behaviour, the response rate is fast, and extinction is medium. For fixed interval reinforcement, where reinforcement is provided so long as a quota is fulfilled during a time period, the response and extinction rates are medium. For variable ratio reinforcement, where the behaviour is reinforced after an unpredictable number of repetitions, the response rate was fast and extinction was slow. This is the equivalent of gambling. For variable interval reinforcement, where if a quota is fulfilled within an unpredictable length of time, a reward is given, the response rate is again fast and extinction is slow.
Ещё видео!