When psychologists talk about learning, they are referring to a relatively permanent change in knowledge or behavior that comes about as a result of experience. Experience is necessary for us to speak, read, write, add and subtract, ride a bicycle, or know how to charm a romantic partner. Regardless of your specific area of study, they all incorporate the concept of learning. Often what we learn makes us happier, healthier, and more successful; sometimes it does not. The beauty of adaptation by learning is that it is flexible. This means that each of us can learn to behave in ways that benefit rather than harm ourselves and others.
The question is: how does this learning take place? We will focus on four types of learning – habituation, classical conditioning, operant conditioning, and observational learning. What is common among all these types of learning is that they work under the principle of learning by association. The simplest form of learning is habituation – a tendency to become familiar with a stimulus merely as a result of repeated exposure. The first time it happens, a sudden loud noise or a blast of cold air has a startling effect on us and triggers an ‘orienting reflex’.
Among humans, the eyes widen, the eyebrows rise, muscles tighten, the heart beats faster, and brain-wave patterns indicate a heightened level of physiological arousal. On the second and third exposures to the stimulus, the effect is weakened. Then as we become acclimated or ‘habituated’ to the stimulus, the novelty wears off, the startle reaction disappears, and boredom sets in. Habituation is a primitive form of learning and is found among mammals, birds, fish, insects, and all other organisms. For example, sea snails reflexively withdraw their gills at the slightest touch. Then after repeated tactile stimulation, the response disappears.
Animals may also habituate to objects that naturally evoke fear after repeated and harmless exposures. When lab rats were presented with a cat collar smeared with a cat’s odor, they ran from it and hid. However, after several presentations the rats hid for decreasing amounts of time, eventually resembling control group rats exposed to an odorless collar. If you think about everyday life, numerous examples of habituation come to mind. People who move from a large city to the country or from a region of the world that is hot to one that is cold often need time to adjust to the sudden change in stimulation.
Once they do, the new environment seems less noisy, quiet, hot or cold. Habituation also has important implications for the power of rewards to motivate us. Regardless of whether the rewarding stimulus is food, water, or money, it tends to lose impact, at least temporarily, with repeated use. Thus, you have to keep increasing the reward for its continued success (unless there is a pause in the rewarding cycle, then you can kind of ‘reset’ the value). In habituation, an organism learns from exposure that a certain stimulus is familiar.
Over the years, however, psychologists have focused more on the ways in which we learn relationships between events. Thus, we’ll move to the latter three learning processes. Following Aristotle, modern philosophers and psychologists have long believed that the key to learning is association, a tendency to connect events that occur together in space and time. Enter Ivan Pavlov, a Russian physiologist (not psychologist! ). After receiving his medical degree in 1882, he spent twenty years studying the digestive system and won a Nobel Prize for that research in 1904. Pavlov was the complete dedicated scientist.
Rumor has it that he once reprimanded a lab assistant who was ten minutes late for an experiment because of street riots stemming from the Russian Revolution – saying ‘Next time there’s a revolution, get up earlier! ’ Ironically, Pavlov’s most important contribution was the result of an incidental discovery. In studying the digestive system, he strapped dogs in a harness, placed different types of food in their mouths, and measured the flow of saliva through a tube surgically inserted in their cheek. But there was a ‘problem’: After repeated sessions, the dogs would begin to salivate before the food was actually put in their mouths.
In fact, they would drool at the mere sight of food, the dish it was placed in, the assistant who brought it, or even the sound of the assistant’s approaching footsteps. Pavlov saw these ‘psychic secretions’ as a nuisance, so he tried to eliminate the problem by sneaking up on the dogs without warning. He soon realized, however, that he had stumbled on a very basic form of learning, what we call classical conditioning, and Pavlov devoted the rest of his life to studying it. To examine the classical conditioning systematically, Pavlov needed to control the delivery of food, often a dry meat powder (yum! , as well as the events that preceded it. The animals did not have to be trained or ‘conditioned’ to salivate. The salivary reflex is an innate unconditioned response that is naturally set off by food in the mouth, an unconditioned stimulus. There are numerous unconditioned stimulus-response connections. Tap you knee in the right spot with a rubber mallet and your leg will jerk. Blow a puff of air in your eye and you’ll blink. In each case, the stimulus automatically elicits the response. No experience is necessary.
Using the salivary reflex as a starting point, Pavlov sought to determine whether dogs could be trained by association to respond to a ‘neutral’ stimulus – one that does not naturally elicit a responses. To find out, he conducted an experiment in which he repeatedly rang a bell before placing food in the dog’s mouth. Bell, food. Bell, food. After a series of these paired events, the dog started to salivate to the sound alone. Because the bell, which was initially a neutral stimulus, came to elicit the response through its association with food, it became a conditioned stimulus, and salivation became a conditioned response.
With this experiment as a model, Pavlov and others trained dogs to salivate in response to buzzers, ticking metronomes, tuning forks, odors, lights, colored objects, and a touch on the leg. Because of the decades of research and the multitude of psychologists who became involved in research on classical conditioning, we now know about four basic principles of classical conditioning. Classical conditioning seldom springs full blown after a single pairing of the CS and the UCS. Usually, it takes some number of paired trials for the initial learning, or acquisition, of a CR.
In Pavlov’s experiments, the dogs did not salivate the first time they heard the bell. However, the CR increases rapidly over the first few pairings, until the ‘learning curve’ peaks and levels off. The acquisition of a classically conditioned response is influenced by various factors. The most critical are the order and timing of the presentation. In general, conditioning is quicker when the CS (the bell) precedes the onset of the UCS (the food) – a procedure called forward conditioning. Ideally, the CS should precede the UCS by about half a second and the two should overlap somewhat in time.
When the onset of the UCS is delayed, conditioning takes longer and the conditioned response is weaker. And that’s an important point – that the learning of a UCR can be stronger or weaker and can take more or less time. In the acquisition phase of classical conditioning, a CR is elicited by a neutral stimulus that is paired with a UCS. But what happens to the CR when the UCS is removed? Would a dog continue to salivate to a bell if the bell is no longer followed by food? The answer is no. If the CS is presented often enough without the UCS, it eventually loses its response-eliciting power.
This apparent reversal of learning is called extinction. After an animal is conditioned to respond to a particular CS, other stimuli will often evoke the same response. IN Pavlov’s experiments, the dogs salivated not only to the original tone but also to other tones that were similar but not identical to the CS. Other researchers have made the same observation. What they have learned is that the more similar the stimulus to the CS, the more likely it is to evoke a conditioned response. This tendency to respond to stimuli other than the original CS is called stimulus generalization.
Stimulus generalization can be useful because it enables us to apply what we learn to new, similar situations. But there are drawbacks. As illustrated by the child who is terrified of all animals because of one bad encounter with a barking dog, or the racist who assumes that ‘they’ are all alike, generalization is not always adaptive. Sometimes we need to distinguish between objects that are similar – a process of discrimination. Again, Pavlov was the first to demonstrate this process. He conditioned a dog to salivate in the presence of a black square (a CS) and then noticed that the response generalized to gray-colored squares.
Next, he conducted a series of conditioning trials in which the black square was followed by food while the gray one was not. The result: The dog continued to salivate only to the original CS. In a similar manner, the dog eventually learned to discriminate between the color black and darker shades of gray. Psychologists actually took little notice of Pavlov’s work until 1914, when John Watson described Pavlov’s work to a group of American psychologists. To demonstrate the relevance of the phenomenon to humans, Watson and his assistant conditioned an eleven-month-old boy named Albert to fear a white laboratory rat. Little Albert’ (as he has come to be known) was a normal, healthy, well-developed infant. Like others his age, he was scared by loud noises but enjoyed playing with fully little animals. Enter John Watson. Modeled after Pavlov’s research, Watson presented Albert with a harmless white rat. Then just as the boy reached for the animal, Watson made a loud, crashing sound by banging a steel bar with a hammer, which caused the startled boy to jump and fall forward, burying his head in the mattress he was lying on. After seven repetitions of this event, the boy was terrified of the animal.
What’s worse, his fear generalized, leading him to burst into tears at the sight of a rabbit, a dog, a Santa Claus mask, and even a white fur coat. Thus, even with just a few trials, phobias can develop,which can spread to fears of similar items. I should point out that no ethics committee would ever allow a psychologist to engage in this kind of behavior. We have much stricter rules about hurting people now than we ever have. It is important to note that people acquire taste aversions, too – often with important practical implications.
Consider, for example, an unfortunate side effect of chemotherapy treatments for cancer patients. These drugs tend to cause nausea and vomiting. As a result, patients often become conditioned to react with disgust and a loss of appetite to food they had eaten hours before treatment. Thankfully, the principles of classical conditioning offer a solution to this problem. When cancer patients are fed a distinctive maple-flavored ice cream before each treatment, they acquire a taste aversion to that ice cream – which becomes a ‘scapegoat’ and protects other foods in the patient’s diet.
Still, many cancer patients who had undergone chemotherapy and survived report that they continue to feel nauseous, and sometimes vomit, in response to the sights, smells, and tastes that remind them of treatment – as much as twenty years later. Some fears are evolutionarily healthy for us – they keep us alive. Researchers have shown that conditioned fears to things like snakes, spiders, and bad-tasting substances are easier to teach to subjects and tend to remain learned for longer periods of time (and remember, these subjects did not initially have a fear of these stimuli) than do responses to neutral stimuli.
It’s is healthy for us to be afraid of snakes, spiders, and bad-tasting substances as they may be indicative of danger – from a venomous bite or poison. It has also become clear to researchers that our social behaviors are also susceptible to influence through classical conditioning. Think about JAWS…how do you know when the shark is coming?? The music…that’s a cue that is widely used in TV and movies to manipulate the audience’s emotional state. Or think about the interlocking orange and yellow circles used by Mastercard…they have now come to represent that company to such a degree that they no longer need to state who they are.
But more importantly, they have come to be seen as visual cues that lead us to spend money. In one study, college students who were asked to estimate how much money they would be willing to spend on various consumer products gave higher estimates when there was a credit card lying on the table in the testing room than when there was not. In a second study, which was conducted in a restaurant, diners were randomly given tip trays for payment that were either blank or had a major credit card logo on it. With 66 cash-paying customers in the sample, the credit-card tray elicited an increase in tipping from 15. percent of the bill to 20. 2 percent. Researchers have wondered about the conditioning effects of chemotherapy – which inhibits the growth of cancerous cells and the immune system. With chemotherapy drugs always being given in the same room in the same hopsitital, they wondered if it was possible, over time, that a patient’s immune system is conditioned to react in advance to cues in the surrounding environment. What they found is that in a study of women undergoing several chemotherapy treatments for ovarian cancer, their immune systems were weakened as soon as they entered the hospital, before they were even treated.
Like Pavlov’s bell, the hospital setting had become a conditioned stimulus, thus triggering a maladaptive change in cellular activity. So, if the immune system can be weakened by conditioning, can it similarly be strengthened? Preliminary research on animals is showing some success. In human studies, too, researchers have found that after repeatedly pairing sweet sherbet and other neutral stimuli with shots of adrenaline (which has the unconditioned effect of increasing activity in certain types of immune cells), the sherbet flavor alone later triggered an increase in the immune response.
So classical conditioning may explain why people salivate at the smell of food, cringe at the sound of a dentist’s drill, or tremble at the sight of a flashing blue light in the rearview mirror. But it can’t explain how trainers at Sea World teach killer whales to jump through hoops, nor how we learn to make people laugh or behave in ways that earn love, praise, sympathy, or respect from others. This is all done through the process of operant conditioning. Operant conditioning functions on the principle that we associate a response with its consequence and whether we repeat or avoid that behavior depends on the positivity of the consequence.
If it gets us something we like (or removes something we don’t like), then we do it again. If it gets us something we don’t like (or removes something we do like), then we won’t do it again. This is the law of effect – that we will repeat those behaviors that are followed by positive outcomes and will quit those behaviors that are followed by negative outcomes. Now the main individual responsible for what we know about operant conditioning is B. F. Skinner…who we have mentioned several times now.
To study learning systematically, Skinner knew that he had to design an environment in which he controlled the organism’s response-outcome contingencies. So as a graduate student in 1930, he used an old ice chest to build a soundproof chamber equipped with a stimulus light, a response bar (for rats) or pecking key (for pigeons), a device that delivers dry food pellets or water, metal floor grids for delivery of electric shock, and an instrument outside the chamber that automatically records and tabulates the responses. This became known as the Skinner box. Next, Skinner introduced a new vocabulary.
To distinguish between active types of learning and Pavlov’s classical conditioning (which is more passive), Skinner coined the term operant conditioning. He also talked about different types of behavioral/response contingencies, which we’ll go into more detail on in a minute. They are: positive reinforcement, negative reinforcement, extinction, and punishment. To avoid speculating about an organism’s internal state, Skinner also used the term reinforcement instead of reward or satisfaction. Objectively defined, a reinforcer is any stimulus that increases the likelihood of a prior response.
A positive reinforcer strengthens a prior response through the presentation of a positive stimulus. In the Skinner box, the food that follows a bar press is a positive reinforcer. Even mild electrical stimulation to certain ‘pleasure centers’ of the brain, which releases the chemical neurotransmitter dopamine, has a satisfying effect and serves as a positive reinforcer. In contrast, a negative reinforcer strengthens a response through the removal of an aversive stimulus. In a Skinner box, the termination of a painful electric shock is a negative reinforcer.
Similarly, we learn to take aspirin to soften a headache, fasten our seatbelts to turn of the seatbelt dinger, and rock babies to sleep to stop them from crying. It’s important to keep straight the fact that positive and negative reinforcers both have the same effect: to strengthen a prior response. Skinner was quick to point out that punishment is not a form of negative reinforcement. Although the two are often confused, punishment has the opposite effect: It decreases, not increases, the likelihood of a prior response. There are two types of punishment.
A positive punisher weakens a response through the presentation of an aversive stimulus. Shocking a lab rat for pressing the response lever, scolding a child, locking a criminal behind bars, and boycotting a product all illustrate this form of punishment designed to weaken specific behaviors. In contrast, a negative punisher weakens behavior through the removal of a stimulus typically characterized as positive. Taking food away from a hungry rat and grounding a teenager by suspending driving privileges are two examples. Okay, so so far everything seems straight forward.
But can you think of a problem – a potential flaw? We know that responses that produce reinforcement are repeated…what’s the weakness in this? Before the first food pellet, how does the animal come to press the bar? One possibility is that the response occurs naturally as the animal explores the cage. Skinner pointed to a second possibility: that the behavior is gradually shaped, or guided, by the reinforcement of responses that come closer and closer to the desired behavior. Imagine that you trying to get a hungry rat to press the bar in a Skinner box. Where do you begin?
The rat has never been in this situation before, so it sniffs around, pokes its nose through the air holes, grooms itself, rears on its hind legs, and so on. At this point, you can wait for the target behavior to appear on its own, or you can speed up the process. If the rat turns toward the bar, you drop a food pellet into the cage. Reinforcement. It if steps toward the bar, you deliver another pellet. Reinforcement. If the rat moves closer or touches the bar, you deliver yet another pellet. Once the rat is hovering near the bar and pawing it, you withhold the next pellet until it presses it down, which triggers the feeder.
Before long, your subject is pressing the bar at a rapid pace. By reinforcing ‘successive approximations’ of the target response, you will have shaped a whole new behavior. Professor/teaching example. In classical conditioning, repeated presentation of the CS without the UCS causes the CR to gradually weaken and disappear. Extinction also occurs in operant conditioning. If you return your newly shaped rat to the Skinner box but disconnect the feeder from the response bar, you’ll find that after the rat presses the bar some number of times without reinforcement, the behavior will fade and become extinguished.
So you’ll stop putting money into a broken vending machine that doesn’t give you your coke. Now, if a break in the action follows distinction, the old behavior will reappear…if you don’t use that vending machine for a month, you are likely to try again because of prior learning when the machine did work and reinforced your putting money into it. Some other principles of operant conditioning include: primary versus secondary reinforcers, discrimination, and chaining. Primary reinforcers are innately reinforcing stimulus, such as one that satisfies a biological need. Things like food, water, or relief from pain are primary reinforcers.
Secondary reinforcers (your book calls them conditioned reinforcers) are those stimuli that gains its reinforcing power through its association with a a primary reinforcer. They have power not in and of themselves but because they can get you a primary reinforcer. This is things like attention, praise, money, and good grades. They don’t satisfy a physical need but because of them, you may be rewarded with primary reinforcers. Okay, so why don’t behaviors just keep occurring? Because reinforcements are often available in some situations but not others, it is adaptive to learn non only WHAT response to make, but also WHEN to make it.
If pecking a key produces a food only when a green disk is lit, a pigeon may learn to discriminate and to respond on a selective basis. The green light is a discriminative stimulus that ‘sets the occasion’ for the behavior to be reinforced. Also, some reinforcers can lose their power over time to be rewarding or the subject can become satiated on that particular reinforcer. Early in his research, Skinner would reinforce his animals on a continuous basis: Every bar press produced a food pellet. Then something happened.
At the time, Skinner had to make his own pellets by squeezing food paste through a pill machine and then waiting for them to dry. The process was time consuming. One Saturday, assessing how many he had left and how many he would need, he realized that unless he worked all day and all night, he would be out of pellets by Monday morning. Not wanting to spend his entire Saturday at making pellets, he decided that not EVERY response had to be reinforced. He adjusted his apparatus so that the bar-press response would be reinforced on a partial basis – only once per minute.
Upon his return the next week, however, he found rolls of graph paper with response patterns that were different from anything he had seen before. From this experience, Skinner came to appreciate the powerful effects of ‘partial reinforcement’. So we’ve been talking about continuous reinforcement. This method works quickly but is subject to rapid extinction, once you end it so you pretty much have to keep it up all the time…that is time and money-consuming, and isn’t all that realistic. We go through life experiencing some form of partial reinforcement for most behaviors.
In the situation I just described, reinforcement follows the first response made after a fixed interval of time had elapsed. In a fixed-interval schedule, the response produces food after each new minute; or it may be made available only after every two, ten, or fifteen minutes. The schedule is fixed by time, and it tends to produce a slow, ‘scalloped’ response pattern. After the animal learns that a certain amount of time must elapse, it pauses after each reinforcer and then responds at an accelerating rate until it nears the end of the cycle – which signals that the next reinforcement is available.
Most of you are on a fixed interval schedule for studying – behaviors start off slow then pick up right before each exam, slowing again after each one. Once an animal learns what the fixed pattern is, they press the bar only as they near the end of each interval. To counter this lazy response pattern, Skinner tried varying the interval around an average. In other words, an interval may average one minute in length, but the actual timing of a reinforcement is unpredictable from one interval to the next – say, after fifty seconds, then two minutes, ten seconds, and one minute.
The result is a slow but steady, not scalloped, pattern of responses. In effect, teachers who give pop quizzes are using a Variable-Interval schedule to ensure that their students keep up with the reading rather than cram at the last minute. In the fixed-ratio situation, a reinforcer is administered after a fixed number of responses – say, every third response, or every fifth, or tenth. In a fixed-ratio schedule, the response-to-reinforcement ratio remains constant. Frequently, you see bursts of presses until the reinforcer appears, a brief pause, and then another burst. The result is a fast, steplike response pattern.
Frequent-flier programs, where you can earn a free flight after 25,000 miles of air travel, CD clubs that offer a free CD after every fifth purchase all operate on a fixed-ratio schedule. With the variable-ratio schedule, the reinforcement appears after some average number of responses is made – a number that varies randomly from one reinforcement to the next. Again, there is an average but the actual number of responses needed to trigger reinforcement varies (and sometimes widely). Unable to predict which response will produce a food pellet, animals on a VR schedule respond at a constant high rate.
In one case, Skinner trained pigeons to peck a disk 10,000 times for a single food pellet! Slot machines and lotteries are rigged to pay off on a VR schedule, leading gamblers to deposit coins and purchase tickets at a furious, addictive pace. If you have ever tried to call by phone or over the Internet for tickets to a concert or sports event only to receive a busy signal, chances are you too kept trying with dogged persistence. The reason: When it comes to getting through, our efforts are reinforced, as with slot machines, on a variable-ratio schedule. In general, partial reinforcement schedules strengthen later resistance to extinction.
Classical and operant conditioning are two ways in which organisms adapt and learn from experience. But something is missing…forget the dogs, cats, rats, and pigeons that drool, press bars, peck keys, jump from shocks, and run mazes. What about the human learning of complex behavior? Don’t we sometimes learn without direct experience? Think about the first time you danced, drove a car, or programmed a VCR. Now imagine how slow and inefficient you would have been if those skills had to be acquired from scratch or ‘shaped’ through trial and error. Complex new behaviors can be also be learned by watching and imitating others.
This form of learning occurs not only among humans but also in many types of animals. Most animals show the new generations how to engage in certain behaviors – usually regarding feeding situations. Human infants also exhibit rudimentary forms of imitation. Research shows that they copy adults who stick their tongue out, use a certain hand to point, reach, or wave, and they also imitate other children their age. Imitation is adaptive: By observing their peers and adults, young members of a species learn to interact and develop the skills of past generations.
According to Albert Bandura, people learn by watching others. These others are called models and the process is known as observational learning. In a classic experiment to demonstrate this point, nursery-school children were exposed to a live adult model who behaved aggressively. In a typical session, the child would be sitting quietly, drawing a picture. From another part of the room, an adult approached a Bobo doll (an inflatable clownlike toy that is weighted on the bottom so that it pops back up whenever it is knocked down).
For ten minutes, the adult repeatedly abused the doll – sitting on it, pounding it with a hammer, kicking it, throwing balls at it, and yelling ‘Sock him in the nose! Kick him! ’. After the outburst, the child was taken to a room filled with attractive toys but told that these toys were being saved for ‘the other children’. Frustrated, the child was then taken to a third room containing additional toys, including – you guessed it – a Bobo doll. At that point, the child was left alone and observed through a one-way mirror. What happened?
Compared to children exposed to a nonviolent model or to no model at all, those who had witnessed the aggressive display were far more likely to assault the doll. In fact, they often copied the model’s attack, action for action, and repeated the same abusive remarks, word for word. The children had acquired a whole new repertoire of aggressive behavior. More recent research confirms the point: Among children and adolescents, exposure to aggressive models on TV and in the movies triggers aggression – not just in the laboratory but in the classroom, in the playground, and other settings.
Observational learning can also have beneficial effects. In one study, snake phobics gained the courage to approach a live snake by first watching someone else do so – which is why models are often used in the treatment of phobias. In a second study, bystanders were more likely to help a stranded motorist or donate money to charity, two acts of generosity, if they had earlier observed someone else do ithe same. According to Bandura, observational learning is not a simple, automatic, reflexlike reaction to models.
Rather, it consists of two stages: acquisition and performance. He felt that a newly acquired response often remains ‘latent’ until the organism is motivated to perform it. He also noted that there are four steps involved in observational learning: attention, retention, reproduction, and motivation. Attention: To learn by observation, one must pay attention to the model’s behavior and to the consequences of that behavior. Due to their ability to command our attention, parents, teachers, political leaders, and TV celebrities are potentially effective models.
Retention: In order to model someone else’s behavior minutes, days, weeks, months, or even years later, one must recall what was observed. Accordingly, modeling is likely to occur when the behavior is memorable or when the observer things about or rehearses the behavior. Reproduction: Attention and memory are necessary conditions, but observers must also have the motor ability to reproduce the modeled behavior. As closely as you may watch, and as hard as you try, it is unlikely that you will be able to copy Michael Jordan’s graceful flight to the basket.
Motivation: People may pay attention to a model, recall the behavior, and have the ability to reproduce it – all laying a necessary foundation for modeling. Whether an observer takes action, however, is determined by his or her expectations for reinforcement – expectations that are based not only on personal experience but also on the experiences of others. This last point is important because it illustrates ‘vicarious’ reinforcement: that people are more likely to imitate models who are rewarded for their behavior and less likely to imitate those who are punished. Apparently, learning can occur without direct, firsthand experience.