Ahhh, that elusive skill! Shaping by successive approximation aka “shaping” or “free shaping” is really just training in my book. I don’t follow any rules about not luring or prompting, I use targets and props, and I help my learner as much as possible. I believe strongly that the only benefit to purely shaping a behavior with zero trainer interaction or “help” is only valuable to the trainer–to improve the trainer’s eye and precision. The learner does not benefit here; in fact the learner is often frustrated and agitated when we do this. So when training Watson, I like to do these things to be certain he has a high success rate and little to no frustration:
Make the C the A
Behavior occurs in this order: antecedent (stuff that happens before, this includes the environment and any other prompts), behavior (the agenda behavior we are after), consequence (in our case, cookies, or positive reinforcement). I want to set up my “antecedents” or my scenario/environment in such a way that makes the desired behavior highly likely, and I want to utilize a consequence (cookies) that actually feeds back into the antecedent, recreating the behavior loop all over again so I can reinforce.
Seem confusing as all hell? Sit tight, I’ll make it clear.
Here, my agenda behavior is all four feet on the purple platform. The ideal antecedent is for Watson to be opposite me, facing me, with the platform in between us, like this:
So I need to make the C become the A again, meaning I need to deliver my treat in such a way that we see that picture again at the start of each loop. Easy! I just throw my treat back behind him so that after eating it, I see that above picture again. Here is a video:
Select Approximations from Existing Behavior
Shaping by successive approximation means that we are selecting tiny slices of behavior to build a final behavior or behavior sequence. In layman’s terms, we are reinforcing pieces of the final thing we are after, until we get there, rather than trying to get the final behavior right off the bat. The way many of us learned to shape relies heavily on extinction bursts–that increase of behaviors that occurs due to the frustration of the absence of expected reinforcement. Again, layman’s terms: you wave your hand under a public restroom faucet and nothing happens. You do it again. You wave your hands faster. You change how you wave them. You wave them in different places in the sink. That increase in new behaviors is an extinction burst and we all know that is does not feel good. What happens next is the faucet either turns on, or you go to a new sink. Inducing that burst of frustration in our learners in order to teach them isn’t fair, and it isn’t how I teach.
Here, I am shaping a pivot. Notice how often I am clicking. In the beginning any motion toward the trainer/training area is clickable. When he drops his head to look for food, I click when his head comes back up. Later, I don’t click until he returns to the platform. That’s about selecting approximations from current behavior, rather than expecting to just get the behavior right off the bat. Here is the video: