Hitting (or discussing) the Jackpot

http://www.merriam-webster.com defines the term “jackpot” as: an impressive often unexpected success or reward. 

I am going to define it for the purpose of this blog this way: a multiplied delivery of the reinforcer in use in a training session given in order to deliberately increase one response over another one. 

For instance, if I am teaching my dog to station on a platform and I prefer a down, but a sit will do, I might give one cookie for sits but a jackpot of several cookies in a row for downs. Alternatively I could simply give one cookie for each approximation toward that station behavior and jackpot my dog once she has all four feet on the station. Trainers do this sort of thing all the time, but I am curious as to whether it is actually effective or not. Over a series of a few blogs I am going to explore jackpots in deep nerdy detail. As with most subjects of any interest, the more I dug into this the more questions I had, which led to more answers and yet more questions. So bear with me, and let’s see about jackpots!

What could be wrong with a jackpot? 

Giving a dog bonus cookies might not be effective, but it can’t hurt, right? Or can it? Well, actually, it could.

In training, we want to maintain clean behavior loops. Alexandra Kurland calls this idea “loopy training” and you can read more about it here.  Essentially, if we agree that clean training involves clear criteria and a high rate of reinforcement then we can agree that our sessions should exist in a clean loop free of unwanted “junk” behaviors; a loop in which the road to reinforcement is short and well-lit for our learners. Hannah Branigan has already done a really nice job geeking out on this subject, I suggest you check that out. 

What does this loopy business have to do with jackpots? A jackpot inherently interrupts the loop. It interrupts efficiency and actually reduces rate of reinforcement. For instance, if I am teaching my dog to run across a target, and I want to select and jackpot anytime she hits the target with more than one foot, the session might look like this:

One paw target-click-cookie-one paw-click-cookie-three paws-click-cookie-cookie-cookie-cookie-cookie-one paw-click-cookie-two paws-click-cookie-cookie-cookie-cookie-cookie-one paw-click-cookie-three paws-click-cookie-cookie-cookie-cookie-cookie

So all targeting behavior was clicked and reinforced a total of 7 times, and one paw targets were actually clicked 4 times, so more than half. Conversely, if we say each event takes the same amount of time, I could have actually clicked and reinforced for several more targets, some of which may have been higher quality than just one paw, if I was not using jackpotting. (Arguably I could have selected out those one paw targets pretty quickly with differential reinforcement, too.)

What this means is that if jackpotting actually works to increase one response over another, then it might be worthwhile. But if it doesn’t, then it simply acts as a nuisance to your training plan; an interrupter that does nothing but slow learning down.

Hasn’t this already been studied?

Yes, with varied results. If you’ve got a study you think I haven’t read, please list it in the comments, I am only going to talk about one here, though I reviewed several.

One study found that jackpots do not increase single-response types of behaviors any more than normal reinforcement (meaning, rate of targeting does not increase following a jackpot) but that jackpots may be helpful in trying to increase one response over another when cues for both responses exist (if you jackpot the dog for hitting one target but only offer a single reinforcer for targeting the other, the jackpotted target gets hit more). If this is true, then we are wasting our energy jackpotting certain responses within a shaping session, but might do well to consider jackpots when trying to select for one behavior over another when both behaviors are occurring.

To put it simply, jackpotting “breakthroughs” in shaping a behavior (a common practice in training) is likely just interrupting your loop, while jackpotting the down on the pause table and only offering a single reinforcer for a stand or sit on this obstacle might be worth your time.

Most other studies I found actually explored whether or not dogs can actually ascertain quantities, again with mixed results. What comes through in the research is that yes, they can, when quantities are split up. Meaning, a pile of four cookies vs one cookie means little, while four cookies delivered one by one is clearly different than just one cookie delivered the same way.

What do I do when I have a question? I ask my dogs…

All of this led me to trying out a little experiment. Know that I am aware of the MANY flaws in what I did here. But I have learned that if we wait until something is perfect to share it, we wind up keeping everything to ourselves. So, here it goes:

I tried this on 5 dogs. 4 border collies, one Australian shepherd. Varying ages, varying degrees of clicker competency, but all trained on a regular basis and keen to the click then treat contingency. I wanted to see if jackpotting one target would increase the likelihood that the dogs would choose that target over another identical target that only earned them one click followed by one treat. I hadn’t actually read the study I link to above before I carried this out.

With each dog the target on the left was introduced first and when they hit it they got one click followed by four treats.  The target on the right was presented by itself next, and each time they targeted it they got one click followed by one treat. Then, I presented both targets, and maintained this contingency.

Here’s Stig’s second try at this, which was done after some evaluating of my initial attempt:

 

The first thing that is really illuminated is that loss of efficiency I mentioned earlier. In the initial phase, in which I am showing the dog what the contingency will be, it takes Stig almost a minute to get clicked 10 times on the jackpot side, whereas it takes less than half that time to get clicked the same amount of times on the single treat side.

Because of this I made a major error in my first sessions–I did not count clicks in the initial phase, I instead just worked each side for a comparable amount of time. What resulted was the right side getting clicked and treated about twice as much as the left.  Due to matching law, this skewed my initial results; most dogs had a preference for the right side when given the choice! 

The second finding is that Stig chose the left side more often to begin with when the options were present, but then shifted back over the right side toward the end of the session. In total, he was given the choice between the two sides a total of 31 times, with three distinct trials (after I had tossed a reset cookie). He chose the left (jackpot) side 14 times, and he chose the right (single treat) side 17 times. So a very slight preference for the single treat side was revealed, and did not show up until late in the session.

Now let’s look at Felix:

 

Again, efficiency of the single treat model wins out in the opening phase, but what is interesting to me is Felix’s preference. In three trials he chooses the right (single treat) side every time (19 times). A clear preference for the single treat side, not the jackpot side.

When I do this over again there will be a timer for the final phase. Stig’s final phase would up being slightly longer than Felix’s, with most of his right side choices happening toward the end. 

Interestingly, Idgie’s preference wound up being split evenly; in two trials she chose left 8 times and right 8 times. Ghost’s preference was overwhelmingly skewed toward the right (single treat) with a whopping 84 hits on that target and only 13 on the jackpot side. Her session was the longest, with 5 trials.

Brink forced me into a cleaner presentation of the targets (he would not stop clawing my legs as he aimed for the target on the ground) that I will switch to for future data for all dogs. For this reason I think his data is probably the most accurate, and he still wound up just about split down the middle with 17 left side (jackpot) hits and 18 right side (single treat) hits. Here is his video:

 

 

So, what do we know?

What we know about jackpots is that we need to know more. While I do not claim that the experimenting I did in my basement with my dogs is conclusive by any means, I do think it’s interesting that the dogs that showed a clear preference actually preferred the single treat side. This makes me wonder about the inherent reinforcement of the click itself, and whether the SEEKING system is actually trumping the jackpot.

In any case, I am still very curious about jackpots and will be looking into this further in the coming weeks.

 


3 thoughts on “Hitting (or discussing) the Jackpot

  1. Sarah, I wonder what the results would look like if jackpot was defined by quality instead of quantity, e.g. right side = 1 piece of kibble, left side = 1 piece of steak. Efficiency would not then be an issue.

    1. That is a question I have as well. I think the next follow-up will explore that, as well as assessing preference. We need a mechanism that tells us what the dog prefers other than just “he seems to like that better” before we can actually use quality as a jackpot.

  2. Do you think picking only working breed, and high energy herding had a part in the result of the faster target getting hit more? Just watching the videos I really wondered what other types of dogs would do. Also made me wonder about the value of jackpoting in shy or fearful rather than bold. Very interesting stuff! Thanks for the time you put into this.

Comments are closed.