00:00 - Rob Pizzola (Host)
Team records by referee Chiefs 6-0, ravens 0-3,. Looks juicy right. I'm Rob Pozzola. Pro Sports Better, and this one for me is personal, because people lose real money when big accounts dress up a story as data. If you have a platform and you're pushing a 9-game referee trend like it's predicting the future, you either do not understand the numbers or you're counting on your audience not to ask questions On Circles Off. We do the opposite. We test ideas, we measure what actually moves a game, and today I will show you why win-loss by referee is complete nonsense, how 6-0 streaks pop out of randomness and what you should track instead if you want an edge on sides, totals and props. If trends like this are something you regularly value, stick around. I'll give you a simple weekly checklist and a better way that you can evaluate them. You will leave with tools, not vibes. Let's get into it First, the sample problem. We're talking about six Chiefs games and three Ravens games across five seasons New players, new coordinators, different opponents, different prices, different venues. Nothing is being held constant here. With samples this small fake streaks are almost guaranteed. Quick sanity check why streaks happen in random data.
01:36
Grab a coin, flip it six times Now do that a hundred times. You will see several 6-0 streaks. You'll see several 0-6 streaks, even though the coin has no hot hand. Football is noisier than a coin toss. A 6-0 record with referee X is not the story. The real question is can we explain it with something besides luck? Remember, the Chiefs have Patrick Mahomes at quarterback. He's won almost 80% of his career starts. A team led by that type of quarterback is going to stack wins, regardless of who the official is. Secondly, we have something called confounders. Quarterbacks and other factors move outcomes far more than the officials. Patrick Mahomes versus a backup quarterback is a bigger lever than any referee. Price matters, baseline matters. If the Chiefs closed as a favorite in most of those games, the expected record before kickoff is already positive. If you're laying about four points on average, a fair six game record is something like four and two. So going six and O is not shocking. Any elite quarterback laying points is going to stack wins. Giving the ref credit for that is backwards thinking.
02:54
Third, we have data dredging. There are endless ways to carve this up. Since 2020, home only divisional third down. If you hunt through enough teams, through enough refs, enough time windows, you will always find something extreme. That's fishing, that's not finding. Fourth outcome versus mechanism. This is the big one. Wins are the outcome, refereeing is the mechanism.
03:22
If the claim is that one team is getting called for less third down offensive penalties or a low DPI rate, then measure those things directly. How often this crew throws those flags compared to the league average? What does that do to expected points on those plays? Team wins-loss tells you nothing about the mechanism that you're selling. The bottom line here is win-loss by referee is basically a vibes chart. It looks sharp. It predicts nothing. It teaches bettors to chase stories instead of chasing edges. If you like myth-busting over clickbait, consider subscribing here to Circles Off so that more bettors can find us and dodge that bad info in the future. But instead of just telling you why I think this is bad strategy, I'd like to explain how I would use referee data without fooling myself.
04:12
Number one count how often it happens, not who won. Go back to last season plus this season, maybe a little bit further back For each crew. Count the flags that matter, then divide by the number of chances. Keep it simple how many times per 100 plays. Here's some examples Defensive pass interference per 100 pass plays. Offensive holding per 100 plays. Pre-snap penalties per 100 offensive snaps. This tells you more about how a crew behaves, not who happened to win the game. Number two adjust for who is playing. Okay, some teams are going to grab more. Some tackles hold more than others. A quick workaround is to compare the crew's numbers in games that did not involve today's teams. Or just note this team already draws a lot of defensive pass interference so that you're not double counting it. If one team consistently commits more penalties than another team, it can simply mean that one team is undisciplined, not that the referee has it out for them.
05:15
Number three think about impact and subjectivity. Not all flags are equal. Some are much more subjective than others. A deep pass interference or a roughing the passer call can swing a drive. A delay of game usually doesn't. A false start is basically an automatic call every time. Write these things down and star the big. Swing high judgment flags and weigh those more. Number four make sure that this all sticks. Check last season, this season, if it appears one year and then it disappears the next. Treat it as noise. Crews change, league emphasis changes. Only trust what is repeatable.
05:59
Number five keep your adjustments small. If a crew throws more defensive pass interferences than average and both teams attack deep. Yeah, maybe lean a little bit higher on the game total. If a crew nails offensive holding and one team's tackles are weak in that department, you can expect more. Stalled drives Usually leans lower on the total. These are tiny nudges, not massive moves, but it is a way of incorporating referee data into your handicapping that actually makes sense. Number six check yourself after the fact. Track your notes. For a while I keep a notebook. Did the small nudge help you beat the closing number or land closer to the result? If not, dial it back or just retire it completely.
06:45
Here's a simple worksheet example that anyone can build out. Take the last few seasons of data For a given crew, count defensive pass interference flags and count pass plays in those games. Now turn it into DPI per 100 passes. If the league sits around one call per 100 and this crew is closer to 1.6 per 100, that might be a real tendency. Do the same thing for holding pre-snap flags. Add a note for big swing penalties as well, and which ones are subjective, like roughing the passer versus stuff that's automatic like a false start. Now look at your matchup for this week DPI-friendly crew, two vertical offenses. Yeah, maybe lean slightly higher on the total Holding heavy crew. One team lives in the run game with shaky tackles. Expect more stall drives. Lean a touch lower, but keep it simple. How often per 100 plays is enough to spot tendencies.
07:50
You can go larger sample as well if you want, but get fancy later, only if it's actually helping you. Now I don't want to lie to you and convince you that this is going to be like some light bulb moment and all of a sudden you're going to beat the NFL market. But this is a much better way to try to incorporate this type of data than to blindly tail some sort of dumb trend with zero context. If you want to apply it to any trend, that's easy. Before you trust any trend, just run it through this checklist.
08:23
Number one sample size. Is it big enough to survive randomness? Number two does it show up again? Is it something that seems repeatable from season to season? Number three did you adjust for the obvious stuff already? The quarterback homer away, injuries, weather, whatever it may be? Number four is there a football reason for it? Mechanism, not just correlation? Number five did you test it on new data? After you found it, did it still work later? On? Number six did you hunt for it across tons of teams, refs or time windows? If you did, you need stronger proof or you need a bigger effect. Number seven is it big enough that you think it could matter in the line and potentially is not being accounted for right now? And number eight can you use it? Is there a clear path to a side, a total, a prop bet? If it fails two or more of these in the checklist, I would probably just toss it. If it passes most, keep it, but bet it small and keep testing. Now let's take this checklist and let's apply it to the referee trend.
09:36
Number one sample size. That's a fail. Six for KC. Three for Baltimore Easy fail. Number two does it show up again? Five seasons, different rosters, different schemes I'd say that's a fail as well. Three adjustments Big time fail. No control for Patrick Mahomes being the quarterback and the Chiefs being great. Four is there a football reason for it? Another fail the claim mentions third down penalties and DPI, but the stat that they're showing is wins. Number five new data test Fail. Cherry picked, not validated at all. Fishing risk fail.
10:11
With this many refs, teams and windows, you're always going to find something that's 6-0 somewhere. The magnitude is unknown. We don't know if there's a point spread impact because it's not shown and the usefulness also a fail. What would have convinced me? Show me that this crew's DPI per 100 passes versus the league over the last few seasons same for offensive holding on third down and a short note on whether it still shows up this season. Then give me one or two ways it changes this particular matchup. Team win-loss by referee does none of that.
10:50
Now back to why. This is personal. I care about this space and the people watching. Shiny 6-0 trends with no substance. That teaches bad habits. It costs you money. If you have reach, you have responsibility. If you do not know the numbers, learn them. If you do, and you post this anyways, you are selling bait On Circles Off. We measure what can move a game, we try to test it later and we only keep what survives. If this helped, consider subscribing. Hit that like button so more bettors can find it and drop the worst handicapping trends you've seen in the comments down below. I'd like to have a laugh at a lot of these out there, because there are some bad ones. Trends are not inherently evil. Bad thinking is Demand a reason. Demand that it shows up again Demand that it matters. If a trend cannot clear those bars, it does not deserve your money. Thanks for watching.