Saturday, April 27, 2013

More on Zone% and Batting Order: AL vs. NL

Today I read a blog entry on Fangraphs that discussed zone% (percent of pitches a batter sees which are in the strike zone) and how it relates to a player's position in the lineup. The purpose of the post to discuss the idea of protection (do hitters see more strikes when batting in front of good hitters), but I think the author found something else much more interesting.

Below are his core results, zone% by lineup position in the National League and American Leage

See what's weird?

Here they are in graph form:

In the National League, the leadoff hitter sees the fewest strikes, the 9th hitter sees the most strikes, and there's a steady trend upward in between  This kind of makes sense to me since teams tend to (lol Mariners/Endy Chavez) put their best hitters early on the batting order.

In the American League (except for the 9th hitter) it's exactly the opposite!

Why might that be?

Are there consistent differences in the lineups between the AL and NL? The answer appears to be no. Baseball reference has splits by lineup position readily available and I was able to calculate wOBA (wOBA is a comprehensive measure of a batter's value) by lineup position for 2012:

Almost identical. The 9th spot in the NL is, of course, pitchers. The graph looks the same using OPS or OBP.

It looks a little different if you use iso (iso is slugging percentage minus batting average, a measure of isolated power)

Looks like managers (in both leagues) like putting speedy contact guys in the 1 and 2 holes before bringing in the big bats.

Speaking of speed, there is one stat for which the curves diverge some, steal %.

 I would guess the divergence in steal % for 8/9 hitters is driven by
  • AL managers putting their 'second leadoff' (i.e. can't but but at least he's not slow) player in the 9 hole
  • NL pitchers sacrificing a lot (lower incentive for the 7/8 hole hitter to steal)

So it looks like the only real difference in lineups between AL and NL is pitchers batting 9th (and sucking). It's possible there is actually a difference that I've missed, but failing that I can think of 3 other options:
  1. The presence of a pitcher in the nine hole has strategic strike-throwing ramifications all the way up the lineup
  2. Strike throwing strategy is very different between the AL and NL
  3. Dave's finding is the flukey artifact of a small sample

I started looking into number 2. I pulled in 2012 batting data (the fangraphs dashboard with pitch fx plate discipline data tacked on, min 300 PAs) for both AL and NL and looked for which metrics were well correlated with zone%1,2

The leagues appear similar. Power hitters, free swingers, and guys who strikeout a lot get fewer strikes. Guys who are high contact, high speed (low power), strong fielding types (hi Brendan Ryan) get lots of strikes because where are they gonna hit ball anyway.

The only place I see a consistent difference between the leagues is in speed. Fast NL guys still tend to get more strikes, but less so than fast AL guys.

I'm going stop now and think about what I've found so far. I'll dig further on another day.

1. There is lots of collinearity here (or there would be if these were all explanatory variables in a regression) so don't read high correlation with HR then also with iso as two separate pieces of information, as HRs go up iso goes up, so of course their correlations will be similar with everything. The purpose is to just get a picture of the kinds of players seeing more strikes in the AL and NL.

2. Note the use of playerid as a negative control, it should obviously be uncorrelated with zone%, so there is definitely some noise in the date. See here for definitions of all the terms in the table.

  1. This is awesome work. I would love to read more when you get the chance to continue your research. It's hard to believe there can be "noise" over such a large sample of pitches but .. you never know.