Tuesday, May 21, 2013

Crapcanalysis: Does fast matter? Part 2

Catching fire on the parade laps is not the best way to win or set fastest lap, but it's terrifically photogenic. (Murilee Martin photo)
Yesterday, we began an investigation of the relationship between fastest lap times and finishing position by taking a look at overall winners and at fastest lap setters from every 24 Hours of LeMons race from the 2012 season through the end of April 2013. We then expanded that to include the top 10 in each category.

Here's the brief recap:

Average fastest-lap rank of overall winners: 5.83
Average fastest-lap rank of all Top 10 finishers: 18.479

Average finishing position of fastest lap setters: 22.2
Average finishing position of 10 fastest lap setters: 24.608

Some readers will decry this as falling short of thoroughness on account of small sample size. Surely, since we're only looking at the superlatives in terms of finishing order and fastest laps, we're missing a larger pattern.

So we set out to plot the positions of every car that had completed a lap during the same time period. If you're curious, that's 2,276 data points.

Before we display what that plot looks like, we know that the plot will thin out because we're plotting by straight finishing position and fastest lap rank. This does not take into account number of entries in a given race. As you may or may not know, race attendance varies from 40 or so entries up to 170, so the slowest, worst entry in a 40-car field winds up buried in the plot.

[Editor's Note: Rambling ahead for four or five paragraphs, probably pointless in nature. Skip ahead to chart at your discretion.]

In a perfect world, we'd have plotted each point by a more objective factor. We actually designed this plot with the following coordinates but didn't use it:

X-axis = Factor of race's fastest lap (Team's fastest lap time / Race's fastest lap)
Y-axis = Factor of winner's lap total (Team's lap total / Winner's lap total)

This would have given general coordinates of something like (1.somethingorother, 0.somethingorother).

So why didn't we use that system of factors? Two reasons:

(1) It was a bit more work and--despite the occasional evidence to the contrary--we're actually pretty lazy.
(2) Using actual finishing position and fastest lap rank makes it far easier to identify outliers using (x, y) coordinates and to find them so that we can give some anecdotal information for those outliers.

That said, if someone wants to do slightly more work than we did, we're quite willing to share our data so that someone with more motivation and mathematical ability can graph that.

But enough of our silly rambling. Here's the plot (Click to embiggen):

So what do you see in the data? Based on our elementary knowledge of statistics, we see some correlation between best laps and finishing position. The points of data make a fairly obvious line of relationship.

Statisticians who existed long before we did and who were clearly much more clever discovered a way to tell correlation between such things. This site was helpful to us and seemed slightly more accurate than Wikipedia. Using the formula given for correlation, we came up with a 0.63 factor of relationship between fastest lap time and finishing position.

A factor of 1.00 shows perfect positive correlation and a factor of 0.00 shows no correlation at all. With that in mind, we feel comfortable stating that there is a moderately strong correlation between fastest lap time and finishing order.

However, correlation does not mean causation. We don't know what "causes" higher finishing positions (maybe Tiger Blood), but we do know that faster lap times are more likely to be present on a high-finishing team.

Blah blah blah.

These are things that most people know and likely do not find interesting.

This plot (Click to embiggen) is the same as the plot above, except it has two red boxes superimposed on it. These were placed in a manner consistent with Heisenberg's Uncertainty Principle1, which states that "If you're uncertain about things, just make up some bullscheisse."

Well, not exactly. These two boxes roughly outline Actual Interesting Teams who lie outside what we've described above as falling into the corollary2. Through our highly scientific methods of shrugging and/or applying Heisenberg's Uncertainty Principle, we figure these boxes contain data points that represent, more or less, good stories: Slower cars who have finished well (lower, right-side box) and faster cars that have finished poorly (upper, left-side box).

And these are the teams that we're most interested in: the trundlers and lopers on one hand, the hella-hauling-assers and blow-uppers on the other.

In this graph (Click to embiggen), we've had Excel calculate a trendline. The trendline appears very different from what one's eyes indicate the trendline should be. The reason? The concentration of data points below the line and closer to the Y-axis. Or something like that3.

Let's look at some points of interest, which I highlighted on the graph with letters.

A. As you might expect from yesterday's post, we find a very dense cluster of teams who finished in the Top 10 overall with one of the 10 fastest laps in the race. This again supports the notion that in order to win a LeMons race, you don't necessarily have to be THE fastest, but you're more likely to win if you're in the fastest 10 or so. This is intuitive, boring and we've covered it already, so let's move on.

 (Murilee Martin photo)
B. Here we see the two slowest winners since 2012 began: Bill Danger's Honda Accord was 20th fastest when it won at the spring New Hampshire Motor Speedway and California Mille was 21st fastest at The Ridge in August. The race at the Ridge was a strange beast in and of itself, but we'll talk about that in a later post.

Bill Danger, running an Accord, bested two much faster cars in the Booby Prize Nissan 200SX and the Keystone Kops Volvo 244. We dug through Specialty Timing's timecards from that NHMS race and the Danger car looks to have run the entire race on five stints, compared to what we estimated to be about seven for the Booby Prize car. The Kops were off track more often due to a couple of ill-timed black flags.

A deeper look at Bill Danger's lap times over 516 laps shows that, while they weren't faster than the other two cars, they consistently turned in lap times within a couple seconds of their best lap during all of the stints, meaning their drivers were all comparable in skill and all very consistent.

(The Rusty Hub photo)
C. Here's another Honda Accord, this one belonging to The Blue Shells at Gingerman Raceway in 2012, when their carbureted Class B heap plodded its way to third overall and an eight-lap class win over the much, much faster Joe Dirt Chevy Caprice. In a race with 62 starting cars, the Blue Shells' 1:51.2 ranked 47th fastest, which put it in the laptime company of LemonAid Racing's three-cylinder Geo Metro and an automatic Ford Tempo.

Again, when we look at Specialty Timing's Gingerman timecards, we see that the Blue Shells ran their race on five or so stints (it's a bit nebulous...allwe have to go on is extra-long lap times and the knowledge that there was the overnight stoppage). Like Bill Danger's laptimes, the Blue Shells' drivers were all running about the same pace for the duration of their stints.

One more tie-in between Bill Danger and The Blue Shells: These curious outcomes occurred in consecutive races, just two weeks apart.

(Murilee Martin photo)
D. This is one of our favorite outliers because we it took us a while to figure out how it happened. At Sears Point in 2012, Class B winners Sierra Auto Recycling took their Ford Crown Victoria all the way to fifth place overall in a 171-car field despite running only the 73rd-fastest lap. Care to hazard a guess as to how a gas-guzzling, not-really-fast Crown Vic finished fifth?

If you're one of the 800 or so drivers at Sears Point that weekend, it's unlikely that you've forgotten that much of the race was conducted during a deluge. Sierra Auto Recycling proved more than capable of managing a wet track. They sat in third place overall at the end of the first hour and then stuck near the top of the scoring tables for most of the weekend, using the class lead they built up during the rain to hold off challenges from the Oldsmember Olds Regency and the Uber Vogel: Hans Am Mercedes 190E when the track finally dried out Sunday.

E. Here we see a small cluster of Top 10 finishers whose fastest lap ranked around 50th in the field. Among those in this cluster are: Vermont Maple Runners and Team Farfrumwinnin at New Jersey Motorsports Park in 2012; Team LemonAid at Autobahn Country Club; Ghetto Motorsports at Summit Point; Howard Turkstra at Carolina Motorsports Park; and Communists R Us at Sears Point. None of these cars is what you'd consider fast, but they're obviously all capable of running a solid race regardless of lap times.

We have no idea why there would be a cluster here, to be honest. LemonAid's performance in this case is particularly astounding and an outlier within this group, given that only 59 cars started the race. Other than that, the other members of this cluster were in large fields.

(Murilee Martin)

F. Chump Ganasee Targee Racing are something of a double outlier with their performance at Thunderhill last year. Not only does the team campaign an Eagle Talon--historically one of the least durable crapcans of all time--they also finished eighth place overall despite ranking 83rd in fastest laps. The Talon, of course, represents an alliance between two brands that largely allergic to staying together for a whole race: Mitsubishi and Chrysler. Despite this, the Ganasee squad has finished in the Top 10 three times.

(Murilee Martin photo)

G. This point comes from the "Fast Failures" section of outliers and it's the BBQ Rubber Chicken Picatta Volvo 940 from Sears Pointless 2013. The Picatta-teers have ditched the Swedish propulsion unit and dropped a ubiquitous Chevy Small-Block V8 under its hood. As is well documented, GM's small-block motor can really make a car go. But as is slightly less well known, the same motor tends to fail spectacularly from the torture test of crapcan racing. At Sears Point, the 940 was 1.3 seconds faster than the next-fastest car, Eyesore Racing's Miata, but it spent a good chunk of the weekend on jack stands to finish 109th place.

(Murilee Martin photo)
H. This is one of the most notorious cars in LeMons, a car that is respectably fast but was handicapped out of contention by 500 million penalty laps. Of course, this the Fukushima Debris 2004 Mazda RX-8, which was 15th fastest in its rainy debut at Sears Point in 2012 but finished Dead Last (DFL) in the standings. Bringing a late-model sportscar to a crapcan race is a surefire way to end up in the Fast Failures quadrant of our plot.

I. Here's an interesting cluster of Top 15-ish cars that have finished right around 55th overall. There's too many to call out individually, but it's a curious bit, that. Any causal theories?

(Murilee Martin photo)

J. Here's an entry that fits the trendline perfectly: Team 4 Play's Miata at Chuckwalla in December 2012. The Miata finished in the bottom half of the field (78 or 114) but was 92nd fastest.

K. Look at this point all the way out there all on its own. The first person to tell us correctly which team and which race that was wins some prize that we'll make up on the spot.

[Editor's Note: Want to know about other data points? Leave a comment or email us and we'll add it to the list.]

We'll be back with a third part tomorrow to discuss one of the strangest races in the last two years.

Crapcanalysis: Does Fast Matter? Part 1

1 The astute among you may know Heisenberg's actual Uncertainty Principle, but for the unfamiliar, here's what it really is. If you get through all of that, pat yourself on the back for actually learning something today, because you probably haven't from anything we wrote.

2 This word almost certainly misused with extreme prejudice

3 We applied "Heisenberg" here, too.


  1. This comment has been removed by the author.

  2. Hey Eric. I think this is a good piece. But how would the results look if the graphs were based on number of laps complete as a percentage of the winners number of laps. Finishing place, as far as statistics go, isn't good information because it's contingent on too many variables which have nothing to do with that team. Think of it this way - 20 spec Miatas with identical drivers have a 10 lap race with a LeMons type of start. They finish 1 to 20 but will all finish on the same lap which is more closely tied to their actual performance.

    1. Hi Rob, good to hear from you. I briefly address the shortcomings of using in the post (just above the first graph) and will probably actually compile a new graph with the method I describe.

      Mostly, I wanted a way that was easy to identify teams on the graph with a simple, all-integer coordinate system that relates to easily found information from timing sheets.

      There's another anecdotal post going up today or tomorrow and I already have plans for three (unwritten and thus-far unresearched) posts using this same data in the near future.

      As I'm not a professional statistician, I'm definitely always open to more detailed feedback and have gotten a lot of great stuff already from this post and from the ones I did in January.

      Thanks for reading!