Data
Historical and popular handicapping criteria, applied to the top 22 Kentucky Derby prospects, listed according to graded stakes earnings. The spreadsheet includes the complete 2003-2008 Derby fields and the top three finishers 1998-2008 for reference, and will be updated once more during Derby week, after all workouts are done and post positions have been drawn. Note: This year I’ve added two columns, one for “Started on dirt,” another for “Won on dirt,” for those concerned about the surface factor.
Any questions or suggestions? Please let me know in the comments (thanks for asking about when this would be up, Jeff). I’ll be returning to the spreadsheet, stats, and Derby handicapping next week.
4/21/09 Addition: Geno at Equispace has also been hard at work compiling data, and has posted a thorough spreadsheet that includes Beyer speed figures and dosage for the top Derby prospects.
Ray Paulick has posted a piece this morning on the possible expansion of the Jockey Club into the tote business that includes a bit on Equibase and its practice of locking all data up behind a paywall, unlike most major sports. “It’s short-term thinking,” says an executive quoted by Paulick. “If our objective in racing is for the horseplayers to win, we should do everything we can to help him, and increase the churn. That’s where the revenue for our business should come from, not from the statistics the horseplayer needs.†Heck, yes.
On the topic, here’s a bit from a post on June 5, 2008:
The Supreme Court squashed Major League Baseball’s attempt to maintain exclusive control of player statistics, turning down its appeal of an Eighth Circuit Court ruling that allowed fantasy baseball leagues to use the data without paying a licensing fee. “The information used in … fantasy baseball games is all readily available in the public domain,” said the appeals court, “and it would be strange law that a person would not have a First Amendment right to use information that is available to everyone.” Well, this is interesting … and most definitely relevant to the industry. Applied to racing, this ruling could be interpreted to mean that almost all data and statistics in the past performances and results charts are in the public domain (which makes it ridiculous that Equibase buries historical charts behind a paywall), but not presentation of the data or statistics [so no straight re-posting of PDF charts], or analysis derived using proprietary methods (such as speed figures).
CBSSports.com responded to the Supreme Court’s decision by launching a new site that makes available data for baseball, as well as football, basketball, hockey, and auto racing. I’d love to see a similar initiative in racing. As baseball stats wizard Bill James said,
People take information and build knowledge. When you give them new information they will create new knowledge, absolutely and without question.
Free data and historical stats, that’s the way to build the fan base.
“If you look back to 1990 and see what information was available and how it was made available, we’ve accomplished a lot,†Equibase president Hank Zeitlen tells Paulick, and that might be true — but it’s not enough.
“Turns out, that there is still huge unlocked potential, there is still a huge frustration that people have, because we haven’t got data on the web as data.”
In the TED talk embedded above, Tim Berners-Lee recalls inventing the WWW twenty years ago and observes that the web’s original purpose of linking documents together is evolving into one of linking data. (Think APIs, think of the potential for racing. Amazing, right? Try not to get too discouraged contemplating the current state of data distribution in the industry.)
Copyright © 2000-2023 by Jessica Chapel. All rights reserved.