As always please help me with my bandwidth costs for this data. Just buy me a beer. Thanks,
2007-2009 Pitch FX database in mysql
I have went through this year and reworked the import scripts from Mike Fast and especially the mysql database to make the import faster much much much faster with indexes. It was taking over 5 minutes to import a game in 2009 with all the previous data in the database, and now it takes 30 seconds at most. I fixed games that had data errors, and made sure they imported. There is alot of time and effort that goes into this import with my brother bugging me to scripting it, to my brother bugging me, to testing, to my brother bugging me. It really isn’t that bad, but some times it feels like it. The import is up to 151MB compressed so we might have to look at splitting this up by year in the future or something. Ideas? I will only be releasing one file from now on for the Pitch fx MySQL database import is named pbp2.sql. Here is the reworked file that gets updated daily.
Please would also like to know when the 2009 data from retrosheet is out so I can import that data, and maybe rework the output. I have have hear people wanting it out by 10 or 20 years spans. I can do it I just would like to know that people will use it before I do it.
Download Pitch F/X Database here
Here is how it is all done. I have 4 scripts that run a night
1. hack_4day.pl
2. hack_pbp2.pl
3. 2009.pl
4. update_db_with_count.pl
All the scripts are available on the downloads page
#1 by Nick Steiner at December 6th, 2009
Darrell -
Retrosheet for 2009 is out. I just wanted to let you know of that, and that I, along with several other people I’ve corresponded with, would be interested in having it broken up into groups of 10 years.
Are you still going to be interested in doing this?
#2 by Darrell at December 22nd, 2009
Nick,
I have broken the database out into decades, 1950s, 1960s, …. I also have the big boy, and should be releasing this today. My brother slowed me down made me spend a week fixing his house, and wants the baseball data on time. What a butt. Anyway I will let you know when this is done.
Thanks,
#3 by Josh at January 3rd, 2010
Darrell,
Thanks very much for providing this data in mySQL format. A huge time saver.
I d/l-ed the retrosheet database, but can’t get the pitch f/x link to work. I get an empty file.
Any chance you could check the link to make sure it’s working?
Thanks!
P.S. I donated gladly.
#4 by Darrell at January 4th, 2010
Josh,
I have fixed the pitch f/x database download. It was probably me using the pitch f/x export script to make the Retrosheet sql.gz file. Let it run for a split second overwrite the data. Oh well sorry for the mistake.
Darrell
#5 by josh2 at January 4th, 2010
How do I get the .gz file into my SQLyog? It won’t execute. Sorry, I’m a rookie.
Joshua
#6 by nick at February 2nd, 2010
I think the file might still be empty, no?
#7 by Jeff Zimmerman at February 3rd, 2010
It is a large file, so it is not black.
.gz files are zipped files, you will need to uncompress it