Baseball Analytics

  • Comments posted to this topic are about the item Baseball Analytics

  • I'd love to have a go with R but it is too far down my list. I guess that I will benefit from the many articles and discussions posted here before I get that chance and for that I thank you all.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • I'm the data manager for the baseball league I participate in each summer. This past season we used iScore Sport's app on a couple of Samsung tablets instead of a using scorebooks. It generated relevant stats pretty quickly but we only used it for batting statistics. We could also go through the play by play online, which was cool. If we kept track of what position each person was playing it would have given us defensive stats too. The year before I would transcribe what was in the scorebooks and put the data into SQL Server. It gave me more flexibility than using an app, but man was it time consuming!

    I think that the analysis is a bit more fun since it's people I know. The top players definitely enjoy seeing their names on the league website.

  • I'm sure someone already has all the stats for all the sporting events in some database. But I think it would be fun making my own of my local college football team.

  • I enjoyed playing baseball too - we had an American guy coach some Physical Education sessions for us when at school for a couple of years, and I really liked playing it. Watching not so much I'll admit, if I'm going to watch a long and boring game let's make it cricket, the daddy of such.

    I'm in Gary's boat though, I'd love the chance to do something with R but it's just not coming up yet, perhaps sometime soon.

  • I'd recommend the book "Analyzing Baseball Data With R" as a good jumping off point. The book uses Sean Lahman's free baseball data set, available here:

    http://seanlahman.com/baseball-archive/statistics/

    The data is available in CSV, Access and MySQL format, so it takes a bit of manipulation to get it into SQL Server.

    I've taken R courses and used R at work, and this book has a decent introductory chapter before it dives deeper into the sabermetrics end of the pool.

    Greg

  • I use pandas and various other Python tools to do this sort of analysis. Like Buck Woody, I'm building a toolbox that will probably include R in the future. This is in addition to Excel, SQL Server and various other utilities.

    As for data sets, I'd rather grab some of the various climate, topographic, astronomy, census or demographic data available from the government and other organizations. I can usually find data that's interesting and applicable to similar work data.

  • gregory.price (11/4/2015)


    I'd recommend the book "Analyzing Baseball Data With R" as a good jumping off point. The book uses Sean Lahman's free baseball data set, available here:

    http://seanlahman.com/baseball-archive/statistics/

    The data is available in CSV, Access and MySQL format, so it takes a bit of manipulation to get it into SQL Server.

    I've taken R courses and used R at work, and this book has a decent introductory chapter before it dives deeper into the sabermetrics end of the pool.

    Greg

    Thanks, I'll check it out. I have the Lahman database in SQL already, so I can play.

  • Does this analysis take into account the facts that many losing teams trade away top players during the season to plan for their future? Additionally, does it account for whether teams 27 games out of first place take the risks to win at the end of the season? For example, some teams may shut down pitchers to keep them fresher for the next season.

  • petti1955 (11/4/2015)


    Does this analysis take into account the facts that many losing teams trade away top players during the season to plan for their future? Additionally, does it account for whether teams 27 games out of first place take the risks to win at the end of the season? For example, some teams may shut down pitchers to keep them fresher for the next season.

    If you think about it, isn't that the reason why.

    The hypothesis was made and then tested against the data. It never explained the reason but treated is as some kind of phenomenon. petti1955 has taken it a step further to explain why. Now we just need to prove it 😉

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • I think this is a high level analysis that shows losing teams get worse. Why? That's for further analysis. Is is pitchers changing (trade, sent down, kids brought up) or other factors. We could dig in further, but it's a basic analysis.

    Since this is zero sum (I win, you lose), perhaps there are other reasons to look deeper into the factors behind the theory.

  • chrisn-585491 (11/4/2015)


    I use pandas and various other Python tools to do this sort of analysis.

    That's what I use with SQL Server. There is no need to use R, when you can just run Python scripts that pretty much do similar analysis.

    Wish I could explore more with R with the new versions, but not easy to get that approved. 🙁

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply