R Dataset / Package plyr / baseball

Submitted by pmagunia on March 9, 2018 - 1:06 PM
Dataset License
GNU General Public License v2.0
Attachment Size
dataset-89446.csv 1.5 MB
Documentation

Yearly batting records for all major league baseball players

Description

This data frame contains batting statistics for a subset of players collected from http://www.baseball-databank.org/. There are a total of 21,699 records, covering 1,228 players from 1871 to 2007. Only players with more 15 seasons of play are included.

Usage

baseball

Format

A 21699 x 22 data frame

Variables

Variables:

  • id, unique player id

  • year, year of data

  • stint

  • team, team played for

  • lg, league

  • g, number of games

  • ab, number of times at bat

  • r, number of runs

  • h, hits, times reached base because of a batted, fair ball without error by the defense

  • X2b, hits on which the batter reached second base safely

  • X3b, hits on which the batter reached third base safely

  • hr, number of home runs

  • rbi, runs batted in

  • sb, stolen bases

  • cs, caught stealing

  • bb, base on balls (walk)

  • so, strike outs

  • ibb, intentional base on balls

  • hbp, hits by pitch

  • sh, sacrifice hits

  • sf, sacrifice flies

  • gidp, ground into double play

References

http://www.baseball-databank.org/

Examples

baberuth <- subset(baseball, id == "ruthba01")
baberuth$cyear <- baberuth$year - min(baberuth$year) + 1calculate_cyear <- function(df) {
  mutate(df,
    cyear = year - min(year),
    cpercent = cyear / (max(year) - min(year))
  )
}baseball <- ddply(baseball, .(id), calculate_cyear)
baseball <- subset(baseball, ab >= 25)model <- function(df) {
  lm(rbi / ab ~ cyear, data=df)
}
model(baberuth)
models <- dlply(baseball, .(id), model)
--

Dataset imported from https://www.r-project.org.

Documentation License
GNU General Public License v2.0

From Around the Site...

Title Authored on Content type
R Dataset / Package vcd / JointSports March 9, 2018 - 1:06 PM Dataset
R Dataset / Package Stat2Data / Titanic March 9, 2018 - 1:06 PM Dataset
R Dataset / Package KMsurv / tongue March 9, 2018 - 1:06 PM Dataset
R Dataset / Package psych / Tucker March 9, 2018 - 1:06 PM Dataset
R Dataset / Package Ecdat / Kakadu March 9, 2018 - 1:06 PM Dataset