Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
522 views
in Technique[技术] by (71.8m points)

r - Trouble reading multiple rds files at the same time and saving them to a single data frame

I have a for loop set up for some rds files and I need to find a way to read said files. I could manually merge them with rbind as well, but it would be preferable if I could have them already together.

YearsSeq <- seq(2010,2020,1)
for (year in YearsSeq) {
  Allrds <- paste0('https://raw.githubusercontent.com/guga31bb/nflfastR-data/master/data/play_by_play_', sprintf('%02d', YearsSeq), '.rds')
}

With the code above I can save the rds files to Allrds so that Allrds[1] is data from 2010, Allrds[2] is data from 2011, etc.

Using readRDS(Allrds) doesn't work, it comes back with the error "Error in gzfile(file, "rb") : invalid 'description' argument"

Any help would be appreciated!!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

You have a couple of issues here. First, an apply function such as lapply will work better than a loop. Second, you cannot read RDS directly from a github URL - you need to specify a connection using url.

So you can read into a list of data frames like this:

Allrds <- lapply(2010:2020, function(x) readRDS(url(paste0("https://raw.githubusercontent.com/guga31bb/nflfastR-data/master/data/play_by_play_", x, ".rds"))))

And you can bind into a single data frame like this:

Allrds <- do.call(rbind, Allrds)

# check size of data
dim(Allrds)

[1] 529480    340

You will lose the year information when you rbind, but I see the data contains season so that is not an issue.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...