googledrive allows you to interact with files on Google Drive from R.
Install from CRAN:
drive_. Auto-completion is your friend.
dribble, a “Drive tibble”. This is a data frame with one row per file. A dribble is returned (and accepted) by almost every function in googledrive. It is designed to give people what they want (file name), track what the API wants (file id), and to hold the metadata needed for general file operations.
%>%, but does not require its use.
Here’s how to list the first 50 files you see in My Drive. You can expect to be sent to your browser here, to authenticate yourself and authorize the googledrive package to deal on your behalf with Google Drive.
drive_find(n_max = 50) #> Auto-refreshing stale OAuth token. #> # A tibble: 50 x 3 #> name id drive_resource #> * <chr> <chr> <list> #> 1 chicken-xyz.csv 0B0Gh-SuuA2nTVUZGclZiSzZ0bkE <list > #> 2 chicken-rm.txt 0B0Gh-SuuA2nTT3dBbXd1ZWtvSkE <list > #> 3 chicken.jpg 0B0Gh-SuuA2nTbEhtYnIzcFNfX3M <list > #> 4 README-mirrors.csv 1LJlt-1emr662GV8WdEzddzsfqrt-Vg… <list > #> 5 README-mirrors.csv 1PLXfempSnjpXbKVEXwMG5vBEnd-Fwm… <list > #> 6 def 0B0Gh-SuuA2nTRG5YWFVGaV8zbU0 <list > #> 7 abc 0B0Gh-SuuA2nTT2NqTGdLVWFkcjA <list > #> 8 folder1-level4 0B0Gh-SuuA2nTaTR6elE0TjZUUHM <list > #> 9 folder1-level3 0B0Gh-SuuA2nTWktWeTB0ajVoQjQ <list > #> 10 cranberry-TEST-drive-ls 1PM--xCb5axy5Uu9f6fDNjPAN2psRbQ… <list > #> # ... with 40 more rows
You can narrow the query by specifying a
pattern you’d like to match names against. Or by specifying a file type: the
type argument understands MIME types, file extensions, and a few human-friendly keywords.
Alternatively, you can refine the search using the
q query parameter. Accepted search clauses can be found in the Google Drive API documentation. For example, to get all files with
'horsebean' somewhere in their full text (such as files based on the
chickwts dataset!), do this:
(files <- drive_find(q = "fullText contains 'horsebean'")) #> # A tibble: 8 x 3 #> name id drive_resource #> * <chr> <chr> <list> #> 1 chickwts 0B0Gh-SuuA2nTN05CNjk3bG… <list > #> 2 chickwts_gdoc-TEST-drive-publish 1BHmmAyclG4RQS7hOJpeKlI… <list > #> 3 foobar 1qoA3kr9DmSTtsG9hoicP7y… <list > #> 4 foobar 0B0Gh-SuuA2nTa01CaXZZOW… <list > #> 5 chickwts-TEST-drive-publish 0B0Gh-SuuA2nTSjh6SElzSV… <list > #> 6 chickwts_gdoc-TEST-drive-list 1lA8iYCyFyFi7T_qe7UrYyB… <list > #> 7 chickwts_txt-TEST-drive-publish 0B0Gh-SuuA2nTaU1CRmR2ZG… <list > #> 8 hadley-googledrive-tour 1CndPuuAlTGNJkyqqE-CNuw… <list >
You generally want to store the result of a googledrive call, as we do with
files is a dribble with info on several files and can be used as the input for downstream calls. It can also be manipulated like a regular data frame at any point.
as_id() can be used to coerce various inputs into a marked vector of file ids. It works on file ids (for obvious reasons!), various forms of Drive URLs, and dribbles.
## let's retrieve same file by id (also a great way to force-refresh metadata) x$id #>  "0B0Gh-SuuA2nTRG5YWFVGaV8zbU0" drive_get(as_id(x$id)) #> # A tibble: 1 x 3 #> name id drive_resource #> * <chr> <chr> <list> #> 1 def 0B0Gh-SuuA2nTRG5YWFVGaV8zbU0 <list > drive_get(as_id(x)) #> # A tibble: 1 x 3 #> name id drive_resource #> * <chr> <chr> <list> #> 1 def 0B0Gh-SuuA2nTRG5YWFVGaV8zbU0 <list >
In general, googledrive functions that operate on files allow you to specify the file(s) by name/path, file id, or in a
dribble. If it’s ambiguous, use
as_id() to flag a character vector as holding Drive file ids as opposed to file paths. This function can also extract file ids from various URLs.
We can upload any file type.
(chicken <- drive_upload( drive_example("chicken.csv"), "README-chicken.csv" )) #> Local file: #> * /Users/jenny/resources/R/library/googledrive/extdata/chicken.csv #> uploaded into Drive file: #> * README-chicken.csv: 1w9vv35y7pNE6q_wl3EYoh5ty_GO44gj0 #> with MIME type: #> * text/csv #> # A tibble: 1 x 3 #> name id drive_resource #> * <chr> <chr> <list> #> 1 README-chicken.csv 1w9vv35y7pNE6q_wl3EYoh5ty_GO44gj0 <list >
Notice that file was uploaded as
text/csv. Since this was a
.csv document, and we didn’t specify the type, googledrive guessed the MIME type. We can overrule this by using the
type parameter to upload as a Google Spreadsheet. Let’s delete this file first.
drive_rm(chicken) #> Files deleted: #> * README-chicken.csv: 1w9vv35y7pNE6q_wl3EYoh5ty_GO44gj0 ## example of using a dribble as input chicken_sheet <- drive_upload( drive_example("chicken.csv"), "README-chicken.csv", type = "spreadsheet" ) #> Local file: #> * /Users/jenny/resources/R/library/googledrive/extdata/chicken.csv #> uploaded into Drive file: #> * README-chicken.csv: 12ZzU5hS7GFQdpJBoVzBz3p8Xie2uNmCuNJNUVemGF9k #> with MIME type: #> * application/vnd.google-apps.spreadsheet
Versions of Google Documents, Sheets, and Presentations can be published online. You can check your publication status by running
drive_reveal(..., "published"), which adds a logical column
published and parks more detailed metadata in a
drive_publish() will publish your most recent version.
(chicken_sheet <- drive_publish(chicken_sheet)) #> Files now published: #> * README-chicken.csv: 12ZzU5hS7GFQdpJBoVzBz3p8Xie2uNmCuNJNUVemGF9k #> # A tibble: 1 x 7 #> name published shared id drive_resource permissions_res… #> * <chr> <lgl> <lgl> <chr> <list> <list> #> 1 README-… TRUE TRUE 12ZzU5hS7GFQd… <list > <list > #> # ... with 1 more variable: revision_resource <list>
We can download files from Google Drive. Native Google file types (such as Google Documents, Google Sheets, Google Slides, etc.) need to be exported to some conventional file type. There are reasonable defaults or you can specify this explicitly via
type or implicitly via the file extension in
path. For example, if I would like to download the “538-star-wars-survey” Google Sheet as a
.csv I could run the following.
Alternatively, I could specify type via the
Notice in the example above, I specified
overwrite = TRUE, in order to overwrite the local file previously saved.
Finally, you could just allow export to the default type. In the case of Google Sheets, this is an Excel workbook:
Downloading files that are not Google type files is even simpler, i.e. it does not require any conversion or type info.
## upload something we can download text_file <- drive_upload(drive_example("chicken.txt"), name = "text-file.txt") #> Local file: #> * /Users/jenny/resources/R/library/googledrive/extdata/chicken.txt #> uploaded into Drive file: #> * text-file.txt: 1l-jboSScpNnleEeJC1Hf3rv6bzwTcsnA #> with MIME type: #> * text/plain ## download it and prove we got it drive_download("text-file.txt") #> File downloaded: #> * text-file.txt #> Saved locally as: #> * text-file.txt readLines("text-file.txt") %>% head() #>  "A chicken whose name was Chantecler" #>  "Clucked in iambic pentameter" #>  "It sat on a shelf, reading Song of Myself" #>  "And laid eggs with a perfect diameter." #>  "" #>  "—Richard Maxson"