File Identification • googledrive

The most natural way to identify a file is by its name. The Drive API, however, fundamentally identifies a file by its unique id. googledrive makes it easy to specify your file of interest by name at first, but then immediately capture that in a way that includes the file’s id. This facilitates downstream file operations.

googledrive holds information on Drive files in what we call a dribble, a Drive tibble. A tibble is the variant of the data frame that is used throughout the tidyverse. A googledrive dribble will have one row per file and is guaranteed to have these variables:

name: The file’s name.
id: The file’s unique id.
drive_resource: A list column containing Drive API metadata that is internally useful and possibly interesting to some users. The typical user will leave this variable unexamined. Just let it be.

Some functions add additional variables, but the three above are always present. drive_reveal(), for example, can add file paths, MIME types, trash status, and information about permissions and publishing. Use your usual techniques for data frame manipulation in order to isolate specific rows – files, in this case – that you want to operate on. For example, you can manipulate the dribble with dplyr verbs such as filter(), mutate(), arrange(), and slice().

How to get one or more files in a dribble

How do you get files into a dribble in the first place? Two main functions for this:

drive_find(): Similar to https://drive.google.com. Lists all your files or lets you narrow things down based on name or file properties.
drive_get(): Get files by name – file path, actually – or by id, including by URL.

drive_find()

Read the help for complete documentation but here are some of the many ways to call drive_find():

drive_find()
drive_find(n_max = 40)
drive_find(pattern = "chicken")
drive_find(type = "pdf")
drive_find(type = "folder")
drive_find(type = "spreadsheet")
drive_find(trashed = TRUE)
drive_find(q = "fullText contains 'project'")
drive_find(q = "modifiedTime > '2019-04-21T12:00:00'", order_by = "recency")
drive_find(q = c("starred = true", "visibility = 'anyoneWithLink'"))

drive_find() is for exhaustive file listing or filtering on file properties.

drive_get()

Read the help for complete documentation but here are how calls to drive_get() can look:

drive_get("i_am_a_file_name")
drive_get("i/am/a/deeply/buried/file.txt")
drive_get("i/am/a/folder/")
drive_get(c("i_am_a_file_name", "path/to/file"))
drive_get(as_id("abcdefghijklm"))
drive_get(as_id(c("abcdefghijklm", "nopqrstuvwxyz")))
drive_get(id = "abcdefghijklm")
drive_get(id = c("abcdefghijklm", "nopqrstuvwxyz"))
drive_get(as_id("https://docs.google.com/document/d/abcdefghijklm/edit"))

drive_get() is for targetted file fetching based on name, path, id, or URL.

Other handy functions

drive_reveal() adds bonus information to the dribble, either by excavating it from the drive_resource variable or by calling the Drive API. Use it on a dribble containing files of interest.

drive_reveal(files, "path")
drive_reveal(files, "trashed")
drive_reveal(files, "mime_type")
drive_reveal(files, "permissions")
drive_reveal(files, "published")

drive_ls() lists files below a specified folder. It’s a thin wrapper around drive_find(), so all those capabilities are available.

drive_ls("i/am/a/folder/", type = "spreadsheet")

drive_browse() will open a file in your browser.

drive_browse(i_am_a_dribble)
drive_browse("i_am_a_file_name")
drive_browse(as_id("abcdefghijklm"))