Search code examples
google-sheetsgoogle-sheets-formulaimportdata

Quota/limitations on importrange() and importdata() in Google Sheets?


I have one main Google Sheet that computers an approximately 10 column x 30 row worksheet. I have roughly 400 "client" Google Sheets (separate files) that pull that 10 x 30 worksheet in to use for data validation. Working in a school environment, the main sheet is readable/editable by a bunch of guidance counselors and academic deans, while the client sheets are accessible to individual students and their advisors (hence the multitude of sheets, to manage access).

The 10 x 30 worksheet is fairly static. It changes maybe once every couple of months, although yesterday it went through a series of rapid changes that precipitated my problem and thus my research. When it changes, those changes need to be reflected in all ~400 client sheets.

I had been using IMPORTRANGE(), but that stopped working yesterday. But not with the error described in the documentation regarding rate-limiting due to a large number of requests, but with one that suggests a total failure:

Error linking spreadsheets. The source spreadsheet has reached the maximum capacity for sharing and importing. Learn more

I read this as meaning "you've connected too many worksheets to this sheet, only the first n sheets will get data" (since that's what I'm seeing). But n is not well-defined, since this is a setup under development, and trashing (and then permanently deleting from the trash) old, disused client sheets changed nothing.

I experimented with dumping out the 10 x 30 worksheet into a CSV and using IMPORTDATA(), connecting the client sheets to that, hoping that that would trigger requests when the client sheets were opened, rather than when the main sheet changed. But that also maxed out with around 20 client sheets connected to it. To get Google Sheets to import the CSV from a file hosted in a shared drive, I invoked IMPORTDATA() thus:

=IMPORTDATA("https://drive.google.com/uc?export=download&id=<FILE_ID>")

Having read the documentation from Google on the limitations of the IMPORTRANGE() function and on optimizing data references, and having searched the web, I'm still not clear on how these (not 100% clearly documented) limitations affect my situation. I'm thinking of scripting the data pull to the client sheets next.

Any links to documentation (or empirical evidence) would be appreciated.


Solution

  • Every named user and group the spreadsheet is shared with, and every authorized importrange() access that reads the spreadsheet, counts as an access slot. The absolute limit on the number of shares from a spreadsheet is 600 access slots.

    There is currently (February 2024) no way to find out which daughter spreadsheets are reading data from a mother spreadsheet using importrange(), so it is not easy to discover how many access slots have been used that way. Further, an importrange() access slot does not get freed when you remove the formula from a daughter spreadsheet, nor when you delete a daughter spreadsheet. These are known issues, and Google is aware of them, so chances are that there will be a feature for discovering dependent spreadsheets and better manage access slots at some point. Until that happens, it makes sense to plan ahead and stay well below the limit. There are several approaches.

    The easiest approach is to use File > Share > Share with others to share the spreadsheet as "anyone with link" and remove all named user shares you can. Every named user you remove from the list frees up one access slot. The importrange() function can also read through an "anyone with link" share without using up an access slot, regardless of whether the share is "editor", "commenter" or "viewer". Note that this only applies to new importrange() accesses you create after changing sharing so it will not release any existing access slots. Also note that this option is different from "anyone at your domain with the link", so you cannot use this approach in a Google Workspace Domain where "anyone with link" has been disallowed.

    If there are many named user shares but you cannot share the file with "anyone with link", another approach is to share the spreadsheet with groups instead of individual users. If there are 200 named user shares, you can put these users in a group to free 199 access slots. Note that there is a separate limit of a maximum of 100 groups within the 600 access slots.

    The third approach is to use File > Share > Publish to web > Comma-separated values to get a link from where you can read with importdata() instead of importrange(). This does not put as much load on the mother spreadsheet, and no access slots are used up. Note again that that this only applies to new accesses — replacing an existing importrange() with importdata() will not release an access slot.

    The fourth approach is to chain spreadsheets so that a grandmother spreadsheet feeds a number of mother spreadsheets that in turn each feed a number of daughter spreadsheets. Each daughter uses importrange() to read from their mother, and each mother uses importrange() to read from the grandmother spreadsheet.

    When you already have an existing spreadsheet that is approaching the limit, you can start a chain by using File > Make a copy to create a new grandmother spreadsheet. Replace the data in the original spreadsheet with importrange() formulas that point to the new grandmother. The original spreadsheet becomes one of the mothers, and the existing daughters continue reading from that mother. You can then create additional mothers, and additional daughters that read from the new mothers, without adding load to the original spreadsheet.

    You should keep the number of shares well below 600 at every level of the chain:

    • to share to 1,000 daughters,  use 1 grandmother with say 20 mothers, each with say 50 daughters;

    • to share to 5,000 daughters, use 1 grandmother with say 50 mothers, each with say 100 daughters.

    To improve performance, you may also want to modify some of the existing importrange() formulas in existing daughter spreadsheets to refer to one of the new mothers instead of the original mother that is under heavier load than the others.

    There are many things that affect spreadsheet performance. See my optimization tips.