Search code examples
datasetkagglegoogle-colaboratory

Using Kaggle Datasets in Google Colab


Is it possible to use any datasets available via the kaggle API in Google Colab? I see the Kaggle API is used in this Colab notebook, but it's a bit unclear to me what datasets it provides access to.


Solution

  • Step-by-step --

    1. Create an API key in Kaggle.

      To do this, go to kaggle.com/ and open your user settings page. settings nav

    2. Next, scroll down to the API access section and click generate to download an API key. api token This will download a file called kaggle.json to your computer. You'll use this file in Colab to access Kaggle datasets and competitions.

    3. Navigate to https://colab.research.google.com/.

    4. Upload your kaggle.json file using the following snippet in a code cell:

      from google.colab import files files.upload()

    5. Install the kaggle API using !pip install -q kaggle

    6. Move the kaggle.json file into ~/.kaggle, which is where the API client expects your token to be located:

      !mkdir -p ~/.kaggle !cp kaggle.json ~/.kaggle/

    7. Now you can access datasets using the client, e.g., !kaggle datasets list.

    Here's a complete example notebook of the Colab portion of this process: https://colab.research.google.com/drive/1DofKEdQYaXmDWBzuResXWWvxhLgDeVyl

    This example shows uploading the kaggle.json file, the Kaggle API client, and using the Kaggle client to download a dataset.