Using Kaggle Datasets in Google Colab
You should be able to access any dataset on Kaggle via the API. In this example, only the datasets for competitions are being listed. You can see that datasets you can access with this command:
kaggle datasets list
You can also search for datasets by adding the -s tag and then the search term you're interested in. So this would give you a list of datasets about dogs:
kaggle datasets list -s dogs
You can find more information on the API and how to use it in the documentation here.
Hope that helps! :)
Detailed approach:
- Go to my account in your profile
- Scroll down, until you find an option Create new Api Token, this will download a file called kaggle.json
- Go to Colab upload the file kaggle.json
- pip install kaggle
- create a new folder named kaggle, copy kaggle.json into the kaggle folder, and set read-write permissions only for you(user).
6.Go to Kaggle website.For example, you want to download any data, click on the three dots in the right hand side of the screen. Then click copy API command
- Go to colab, paste the API command
8.When you do an !ls
, you will see that our download is a zip file.
- To unzip the file use the following command
- Now, when you do
!ls
you'll find our csv file is extracted from the zip file.
- To read the file perform a simple
pd.read_csv
, import pandas
12.As you see, we have successfully read our file into colab.
This downloads the kaggle dataset into google colab, where you can perform analysis and build amazing machine learning models or train neural networks.
Happy Analysis!!!
Step-by-step --
Create an API key in Kaggle.
To do this, go to kaggle.com/ and open your user settings page.
Next, scroll down to the API access section and click generate to download an API key. This will download a file called
kaggle.json
to your computer. You'll use this file in Colab to access Kaggle datasets and competitions.Navigate to https://colab.research.google.com/.
Upload your
kaggle.json
file using the following snippet in a code cell:from google.colab import files files.upload()
Install the kaggle API using
!pip install -q kaggle
Move the
kaggle.json
file into~/.kaggle
, which is where the API client expects your token to be located:!mkdir -p ~/.kaggle !cp kaggle.json ~/.kaggle/
Now you can access datasets using the client, e.g.,
!kaggle datasets list
.
Here's a complete example notebook of the Colab portion of this process: https://colab.research.google.com/drive/1DofKEdQYaXmDWBzuResXWWvxhLgDeVyl
This example shows uploading the kaggle.json
file, the Kaggle API client, and using the Kaggle client to download a dataset.