Download Data
Table of Contents
- Data Transfer using FileZilla
- Downloading using Wget
- Downloading using Curl and Wget
- Downloading Data from Kaggle
- Download Data from Instance to Local using SFTP/SCP
You can download data to your Jarvislabs.ai GPU/CPU powered instances in multiple ways. Lets looks at some options.
- Filezilla
- Wget (Simple and fast)
- Using Curl Wget
- Downloading from Kaggle
- SFTP Transfer from your local directory
Data Transfer using FileZilla
Check out the video below to know how to transfer data from local to your jarvislabs instance and vice versa.
Downloading using Wget
Open Terminal from JupyterLab, and use the wget
command followed by the link to download any publicly available datasets.
# Example: Downloading the CIFAR10 dataset
wget https://s3.amazonaws.com/fast-ai-sample/cifar10.tgz
Downloading using Curl and Wget
Steps to download data from Kaggle Dataset, Google Drive, or other data sources to Jarvislabs.ai instances using Curl and Wget extensions.
Downloading Data from Kaggle
Kaggle provides an API that allows you to:
- Download Kaggle datasets
- Upload datasets to Kaggle
- Make Kaggle submissions
For more information, check the Kaggle API documentation.
Install Kaggle API
Open the terminal and run:
pip install kaggle --upgrade
Setup Kaggle API
- Go to the
Account
tab of your Kaggle user profile. - Select
Create API Token
to downloadkaggle.json
, which contains your API credentials. - Place this file in the location
~/.kaggle/kaggle.json
.
If you have uploaded using JupyterLab, use the following commands to copy it to the required location:
mkdir -p ~/.kaggle
mv kaggle.json ~/.kaggle/
chmod 600 ~/.kaggle/kaggle.json
Downloading Kaggle dataset
Open terminal, and replace feedback-prize-2021
with the competition you are participating.
kaggle competitions download -c feedback-prize-2021
Uploading Kaggle dataset
You can use Kaggle API to upload datasets from Jarvislabs.ai instance to Kaggle datasets.
kaggle datasets init -p /path/to/dataset
You will find dataset-metadata.json file inside the dataset folder. Change the 'id' and the 'title' in the file dataset-metadata.json
kaggle datasets create -p /path/to/dataset
Update an existing dataset
kaggle datasets version -p /path/to/dataset -m "Updated data"
Download Data from Instance to Local using SFTP/SCP
SSH
You have to connect your instance with your local terminal.
To do that you have to generate a SSH key in your JarvisLabs instance. Check out this document for more information.
After that press the SSH button on your instance to copy the SSH key and port number of that instance.
The key will be in this format.
#example ssh string
ssh -o StrictHostKeyChecking=no -p <port_number> <username@hostname>
SFTP
Paste the key in your terminal in the following format.
sftp -P <port_number> <username@hostname>
For example
sftp -P 11114 root@ssha.jarvislabs.ai
Download SFTP
In your terminal you can use this command to safely transfer anything from your instance to your local directory.
get /path/to/file
Upload SFTP
To upload a file from your local to remote instance
put /path/to/upload/file /path/to/upload/destination
SCP
You can also upload files from your local to your jarvislabs instance and vice versa.
Download SCP
scp -P <username@hostname>:/workspace/
<portnumber> /Users/<name>/Downloads/welcome.zip
Upload SCP
scp -P <portnumber> /Users/<name>/Downloads/welcome.zip
<username@hostname>:/workspace/