Getty Images

Why and how to use Google Colab

Whether you're looking to gain experience or you're already an expert data scientist, Google Colab can help boost ML and AI initiatives. Follow this tutorial to learn the basics.

For beginners looking to gain experience with machine learning and AI, it can be difficult to obtain access to huge data sets or vast computational power to handle workloads. One option to overcome this challenge is Google Colab, a free tool from Google that provides resources, such as GPUs, TPUs and Python libraries, to help you gain experience or further refine your skills.

Follow this tutorial to learn what Google Colab is and how to start using the tool.

What is Google Colab?

Google Colaboratory, or Colab, is an as-a-service version of Jupyter Notebook that enables you to write and execute Python code through your browser.

Jupyter Notebook is a free, open source creation from the Jupyter Project. A Jupyter notebook is like an interactive laboratory notebook that includes not just notes and data, but also code that can manipulate the data. The code can be executed within the notebook, which, in turn, can capture the code output. Applications such as Matlab and Mathematica pioneered this model, but unlike those applications, Jupyter is a browser-based web application.

Google Colab is built around Project Jupyter code and hosts Jupyter notebooks without requiring any local software installation. But while Jupyter notebooks support multiple languages, including Python, Julia and R, Colab currently only supports Python.

Colab notebooks are stored in a Google Drive account and can be shared with other users, similar to other Google Drive files. The notebooks also include an autosave feature, but they do not support simultaneous editing, so collaboration must be serial rather than parallel.

Colab is free, but has limitations. There are some code types that are forbidden, such as media serving and crypto mining. Available resources are also limited and vary depending on demand, though Google Colab offers a pro version with more reliable resourcing. There are other cloud services based on Jupyter Notebook, including Azure Notebooks from Microsoft and SageMaker Notebooks from Amazon.

The benefits of Google Colab

Enterprise data analysts and analytics developers can use Colab to work through data analytics and manipulation problems in collaboration. They can write, execute and revise core code in a tight loop, developing the documentation in Markdown format, LaTeX or HTML as they go.

Notebooks can include embedded images as part of the documentation or as generated output. In addition, you can copy finished analytics code, with documentation, into other platforms for production use once sufficiently tested and debugged.

Google Colab eliminates the need for complex configuration setup and installation, as it runs right in the browser. It also includes pre-installed Python libraries that require no setup to use.

How to use Colaboratory

To use Colaboratory, you must have a Google account.

On your first visit, you will see a Welcome To Colaboratory notebook with links to video introductions and basic information on how to use Colab.

Create a workbook

From the File menu, click New notebook to create a workbook.

After clicking on 'File' in the top-left corner, a drop-down will appear with the options 'New notebook,' 'Open notebook' and 'Upload notebook.'

If you are not yet logged in to a Google account, the system will prompt you to log in.

The notebook will by default have a generic name; click on the filename field to rename it.

The filename field is highlighted at the top-left corner of the page.
The file name has been changed from 'Untitled0' to 'MyFirstNotebook.'

The file type, IPYNB, is short for "IPython notebook" because IPython was the forerunner of Jupyter Notebook.

The interface allows you to insert various kinds of cells, mainly text and code, which have their own shortcut buttons under the menu bar via the Insert menu.

The 'Insert' tab has been selected, and the drop-down displays the options to include 'Code cell,' 'Text cell' and 'Section header cell.'

Because notebooks are meant for sharing, there are accommodations throughout for structured documentation.

Code, debug, repeat

You can insert Python code to execute in a code cell. The code can be entirely standalone or imported from various Python libraries.

A notebook can be treated as a rolling log of work, with earlier code snippets being no longer executed in favor of later ones, or treated as an evolving set of code blocks intended for ongoing execution. The Runtime menu offers execution options, such as Run all, Run before or Run the focused cell, to match either approach.

The 'Runtime' tab has been selected, and the drop-down displays the options to include 'Run all,' 'Run before,' 'Run the focused cell,' 'Run selection' and 'Run after.'

Each code cell has a run icon on the left edge, as shown above. You can type code into a cell and hit the run icon to execute it immediately.

The code cell displays the executed code.

If the code generates an error, the error output will appear beneath the cell. Correcting the problem and hitting run again replaces the error info with program output. The first line of code, in its own cell, imports the NumPy library, which is the source of the arange function. Colab has many common libraries pre-loaded for easy import into programs.

A text cell provides basic rich text using Markdown formatting by default and allows for the insertion of images, HTML code and LaTeX formatting.

The utility bar in the text cell that displays options such as making the text bold or italic.

As you add text on the left side of the text cell, the formatted output appears on the right.

The text box on the left displays unformatted text and one on the right displays the text formatted.

Once you stop editing a block, only the final formatted version shows.

The final formatted version of the text in the text cell.

Incorporating data into the notebook

After getting comfortable with the interface and using it for initial test coding, you must eventually provide the code with data to analyze or otherwise manipulate.

Colab can mount a user's Google Drive to the VM hosting their notebook using a code cell.

The code cell displays the mounted Google Drive to a VM.

Once you hit run, Google will ask for permission to mount the drive.

A pop-up message from Google displays the question 'Permit this notebook to access your Google Drive files?' and the options to 'Connect to Google Drive' or 'No thanks.'

If you allow it to connect, you will then have access to the files in your Google Drive via the /my_drive path.

If you prefer not to grant access to your Drive space, you can upload files or any network file space mounted as a drive from your local machine instead.

The code cell displays the imported files from the local machine.

With file access, many functions are available to read data in various ways. For example, importing the Pandas library gives access to functions such as read_csv and read_json.

Save and share

By default, Colab puts notebooks in a Colab Notebooks folder under My Drive in Google Drive.

A notebook file located in the Colab Notebooks folder in the user's Google Drive.

The File menu enables notebooks to be saved as named revisions in the version history, relocated using Move, or saved as a copy in Drive or GitHub. It also allows you to download and upload notebooks. Tools based on Jupyter provide broad compatibilities, so you can create notebooks in one place and then upload and use them in another.

The 'File' tab is selected with the options to 'Locate in Drive,' 'Open in playground mode' and more.

You can use the Share button in the upper right to grant other Google users access to the notebook and to copy links.

Google also provides example notebooks illustrating available resources, such as pre-trained image classifiers and language transformers, as well as addressing common business problems, such as working with BigQuery or performing time series analytics. It also provides links to introductory Python coding notebooks.

Next Steps

18 data science tools to consider using

Dig Deeper on Artificial intelligence platforms

Business Analytics
Data Management