Introduction to data visualization in Python

Introduction to data visualization in Python

"In this module, we will introduce the basics of plotting in python with some of most commonly used packages such as matplotlib and seaborn."


The estimated time to complete this training module is 3h.

The prerequisites to take this module are:

Contact Andréanne Proulx if you have questions on this module, or if you want to check that you completed successfully all the exercises.


This module was presented by Jacob Vogel during the QLSC 612 course in 2020, the slides are available here.

The video of the presentation is available below (1h09):


  • Download the jupyter notebook (save raw version from Github), or start a new jupyter notebook
  • Watch the video and run the cells in the notebook


For this next part, we will refer to the following notebook

For example purposes, we will make use of a phenotypic dataset from the ABIDE II consortium. This amazing international multi-site dataset contains data from individuals diagnosed with Autism Spectrum Disorder (ASD) and healthy controls. We will use a version of the phenotypic data from a single site (Kennedy Krieger Institute). Thus please download the dataset from the linked resource providing your NITRC credentials. If you don't have one, you can create an account.

  1. Read through the notebook running all the cells
  2. Complete the exercises in the notebook

Exercise 1 Create a figure with a single axes and replot the second scatterplot to group by sex instead of dx_group.

   Set the figure size to a ratio of 8 (wide) x 5 (height)
   Use the colors red and gray
   Set the opacity of the points to 0.5
   Label the axes
   Add a legend

Exercise 2 Using a pairwise plot, compare the distributions of age, viq, and piq with respect to dx_group.

    Set a palette
    Set style to ticks
    Set context to paper
    Suppress the dx_group variable from being on the plot

Exercise 3 Using a violin plot separate out viq as a function of sex and dx_group.

    Different dx_group should be on each half of each violin
    The x-axis should reflect the different sex categories.

Exercise 4 Play around and make an interactive plot using plotly and your project data if you have any.

More resources