Zehao Xu

PhD, University of Waterloo

“Good diagrams clarify. Very good diagrams force the ideas upon the viewer. The best diagrams compellingly embody the ideas themselves.”

… Wayne Oldford (2003)

“The aim of interactive graphics is not to improve and polish a particular display till it conveys its message in an effective manner, but to use sets of displays to explore data sets and discover the information in them.”

… Unwin (1999)

Great northern diver

Zehao’s GitHub


This is Zehao Xu, a PhD student of Statistics at the University of Waterloo, supervised by Wayne Oldford. My research interests include Data Visualization, Data Analysis, Interactive Graphics, Machine Learning, package Development.

Currently, I am dedicated to project loon, which is built by Prof.Oldford and Dr.Waddell

Welcome to Loon Wonderland

In package loon development, I am mainly in charge of building some new functions to better serve data scientists (e.g. loonGrob, facets, …) and fixing some tiny bugs.

Additionally, I am also interested in building some loon derivatives:

  • loon.ggplot, an R package to turn ggplot graphic data structures into interactive loon plots.
  • loon.shiny, display loon widgets in a shiny app
  • loon.tourr provides tour mechanism (e.g. grand tour, guided tour, etc) in loon


Besides, I enjoy myself in large data visualization. So, a common question, how large is large? Is thousand large? Is million large? Is billion large? Trillion is large! In language R, most graphical systems can handle data points less than 10 thousand (with word handle, I mean the graphics can be rendered in reasonable time). Beyond it, the rendering time will increase dramatically (Don’t believe it? Try plot(rnorm(1e6))). If the number of observations reaches 1 million, R session may have a chance to be terminated. Package rasterly is built to visualize large data (even billion) in seconds.


Everyone loves ggplot. It provides materials (i.e. serialaxes objects) to visualize high dimensional dataset in ggplot.

  • Serialaxes coordinates (i.e., parallel or radial axis systems)

  • General glyphs (e.g., polygons, images) to appear a scatterplot.

  • “More general” geom_histogram and geom_density to allow them to appear on serial axes.


  • Data Visualization
  • Data Analysis
  • Interactive Graphics
  • Machine Learning
  • Package Development


  • PhD in Statistics, 2017

    University of Waterloo

  • Master in Computational Science, 2016

    University of Waterloo

  • BSc in Statistics and Risk Management, 2012

    Southwest University of Finance and Economic









May 2019 – Sep 2019 Montreal
Responsibilities include:

  • Package development (rasterly)
  • Code review


University of Waterloo

Sep 2017 – Present Waterloo



NLP with Disaster Tweets in R

Use natural language processing to explore whether a tweet announces a disaster.


It provides materials (i.e. serialaxes objects) to visualize high dimensional data in ggplot.


Exploratory interactive data visualization.


An add on package to loon that converts ggplot2 plots to interactive loon plots, vice and versa.


Interactive loon widget in wep shiny app


Easily and Rapidly Generate Raster Image Data with Support for Plotly.js


Implenment tour algorithms in interactive graphical system loon.

Recent & Upcoming Talks

Interest Topics