PhD, University of Waterloo
“Good diagrams clarify. Very good diagrams force the ideas upon the viewer. The best diagrams compellingly embody the ideas themselves.”
… Wayne Oldford (2003)
“The aim of interactive graphics is not to improve and polish a particular display till it conveys its message in an effective manner, but to use sets of displays to explore data sets and discover the information in them.”
… Unwin (1999)
This is Zehao Xu, a PhD student of Statistics at the University of Waterloo, supervised by Wayne Oldford. My research interests include Data Visualization, Data Analysis, Interactive Graphics, Machine Learning, package Development.
Currently, I am dedicated to project loon, which is built by Prof.Oldford and Dr.Waddell
Welcome to Loon Wonderland
loon development, I am mainly in charge of building some new functions to better serve data scientists (e.g.
facets, …) and fixing some tiny bugs.
Additionally, I am also interested in building some loon derivatives:
loonwidgets in a shiny app
Besides, I enjoy myself in large data visualization. So, a common question, how
Is thousand large?
Is million large?
Is billion large?
Trillion is large! In language
R, most graphical systems can handle data points less than 10 thousand (with word handle, I mean the graphics can be rendered in reasonable time). Beyond it, the rendering time will increase dramatically (Don’t believe it? Try
plot(rnorm(1e6))). If the number of observations reaches 1 million, R session may have a chance to be terminated. Package
rasterly is built to visualize large data (even billion) in seconds.
ggplot. It provides materials (i.e. serialaxes objects) to visualize high dimensional dataset in
Serialaxes coordinates (i.e., parallel or radial axis systems)
General glyphs (e.g., polygons, images) to appear a scatterplot.
“More general” geom_histogram and geom_density to allow them to appear on serial axes.
PhD in Statistics, 2017
University of Waterloo
Master in Computational Science, 2016
University of Waterloo
BSc in Statistics and Risk Management, 2012
Southwest University of Finance and Economic
Use natural language processing to explore whether a tweet announces a disaster.
It provides materials (i.e. serialaxes objects) to visualize high dimensional data in
An add on package to loon that converts
ggplot2 plots to
interactive loon plots, vice and versa.
R(2020 useR! postponed)
Loon.Tourr: Interactive Tour Techniques - SSC 2021