PhD, University of Waterloo
“Good diagrams clarify. Very good diagrams force the ideas upon the viewer. The best diagrams compellingly embody the ideas themselves.”
… Wayne Oldford (2003)
“The aim of interactive graphics is not to improve and polish a particular display till it conveys its message in an effective manner, but to use sets of displays to explore data sets and discover the information in them.”
… Unwin (1999)
This is Zehao Xu, a PhD student of Statistics at the University of Waterloo, supervised by Wayne Oldford. My research interests include Data Visualization, Data Analysis, Interactive Graphics, Machine Learning, package Development.
Currently, I am dedicated to project loon, which is built by Prof.Oldford and Dr.Waddell
Welcome to Loon Wonderland
In package
loon development, I am mainly in charge of building some new functions to better serve data scientists (e.g. loonGrob
, facet
s, …) and fixing some tiny bugs.
Additionally, I am also interested in building some loon derivatives:
loon
plots.loon
widgets in a
shiny apploon
Rasterly
Besides, I enjoy myself in large data visualization. So, a common question, how large
is large
? Is thousand large?
Is million large?
Is billion large?
Trillion is large!
In language
R, most graphical systems can handle data points less than 10 thousand (with word handle, I mean the graphics can be rendered in reasonable time). Beyond it, the rendering time will increase dramatically (Don’t believe it? Try plot(rnorm(1e6))
). If the number of observations reaches 1 million, R session may have a chance to be terminated. Package
rasterly is built to visualize large data (even billion) in seconds.
Ggmulti
Everyone loves ggplot
. It provides materials (i.e. serialaxes objects) to visualize high dimensional dataset in ggplot
.
Serialaxes coordinates (i.e., parallel or radial axis systems)
General glyphs (e.g., polygons, images) to appear a scatterplot.
“More general” geom_histogram and geom_density to allow them to appear on serial axes.
PhD in Statistics, 2017
University of Waterloo
Master in Computational Science, 2016
University of Waterloo
BSc in Statistics and Risk Management, 2012
Southwest University of Finance and Economic
Use natural language processing to explore whether a tweet announces a disaster.
It provides materials (i.e. serialaxes objects) to visualize high dimensional data in ggplot
.
Exploratory interactive data visualization.
An add on package to loon that converts ggplot2
plots to interactive
loon plots, vice and versa.
Interactive loon widget in wep shiny app
Easily and Rapidly Generate Raster Image Data with Support for Plotly.js
Implenment tour algorithms in interactive graphical system loon.
R
(2020 useR! postponed)Loon.Tourr
: Interactive Tour Techniques - SSC 2021