Long Nguyen

Professor of Statistics, Department of Statistics, University of Michigan

Other affiliations:
Professor of Electrical Engineering and Computer Science (by courtesy)
Faculty member, Michigan Institute for Data Science
Long-term member, Vietnam Institute for Advanced Study in Mathematics

Email: xuanlong@umich.edu
Office: 461 West Hall, Phone: 734-763-3499, Fax: 734-763-4676

Mail Address: 439 West Hall, 1085 South University, Ann Arbor, MI 48109-1107

[Research] [Teaching] [Students] [Publications] [Biography]

Prospective PhD students: Please consider applying to Michigan and thank you for your interest. I apologize if you have not received my response to your enquiry about the program. Admission decision is made a graduate admissions committee, please see this link for further information.

Research interests

Nonparametric Bayesian statistics
Optimal transport and statistical inference
Machine learning and optimization
Hierarchical, mixture and graphical models
Spatiotemporal and functional data analysis
Stochastic, variational and geometric methods in statistical inference

Synopsis: Statistical inference and learning is the computational process of turning data into statistics, prediction and understanding. I work with richly structured data, such as those extracted from texts, images and other spatiotemporal signals.

I am particularly interested in a field in statistics known as Bayesian nonparametrics, which provides a fertile and powerful mathematical framework for the development of many computational and statistical modeling ideas. The spirit of Bayesian nonparametric statistics is to enable the kind of inferential procedures according to which both the statistical modeling and computational complexity may adapt to increasingly large and complex data patterns in a probabilistically graceful and effective way. In this framework, stochastic processes and random measures, along with latent variable models such as mixture, hierarchical and graphical models figure prominently. In addition, my students and I seek to understand the interaction between statistical inference and the theory of optimal transport that arises naturally in the learning of complex hierarchical models and spatiotemporal and functional patterns.

My motivation for all this came originally from an early and sustained interest in machine learning. A primary focus in our machine learning research is to develop more effective inference algorithms using variational, stochastic and geometric viewpoints.

Editorial boards (past or current)

My Vietnamese name is Nguyễn Xuân Long. Therefore, "XuanLong Nguyen" is used in my English publications. Furthermore, the first name is Long for short.

Students [selfie in deserted Niagara Falls in November'16] [on a Phở day, March 2018] [@Ashley's] [in the time of COVID-19, May 2020] [finally in-person, on Huron river, on a beautiful day of September 2021]

Current PhD students:

Sunrit Chakraborty
Trong Dat Do, joint with Jonathan Terhorst
Vincenzo Loffredo
Yidan Xu, joint with Yixin Wang
Yilei Zhang

Former PhD students and postdocs

Jiacheng Zhu PhD ME (Carnegie Mellon University) 2023; Postdoctoral fellow, MIT
Bach Viet Do PhD Stats 2023; Research data scientist at Ford Motor
Rayleigh Lei PhD Stats 2022; Postdoctoral fellow, University of Washington
Yun Wei PhD Math (AIM) 2020; Postdoctoral fellow, SAMSI and Duke University
Aritra Guha, PhD Stats 2020; L. J. Savage doctoral dissertation award; now Senior researcher at AT&T Labs
Mikhail Yurochkin PhD Stats 2018; Research manager, MIT-IBM Watson AI Lab
Nhat Ho PhD Stats 2017; now Assistant Professor, University of Texas, Austin
Hossein Keshavarz Shenastaghi PhD Stats 2017; now Data scientist at Relational AI
Zhaoshi Meng PhD EECS 2014; Senior Researcher, Vicarious
Arash Ali Amini Postdoctoral fellow 2011--2014; now Associate Professor, UCLA
Vijay Manikandan Janakiraman PhD ME 2013; now Engineering manager at Meta
Jian Tang PhD CS (Peking University, visiting 2011--2013, postdoc: 2016--2017); now Associate Professor, Mila-Quebec AI Institute, HEC Montreal
Kohinoor Dasgupta, PhD Stats 2012; now Director Biostatistics, Novartis India
Cen Guo, PhD Stats 2012; Senior manager in Data Science at Apple

Master's students

Ziyi Song (AMDP), MS 2021, in Statistics PhD program, University of California, Irvine
Sijun Zhang (AMDP)
Jawad Mroueh, MS 2019
Bopeng Li, MS 2012; in Statistics PhD program, University of Michigan

Undergraduate honor thesis advisees

2019--2020: Jingyi Jia (graduate student at UM Statistics)
2018--2019: Yingsi Jian (graduate student at Harvard), Jiayue Lu (graduate student at Univ of Southern California)
2017--2018: Jiahui Ji (graduate student at UM Biostatistics), Zui Chen (graduate student at Parsons School of Design, NYC)

Visitors

Giuseppe Di Benedetto Visiting PhD student from Oxford University, March--May 2018
Federico Camerlenghi Postdoctoral visiting scholar; April--May 2016; Assistant Professor, University of Milano-Bicocca
Hyun-Chul Kim, Visiting scholar 2010--2011; Research Professor, Yonsei University, Korea

Some (not so up-to-date) collaborative projects and links

Summer school on Bayesian statistics and computation, VIASM and UEH, Ho Chi Minh city, July 13--July 22, 2023. Links to technical programs and various activities.
Learning from naturalistic driving encounters: Joint with Ding Zhao (mechanical engineering faculty at Carnegie Mellon University) and funded by Toyota Research Institute.
Music theory: Joint with music theorists at Michigan, Sam Mukherji, Áine Heneghan, Nathan Martin and Rene Rusch, and UM linguist Steven Abney.
Statistical Machine Learning reading group. This link contains a list of excellent papers discussed in a reading group formerly organized by a number of young(!) UM statisticians and machine learning researchers (2011--2016).
Real time CO2 data assimilation and anomaly detection project: Led by Anna Michalak Lab at Carnegie Institution for Science and Michigan team.
Big Data Summer Institute. Led by Bhramar Mukherjee at the University of Michigan. Exciting opportunity for computer science, mathematics and statistics undergraduates looking to find meaning in very large scale data.
Vietnam Institute for Advanced Study in Mathematics. An excellent place for mathematics and mathematical research in Hanoi.

Selected talk slides

Parameter estimation and interpretability in Bayesian mixture models. Keynote talk, 12th Bayesian Nonparametrics Conference, Oxford, June 2019.
Elements of data science. Summer School on Data Science, Vietnam Institute for Advanced Study in Mathematics, Hanoi and Ho Chi Minh, May 2017.
Multi-level clustering with contexts via hierarchical nonparametric Bayesian inference. Biostatistics Seminar, University of Michigan, October 2016.
Singularity structures and parameter estimation in finite mixture models. Workshop on Empirical Likelihood Methodology, National University of Singapore, June 2016.
Topic modeling with more confidence: a theory and some algorithms. Keynote talk, Pacific-Asia Knowledge Discovery and Data Mining Conference, Ho Chi Minh, May 2015.
Borrowing strength in hierarchical Bayes: convergence of the Dirichlet base measure. 9th Bayesian Nonparametrics Conference, Amsterdam, June 2013.
Convergence of latent mixing measures in finite and infinite mixture models. Bayesian Nonparametrics Workshop at ICERM, Providence, September 2012.
Clustering problems, mixture models and Bayesian nonparametrics. VIASM Summer School, Hanoi, July 2012. [Additional notes ]
Message-passing sequential detection of multiple change points in networks. IEEE Symposium on Information Theory, Boston, July 2012.
Inference of functional clusters from non-functional data. Midwest Statistics Research Colloquium, Madison, March 2012.
Dirichlet labeling and hierarchical processes for clustering functional data. IMS-China Conference, Xi'an, July 2011.
Decentralized decision making with spatially distributed data. AI Seminar, University of Michigan, Oct 2009.
Surrogate loss functions, divergences and decentralized detection. Thesis Talk, UC Berkeley, May 2007.
Anomaly and sequential detection with time series data. Tutorial lectures given at Berkeley, 2006.