Past Workshops

The Institute for Social Research Population Dynamics and Health Program at the University of Michigan presents Data Integration in Surveys and Clinical Trials, a PDHP Workshop conducted by Michael Elliott and Yajuan Si, University of Michigan. Topics include: Overview of data integration, Using probability samples to improve the representativeness of non-probability samples, Using observational data to improve the generalizability of clinical trials, Computational methods and software for data integration in practice. September 24, 2024 - 9am-1pm EDT 1430 ISR-Thompson/Zoom Please visit pdhp.isr.umich.edu/workshops for more information

Data Integration in Surveys & Clinical Trials

September 24, 2024

Slides.

Workshop video is available here.

This hybrid (in-person and Zoom) workshop is geared toward analysts who are combining multiple data sources (for example, survey, clinical and administrative data), primarily focusing on survey research and clinical trials. Attendees will learn the strengths and weaknesses of various types of data sources, and state-of-the-art techniques for integrating them together, including using probability samples to improve the representatives of non-probability samples and using observational data to improve the generalizability of clinical trials. Finally, the workshop will provide a tutorial on the latest computational methods and software for data integration in practice, including a graphical user interface applying the approach of multiple regression and poststratification.

Topics covered:

  • Overview of data integration
  • Using probability samples to improve the representativeness of non-probability samples
  • Using observational data to improve the generalizability of clinical trials
  • Computational methods and software for data integration in practice

 


Generative AI For Practitioners, a PDHP Workshop conducted by Alexis Castellanos

Generative AI For Practitioners

August 23, 2024

Slides and example datasets.

Workshop video is available here.

This hybrid (in-person and Zoom) workshop provides attendees with a tutorial on using state-of-the-art generative AI platforms (such as ChatGPT), including real-world examples and tips for incorporating these tools into daily workflows. Attendees will also learn best practices for prompt engineering & refining response, biases & limitations of generative AI systems (including explanation of hallucinations and how they arise), and will conclude with a discussion of future trends and applications in the field of generative AI.

Topics covered:

  • Tutorial on ChatGPT and other generative AI platforms
  • Prompt engineering & refining results
  • Biases & limitations of generative AI, including hallucinations and how they arise
  • Future trends and applications in this field

 


Creating Web Applications with R-Shiny, a PDHP Workshop conducted by Robert Ashmead

Creating Web Applications with R-Shiny

December 15, 2023

Slides, Lab Materials, & Software.

Workshop video is available here.

This hybrid (in-person and Zoom) workshop is geared toward R users from all disciplines, and will guide attendees on the creation and deployment of interactive web applications using R-Shiny. Attendees will receive hands-on practice creating a live R-Shiny web app, as well as tutorials on hosting & deploying their apps, and advanced features available to R-Shiny users.

Topics covered:

  • Introduction to the basics of R-Shiny
  • Guided creation of a live R-Shiny web application
  • Options for hosting and deploying R-Shiny applications
  • A tour of advanced features, including interactive maps, tables, and visualizations

 


Recent Advances in Difference-in-Differences, a PDHP Workshop conducted by Jonathan Roth

Recent Advances in Difference-in-Differences

July 28, 2023

Slides, Lab Materials, & Software (Github).

Workshop video is available here.

This half-day workshop builds upon our January workshop covering DiD (although that is not a prerequisite), and is geared toward researchers and data analysts of all skill levels. This workshop will cover the current state-of-the-art in the DiD field, including discussion around testing for pre-tends, and new tools for research settings where parallel trends can not be assumed, while also providing hands-on practice with the latest tools using R & Stata.

Topics covered:

  • Overview of the current state-of-the-art in DiD
  • Advantages and limitations of testing for pre-trends
  • New tools for settings where parallel trends can not be assumed
  • Coding exercises using new tools in R and Stata

 


Difference-in-Differences, a PDHP Workshop conducted by Pedro H.C. Sant'Anna

Difference-in-Differences

January 31, 2023

Slides, Lab Materials, & Software.

Workshop video is available here.

This half-day workshop is geared toward researchers and data analysts of all skill levels, and covers all aspects of the difference-in-differences analytic approach, with an additional emphasis on recent advances in the field. Attendees also receive hands-on practice with difference-in-differences analysis using R and Stata.

Topics covered:

  • Deep dive into canonical Difference-in-Differences methods
  • Understanding limitations of linear two-way fixed effects regressions
  • Allowing for heterogeneous treatment effects when treatment timing varies
  • Recent advances in DiD
  • Hands-on practice with R and Stata using both real and simulated data

 


Applied Multilevel Models, a PDHP Workshop conducted by Ryan Walters

Applied Multilevel Models

November 30, 2022

Slides, Lab Materials, & Software.

Workshop video is available here.

Please join as the PDHP workshop series resumes on November 30th with “Applied Multilevel Models”, presented by Ryan Walters of Creighton University. This half-day workshop is geared toward data analysts of all skill levels, and will cover practical applications of multilevel models (including terminology, notation and interpretation of multilevel models). Attendees will also receive hands-on practice fitting multilevel models using Stata, SAS, SPSS, and R, using data from the American Community Survey. As always, this workshop is free and open to the public.

Topics covered:

  • An overview of the multilevel model, including terminology and notation
  • Understanding the difference between fixed and random effects
  • How to include, interpret, and evaluate predictor variables across levels of analysis
  • Hands-on practice with multilevel models in Stata, SAS, SPSS, and R, using data from the American Community Survey

 


Multiple Imputation in Practice, a PDHP Workshop conducted by Trivellore Raghunathan

Multiple Imputation in Practice

July 13, 2022

Slides, Lab Materials, & Software.

Workshop video is available here.

Please join as the PDHP workshop series resumes on July 13 with “Multiple Imputation in Practice”, presented by Trivellore Raghunathan (Michigan Survey Research Center; Michigan School of Public Health; Michigan/Maryland Joint Program in Survey Methodology). This half-day workshop is geared toward data analysts from all fields and of all skill levels, and will cover practical applications of multiple imputation (including various data structures and patterns of missing data, as well as analysis of imputed data arising from complex survey designs). Attendees will also receive hands on practice with multiple imputation via sequential regression multivariate imputation (commonly known as chained equations) using the IVEware software (in stand-alone form, and also as a plug-in for SAS, R, and Stata).

Topics include:

  • Definition of missing data, including patterns and mechanisms
  • Multiple imputation using sequential regression multivariate imputation (AKA chained equations)
  • Multiple imputation for complex survey designs and data structures
  • Hands-on practice using IVEware software (stand-alone and within SAS, R, and Stata)

 


Tools For Reproducible Research, a PDHP Workshop conducted by Alexandru Cernat

Tools For Reproducible Research

March 28, 2022

Slides, Lab Materials, & Software.

Workshop video is available here.

Please join as we conduct a new PDHP workshop titled “Tools For Reproducible Research”, presented by Alexandru Cernat (associate professor of social statistics, University of Manchester). This half-day workshop will cover the main concepts of reproducible research as well as best practices in the field (including meta-analyses, pre-registration, and sensitivity analysis), while mixing both lecture and practical application. Attendees will also get hands-on practice with state-of-the-art tools of reproducible research, such as research project management using R/RStudio and version control using Github.

Topics include:

  • Challenges to social research such as publication bias and specification bias
  • Solutions to the reproducibility crisis: meta-analyses, pre-registration, and sensitivity analysis
  • Tools for better research workflows: project management (via Rprojects and the renv package), version control via Github, and dynamic documents (via git, usethis and Rmarkdown)

 


Sequence Analysis for Social Science, a PDHP Workshop conducted by Emanuela Struffolino and Anette Fasang

Sequence Analysis for Social Science

February 9, 2022

Slides, Lab Materials, & Software.

Workshop video is available here.

Please join as we kick off a new season of PDHP workshops, with a workshop entitled “Sequence Analysis for Social Science”, presented by Anette E. Fasang (Humboldt Universität zu Berlin) and Emanuela Struffolino (University of Milan). Sequence analysis, originally developed in biology to analyze strings of DNA, has attracted increasing attention in the social sciences as a key tool for using longitudinal data to analyze life course processes, including labor market careers, transitions to adulthood, and family formation. This workshop covers the theoretical foundation of sequence analysis, basic descriptive tools, as well as the general work-flow of sequence analysis. Hands-on examples using R will demonstrate the basic analytical steps using illustrative data on family and labor market trajectories.

Topics include:

  • The theoretical foundation of sequence analysis in the social sciences
  • Making informed choices when compiling sequences from a longitudinal dataset
  • Description and visualization of sequences using R
  • Using output from sequence analysis (e.g. distance matrices, measures of complexity) in further basic analysis (such as cluster analysis or regression)

 


Introduction to Multi-Level Models, a PDHP Workshop conducted by Kris Preacher, of Vanderbilt University

Introduction to Multilevel Models

August 19, 2021

Slides, Lab Materials, & Software.

PDHP resumes our 2021 workshop series on Thursday, August 19th, with a workshop entitled Introduction to Multilevel Models, presented by Dr. Kris Preacher of Vanderbilt University’s Quantitative Methods program (within the Department of Psychology and Human Development). This half-day workshop is geared toward data analysts and researchers of all levels, particularly those performing analysis on hierarchically clustered (nested) data using Mplus, R, or SPSS. Attendees will receive an introduction to the key concepts of multilevel models (appropriate settings for their use over standard statistical models, equation conventions, and interpretation), as well as hands-on practice implementing state-of-the-art features of MLM using popular statistical software packages.

Topics include:

  • Key concepts and motivation for MLM vs. standard statistical models
  • Estimating and plotting interaction effects
  • Implications of nested vs. cross-classified mutlilevel data
  • Power analysis for MLM using a general Monte Carlo technique

 


Sociogenomics and Polygenic Scores, a PDHP Workshop co-presented by Ben Domingue and Erin Ware

Sociogenomics & Polygenic Scores

March 16, 2021

Slides, Lab Materials, & Software.

PDHP begins our 2021 workshop series on March 16th, with a workshop entitled Sociogenomics & Polygenic Scores, co-presented by Ben Domingue of Stanford University’s Graduate School of Education and Erin Ware of the University of Michigan Population Neurodevelopment & Genetics Group.  This half-day workshop is geared toward data analysts interested in combining social science and genetic analysis, and will provide information on the recent history of sociogenomics and a novel approach for examining gene-by-environment interactions, as well as hands-on practice with state-of-art techniques in the field (including creating polygenic scores from simulated plink data using a high-performance computing environment).

Topics include:

  • Recent history of sociogenomics
  • A novel approach for examining gene-by-environment interactions
  • Hands-on introduction to high-performance computing and genetic data types
  • Computation of polygenic scores using PRSice2 software

 


Principles of Text AnalysisPrinciples of Text Analysis, a PDHP Workshop conducted by Patrick van Kessel

November 18, 2020

Slides, Lab Materials, & Software.

PDHP resumes our 2020 workshop series on Nov. 18th, with a workshop entitled Principles of Text Analysis, presented by Patrick van Kessel, senior data scientist at Pew Research Center.  This half-day workshop is geared toward data analysts with unstructured text data (e.g. open-ended survey responses or web-curated text), and will provide a tutorial on cleaning, processing, and analyzing data from text-based sources using state-of-the-art text analytics techniques primarily using Python, with some examples also provided in R (experience with either of these languages is recommended but not required).

Topics include:

  • Preprocessing and cleaning messy text data
  • Feature extraction using TF-IDF vectorization
  • Text analytics techniques including topic modelling and unsupervised clustering methods
  • Software demonstration featuring the scikitlearn library for Python

Evidence-Based Data Visualization, a PDHP Workshop conducted by Dr. Audrey MichalEvidence-Based Data Visualization

February 21, 2020

Slides, Lab Materials, & Software.

Workshop video is available here.

PDHP kicks off our 2020 workshop series on Feb. 21st, with a workshop entitled Evidence-Based Data Visualization, presented by Dr. Audrey Michal of the Michigan Department of Psychology.  This half-day workshop will provide a general introduction to data visualization techniques, while introducing a unique evidence-based approach to data viz design (based on Dr. Michal’s research on visual routines in graph comprehension and interpretation), and different data visualization strategies for data exploration versus data explanation.  Attendees will also get hands-on practice creating different types of data visualizations with R software, using GGPlot2 and other state-of-the-art R packages. As always, this workshop is free and open to the public.

Topics include:

  • Introduction to data visualization and principles of data viz design
  • Evidence-based practices for data viz (from Dr. Michal’s research on graph interpretation)
  • Data viz strategies for data exploration vs. explanation
  • Hands-on practice creating different types of data visualizations using R’s GGPlot2 package.

A Practical Guide To Survey Weighting, a PDHP Workshop conducted by Richard ValliantA Practical Guide To Survey Weighting

November 12, 2019

Slides, Lab Materials, & Software.

Workshop video is available here.

Please join us for the conclusion of the 2019 PDHP workshop series, as Richard Valliant (University of Michigan & University of Maryland Joint Program in Survey Methodology) presents “A Practical Guide To Survey Weighting“.  This workshop will present a comprehensive guide to the design and creation of survey weights, including sampling weights, nonresponse adjustment, and calibration, as well as approaches for weighting non-probability samples.

Additional topics include:

  • Stochastic missingness & nonresponse adjustment.
  • Calibration techniques including poststratification, raking, and GREG
  • Demonstration and hands-on practice using R and Stata.

Machine Learning in Survey Research, a PDHP Workshop conducted by Adam EckMachine Learning in Survey Research

October 25, 2019

Slides, Lab Materials, & Software.

Workshop video is available here.

Please join instructor Adam Eck (assistant professor of computer science, Oberlin College), as he conducts a half-day workshop titled “Machine Learning in Survey Research”.  This workshop is designed for population/survey researchers and analysts of all skill levels, and will present an introduction to machine learning concepts and their applications to survey research (such as sample frame creation, respondent modelling, and open-ended response coding).

Topics include:

  • Introduction to machine learning and its applications to survey research
  • Decision trees and random forests
  • Deep learning and other neural network-based techniques
  • ML techniques to model respondent behaviors, assist with coding of open-ended responses, and more
  • Demonstration using R and Python

Design-Based Analysis of Survey Data, a PDHP Workshop conducted by Brady WestDesign-Based Analysis of Survey Data

September 24, 2019

Slides, Lab Materials, & Software.

Workshop video is available here.

Please join instructor Brady T. West of the University Of Michigan’s Program in Survey Methodology, as he conducts a half-day workshop titled “Design-Based Analysis of Survey Data”.  This workshop is designed for survey data analysts of all skill levels, and will present theoretically appropriate methods of analyzing survey data collected from complex sample designs.  Dr. West will also present the implications of incorrect analyses based on his research findings from a meta-analysis of analytic error, while also providing examples of proper design-based data analysis techniques using SAS and Stata.  As always, this workshop is free and open to the public.

Topics include:

  • Overview of theoretically appropriate design-based analysis of survey data collected from complex samples
  • Case studies in analytic error (including findings from a meta-analysis of recent scientific publications), and the implications of using incorrect analysis methods
  • Appropriate use of survey weights and design-based methods of variance estimation for population inference related to descriptive parameters and regression models
  • Examples of proper design-based data analysis techniques using SAS and Stata (attendees are also welcome to ask about similar methods in other software packages)

Linear Regression With Linked Data, a PDHP Workshop conducted by Emanuel Ben-David and Martin SlawskiLinear Regression With Linked Data

Part 2 of a multi-part workshop series on record linkage

August 22, 2019

Slides, Lab Materials, & Software.

The PDHP workshop series resumes August 22nd with Part 2 of our ongoing Record Linkage series: Linear Regression With Linked Data.  This half-day workshop, conducted by Emanuel Ben-David (of the US Census Bureau’s Center for Statistical Research and Methodology) and Martin Slawski (of George Mason University), is geared toward population researchers, computational social scientists, statisticians, and data scientists of all experience levels.

Topics include:

  • Overview of record linkage and entity resolution
  • Impact of linkage error on regression analyses of linked data files
  • Linkage error adjustment and correction methods (including regression techniques and optimal matching)
  • Hands-on training and practice of these techniques using R software

An Introduction to Entity Resolution, a PDHP Workshop conducted by Rebecca C. SteortsAn Introduction to Entity Resolution

Part 1 of a multi-part workshop series on record linkage

July 10, 2019

Slides, Lab Materials, & Software.

The PDHP workshop series resumes July 10th with the first in a multi-part series of workshops on record linkage topics & techniques within social research. Please join Assistant Professor Rebecca C. Steorts, PhD, of Duke University’s Department of Statistical Science, as she presents An Introduction to Entity Resolution, a half-day workshop geared toward population researchers, computational social scientists, statisticians, and data scientists of all experience levels. This hands-on workshop will cover both the theory and practice of probabilistic entity resolution, while demonstrating state-of-the-art techniques using R software and Apache Spark.

Topics include:

  • Overview and introduction to entity resolution
  • Entity resolution fundamentals (record linkage, de-duplication, blocking, and computational gains)
  • Entity resolution evaluation metrics (including precision, reduction ratio, and robustness to tuning parameters)
  • Bayesian entity resolution models (including both parametric and nonparametric Bayesian mixture models)
  • Hands-on demonstration of state-of-the-art R packages (using blink) and computational gains (using Apache Spark)

Network Analysis: Overview & Applications to Population Science, a PDHP Workshop conducted by Ceren Budak and Daniel RomeroNetwork Analysis: Overview and Applications To Population Science

June 4, 2019

Slides, Lab Materials, & Software.

The PDHP workshop series resumes with our first workshop of the summer: Network Analysis: Overview and Applications To Population Science. Please join instructors Ceren Budak and Daniel Romero (both from U of M School of Information and formerly Microsoft Research) for a half-day workshop geared toward population researchers and data scientists of all experience levels.  The workshop features 2 hours of lecture (covering fundamental principles and theory of network analysis) followed by 2 hours of lab (simulation-based information diffusion within networks and optimal seed node selection), while exploring the connections between network analysis and social research.

Topics include:

  • Basic concepts of networks and network data
  • Measuring network properties such as centrality and node/edge importance
  • Various models of information diffusion and cascade effects
  • Network-based classification methods (including Random Walk and K-nearest neighbors)
  • Network simulation using Python
  • Impact of seed node selection on network properties.

QMP/SMP Methodological Seminar

Winter 2019

This seminar is intended to facilitate collaboration among behavioral, social, health, and data scientists interested in the development and application of adaptive approaches to intervention (prevention, treatment or policy), measurement, and data collection. The term ‘adaptation’, which is broadly defined as ‘the ability to change to suit different conditions’, has different meaning across different domains of behavioral, social and health practice and research. The goal of this seminar is to bridge this gap by exploring and discussing how concepts, tools and procedures used to inform or operationalize adaptation in one domain (e.g., intervention science) can be used to inform or operationalize adaptation in other domains (e.g., survey methodology), and how the various approaches can be used synergistically to advance precision medicine initiatives. For example, how can we improve health by combining ideas from the design of adaptive interventions, which use ongoing information about an individual or context to decide how to modify treatments over time, with ideas from responsive survey design, which uses ongoing information about an individual or context to decide how best to engage an individual in a research survey, and adaptive measurement, which focuses on efficient, low-burden approaches to measuring change in health constructs?


Total Survey Error: a Framework For High Quality Survey Design, a PDHP Workshop conducted by Brady West and Paul SchulzTotal Survey Error: a Framework For High Quality Survey Design

October 23, 2018

Slides, Lab Materials, & Software.

Instructors Brady T. West and Paul Schulz are kicking off the new PDHP workshop series with an overview of the Total Survey Error framework and its implications for survey research.  This half-day workshop is geared toward survey researchers of all types and experience levels, and will cover the design, implementation, and monitoring of survey data collections using the TSE paradigm as a guiding set of principles.  The workshop will use a mix of conceptual discussions and team exercises to explore both the underlying theory and real world applications of the TSE paradigm in survey research.

Topics include:

  • Sources of survey error
  • Quantifying and evaluating TSE in a data collection
  • Implications of TSE for study design
  • TSE reduction strategies
  • Linking TSE and Responsive / Adaptive Survey Design