Quantitative Methods

Advances in quantitative methodology have allowed us to broaden and refine the types of questions we can address.  These developments provide us with new and powerful ways to connect our theoretical models to empirical evidence.  Members of our faculty use a broad range of data collection and data analysis techniques.  Because our department features considerable strength in quantitative methods, we are able to offer several different options for students interested in building various levels of expertise.

A hallmark of graduate education in Sociology at Penn State is strong training in quantitative research methods. This training emphasis is promoted by the large number of department faculty with specialized training in quantitative methods, by the availability of twelve regularly offered quantitative methods graduate courses, and by providing the opportunity to earn a Certificate in Quantitative Methods. 

In addition to the Department of Sociology quantitative methods program, a Doctoral Minor in Applied Statistics, awarded by the Department of Statistics, is also available for more advanced training.  Other programs and training opportunities on campus include the annual Clogg Lecture co-organized by the Departments of Sociology and Statistics, the Survey Research Center, the Quantitative Social Science Research Initiative organized by the Department of Political Science, the Methodology Center in the College of Health and Human Development, and statistical seminars and training sessions offered by the Statistics Core within the PRI .

Penn State is home to one of 19 Federal Statistical Research Data Centers.  These centers provide researchers with secure access to restricted economic, demographic, and health data collected by the U.S. Census Bureau, the National Center for Health Statistics, and the Agency for Healthcare Research and Quality.  It is a vital resource for Penn State faculty and graduate student researchers in the fields of economics, business, demography, statistics, sociology, and health services.

The data that is available through the RDC is the micro data on individuals, families, business establishments, and firms that are collected in the surveys and censuses undertaken by these government agencies. Many of the demographic and health data sets contain geographic detail and more detailed question responses and variables that are not available in the public use versions of the surveys. The economic data available include longitudinal data on establishments, including detailed geographic information, and matched employee-employer data. Many of the data sets can be linked across surveys or with external data sources providing a rich set of opportunities for researchers.

Faculty Research

Spatial Data Collection and Analysis

Improving the collection, measurement, and analysis of spatial data is a core focus of our quantitative researchers at Penn State.  In addressing spatial location, dynamics, uncertainty, and classification, faculty are opening new fields of inquiry about spatial dependence and space-as-social-structure.  Visualization techniques offer new tools for generating hypotheses and illustrating both continuity and segregation in spatial organization.

  • How can we design tools to collect detailed spatial data? (Yabiku)
  • In what ways can we conceptualize and measure human activity and mobility? (Matthews)
  • What do me mean by neighborhood or by community, and how can we measure them? (Matthews, Lee, Yabiku)
  • What are the ‘best practices’ in utilizing GPS or GIA information?  How can these techniques be applied in developing regions of the world? (Matthews, Yabiku)
  • How can remote sensing and other non-traditional data sensors be integrated into sociological analysis? (Verdery, Graif)
  • How can we use technology, such as mobile phones and tablets, to develop innovative approaches to data collection?  (Yabiku, Verdery)
  • What will provide better estimates of exposures to features of the made and natural environments? (Daw, Matthews, Yabiku)
  • How can we design projects that can incorporate social, physical, and natural data in both historical and contemporary contexts? (Yabiku, Hardy)
  • How do we ensure that these diverse data sources can be integrated to test interdisciplinary hypotheses? (Yabiku, Verdery)

Biological and Genetic Data Collection and Analysis

Human behaviors and traits are shaped through the interaction of social, biological, and environmental influences, and adequately modeling these complex interactions through statistical methods is a substantial challenge outside of the training of most social researchers. Social science datasets such as The National Longitudinal Study of Adolescent to Adult Health, the Health and Retirement Study, Wisconsin Longitudinal Study, and others have expanded beyond typical survey measures to collect genetic data in increasing detail, progressing from handfuls of genetic markers to genome-wide scans measuring millions of variants, and will undoubtedly soon incorporate data on every base pair in the genome and information on epigenetic signatures.  

  • What is the interplay between social, environmental, and genetic determinants of key human outcomes, and how can we best model it? (Daw, Verdery)
  • How do we address the challenges posed by multiple testing and genetic confounding? (Daw)
  • What can biomarkers tell us about social experiences and how can we leverage these data to test sociological hypotheses? (Daw)
  • Can our understanding of gene-environment interplay be improved by constructing genetic indices for different phenotypes?   (Daw)
  • How can we best use molecular genetic data to address nature/nurture questions? (Daw)

Network Data Collection and Analysis

Social network data reflect direct and indirect connections across people, families, organizations, or nations.  The statistical and visual representations of these connections allow us to characterize how current systems are structured and dynamic systems may be transformed.  We may be interested in interpersonal relationships such as friendships or coauthors, the transmission of information or of infectious disease, or how international migrants may be organized through community-based networks in both origin and destination countries.

  • How do organizations obtain access to valued resources without diluting the loyalties and identities of their members? (DellaPosta)
  • What methods should we use to improve the collection and analysis of theoretically important data about individuals' social networks and spatial contexts? (Verdery)
  • How can we understand larger social network structures from samples of social networks? (Verdery)
  • What can we learn from the study of the boundary-spanning activities of “brokers” who bridge gaps in social structure? (DellaPosta)
  • What are the structure and functions of American and other nations’ kinship networks? (Daw, Verdery)
  • How do long run demographic changes affect social and kinship networks? (Verdery)

Longitudinal Models

Longitudinal data, also known as panel data, arises when social, demographic, and economic outcomes and behaviors are measured repeatedly over time on units such as individuals, families, organizations or nations. Examples of such data include the National Longitudinal Surveys, Panel Study of Income Dynamics, Survey of Health and Aging and Retirement in Europe, and the Health and Retirement Study. Meanwhile, cross-sectional surveys that collect information for a group of people at a single time may be administered annually or every few years. Such pooled cross-sectional data, including the 1974-2014 General Social Survey data and 1964-2016 Current Population Survey data, provide valuable information for understanding social and demographic changes. Our faculty are developing innovative methods for describing and investigating the temporal changes reflected in these data. Such analysis can provide important clues about social and demographic factors that give rise to the observed trends and broader patterns of social change. We ask the following questions:

  • What are the distinct dimensions of time on which inequality manifests? (Luo, Hardy)
  • What can we learn by decomposing temporal components of changes, trends, and vital rates? (Luo, Hardy)
  • What are the social, economic, and demographic factors responsible for the changes in rates, such as the decline in men’s labor force participation and the increase in women’s labor force participation?  (Luo)
  • How can we determine causality from observational data? (Daw)

New Types of Data For Social Research

As new data forms and analytical techniques become available, researchers must confront new ethical issues, particularly with Big Data and administrative records linkages. At the same time, sociological research has begun to examine groups that have hitherto been missed by traditional survey approaches who must be sampled with new techniques. And, in addition to testing hypotheses with observed data, we can learn from simulated social systems. In these approaches, such as agent-based modeling and demographic micro-simulation, researchers specify the rules, structures, and actors in a simulated environment and test how outcomes vary as different parameters of these systems are altered. Sometimes these simulations are spatially explicit. Simulation methods also offer an important laboratory to test the performance of new and traditional statistical methodologies and estimators. The simulations often use real-world, empirical data to populate the likely parameters of these systems.

  • How can we preserve the privacy and confidentiality of respondents as ever more data about them becomes available through administrative linkages? (Van Hook?)
  • How can the analysis of increasingly large amounts of digitized text complement traditional sociological research on individuals? (Felmlee?)
  • What unique ethical dilemmas are posed by experimental research in “field” settings rather than laboratories? (Gaddis)
  • Can new methods for survey sampling turn around declining response rates? (Verdery)
  • How can we sample groups that have historically been missed by traditional sampling techniques such as the homeless, undocumented migrants, or sex workers? (Verdery)
  • What statistical estimators should be used to cope with data sampled via non-traditional methods? (Verdery)
  • How can simulated social systems help to refine classic sociological theories about macrostructural organization? (Verdery, Daw)
  • Can research methods and statistical estimators be improved using computational and simulation models? (Luo, Verdery)