Applications and Trends of Big-data and AI Methods in Behavioral and Social Sciences
The digital era fosters the blossoming of research methods that can collect and analyze big data of human behavior for descriptive, correlational, and experimental studies in behavioral and social sciences. More and more studies leverage the internet or smartphones to collect big data and adopt artificial intelligence (AI) methods, such as machine learning, to accurately model the big data. As researchers become more enthusiastic about these new approaches, they also gradually learn the limitations of such big-data and AI methods. Compared to paper-based surveys and laboratory experiments, inter-net- or smartphone-based approaches of data collection often com-promise data quality over quantity because such data collection processes, despite being less constrained by space and time, are also less controlled by researchers. Similarly, compared to traditional statistical models, AI's algorithm-based approaches of data analysis often compromise model simplicity over accuracy because such data models, in order to capture complex regularities in data, are inevitably complex and hence less explainable. As a result of these relative advantages and disadvantages of big-data and AI methods, the trends of applying them in behavioral and social sciences in the 2010s are, to some extent, circular—the rise of big-data and AI methods leads to a re-appreciation of traditional research methods and subsequent development of hybrid approaches. To elaborate on the circularity, the present article reviews the relevant literature published between 2010 and 2019 from the perspecitves of data collection, data analysis, and study reproducibility. Specifically, in terms of data collection, behavioral and social sciences were grounded in small data, grew an interest in big data for their potential of testing universality of research findings, and then turned back to collect relatively quality-assured small data. In terms of data analysis, behavioral and social scientists developed theories predominantly using explanatory statistical models, being attracted to but at the same time felt perplexed by highly accurate predictive models that were based on machine learning, and then finally found ways of making predictive models explainable. In terms of study reproducibility, although collection and analysis of big data held the promise of improving sample size, sample diversity, and thus the reproducibility of results and inferences in behavioral and social sciences, ironically the study methods them-selves were becoming irreproducible because the rapidly evolving cyber environments from which research data were gathered might have irreversibly changed, or the technical threshold of repeating the same analysis was insurmountably high to most researchers in the field. How can behavioral and social scientists respond to the afore-mentioned changes and impacts brought about by big-data and AI methods? Based on foreseeable scientific and technological trajectories, in the end we conclude that the hurdles of learning and applying the big-data and AI methods will be lowered and thus recommend researchers to integrate both new and old methods, which are, in fact, complementary to each other. These integrated approaches, such as aggregating big data from small studies for machine-learning analysis, will help researchers to see not only the forest but also the trees and ultimately help advance behavioral and social sciences.