Leveraging ML techniques for social science research
(July 4, 2:00 PM - 5:30 PM)
The capabilities of machines are advancing rapidly, with examples such as ChatGPT’s human-like reasoning and creativity, Copilot’s capacity to become our peer-programmers, Facebook's facial recognition technology, and Google's new AI and ML frameworks like Tensorflow. With these advancements, researchers now have a large toolset of approaches to perform data-driven research and provide insights that were previously infeasible. But, as researchers, how will these advancements change our research identity and the nature of our research? For instance, face recognition algorithms do not follow predetermined rules for detecting certain pixel combinations that make up a face, based on human understanding. Instead, these algorithms utilize a vast dataset of labeled photos to estimate a function f (x), which predicts the presence y of a face based on pixels x. This approach has similarities to econometrics and raises important questions, which we will address in this workshop. Specifically, we will answer three questions – (a) Are these algorithms simply utilizing conventional methods to process extensive and innovative datasets? (b) If there are new empirical tools, how do they relate to existing knowledge? and, (c) How can we as researchers incorporate these methods into our own studies?
The first half of the workshop will be an interactive lecture, where we will understand the background and implications of ML and AI techniques for behavioral research. In the second half of this workshop, we will have a hands-on exercise. Here, we will develop a data-driven research question using these new and advanced computational techniques. The idea here is to see the amazing power that we now have in conceptualizing new constructs and finding interesting insights.
Hands on exercise - With the rise of generative AI technologies, studying the extent to which these algorithms (that are trained on human-created ground truths) are able to produce highly novel ideas, which are useful for organizations, is an open question (Jago 2019, Amabile 2020). Additionally, with claims that these AI technologies can mimic human emotions and behavior (Raj et al. 2023), there is an increasing need to understand the implications on the future of software development– whether such AI technologies can become our peer-programmers, and what their impact might be on the creativity and quality of the software produced?
As a first step towards answering the aforementioned questions, this exercise will try to extract the perceptual attributes (think novelty, usefulness etc.) of a given sample of software code. A sample dataset of code commits from GitHub will be provided and our task will be to computationally evaluate them using ChatGPT API and study their antecedents. "