BEGIN:VCALENDAR
VERSION:2.0
X-WR-CALNAME;VALUE=TEXT:Rohit Bhattacharya (Williams College)
PRODID:-//Harvard events data//EN
BEGIN:VEVENT
UID:event_1337071_0
SUMMARY:Rohit Bhattacharya (Williams College)
DESCRIPTION:<h3>Title</h3><p><span>Opportunities for Principled Use of AI for Causal Inference</span></p><h3>Abstract</h3><p><span>Modern causal inference theory has advanced significantly in handling challenges that underlie observational data. Most of this theory, however, assumes clean, structured data, whereas real-world data sources like electronic health records contain a mix of both structured measurements and unstructured information (e.g., clinical notes.) In this context, state-of-the-art machine learning methods—particularly generative AI models trained on unstructured data—offer new opportunities for causal inference. While these tools are trained purely on associational tasks, I argue for some principled approaches to incorporating them into causal pipelines. Time permitting, I will present two concrete examples of this. The first concerns settings where information about unobserved confounding is captured in unstructured text data. The proposed method uses zero-shot models (e.g., large language models) to infer&nbsp;proxies from multiple instances of pre-treatment text and plugs them into the so-called proximal g-formula. I also briefly describe falsification heuristics and opportunities for sensitivity analysis for this method. The second example concerns settings involving high-dimensional treatments in the context of computational genomics. In this context, I describe how AlphaFold can be used as a form of interpretable dimension reduction for complex interventional queries involving mutations in protein sequences.</span></p>
LOCATION:CGIS Knafel Building, Room K354
STATUS:CONFIRMED
DTSTART:20251105T170000Z
DTEND:20251105T183000Z
END:VEVENT
END:VCALENDAR