AI Safety Course Intro Blog

July 20, 2025 ~ Boaz Barak ~ 2 Comments

I am teaching CS 2881: AI Safety this fall at Harvard. This blog is primarily aimed at students at Harvard or MIT (where we have a cross-registering agreement) who are considering taking the course. However, it may be of interest to others as well. For more of my thoughts on AI safety, see the blogs … Continue reading AI Safety Course Intro Blog

Emergent abilities and grokking: Fundamental, Mirage, or both?

December 22, 2023December 22, 2023 ~ Boaz Barak ~ Leave a comment

One of the lessons we have seen in language modeling is the power of scale. The original GPT paper of Radford et al. noted that at some point during training, the model “acquired” the ability to do sentiment analysis of a sentence X by predicting whether it is more likely to be followed by “very … Continue reading Emergent abilities and grokking: Fundamental, Mirage, or both?

Thoughts on AI safety

April 12, 2023 ~ Boaz Barak ~ Leave a comment

Last week, I gave a lecture on AI safety as part of my deep learning foundations course. In this post, I’ll try to write down a few of my thoughts on this topic. (The lecture was three hours, and this blog post will not cover all of what we discussed or all the points that … Continue reading Thoughts on AI safety

Replica Method for the Machine Learning Theorist: Part 2 of 2

August 11, 2021 ~ Boaz Barak ~ 3 Comments

Blake Bordelon, Haozhe Shan, Abdul Canatar, Boaz Barak, Cengiz Pehlevan See part 1 of this series, and pdf version of both parts. See also all seminar posts. In the previous post we described the outline of the replica method, and outlined the analysis per this figure: Specifically, we reduced the task of evaluating the expectation … Continue reading Replica Method for the Machine Learning Theorist: Part 2 of 2

Replica Method for the Machine Learning Theorist: Part 1 of 2

August 11, 2021August 11, 2021 ~ Boaz Barak ~ 2 Comments

Blake Bordelon, Haozhe Shan, Abdul Canatar, Boaz Barak, Cengiz Pehlevan [Boaz's note: Blake and Haozhe were students in the ML theory seminar this spring; in that seminar we touched on the replica method in the lecture on inference and statistical physics but here Blake and Haozhe (with a little help from the rest of us) … Continue reading Replica Method for the Machine Learning Theorist: Part 1 of 2

Causality and Fairness

June 11, 2021June 11, 2021 ~ Boaz Barak ~ Leave a comment

Scribe notes by Junu Lee, Yash Nair, and Richard Xu. Previous post: Toward a theory of generalization learning Next post: TBD. See also all seminar posts and course webpage. lecture slides (pdf) - lecture slides (Powerpoint with animation and annotation) - video Much of the material on causality is taken from the wonderful book by … Continue reading Causality and Fairness

Towards a Theory of Generalization in Reinforcement Learning: guest lecture by Sham Kakade

April 24, 2021April 24, 2021 ~ Boaz Barak ~ Leave a comment

Scribe notes by Hamza Chaudhry and Zhaolin Ren Previous post: Natural Language Processing - guest lecture by Sasha Rush Next post: TBD. See also all seminar posts and course webpage. See also video of lecture. Lecture slides: Original form: main / bandit analysis. Annotated: main / bandit analysis. Sham Kakade is a professor in the … Continue reading Towards a Theory of Generalization in Reinforcement Learning: guest lecture by Sham Kakade

Natural Language Processing (guest lecture by Sasha Rush)

April 3, 2021April 8, 2021 ~ Boaz Barak ~ 4 Comments

Scribe notes by Benjamin Basseri and Richard Xu Previous post: Inference and statistical physics Next post: TBD. See also all seminar posts and course webpage. Alexander (Sasha) Rush is a professor at Cornell working in in Deep Learning / NLP. He applies machine learning to problems of text generation, summarizing long documents, and interactions between … Continue reading Natural Language Processing (guest lecture by Sasha Rush)

Inference and statistical physics

April 2, 2021April 3, 2021 ~ Boaz Barak ~ 3 Comments

Scribe notes by Franklyn Wang Previous post: Robustness in train and test time Next post: Natural Language Processing (guest lecture by Sasha Rush). See also all seminar posts and course webpage. lecture slides (pdf) - lecture slides (Powerpoint with animation and annotation) - video Digression: Frequentism vs Bayesianism Before getting started, we'll discuss the difference … Continue reading Inference and statistical physics

Robustness in train and test time

April 1, 2021April 2, 2021 ~ Boaz Barak ~ 2 Comments

Scribe notes by Praneeth Vepakomma Previous post: Unsupervised learning and generative models Next post: Inference and statistical physics. See also all seminar posts and course webpage. lecture slides (pdf) - lecture slides (Powerpoint with animation and annotation) - video In this blog post, we will focus on the topic of robustness - how well (or … Continue reading Robustness in train and test time

	Anon on Thoughts by a non-economist on…
	Boaz Barak on AI Safety Course Intro Bl…
	aroraprinceton on AI Safety Course Intro Bl…
	Santiago Tomas Arang… on Six Thoughts On AI Safety
	Saturday assorted li… on Six Thoughts On AI Safety