An Opinionated Guide to Running an Introductory AI Safety Reading Group
I believe introductory AI Safety reading groups are some of the highest-impact programs you can run. Yet there isn’t a ton of time, energy, and intention being put into these programs. So I’m writing this post to distill what I have learned through the time, energy, and intention I’ve put into the reading group I lead.
Note: Throughout, I will largely discuss things in the context of university groups; some things generalize, some things don’t. If you want to discuss your program specifically, please reach out. I’m always happy to chat.
Who am I
I’m currently the director of MIT AI Alignment’s (MAIA) AI Safety Fundamentals Fellowship (AISF). Since starting, I have completely revamped the curriculum, created custom reading resources, changed some structural elements of the program, and have overall spent a lot of time thinking about intro fellowships. Previously, I was a MATHCOUNTS coach (in which I was selected to be Nebraska’s state team head coach), which gave me additional experience in curriculum design and theory of learning.
Purpose / theory of change
For any program you want to run, you should know your purpose or theory of change (these aren’t exactly the same but I will use these terms somewhat interchangeably) of why you’re running it. The theory of change for your intro reading group (hereafter referred to as IRG) should be heavily correlated with the theory of change (ToC) for your group.
In my eyes, there are two main ToCs for university groups. The default is “Find people who are already interested in AI Safety and build their context and skill to get them into an AIS role.” If your group has enough capacity to go beyond this, you can adopt the second ToC: “Convince people AIS is important, then upskill them to get them into an AIS position.”
Both ToCs are totally valid, and again should just come down to organizing capacity. Figure out which fits your org and apply it to your IRG. Knowing this ToC is crucial for having a good application process (if you have people apply). Otherwise, you will have participants that are poor fits for the program you are trying to run.
Curriculum
I think there are a few reasonable goals you may have for your IRG curriculum, and again, it depends on your org’s organizing capacity.
If you’re part of a large org (who has the bandwidth to run many other upskilling programs), then your IRG should focus on breadth of the field—enabling your participants to learn about all their possible opportunities, and then running separate programs to upskill them in these subfields.
If you’re a somewhat smaller org (who isn’t able to run many other upskilling programs), then your IRG should focus on upskilling your participants in whatever subfield you believe your org has a competitive advantage in.
If you’re somewhere in the middle, I would recommend running multiple different IRGs: technical, policy, strategy, etc.
Furthermore, if you feel your org is large enough to start trying to convince people AIS is important (in addition to the default of upskilling), some part of your IRG should be spent on this.
Another thing you may want to consider when preparing your curriculum is “What knowledge or context do I want future members to have?” Your IRG should be a funnel into membership and is the easiest time to give people context on topics that they may not go to another random meeting for.
Regarding the last two points, at MAIA we have a lot of AISF fellows and have a fairly high retention rate for these fellows becoming members. As a result, I edited the curriculum to include more arguments about the importance of AIS and some time spent on things like AI policy (which I believe is important context for all AIS researchers, but most MIT students would skip meetings on).
Format
Don’t overcomplicate things. Have your IRG participants show up at the same time, same place every week. You want your participants to see the IRG as just part of their weekly routine, as this will greatly help with attendance. Related to this, make sure people are enjoying the session (one of the easiest ways of doing this is having food—this is by far the most impactful thing you can spend money on for your IRG). To ensure people are actually enjoying their time, it is important to get consistent and often feedback. For context, AISF has attendance forms for each session with required feedback questions.
At MAIA, we have longer sessions (two hours every week) to provide time for fellows to read the resources in the session rather than asking them to read the resources on their own time. We do this for many reasons: (1) people aren’t going to read the resources on their own time (and if they do, it’ll vary with their interest and free time in that moment), (2) they don’t have a facilitator there to answer their questions, (3) they don’t have a facilitator there to frame the reading, and (4) they aren’t going to read the resources (at least not as deeply as they would in a session). This is not to say you have to have reading time during the sessions, but I highly, highly recommend doing so.
Another thing to consider is your facilitator-to-participant ratio. This semester, AISF has cohorts of two facilitators and roughly eight to eleven fellows (though this varies some). The lower your facilitator-to-participant ratio is, the better. But you always have to weigh this against being able to support more participants (unless you are bottlenecked by number of applications). There is no one-size-fits-all solution here, so my advice is simply to be very mindful of the tradeoffs you’re making when deciding how many people to accept.
You should also be cognizant of your reading-to-discussion ratio in your sessions. In general, if your goal is to upskill, it’s fine to have a bit more reading; if your goal is to convince people of the importance, discussion is much more impactful. I don’t have super strong opinions on the matter beyond this, as long as people enjoy your sessions. For context, AISF sessions are about one hour of reading and about one hour of discussion.
Related to discussions, you need to ensure that your facilitators understand the ToC just as well as you do. This is fairly easy when your IRG is small, as the people facilitating are usually the ones organizing, but when your IRG grows, it proves difficult. For AISF, I lead a weekly two-hour prep session for facilitators where we go through the readings together and I discuss why I added them and how I want facilitators to be leading the discussion (this also includes any narrative arcs I want them to build over the span of the session).
My last piece of advice would be: whatever you do, do it professionally and make your org look good. People like nice things, and if you have a well-run IRG with nice resources, it can make up for a lot of other lackings.
Resources
I have spent a lot of time making such resources for AISF, including weekly outlines, curated reading packets, and facilitator guides. I am more than willing to share these resources with others, but haven’t figured out how I want to share these publicly. At some point, I will edit this post with links to the resources, but until then please just reach out to me and we can figure something out.
Feel free to give me any anonymous feedback you may have!