Plans for 2022 & Review of 2021

21 March 2022 by Stefan Torges

Summary

Mission: The Center on Long-Term Risk (CLR) works on addressing the worst-case risks from the development and deployment of advanced AI systems in order to reduce the worst risks of astronomical suffering (s-risks).
Research: We built better and more explicit models of future conflict situations, making our reasoning and conclusions in this area more legible and rigorous. We also have developed more considered views on AI timelines and potential backfire risks from our work. In total, we published twelve research reports, including a paper in the field of cooperative AI that was accepted at two NeurIPS workshops.
Grantmaking: We contributed to the scale-up of the field of Cooperative AI through our advising of the Center for Emerging Risk Research (CERR). Some of our staff helped set up the Cooperative AI Foundation (CAIF).
Community-building: We ran two s-risk intro seminars (with about fifteen participants each) and a three-month summer research fellowship (with fourteen participants).
Plans for 2022: We plan to continue our research on Cooperative AI, Evidential Cooperation in Large Worlds, AI Forecasting, and other topics. We will also continue to build the s-risk community and to advise the Center for Emerging Risk Research (CERR) on grantmaking.
Fundraising: We are accepting donations to diversify our funding pool and to expand our activities. You can donate here.
Hiring: We are hiring researchers and summer research fellows. You can find details here. The application deadline is February 27, 2022.

About us

Our goal is to reduce the worst risks of astronomical suffering (s-risks) from emerging technologies. To this end, we work on addressing the worst-case risks from the development and deployment of advanced AI systems. We are currently focused on conflict scenarios as well as technical and philosophical aspects of cooperation.

We have been based in London since late 2019. Our team is currently about fourteen full-time equivalents strong, with most of our employees full-time.

Review of CLR in 2021

We review our work across organizational functions by combining a subjective assessment with a list of tangible outputs and activities. The assessments were written by senior staff members.

Overall subjective assessment

Guiding question: Have we made progress towards becoming a research group and community that will have an outsized impact on the research landscape and relevant actors relevant to reducing s-risk?

Across all dimensions, it seems to us that we are in a better position compared to last year:

We have made research progress as far as we can tell. We have better models of future conflict situations (a priority of ours), making our reasoning and conclusions in this area more legible and rigorous. We also have developed more considered views on AI timelines and potential backfire risks from our work. At the same time, it remains the case that our best candidate interventions for reducing s-risk are either indirect (such as doing more research or building capacity) or non-robust (suggesting we should do more research before implementing them).
We contributed to the scale-up of the field of Cooperative AI. We helped set up the Cooperative AI Foundation (CAIF) and the Foundations of Cooperative AI Lab (FOCAL). We published a related paper at two NeurIPS workshops and started collaborating with various people in the field.
The size of the community of people concerned about s-risks seems to have continued to increase and the people in it seem to have continued to advance in their careers. This is based on our impressions rather than hard data.
In terms of organizational setup, we also seem to be in an overall better position. While some key staff members departed, which is a loss for CLR, we were able to hire new staff with relevant and more specialized skill sets. We have also moved towards a new organizational structure that gives wide-ranging authority and responsibility to a group of Lead Researchers. We believe the new structure will help us to scale further.

Research

Subjective assessment

Guiding questions:

Have we made relevant research progress?
Has the research reached its target audience?
Have we received positive feedback from peers and our target audience?

Our continued work on better understanding the causes of conflict has progressed significantly. We have developed some initial internal tools (e.g., game-theoretic models) that will allow us to explore different conflict scenarios and their implications more rigorously. We expect this work to be helpful in (i) informing future work on prioritization and (ii) communicating about conflict dynamics to (and eliciting advice from) various important audiences, including those new to s-risk research, external longtermist researchers, and stakeholders at AI labs. This has already resulted in some fruitful conversations. By providing a set of tools for more “paradigmatic” research (in the form of game-theoretic models), this line of work also opens up more opportunities for people to contribute to s-risk research.

We made progress in our work on Cooperative AI. Our Normative Disagreement paper was accepted at two NeurIPS workshops (Cooperative AI and Strategic ML). We also began working on clarifying foundational Cooperative AI concepts, such as what it would mean to work towards differential progress on cooperation. We hope that this work will feed into work on benchmarks by the Cooperative AI Foundation (CAIF).

We made some progress in thinking about intervening on AI agents’ values in a coarse-grained way so that they at least bargain cooperatively, even if otherwise misaligned. While we had previously been aware of this intervention class, only this year did we start to name it as a distinct, potentially promising area for research and intervention and begin to work on developing and evaluating concrete interventions.

Some staff have started to explore frameworks from the literature on decision-making under deep uncertainty as well as their implications for our strategy. This was the result of research and extensive discussions about the potential for large unintended negative consequences from efforts to shape the long-term future.

We published some work on AI forecasting which increased our own understanding of the topic and seemed to have been well received by the wider community.

We made less progress than we had hoped or planned on better understanding the risks from malevolent actors because a key staff member in this area fell sick for most of the year.

Public outputs & activities

Staff publications

Alex Lyzhov: 'AI and Compute' trend isn't predictive of what is happening (LessWrong)
Daniel Kokotajlo: Fun with +12 OOMs of compute (LessWrong)
Daniel Kokotajlo: Birds, Brains, Planes, and AI: Against Appeals to the Complexity/Mysteriousness/Efficiency of the Brain (LessWrong)
Daniel Kokotajlo: Against GDP as a metric for timelines and takeoff speeds (LessWrong)
Daniel Kokotajlo: Taboo “Outside View” (LessWrong/EA Forum)
Daniel Kokotajlo & Ramana Kumar: P₂B: Plan to P₂B Better (LessWrong)
David Althaus & Daniel Kokotajlo: Incentivizing forecasting via social media (EA Forum)
Jesse Clifton: CLR’s recent work on multi-agent systems (Alignment Forum)
Jesse Clifton: Weak identifiability and its consequences in strategic settings (CLR Blog)
Jia Yuan Loke: Case studies of self-governance to reduce technology risk (EA Forum)
Julian Stastny, Maxime Riché, Alexander Lyzhov, Johannes Treutlein, Allan Dafoe, Jesse Clifton: Normative Disagreement as a Challenge for Cooperative AI (Cooperative AI workshop and the Strategic ML workshop at NeurIPS 2021)
Stefan Torges: AI coordination challenges for preventing AI conflict (CLR Blog)

Output from Summer Research Fellows & CLR Fund grantees

Jack Koch: Grokking the Intentional Stance (Alignment Forum)
Jack Koch: Integrating Three Models of (Human) Cognition (Alignment Forum)
Anthony DiGiovanni: A longtermist critique of “The expected value of extinction risk reduction is positive” (EA Forum)
Nisan Stiennon: My Take on Higher-order Game Theory (Alignment Forum)
Samuel Martin: Distinguishing AI takeover scenarios (Alignment Forum)
Samuel Martin: Investigating AI Takeover Scenarios (Alignment Forum)
Samuel Martin: Takeoff Speeds and Discontinuities (Alignment Forum)
Vojta Kovařík: Formalizing Objections Against Surrogate Goals (Alignment Forum; he did most of this work while doing a trial with CLR)

Community building

Subjective assessment

Guiding questions:

Have we increased the size of the community?
Have we created opportunities for in-person (and in-depth online) contact for people in our community?
Have we systematically coached community members in their careers and prioritization, making it more likely that they will do useful work going forward?
Have we kept existing members engaged?

Our overall impression is that of continued but modest growth and progress. The various programs and events that we ran seem to have engaged people we had not previously been aware of and deepened the engagement of some people we had already known. Our individual calls and meetings put some new people on our radar and impacted some meaningful career decisions (though our counterfactual influence is hard to assess since we don’t yet do systematic evaluations). To the extent that we can already assess the outcomes from grants of the CLR Fund, they seem to have resulted in some meaningful publications and activities.

Outputs & activities

Events & programs

In February and March, we ran two s-risk intro seminars with about fifteen participants each. The participant feedback was generally very positive. The average response to the questions “How likely are you to recommend the Intro Seminar to a value-aligned friend or colleague?” was 4.7 out of 5 and 4.8 out of 5 respectively.

From the end of June until the end of October, we ran a Summer Research Fellowship with two cohorts of seven fellows each (Adrià Garriga-Alonso, Euan McLean, Francis Priestland, Gustavs Zilgalvis, Rory Svarc, Tom Shlomi, and Tristan Cook; Francis Rhys Ward, Jack Koch, Julia Karbing, Lewis Hammond, Megan Kinniment Williams, Nicolas Macé, and Sara Haxhia). Another fellow, Hadrien Pouget, spent three months at CLR during the spring.

The feedback on the fellowship was generally very positive. Among the fellows who responded to our survey, all answered the questions “Are you glad that you participated in the fellowship?” with a 5 out of 5. The average response to the question “If the same program happened next year, would you recommend a friend (with similar background to you before the fellowship) to apply?” was 9.9 out of 10.

Three fellows also ended up joining our team in permanent positions.

Individual outreach

We conducted over seventy 1:1 calls and meetings with potentially promising people. This also included various office visits by people. (We don’t yet collect systematic feedback on these.)

CLR Fund

There were many changes to the fund management: Emery Cooper replaced Lukas Gloor; Stefan Torges replaced Jonas Vollmer; Tobias Baumann replaced Brian Tomasik; Chi Nguyen also joined as a fund manager.

We made the following grants in 2021 (more details here):

Samuel Martin: Research connecting multi-agent AI safety work to existential catastrophe scenarios (Distinguishing AI takeover scenarios, Investigating AI Takeover Scenarios, Takeoff Speeds and Discontinuities)
Nisan Stiennon: Research on what it means for agents to cooperate (My Take on Higher-order Game Theory)
University of Michigan: Research into empirical game-theoretic analysis under Michael Wellman
Caspar Oesterheld: Scholarship for graduate studies in computer science
Anonymous: Stipend for career reflection
Timothy Chan: Stipend for travel to Beijing

Community coordination

Two and a half years ago, we worked with Nick Beckstead from the Open Philanthropy Project to develop a set of communication guidelines for discussing astronomical stakes. In brief, Nick’s guidelines for the broader longtermist community recommend highlighting beliefs and priorities that are important to the s-risk-oriented community. Our guidelines for those focused on s-risks recommend communicating in a nuanced manner about pessimistic views of the long-term future by considering highlighting moral cooperation and uncertainty, focusing more on practical questions if possible, and anticipating potential misunderstandings and misrepresentations.

We had originally planned to reassess the costs and benefits at the end of 2020. We ended up pushing this into 2021. After talking to staff at the Open Philanthropy Project, we decided to extend our commitment to the communication guidelines until at least the end of 2022. However, since we were not able to devote as many resources to this project as we would have liked to, we have planned a more thorough effort for this year.

Grant recommendations to CERR

Subjective assessment

Guiding question:

Have we recommended impactful grants?

As planned, we started advising the Center for Emerging Risk Research (CERR). They are a new nonprofit with the mission to improve the quality of life of future generations. Overall, we are ambivalent about our progress in this area. On the one hand, we are satisfied with the size and rigor of the grant recommendations that we made. On the other hand, we failed to make progress on systematic investigations of cause areas and promising interventions. Instead, we usually investigated opportunities that we learned about through our existing network.

Tangible outputs & activities

Based in part on our recommendations, CERR made the following investments or grants:

Anthropic, a recently announced AI startup led by Dario Amodei, formerly of OpenAI.
A commitment of $15 million to establish the Cooperative AI Foundation (CAIF), whose mission it will build to grow the field of Cooperative AI. Since then, three of our staff have formed a transition team to set up the organization. More recently, Lewis Hammond, one of this year’s fellows and a DPhil student at Oxford, also joined this team.
A grant of $3m to Carnegie Mellon University to establish the Foundations of Cooperative AI Lab, led by Vincent Conitzer.

Operational capacity

Guiding questions:

Have we maintained and improved effective systems to efficiently carry out operations procedures while managing organizational risks adequately?
Did we efficiently take care of important one-off operations tasks?

Our capacity in this area is roughly at the same level as at the beginning of the year due to staff turnover. That means it is not yet as high as we would like it to be, but we expect this to change over the coming months as our new hire gets used to their role. That being said, we are still able to maintain all the important functions of the organization and push forward vital changes in the operational setup of CLR.

General organizational health

Guiding question: Are we a healthy organization with an effective board, appropriate evaluation of our work, reliable policies and procedures, adequate financial reserves and reporting, and high morale?

Board

Members of the board: Tobias Baumann, Max Daniel, Ruairi Donnelly (chair), Chi Nguyen, Jonas Vollmer (replaced David Althaus in December)

The main role of the board is to decide CLR’s leadership and structure, to resolve organizational conflicts at the highest level, and to advise CLR leadership on important questions. Generally, CLR staff seem to agree that they have been effective in that role. There are, however, different views within the organization as to how well they resolved one incident in particular.

Evaluation function

We collect systematic feedback on big community-building and operations projects. We currently do not conduct any systematic evaluation of our research, especially from external peers. This is not ideal. We had already planned to address this in 2021 but failed to do so due to a lack of capacity. It is also generally a difficult problem to solve due to our idiosyncratic priorities.

Policies & guidelines

Overall, it is our impression that our policies are effective and cover the most relevant areas. However, it might always seem this way until we realize that we would have needed a policy for resolving a particular issue. For instance, we added two policies in response to a staff incident this year. So we plan to conduct a systematic review of our policies in 2022.

Financial health

Our budget increased substantially after our move to London from Berlin, primarily due to an increase in salaries resulting from higher costs of living.

Still, primarily due to the support of the Center for Emerging Risk Research (CERR), we are currently in a good financial position. However, without their continued support, we might face serious difficulties maintaining our operations at the current level.

Net asset estimate in early December 2021 (all figures in CHF (1 CHF ≈ 1.09 USD ≈ 0.82 GBP)):

Net assets: 2,073k
Budgeted average monthly operating expenses in 2022: 236k
Runway at budgeted 2022 expenditure: 8.8 months
CLR Fund: 597k (incl. receivables from donation partners not yet reflected in public balance)

Morale

Monthly staff average for the question “How much do you currently enjoy being part of CLR?” was 7.7 this year (compared to 7.6 in 2020 and 8.0 in 2019). However, the response rate for this question in 2021 was low.

Plans for CLR in 2022

We are hoping to hire new permanent researchers. We are also currently hiring summer research fellows to join us temporarily. You can find the details to apply to both of these opportunities here. The application deadline is February 27, 2022.

Research

The current Research Leads at CLR are Jesse Clifton, Emery Cooper, and Daniel Kokotajlo. They set their own research priorities as well as those of the people on their team. Jesse Clifton leads the Causes of Conflict Research Group at CLR while Emery Cooper and Daniel Kokotajlo currently work alone and with a research assistant respectively.

Jesse Clifton

The current priorities for Jesse and his team are:

Producing more rigorous write-ups of CLR’s current understanding of agential s-risks and how they might be reduced. This will hopefully serve to inform and elicit feedback from external researchers and serve as better pedagogical material for those who are new to s-risk research;
Continuing their technical research on Cooperative AI, which includes follow-up work on normative disagreement; studying the possibilities for AI systems to more effectively cooperate in incomplete information settings via a high degree of mutual transparency (using ideas from open-source game theory); better understanding possible paths to the emergence of conflict-conducive preferences in AI systems; and exploring possibilities for Cooperative AI research using large language models;
Developing and evaluating potential interventions in the Cooperative AI space.

Emery Cooper

Emery’s current research priorities are:

Investigating the argument that Evidential Cooperation in Large Worlds (ECL) considerations ought to significantly influence the way we think about cause prioritization, and potential cruxes and sensitivities to features of the world. Building on existing work into the details of these implications, as well as what further work on this topic might be valuable, under different decision-theoretic assumptions;
Evaluating the importance of multi-agent interactions involving acausal reasoning for s-risk reducers, in order to guide prioritization between interventions that differentially improve either acausal or causal cooperation.

Daniel Kokotajlo

Daniel’s research priorities for 2022 are:

Investigating the likelihood that future agents will engage in Evidential Cooperation in Large Worlds (ECL) by default. This and related questions could be crucial considerations for reasoning about how the future will unfold. As a result, it could influence what our high-level priorities should be, what we think the main sources of s-risk are, and what our community-building and research strategies should be.
Continuing his projects across various other areas, most notably AI forecasting.

Community building

Our work in this area will continue largely along the lines of previous years since we are broadly satisfied with the outcomes it has been producing. We will continue to try to identify and advise people interested in s-risks through 1:1 calls and meetings. We will run an intro fellowship for effective altruists who are interested in learning more about our work. We will make grants through the CLR Fund, mostly to support individuals in our community. We will run a summer research fellowship to allow people to test their fit for research on our priorities.

There are two ways in which we could imagine changing or expanding our work. First, after revisiting our communication strategy, we might explore broader communication about our research and focus areas (e.g., through podcast appearances, EA Forum posts, public talks). Second, we might invest more effort into building community infrastructure and platforms of exchange (e.g., regular retreats or an internal forum for those working on reducing s-risks).

Grant recommendations to CERR

CLR staff will continue to advise CERR on their grantmaking.

Operations

We initiated various organizational changes projects in 2021 that will require implementation effort by our operations team in 2022:

Supporting the change in leadership structure by establishing adequate organizational processes;
Finalizing the transfer of most of our activities from a Swiss nonprofit to a UK nonprofit;
Establishing a long-term solution for running the operations of the Cooperative AI Foundation.

In addition to a lot of business-as-usual work (e.g., accounting, office management, hiring logistic & onboarding, reporting), the operations team is considering the following projects:

Deliberating about whether to move offices or to open a second office in a different location;
Scaling up personal assistant & research assistant support for our research staff function;
Organizing the logistics of various research workshops.