Introduction

Overview

The following is a quantitative analysis of 97 interviews conducted in Feb-March 2022 with machine learning researchers, who were asked about their perceptions of artificial intelligence (AI) now and in the future, with particular focus on risks from advanced AI systems (imprecisely labeled “AGI” for brevity in the rest of this document). Of the interviewees, 92 were selected from NeurIPS or ICML 2021 submissions and 5 were outside recommendations. For each interviewee, a transcript was generated, and common responses were identified and tagged to support quantitative analysis. The transcripts, as well as a qualitative walkthrough of the interviews are available at Interviews.

Findings Summary

Some key findings from our primary questions of interest (not discussing Demographics or “Split-By” subquestions):

  • Most participants (75%), at some point in the conversation, said that they thought humanity would achieve advanced AI (imprecisely labeled “AGI” for the rest of this summary) eventually, but their timelines to AGI varied (source). Within this group:
    • 32% thought it would happen in 0-50 years
    • 40% thought 50-200 years
    • 18% thought 200+ years
    • and 28% were quite uncertain, reporting a very wide range.
    • (These sum to more than 100% because several people endorsed multiple timelines over the course of the conversation.)
  • Among participants who thought humanity would never develop AGI (22%), the most commonly cited reason was that they couldn’t see AGI happening based on current progress in AI. (Source)
  • Participants were pretty split on whether they thought the alignment problem argument was valid. Some common reasons for disagreement were (source):
    1. A set of responses that included the idea that AI alignment problems would be solved over the normal course of AI development (caveat: this was a very heterogeneous tag).
    2. Pointing out that humans have alignment problems too (so the potential risk of the AI alignment problem is capped in some sense by how bad alignment problems are for humans).
    3. AI systems will be tested (and humans will catch issues and implement safeguards before systems are rolled out in the real world).
    4. The objective function will not be designed in a way that causes the alignment problem / dangerous consequences of the alignment problem to arise.
    5. Perfect alignment is not needed.
  • Participants were also pretty split on whether they thought the instrumental incentives argument was valid. The most common reasons for disagreement were that 1) the loss function of an AGI would not be designed such that instrumental incentives arise / pose a problem and 2) there would be oversight (by humans or other AI) to prevent this from happening. (Source)
  • Some participants brought up that they were more concerned about misuse of AI than AGI misalignment (n = 17), or that potential risk from AGI was less dangerous than other large-scale risks humanity faces (n = 11). (Source)
  • Of the 55 participants who were asked / had a response to this question, some (n = 13) were potentially interested in working on AI alignment research. (Caveat for bias: the interviewer was less likely to ask this question if the participant believed AGI would never happen and/or the alignment/instrumental arguments were invalid, so as to reduce participant frustration. This question also tended to be asked in later interviews rather than earlier interviews.) Of those participants potentially interested in working on AI alignment research, almost all reported that they would need to learn more about the problem and/or would need to have a more specific research question to work on or incentives to do so. Those who were not interested reported feeling like it was not their problem to address (they had other research priorities, interests, skills, and positions), that they would need examples of risks from alignment problems and/or instrumental incentives within current systems to be interested in this work, or that they felt like they were not at the forefront of such research so would not be a good fit. (Source)
  • Most participants had heard of AI safety (76%) in some capacity (source); fewer had heard of AI alignment (41%) (source).
  • When participants were followed-up with ~5-6 months after the interview, 51% reported the interview had a lasting effect on their beliefs (source), and 15% reported the interview caused them to take new action(s) at work (source).
  • Thinking the alignment problem argument was valid, or the instrumental incentives argument was valid, both tended to correlate with thinking AGI would happen at some point. The effect wasn’t symmetric: if participants thought these arguments were valid, they were quite likely to believe AGI would happen; if participants thought AGI would happen, it was still more likely that they thought these arguments were valid but the effect was less strong. (Source)

Tags

The tags were developed arbitrarily, with the goal of describing common themes in the data. These tags are succinct and not described in detail. Thus, to get a sense for what the tags mean, please search the tag name in the Tagged-Quotes document, which lists most of the tags used (column 1) and attached quotes (column 2). (This document is also available in Interviews.)

Many of the tags are also rephrased and included in the walkthrough of the interviews.

Limitations

There are two large methodological weaknesses that should be kept in mind when interpreting the results. First, not every question was asked of every researcher. While some questions were just added later in the interview process, some questions were intentionally asked or avoided based on interviewer judgment of participant interest; questions particularly susceptible to this have an “About this variable” section below to describe the situation in more detail.

The second issue is with the tagging, which was somewhat haphazard. One person (not the interviewer) did the majority of the tagging, while another person (the interviewer) assisted and occasionally made corrections. Tagging was not blinded, and importantly, tags were not comprehensively double-checked by the interviewer. If anyone reading this document wishes to do a more systematic tagging of the raw data, we welcome this: much of the raw data is available on this website for analysis, and we’re happy to be contacted for further advice.

With these caveats in mind, we think there is much to be learned from a quantitative analysis of these interviews and present the full results below.

Note: All error bars represent standard error.

About this Report

There are two versions of this report: one with interactive graphs, and one with static graphs. To access all of the features of this report, like hovering over graphs to see the number of participants in each category, you need to be using the interactive version. However, the static version loads significantly faster in a browser.

Demographics of Interviewees

Basic Demographics

Gender

genders Freq Perc
Female 8 8
Other 2 2
Male 87 90

Age

Proxy: Years from graduating undergrad + 22 years

Values present for 95/97 participants.

## mean: 31.3684210526316
## median: 30
## range: 19 - 56
## # with value of 0: 0

Location

Country of origin

Proxy: Undergrad country (Any country with only 1 participant got re-coded as ‘Other’)

Values present for 97/97 participants.

undergrad_country_simplified Freq
USA 27
Other 16
China 11
India 11
Canada 6
Germany 5
France 4
Italy 4
Iran 3
Israel 3
Taiwan 3
Turkey 2
UK 2

Current country of work

(Any country with only 1 participant got re-coded as ‘Other’)

Values present for 97/97 participants.

current_country_simplified Freq
USA 57
Other 10
Canada 9
UK 7
China 4
France 3
Switzerland 3
Germany 2
Israel 2

What area of AI?

Area of AI was evaluated in two ways. First, by asking the participant directly in the interview (Field1) and second, by looking up participants’ websites and Google Scholar Interests (Field2). A comparison of Field1 and Field2 is located here. The comparison isn’t particularly close, so we usually include comparisons using both Field1 and Field2. We tend to think the Field2 labels (from Google Scholar and websites) are more accurate than Field1, because the data was a little more regular and the tagger was more experienced. We also tend to think Field2 has better external validity: for both field1 and field2, we ran a correlation between proportion of participants in that field who found the alignment arguments valid and those who found the instrumental arguments valid. This correlation was much higher for field2 than field1. Given that we expect these two arguments are probing a similar construct, the higher correlation suggests better external validity for the field2 grouping.

Field 1 (from interview response)

“Can you tell me about what area of AI you work on, in a few sentences?”

Values are present for 97/97 participants.

Note: “NLP” = natural language processing. “RL” = reinforcement learning. “vision” = computer vision. “neurocogsci” = neuroscience or cognitive science. “near-term AI safety” = AI safety generally and related areas (includes robustness, privacy, fairness). “long-term AI safety” = AI alignment and/or AI safety oriented at advanced AI systems.

Field 2 (from Google Scholar)

Note: “Near-term Safety and Related” included privacy, robustness, adversarial learning, security, interpretability, XAI, trustworthy AI, ethical AI, fairness, near-term AI safety, and long-term AI safety.

At least 1 field2 tag is present for 95/97 participants.

Sector (Academia vs. Industry)

sector_combined Freq Perc
academia 66 68
industry 21 22
academiaindustry 7 7
research_institute 3 3

Status / Experience

h-index

h-index values present for 87/97 participants.

Note people are in different fields (which tend to have different average h-index values)

But one is a noticeable outlier (this person is not primariy in AI). Distribution of the remaining values…

## mean: 14.5232558139535
## median: 8
## range: 0 - 87
## # with value of 0: 1

Years of Experience

Proxy: years since they started their PhD. If someone hasn’t ever begun a PhD, they are excluded from this measure (i.e. marked as NA)

Values present for 81/97 participants.

## mean: 8.33333333333333
## median: 6
## range: 0 - 29
## # with value of 0: 1

Professional Rank

“Status” in Feb 2022

(Any category with only 1 participant got re-coded as ‘Other’)

rank_simplified Freq
PhD Student 38
Other 15
Assistant Professor 10
Postdoc 8
Research Scientist 6
Masters 5
Full Professor 3
Senior Research Scientist 3
Software Engineer 3
Associate Professor 2
Research Staff 2
Undergraduate 2

Institution Rank

Participants’ institutions were determined from Google search. Universities rank was determined by using the below websites (searched in fall 2022); industry size was determined mostly by searching company size on LinkedIn/Google.

Academia

University Ranking in CS (from U.S. News & World Report - lower number = better rank)
Values present for 69 /73 academics.

## mean: 59.2753623188406
## median: 37
## range: 2 - 276
## # with value of 0: 0

University Ranking Overall (from U.S. News & World Report - lower number = better rank)
Values present for 72 / 73 academics.

## mean: 107.083333333333
## median: 60
## range: 1 - 1095
## # with value of 0: 0

Industry

indust_size Freq
under10_employees 2
10-100_employees 1
50-200_employees 1
200-500_employee_company 4
1k-10k_employees 1
10-50k_employees 4
50k+_employees 14
50k+_employees / under10_employees 1

Preliminary Attitudes

What motivates you?

“How did you come to work on this specific topic? What motivates you in your work (psychologically)?”

22/97 participants had some kind of response. This question was only included in earlier interviews (chronologically), before being removed from the standard question list. For example quotes, search the tag names in the Tagged-Quotes document.

Benefits

“What are you most excited about in AI, and what are you most worried about? (What are the biggest benefits or risks of AI?)” ← benefits part

89/97 participants had some kind of response. For example quotes, search the tag names in the Tagged-Quotes document.

Risks

“What are you most excited about in AI, and what are you most worried about? (What are the biggest benefits or risks of AI?)” ← risks part

95/97 participants had some kind of response. For example quotes, search the tag names in the Tagged-Quotes document.

Future

“In at least 50 years, what does the world look like?”

95/97 participants had some kind of response. For example quotes, search the tag names in the Tagged-Quotes document.

Primary ?s - Descriptives

When will we get AGI?

Note: “AGI” stands in for “advanced AI systems”, and is used for brevity

  • Example dialogue: “All right, now I’m going to give a spiel. So, people talk about the promise of AI, which can mean many things, but one of them is getting very general capable systems, perhaps with the cognitive capabilities to replace all current human jobs so you could have a CEO AI or a scientist AI, etcetera. And I usually think about this in the frame of the 2012: we have the deep learning revolution, we’ve got AlexNet, GPUs. 10 years later, here we are, and we’ve got systems like GPT-3 which have kind of weirdly emergent capabilities. They can do some text generation and some language translation and some code and some math. And one could imagine that if we continue pouring in all the human investment that we’re pouring into this like money, competition between nations, human talent, so much talent and training all the young people up, and if we continue to have algorithmic improvements at the rate we’ve seen and continue to have hardware improvements, so maybe we get optical computing or quantum computing, then one could imagine that eventually this scales to more of quite general systems, or maybe we hit a limit and we have to do a paradigm shift in order to get to the highly capable AI stage. Regardless of how we get there, my question is, do you think this will ever happen, and if so when?”

96/97 participants had some kind of response.

Some participants had both “will happen” and “won’t happen” tags (e.g. because they changed their response during the conversation) and are labeled as “both”.

Note: most of the graphs on this doc are not exclusive (same person can be represented in multiple bars), but the one below is. So each of the 97 participants is represented exactly once.

73 / 97 (75%) said at some point in the conversation that it will happen.

Among the 73 people who said at any point that it will happen…

Among the 30 people who said at any point that it won’t happen…

Split by Field

Visualizing AGI time horizon broken down by field is tricky, because participants could be tagged with multiple fields and with multiple time horizons. So if, say, someone in the Vision field was tagged with both ‘<50’ and ‘50-200’ time horizons, including both tags on a bar plot would give the impression that there were actually two people in Vision, one with each time horizon. This would result in an over-representation of people who had multiple tags (n = 21). Thus, for only the cases where we are examining time-horizon split by field, we simplified by assigning one time-horizon per participant: if they ever endorsed ‘wide range’, they were assigned ‘wide range’; otherwise, they were assigned whichever of their endorsed time horizons was the soonest.

The simplification above results in the following breakdown:

## whenAGIdata_simp_lowest
##    None/NA        <50     50-200       >200 wide range wonthappen 
##          4         19         24          9         20         21

An alternative solution for those with multiple time-horizon tags would have been to assign each multi-tag case its own tag. We chose not to do this for the following graphs, in part because there would have been 15 timing tags, the breakdown of which is represented in the table below.

Var1 Freq
wonthappen 21
50-200 20
<50 16
wide range 10
>200 5
>200 + wonthappen 4
None/NA 4
wide range + 50-200 4
50-200 + wonthappen 3
wide range + <50 3
<50 + wonthappen 2
wide range + >200 2
<50 + 50-200 1
50-200 + >200 1
wide range + <50 + >200 1

Field 1 (from interview response)

The graph below shows the proportion of people (among those who had answers, so removing the “None.NA” responses from above) with each answer type within each field. So, for all the people in the ‘long.term.AI.safety’ category for whom we have an answer for the when-AGI question (which is 2 total participants), 100% of them said ‘<50’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.

Observation/summary: No one in NLP/translation, near-term safety, or interpretablity/exlainability endorsed a <50 year time horizon. Meanwhile, no one in long-term AI safety, neuro/cognitive science, and robotics just said AGI won’t happen. People in theory were somewhat more likely to give a wide range.

Field 2 (from Google Scholar)

The graph below shows the proportion of people (among those who had answers, so removing the “None.NA” responses from above) with each answer type within each field. So, for all the people in the ‘Deep.Learning’ category for whom we have an answer for the when-AGI question (which is 25 total participants), 28% of them said ‘<50’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.

Observation/summary: No one in NLP or Optimization endorsed a <50 year time horizon. Meanwhile, no one in Applications/Data Analysis or Inference just said AGI won’t happen. People in vision were somewhat more likely to say that AGI wouldn’t happen.

Split by Sector

The proportions below exclude people in research institutes. So, for all the people in the ‘wide range’ category (N=19), 79% of them are in academia and 21% of them are in industry. People in both sectors get counted for both (so if everyone in a category were in both sectors, it would show 100% academia and 100% industry) If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.

Observation: Very roughly/noisily: as timelines get higher, a larger proportion of the participants fall in academia and a smaller proportion fall into industry… except for ‘won’t happen’.

Split by Age

Remember, age was estimated based on college graduation year

Observation: Not much going on here.

Split by h-index

For the graphs below, the interviewee with the outlier h-index value (>200) was removed.

Observation: People with closer time horizons seem to have higher h-indices.

Alignment Problem

“What do you think of the argument ‘highly intelligent systems will fail to optimize exactly what their designers intended them to, and this is dangerous’?”

  • Example dialogue: “Alright, so these next questions are about these highly intelligent systems. So imagine we have a CEO AI, and I’m like,”Alright, CEO AI, I wish for you to maximize profit, and try not to exploit people, and don’t run out of money, and try to avoid side effects.” And this might be problematic, because currently we’re finding it technically challenging to translate human values, preferences and intentions into mathematical formulations that can be optimized by systems, and this might continue to be a problem in the future. So what do you think of the argument “Highly intelligent systems will fail to optimize exactly what their designers intended them to and this is dangerous”?

95/97 participants had some kind of response. For example quotes, search the tag names in the Tagged-Quotes document.

Among the 58 people who said at any point that it is invalid…

Split by Field

I’m going to simplify by saying that if someone ever said valid, then their answer is valid. If someone gave any of the other responses but never said valid, they will be marked as invalid.

The simplification above results in the following breakdown:

## alignment_validity
## invalid.other       None/NA         valid 
##            40             2            55

Field 1 (from interview response)

The graph below shows the proportion of people (among those who had answers, so removing the “None.NA” responses from above) with each answer type within each field. So, for all the people in the ‘long.term.AI.safety’ category for whom we have an answer for the alignment problem (which is 2 total participants), 100% of them said ‘valid’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.

Observation/summary: people in vision, NLP / translation, & deep learning were more likely to think the AI alignment arguments were invalid, with a >50% chance of not saying the arguments are valid. Meanwhile, people in RL, interpretability / explainability, robotics, & safety were pretty inclined (>60%) to say at some point that the argument was valid.

Field 2 (from Google Scholar)

The graphs below shows the proportion of people (among those who had answers, so removing the “None.NA” responses from above) with each answer type within each field. So, for all the people in the ‘Deep.Learning’ category for whom we have an answer for the alignment problem (which is 26 total participants), 65% of them said ‘valid’. If you are using the interactive version (rather than the static version) of this report, hover over a bar to see the total participants in that category.

Observation/summary: People in Computing, NLP, Computer Vision, & Math or Theory were more likely to think the AI alignment arguments were invalid, with a >50% chance of not saying the arguments are valid. Meanwhile, people in Inference and Near-Term Safety and Related were very likely (>80%) to say at some point that the argument was valid.

Split by: Heard of AI alignment?

Specifically, split by the participants’ answer to the question “Heard of AI alignment?”, which is described below. (The interviewer manually went through and binarized participants’ responses for the question “Heard of AI alignment?”; we will use those binarized tags rather than the initial tags.)

Proportions…

Observation: People who had heard of AI alignment were a bit more likely to find the alignment argument valid than people who had not heard of AI alignment, but not by a huge margin.

There’s a subgroup of interest: those who had not heard of AI alignment before but thought the argument for it was valid. What fields (using field2) are these 30 people in?