Companies understand the AI systems they’re building, right…?
Companies understand the AI systems they’re building, right…?
FLI Open Letter
“AI labs locked in an out-of-control race to develop and deploy ever more powerful digital minds that no one – not even their creators – can understand, predict, or reliably control.” (Future of Life Institute, March 2023)
This open letter was signed by over 27000 signatories, including key AI figures such as Professors Stuart Russell (UC Berkeley, author of the standard textbook “Artificial Intelligence: a Modern Approach"), Yoshua Bengio (U. Montreal, Turing Prize winner), Max Tegmark (MIT), John J Hopfield (Princeton), AI CEOs including Emad Mostaque (Stability AI), Connor Leahy (Conjecture) and others.
Conjecture CEO Connor Leahy, founder of Eleuther AI (GPT-J and GPT-NeoX)
"A traditional software system is you write code. So, you write code, a programmer writes code which solves a problem. You have a problem, you wanted to do something and you write the code to make it do that. A.I. is very different. A.I.s are not really written. They are more like grown. You have a sample of data of what you wanted to accomplish. You don't know how to solve the problem, you just have a description like or samples of the problem, and then you use huge supercomputers to crunch these numbers, to kind of like organically almost grow a program that solves these programs. And importantly, we have no idea how these programs work internally. They are complete black boxes. We don't understand at all how their internals work. This is an unsolved scientific problem and we do not know how to control these things." (CNN, May 2023)
OpenAI Co-founder and Chief Scientist Ilya Sutskever
On AIs not being programmed to do specific tasks:
“Greenberg: “Even though you never program the system to write poetry, or translate languages, or do simple math problems, it learns how to do all of those things…
Sutskever: “That's exactly right. I want to add another thing, which is, at the level of tasks we are talking about, directly programming your system to do these tasks is, basically, really impossible.” (Clearer Thinking Podcast, Oct 2022)
“Our understanding of our models is quite rudimentary.” (Lunar Society Podcast, Mar 2022)
OpenAI Alignment Lead Jan Leike
“In the InstructGPT paper we found that our models generalized to follow instructions in non-English even though we almost exclusively trained on English data.We still don't know why. I wish someone would figure this out.” (Twitter, Feb 2023)
“Before we scramble to deeply integrate LLMs everywhere in the economy, can we pause and think whether it is wise to do so? This is quite immature technology and we don't understand how it works. If we're not careful we're setting ourselves up for a lot of correlated failures.” (Twitter, March 2023).
OpenAI Researcher Jeff Wu
On OpenAI’s recent attempts to explain language models’ behaviors: “Most of the explanations score quite poorly or don’t explain that much of the behavior of the actual neuron” (TechCrunch, May 2023)
Anthropic Co-founder Chris Olah (previously at OpenAI)
“The question that is crying out to be answered there is, ‘How is it that these models are doing these things that we don’t know how to do? … What in the wide world is going on inside these systems??”(80000 Hours, Aug 2021)
“Neural networks have been able to accomplish all of these tasks that no human knows how to write a computer program to do directly. We can’t write a computer program to go and classify images, but we can write a neural network to create a computer program that can classify images. We can’t go and write computer programs directly to go and translate text highly accurately, but we can train the neural network to go and translate texts much better than any program we could have written.” (80000 Hours, Aug 2021)
“Right now, I guess probably the largest circuit that we’ve really carefully understood is at 50,000 parameters. And meanwhile, the largest language models are in the hundreds of billions of parameters.” (80000 Hours, Aug 2021) Note: We have made some progress on interpretability since 2021, but now the largest models have trillions of parameters. The difference between the largest interpreted circuits and the largest models is still of a factor of a million.
Google DeepMind Researcher Neel Nanda (previously at Anthropic)
“...a lot of fascinating scientific questions - how do models actually work? Are there fundamental principles and laws underlying them, or is it all an inscrutable mess?” (200 Open Problems in Mechanistic Interpretability, Dec 2022)
“It is a fact about today’s world that there exist computer programs like GPT-3 that can essentially speak English at a human level, but we have no idea how to write these programs in normal code.” (200 Open Problems in Mechanistic Interpretability, Dec 2022)