The Sweet Spot: Understanding Intelligence Delta Through Human Learning

What an Early 20th Century Soviet Psychologist Can Teach Us About AI Safety

Nov 15, 2024

Remember learning to ride a bike? You probably didn't start by hopping on a two-wheeler and zooming down the street. Instead, you might have begun with training wheels, or had a parent running alongside, holding the back of your seat. As you gained confidence, that support gradually disappeared until one day – perhaps without even realizing it – you were cycling all on your own.

This progression, from "can't do it alone" to "can do it with help" to "can do it independently," isn't just about bikes. It's a fundamental pattern in how humans learn, one that Soviet psychologist Lev Vygotsky called the "Zone of Proximal Development" (ZPD). And surprisingly, this almost a century-old theory about how children learn might become relevant to one of AI's biggest challenges: keeping artificial intelligence both powerful and comprehensible to humans.

The Learning Sweet Spot

Vygotsky's insight was simple but profound: there's a sweet spot in learning where challenges are neither too easy nor too hard. Instead, they're just beyond what you can do alone, but achievable with the right support. Think of it like this:

Too Easy: Things you can already do independently (like walking)
Sweet Spot: Things you can do with help (like learning to ride that bike)
Too Hard: Things beyond reach even with help (like performing brain surgery... for now)

This "sweet spot" is where real magic of learning happens. It's where we're stretched but not broken, challenged but not overwhelmed. And here's where it gets interesting: this same principle might be relevant for managing our relationship with AI.

From Training Wheels to Training AI

When I was leading AI initiatives at Sama, working with companies like Tesla and Meta, I saw firsthand how this principle applies to artificial intelligence. Let me give you a concrete example.

Consider chess engines. In the 1990s, chess programs were strong but still comprehensible to grandmasters. Players could learn from them because the gap wasn't too wide. But today? Top chess engines are so far beyond human capability that their moves often seem alien. The gap has grown too large for optimal learning and collaboration.

The evolution of language models offers another interesting perspective. Early AI writing assistants complemented human writers by offering quick suggestions and alternative phrasings – not because they were necessarily better writers, but because they could rapidly generate alternatives that humans could thoughtfully evaluate and learn from. They operated in that "sweet spot" where their capabilities complemented human creativity and judgment.

As these models become more sophisticated, we face a different challenge. It's not about them becoming "too good" at writing – after all, great human writers can still teach novices. Rather, it's about their decision-making process becoming increasingly opaque. When a human mentor suggests a better way to phrase something, we can ask them to explain their reasoning and learn from their thought process. But with advanced language models, we risk creating systems that can generate text without being able to provide genuine insight into the craft of writing itself.

The challenge isn't that AI will become too good at writing to teach us – it's that we might build systems that can generate without truly being able to illuminate the principles of good writing. This would be like having a mentor who can show you perfect sentences but can't explain the underlying principles that make them work.

Why This Matters (Even If You're Not an AI Researcher)

"But wait," you might think, "isn't the goal of AI to be as capable as possible? And couldn't a superintelligent AI simply adjust its communication to our level, much like how we can explain complex ideas to children?"

It's a compelling argument. After all, if one can explain quantum physics to a five-year-old (with sufficient simplification), shouldn't a superintelligent AI be able to break down its reasoning for us mere humans?

The reality is more complex. While an advanced AI might be able to explain its conclusions in simple terms, there are two crucial challenges we need to consider:

First, there's a fundamental difference between understanding an explanation and engaging in meaningful collaboration. Imagine a grandmaster who can explain chess moves to a beginner. That's helpful, but it's not the same as playing a game where both players are genuinely engaged in strategic thinking together. True learning and innovation often happen in that space of collaborative problem-solving, not just in one-way explanation.

Second, as AI systems become more sophisticated, their decision-making processes might become increasingly alien to human intuition – even if they can explain their conclusions simply1. It's one thing to understand what an AI is telling us to do; it's another to comprehend how and why it reached those conclusions in a way that allows us to meaningfully participate in and sometimes challenge the decision-making process.

This isn't just about learning – it's about maintaining meaningful human agency in a world increasingly shaped by AI. When we keep AI within our Zone of Proximal Development:

We can actively participate in problem-solving rather than just following instructions
We maintain the ability to meaningfully oversee and direct AI development
We ensure AI augments human capabilities rather than relegating us to passive observers

The goal of the Intelligence Delta isn't to limit AI's potential – it's to ensure that as AI grows more capable, humans remain active participants in shaping our shared future rather than mere recipients of AI wisdom, no matter how well it's explained.

What This Means for AI Development

So how do we put this into practice? The Intelligence Delta framework suggests several practical approaches:

Focus on Narrow AI: Develop highly capable but specialized AI systems that humans can comprehend within their domain, rather than pushing immediately for artificial general intelligence.
Prioritize Explainability: Design AI systems that can "show their work" in ways humans can understand, even if it means trading some raw performance for comprehensibility.
Progressive Disclosure: Introduce AI capabilities gradually, allowing human understanding to develop alongside AI advancement.
Active Collaboration: Create interfaces and workflows where humans and AI systems can work together, each bringing their unique strengths to the table.

The Path Forward

As AI capabilities continue to advance, maintaining this "sweet spot" of comprehensibility becomes both more challenging and more crucial. Just as Vygotsky's insights revolutionized our understanding of human learning, the Intelligence Delta framework might help us navigate the future of human-AI interaction and keep an active human agency.

The goal isn't to limit AI's potential – it's to ensure that as AI grows more capable, humans grow alongside it enough to keep managing our future. After all, the best teacher isn't the one who knows the most; it's the one who can help us learn the most.

What do you think? How can we better design AI systems to stay within humanity's Zone of Proximal Development? Share your thoughts in the comments below.

This is the second post in a series exploring the Intelligence Delta framework as a tool to keep the human agency. Subscribe to follow along as we dive deeper into this crucial challenge of our time.

In one of the following posts I will write about why I think that Mechanistic Interpretability or Representation Engineering might be fundamentally flawed to scale with the increasing gap in capabilities.

Julian Michael

Dec 2

I really enjoyed this post! The 'zone of proximal development' with AI seems really interesting to play with. A bunch of questions this raises for me:

1. Maintaining the Intelligence Delta while AI advances means we need to bootstrap and lift humans as much as possible, as fast as possible. What does it look like for a human to be 'lifted up' in this way? Three thoughts:

A) baseline scientific and methodological knowledge. If we target AI towards doing science to authoritatively establish new knowledge and methods, it's clear how to integrate it into the human ecosystem. This is how we'd imagine things advancing without AI.

B) Trustworthy narrow AI tools. If we can lean on AI to do certain, narrow functions at large scale but which we reasonably understand — think, for example, automating scientific meta-analysis and medium-complexity software engineering, but perhaps there are lots more relevant tasks we can automate — then we can keep humans at the helm while advancing our knowledge faster using AI.

C) Human enhancement and BCI. I don't have anything enlightening to say on the topic but one might imagine we'll hit the point soon that it seems worth considering. Especially as we think about how to efficiently interface with trustworthy AI tools, and how to raise humanity's ability to effectively coordinate.

2. For dealing with AGI, I think intelligence/epistemic enhancement will need to be done not just on the level of individual humans, but for all of society. What is the frontier of human knowledge as a whole, and how do we define _humanity's_ zone of proximal development? What kinds of institutions should we be building — if any — to represent the frontier of human agency and steer AI as it continues to develop?

3. Focusing on Narrow AI may not be tenable. The greatest advances have come from making the systems _more_ general, learning from more data and sharing this knowledge between all of their competencies. If we are to focus on Narrow AI, how can we steer the current paradigm in that direction while retaining its benefits? Or, alternatively, is it possible to constrain general AI systems to be more comprehensible in narrow domains?

Much to think about. Thank you for writing!

Expand full comment

2 replies by Milos Borenovic and others

2 more comments...

Agency Matters

Discussion about this post