By Mathew in Reinforcement Learning — 12 Mar 2019

Reinforcement Learning Meets Robotics: Insights from Leslie Kaelbling's Research Journey

Introduction

Leslie Kaelbling, a distinguished roboticist and MIT professor recognized for her groundbreaking work in reinforcement learning, planning, and robot navigation, brings a unique perspective to artificial intelligence research. With accolades including the AAAI Computers and Thought Award and leadership as editor-in-chief of the Journal of Machine Learning Research, Kaelbling's journey from philosophy to robotics offers valuable insights into the pursuit of embodied intelligence.

In this wide-ranging conversation with Lex Fridman, Kaelbling explores the fascinating intersection of philosophy, computer science, and robotics, sharing her materialistic view of artificial intelligence while articulating the complex technical challenges involved in creating truly intelligent machines. Her thoughtful approach to AI research emphasizes pragmatic problem-solving over philosophical debates about consciousness, focusing instead on the concrete engineering challenges of making robots that work effectively in the real world.

The Journey from Philosophy to Robotics

Kaelbling's path to AI began in high school after reading Douglas Hofstadter's "Gödel, Escher, Bach," which introduced her to "the interestingness of primitives and combination and how you can make complex things out of simple parts." This early exposure to formal systems sparked her interest in the foundations of intelligence.

Surprisingly, Kaelbling's undergraduate degree was in philosophy at Stanford, not computer science. "There wasn't [a computer science undergraduate degree] at Stanford at the time," she explains. Her focus was on a specific branch of philosophy that included "logic, model theory, formal semantics of natural language" – a combination now formalized at Stanford as "symbolic systems," which she describes as "a perfect preparation for work in AI and computer science."

After completing a master's degree in computer science, Kaelbling's entry into robotics came through her first job at SRI International's AI lab:

"I got hired at SRI in their AI lab and they were building a robot. It was a kind of follow-on to Shakey, but all the Shakey people were not there anymore. So my job was to try to get this robot to do stuff, and that's really kind of what got me interested in robots."

This hands-on experience with "Flaky" (successor to the pioneering robot "Shakey") launched her into the practical challenges of robotics, where she had to navigate real-world problems without much prior robotics knowledge:

"I had zero background in robotics. I didn't know anything about control, I didn't know anything about sensors, so we reinvented a lot of wheels on the way to getting that robot to do stuff."

The Cycles of AI Research

Throughout the conversation, Kaelbling reflects on how AI research has evolved in cycles, with different approaches moving in and out of fashion:

"It oscillates. Things become fashionable and then they go out, and then something else becomes cool and that goes out. There's some interesting sociological process that actually drives a lot of what's going on."

She traces this oscillation from early cybernetics and control theory to symbolic AI and expert systems, noting how the field shifts not just in methodologies but in the problems it tries to solve:

"What's interesting is when a thing starts to not be working very well, it's not only do we change methods, we change problems. It's not like we have better ways of doing the problem the expert systems people were trying to do... we kind of give up on that problem and we switch to a different problem."

This historical perspective reveals how AI research communities adapt when faced with roadblocks, sometimes shelving problems until new approaches make them tractable again.

The Limitations of Expert Systems

When discussing the roadblocks encountered by symbolic AI and expert systems in the 1980s and 1990s, Kaelbling identifies a fundamental issue:

"The main roadblock I think was the idea that humans could articulate their knowledge effectively into some kind of logical statements. It's not just the cost or the effort, but just the capability of doing it."

She explains that while people might be able to provide post-hoc explanations for their decisions, these explanations often fail to capture how they actually arrive at those decisions:

"I think fundamentally the underlying problem was the assumption that people could articulate how and why they make their decisions. You can tell me a story about why you do stuff, but I'm not so sure that's the why."

This insight helps explain why pure symbolic approaches struggled with tasks that humans perform easily but can't explain, like visual perception or physical manipulation.

The Power of Abstractions

Despite her critiques of purely symbolic approaches, Kaelbling emphasizes the crucial role of abstractions in intelligent behavior:

"I do believe in abstractions. Abstractions are critical. You cannot reason at completely fine grain about everything in your life. You can't make a plan at the level of images and torques for getting a PhD."

She argues that effective intelligence requires reducing the complexity of problems through various forms of abstraction:

"How can you reduce the spaces and the horizon of the reasoning you have to do? The answer is abstraction: spatial abstraction, temporal abstraction... You talk about a room of your house instead of your pose. You talk about doing something during the afternoon instead of at 2:54."

This perspective helps bridge the gap between symbolic and non-symbolic approaches, suggesting that the key is finding the right level of abstraction for different aspects of a problem.

Decision-Making Under Uncertainty

A significant portion of the conversation explores Markov Decision Processes (MDPs) and Partially Observable MDPs (POMDPs) – mathematical frameworks for decision-making under uncertainty that have been central to Kaelbling's research.

She explains MDPs as models where "I know completely the current state of my system" but have uncertainty about how actions will change the world. In contrast, POMDPs address the more realistic scenario where "we don't know the state" and must reason based on incomplete observations:

"We still kind of postulate that there exists a state... but we don't know the state. So then we have to think about how, given the history of actions I've taken and observations I've gotten, what do I think is going on in the world."

Kaelbling acknowledges that optimal planning for POMDPs is computationally intractable, but she sees this challenge as fundamental to AI research:

"Lots of people say 'I don't use POMDPs because they are intractable,' and I think that's a kind of very funny thing to say because the problem you have to solve is the problem you have to solve. If the problem you have to solve is intractable, that's what makes us AI people!"

Belief Space Planning

One of the most fascinating concepts Kaelbling discusses is "belief space" planning – the idea that robots should reason not just about the physical world but about their own beliefs about the world:

"Instead of thinking about what's the state of the world and trying to control that as a robot, I think about what is the space of belief I could have about the world. If I think of a belief as a probability distribution over ways the world could be, my control problem is actually the problem of controlling my beliefs."

This approach allows robots to make deliberate decisions about information gathering:

"I think about taking actions not just [for] what effect they'll have on the world outside, but what effect I'll have on my own understanding of the world outside. That might compel me to ask a question or look somewhere to gather information."

She gives the example of a human driver who occasionally looks over their shoulder – a deliberate information-gathering action that trades off with the cost of temporarily not looking forward. This kind of reasoning about uncertainty is crucial for robots navigating real-world environments.

Hierarchical Planning

Kaelbling also discusses hierarchical planning – breaking down complex, long-horizon tasks into manageable chunks. She illustrates this with everyday examples like planning a trip:

"People since probably reasoning began have thought about hierarchical reasoning. You might say, 'I have this long execution I have to do, but I can divide it into some segments abstractly.' Maybe I have to get out of the house, I have to get in the car, I have to drive, and so on."

What's particularly interesting is how humans make "leaps of faith" about parts of plans they haven't detailed yet:

"I always like to talk about walking through an airport. You can plan to go to New York and arrive at the airport and then find yourself in an office building later. You can't even tell me in advance what your plan is for walking through the airport partly because you're too lazy to think about it, but partly also because you just don't have the information."

This raises important questions about how robots can make similar predictions about their ability to accomplish sub-goals they haven't fully planned out.

The Path to Human-Level Intelligence

When asked about creating robots with human-level intelligence, Kaelbling takes a materialist stance while avoiding speculation about consciousness:

"I don't think much about consciousness. Even most philosophers who care about it will give you that you could have robots that are zombies – that behave like humans but are not conscious. At this moment, I would be happy enough with that."

She emphasizes that intelligence isn't a single monolithic capability but a collection of different reasoning systems:

"It seems probably clear to anybody that you can't all be this or all be that. Brains aren't all like this or all like that. They have different pieces and parts and substructure. I don't think that there's any good reason to think that there's going to be like one true algorithmic thing that's going to do the whole job."

Her current research focus is on finding the right balance between built-in structure and learning:

"To me, the thing that seems most compelling at the moment is this question of what to build in and what to learn. I think we're missing a bunch of ideas... People, don't you dare ask me how many years it's going to be till that happens, because I won't even participate in the conversation. I think we're missing ideas and I don't know how long it's going to take to find them."

Academic Publishing and Research Culture

As founder of the Journal of Machine Learning Research (JMLR), Kaelbling shares insights about academic publishing in AI. JMLR emerged when researchers became frustrated with traditional publishing models:

"There was a journal called Machine Learning which was owned by Kluwer, and I was on the editorial board. We would complain to Kluwer that it was too expensive for the libraries and that people couldn't publish, and we would really like to have some kind of relief on those fronts. They would always sympathize but not do anything. So we just decided to make a new journal."

The result was one of the first major open-access journals in the field – free both to publish in and to read. She notes that this was possible because "computer scientists are competent and autonomous in a way that many scientists in other fields aren't" when it comes to managing the technical aspects of publishing.

Kaelbling also expresses concern about the current fast-paced publication culture in AI:

"I think the horizon for researchers has gotten very short. Students want to publish a lot of papers, and there's value in that... but I'm worried that we're driving out people who would spend two years thinking about something. Back in my day, when we worked on our thesis, we did not publish papers. You did your thesis for years, you picked a hard problem, and you worked and chewed on it."

Optimism Tempered With Realism

Looking to the future of AI, Kaelbling sees inevitable cycles but with steady progress:

"I think the cycles are inevitable, but I think each time we get higher. The Deep Learning stuff has made deep and important improvements, and so the high-water mark is now higher, there's no question. But of course, I think people are overselling, and eventually investors and other people look around and say, 'Well, you're not quite delivering on this grand claim and that wild hypothesis.' So probably it's going to crash some amount, and then it's okay."

She emphasizes that as AI systems become more capable, questions of objective functions and value alignment become increasingly important:

"The idea that we're going to go from being people who engineer algorithms to being people who engineer objective functions – that's definitely going to happen, and that's going to change our thinking and methodology."

At its core, Kaelbling's approach to AI research remains deeply pragmatic. When asked about her favorite science fiction robot, she responds:

"I don't think I have a favorite robot from science fiction. I do research because it's fun, not because I care about what we produce."

This focus on the joy of the research process itself, rather than grand visions of artificial general intelligence, reflects Kaelbling's grounded approach to some of the most complex challenges in computer science.

Key Points

The value of abstraction: Intelligence requires the ability to reason at multiple levels of abstraction, from detailed physical interactions to high-level conceptual planning.
Belief space planning: Robots should reason not just about the physical world but about their own beliefs and uncertainties, enabling deliberate information-gathering actions.
Balance between structure and learning: The most promising path toward more capable AI systems involves finding the right combination of built-in structure and learned capabilities, rather than relying exclusively on either approach.
Computational intractability as the core AI challenge: Many real-world problems are computationally intractable when solved optimally, making approximations and heuristics essential to AI research.
The oscillating nature of AI progress: The field has moved through cycles of different approaches falling in and out of favor, with each cycle reaching higher levels of capability despite occasional setbacks.
The limits of human self-explanation: One fundamental challenge in AI is that humans often cannot articulate how they actually solve problems, making it difficult to directly encode human expertise.
Research culture challenges: The current fast-paced publication culture may discourage the deep, long-term thinking needed to solve the hardest problems in artificial intelligence.

Leslie Kaelbling: Reinforcement Learning, Planning, and Robotics | Lex Fridman Podcast #15
https://www.youtube.com/watch?v=Er7Dy8rvqOc