Physical AI: The 5 Levels of Robot Autonomy explained

Show notes

Dancing robots. Kung-fu moves. Humanoid acrobatics all over the feed. But does that mean Physical AI has actually arrived?

In this episode, Clemens, Principal Engineer at RobCo, shares what Physical AI really means for industrial automation and where the technology stands today.

You'll gain insights into:

  • why Large Language Models are just the starting point and what comes after
  • how robots are being taught today compared to five years ago
  • how RobCo approaches Physical AI in real manufacturing environments
  • where the technology stands today and what accuracy rates actually matter in practice

More about RobCo: Website: https://www.rob.co LinkedIn: https://www.linkedin.com/company/robco-therobotcompany/ Instagram: https://www.instagram.com/robco_therobotcompany/

Chapter markers 00:00 Intro 00:48 Where physical AI stands right now 01:41 From chatting to grabbing: the next AI leap 02:43 Why physical data is so hard to collect 04:35 What physical AI actually means at RobCo 06:16 Why 100 years of automation hit a wall 08:18 The five levels of robot autonomy 13:27 Hardware, software, data 15:15 Why end-to-end ownership changes everything 16:19 Teaching a robot in a few hundred moves 18:09 Why software turns a robot into a brain 19:07 Why modular beats fixed automation 22:04 Real use cases already running in factories 24:26 How many nines does a production line need? 28:20 The moment factories realize everything changed

Show transcript

00:00:00: So there's a lot of hype about dancing robots and you see all these acrobatics happening.

00:00:05: And they're doing kung fu dances, everything that you see robots moving already super crazy.

00:00:10: does this mean humanoid robotics?

00:00:13: is already AI evolved or not?

00:00:15: This is the topic we are trying to cover today as well because talking about physical AI especially... ...and were talking with Clemens from RobCo who is the principal engineer at RobCo.

00:00:25: I can only say welcome back!

00:00:29: Rob Talk, the autonomous robotics podcast.

00:00:33: Physical AI no theory just reality.

00:00:37: Hi Clemens how are you doing?

00:00:40: Physical AI has been all over my newsfeed since I started researching it more and more And i've come to see that this is such a huge upcoming economy.

00:00:49: It's really amazing.

00:00:59: Maybe let's dive into the first topic.

00:01:01: and where is the market standing on physical AI right now?

00:01:06: I think there are a ton of developments happening, lots of investments.

00:01:11: The promise still has to get delivered!

00:01:14: Yeah absolutely but it feels like... And obviously i'm super deep in to AI stuff..

00:01:19: Right?!

00:01:20: I love how AI is developing as changed my life over the past three years But never saw that physical AI especially autonomy of robots is being seen as a much larger market and.

00:01:33: I was always well, i understood that robots are going to be big thing right?

00:01:37: And that automation was always the biggest thing but it didn't understand that physical AIs ex-the... I don't know how you would put this next level after us using large language models.

00:01:48: How do you see that?

00:01:50: So, Large Language Models if you talk with them questions about geometry.

00:01:59: Then it knows quite a few things, because in order to answer the question with text you need to understand for example where is city A compared to city B. so we can ask that thing like oh its... Where's it?

00:02:11: Is this west or north and this starting point of what could call world model In literal sense somehow know how different pieces fit together?

00:02:27: And then, but it's like you cannot ask it.

00:02:30: Oh can you grab that cup and give to a robot?

00:02:35: Because there is so much more to it!

00:02:39: I think the success of large language models has just... created this energy to say, hey now if we can solve language could be also solved these problems in the physical world.

00:02:52: Yeah makes sense I heard is really interesting comparison.

00:02:54: so and large-language models it's all about that data at the Internet obviously being the available data for the large language model.

00:03:01: but a physical AI It's not about data you can collect an instant because its like collection of photons.

00:03:10: That was what i found very interesting.

00:03:14: Yeah, you can use a crawler and collect that but photons are much more information.

00:03:18: And I think that's also one of the huge challenges to collect the surroundings around you in physical AI... ...and then understand it so be able to act upon it?

00:03:29: So its photons, its electrons,

00:03:32: its

00:03:32: protons It's basically imagine babies how they develop in their first year starting, you know, recognizing faces.

00:03:44: But then at the end of the first year there are experts in the physical world and they maybe have one thousand waking hours in that time.

00:03:55: And so apparently The Brain is able to completely get around In the physical World.

00:04:02: and basically what we're trying To understand Is how can We make a machine do these kind Of things?

00:04:09: as an example A three-months old, if you show it a video of coyote in Roadrunner.

00:04:17: And the coyote runs over the canyon and so there's nothing underneath for the baby would be perfectly normal.

00:04:31: but I year later... You know?

00:04:33: ...you start laughing because he has learned that its very unusual.

00:04:37: And so these kind of predictions about the physical world is what you learned during the first year.

00:04:42: Very interesting, um?

00:04:44: What exactly it's physical AI for Rob?

00:04:46: go right now because obviously we're.

00:04:49: We don't want to talk only about cars and robots on everything.

00:04:51: we wanted to talk about this specific application with which You are really working on.

00:04:56: yeah So I mean we're obviously working on manipulating The real-world right a robot arm.

00:05:03: the main purpose Is Um you wanna change something.

00:05:07: So we're not working on running through the world, which is an interesting problem to solve.

00:05:13: But it's something where from our view or a view from our customers, It's not really the most important thing to solve and so for us, it is important to go from.

00:05:26: hey We are programming a robot at just very accurate in going to point A And then two point B towards having it very reactive to the environment, have sensors in there so that they can understand its surroundings and be able to teach it easily.

00:05:46: Either by showing something or moving things around while recording something the brain, if you will to learn these kind of things.

00:06:00: Give me a little bit more insight into why this is so powerful because I think that what you just said feels normal for us as we're using chat interfaces already which are giving great answers but maybe give an explanation on how it was five years ago and now?

00:06:17: Why does programming with such incredible advancements so beneficial.

00:06:25: Well, I mean we have been automating for over a hundred years right and And so where?

00:06:34: We've reached the point Where most of the things that you can program.

00:06:38: Have been automated.

00:06:39: yeah You Can turn processes into very regularized Processes and then you can automate A little bit more.

00:06:46: but now we're at a Point where we need to be able To deal with these slightly unstructured environments or semi-structured environments.

00:06:56: And then you get to a point where programming these things is just too complex.

00:07:01: You know, the amount of code that he would have to use explodes because we had all this boundary conditions and so on... So what's usually called data-driven approaches?

00:07:12: Very interesting!

00:07:14: For me.

00:07:15: I always think about okay they build big factory You know, they automate the process and then their program it.

00:07:20: And that works.

00:07:21: but obviously like even the feedback loop to integrate smaller changes to improve.

00:07:25: maybe the process of automation is much easier when you can teach a new step or an advancement.

00:07:34: I think there's something that is incredibly valuable because we don't need maybe initially installed the machine, but you can teach it.

00:07:43: And obviously there is also expert level needed to this But It's much easier and more intuitive.

00:07:48: I think that was before.

00:07:51: You could do totally different workflows than before.

00:07:54: Yes so we have this North Star That should be able just talk with a robot or show once what he want to do.

00:08:03: We're not here yet.

00:08:06: We've started as a company to say, hey we want to make things simpler than others.

00:08:12: But it still requires quite a bit of expert knowledge even too you know do with UI and without coding so on.

00:08:21: So we wanna make it simpler and simpler to these kind tasks And that's what AI will enable us Of

00:08:26: course.

00:08:28: Where are we today?

00:08:31: Which level are we sitting at?

00:08:32: Like, what is possible today and realistically the next level.

00:08:37: So I mean last year when you thought about how should we structure our work... We thought in the end five levels of autonomy.

00:08:48: so that's a North Star!

00:08:51: How do we get

00:08:52: there?!

00:08:53: And where have been?

00:08:54: For us Level One is still relatively classic robotics.

00:09:00: you have an algorithmic approach for being able to say, I want a move-to point A. There might be an obstacle in and I've got an algorithm optimization problem how to move there?

00:09:11: And so what we do is take the parts of the environment that were interested in—for example the robot itself or certain things that could be in the way —and model it as digital twin.

00:09:24: Also with some physics, you know being able to write a robot controller requires a lot of understanding of the robots' physics.

00:09:33: So that's level one very useful.

00:09:36: Level two would be okay.

00:09:37: so now I'm taking in information.

00:09:40: i have a vision system for Being able To recognize certain things In the environment and so With That obviously The first learning based Approach You can say Okay identify certain objects and then I want to parameterize the planner, so these are things that pretty established.

00:10:01: We have products around it in production on many customer sites which is very reliable.

00:10:08: Now what's next step?

00:10:09: The next step a bit bigger.

00:10:12: now you don't need take just single photo but whole feedback loop running.

00:10:18: So imagine maybe couple of sensors or cameras.

00:10:22: You know where the robot is all of a time, and now you can do something which we call imitation learning for example.

00:10:31: Which means that... ...you have a means to show the robot what's-to-do And there are different means for it.

00:10:39: Then you feed through training process.

00:10:43: It can imitate but not but not like in an algorithmic rule-based sense, it can generalize to a certain extent and repeat these things.

00:10:54: If we put that into otherwise flow of instructions you would still set up the traditional way.

00:11:06: then what do they call level three?

00:11:09: Level Three is just embedded feedback looped learning based movements and tasks that are part of a larger context.

00:11:20: And then level four would be, now you take all these program pieces out so everything is completely learned maybe imitated or tuned on the side with things like reinforcement learning but it covers complex workflow, and I don't have to program anything at all.

00:11:41: And i think this is... All of these steps we set up so that we don't Have to solve This you know complete AI problem.

00:11:50: We can extract value for every step along the way and it blends seamlessly into each other.

00:11:57: Yeah That's So interesting.

00:11:59: Do You already Know like When These Levels will be available For Customers or How Long It Will Take?

00:12:06: because Obviously Like when We looked At look at the automation, then some things like The North Star I guess is pretty fast all away.

00:12:14: Not inevitable but it will take a little bit longer whereas Level One and Level Two are already in action?

00:12:21: And when is level three already an experimentation

00:12:26: mode?".

00:12:27: Well we're experimenting with that right now.

00:12:30: Level Three and Level Four they're variants of the same topic.

00:12:33: so... build the necessary tools.

00:12:37: They are currently expert tools, we're currently looking at pilot customers to try these things out and do feasibility studies and get something towards end of year.

00:12:51: And then there is interesting question when would be full level four?

00:12:58: I think that whole industry makes bets on this will happen And we're making our bets.

00:13:04: It could be pretty sudden, there could be a pretty sudden jump in success rate like task success rates and suddenly it's all over the place Like it was with ChatGPT True.

00:13:16: So were taking out path here.

00:13:18: We think that by having software and hardware development combined also have very clear target That are trying to address We're in our good state and others making other bets.

00:13:35: Amazing, I love it!

00:13:37: What are the challenges when you build these physical AI robots especially?

00:13:42: And what is your biggest challenge there?

00:13:48: So... You can't just write a specification and program it then it works to do some debugging.

00:13:57: it's more like you have three components, hardware software and data.

00:14:04: And in some sense these models are just representations of the data that you collect.

00:14:09: so we need to collect data in a smart way.

00:14:12: You need to collected more than previously would And so navigating this whole field is what we're trying to do as good as possible.

00:14:23: Obviously, you need a lot of domain knowledge in the field and be able take-in all current research that's still presented at conferences or stuff like that.

00:14:34: I think something just mentioned was really interesting.

00:14:36: You said it about robotics.

00:14:38: It's about programming and data.

00:14:42: Robco has very unique approach because everything owned by Robco itself so the ownership is an end-to-end process.

00:14:50: So you don't produce a machine and sell it off to someone, but really like You build the machine?

00:14:57: You built software or collect data and install also at your client's facility where its being used And then obviously their feedback loop is much tighter.

00:15:06: But this is also a challenge something that was special.

00:15:10: How do we keep full ownership of innovation by owning all production robots parts in-house, how do you manage all that?

00:15:18: Because it's obviously especially while your growing so fast not an easy task I guess.

00:15:24: I think its just a DNA of RobCo.

00:15:27: even shortly after the start our founders went to their first customer sites and iterated very quickly And for us is more like information processing system.

00:15:39: So we need understand requirements really deeply.

00:15:43: If u look at from outside You know, in certain areas you just have no idea how the real world works.

00:15:49: True And so being able to go into but then also adapt I think that requires a full loop otherwise your missing out on something all of time.

00:16:00: Yeah everything is lost and communication right?

00:16:02: It's like this game kids used to play where it called silent.

00:16:06: i don't even know what exactly its word was but we whispered somebody else ears and their message changes along with way when you're not fully integrated.

00:16:17: But with RobCo, obviously that is not a problem but more an advantage so we can get all of the data in and use it straight away.

00:16:28: How does physical AI really reduce time for robot setup to productive deployment?

00:16:33: Because they talked already around certain levels.

00:16:36: When looking at this right now how do physical AI reduce time to deployment using RobCo machines?

00:16:45: I mean, when we talk about the autonomy part.

00:16:51: We are already seeing that you can be much more flexible.

00:16:54: people can just upload for example a model of the piece they want to pick and it gets all processed automatically deployed automatically.

00:17:05: so in this sense were enabling on level two customers do things without having to call somebody.

00:17:12: Now with level three, I think this is still something that we're experimenting but we are targeting the same thing.

00:17:22: Right now obviously we were looking at all of our different knobs and so on.

00:17:27: But in the end what you will do is set up your system.

00:17:32: You take for example a teaching device which looks very similar to a gripper.

00:17:38: And then you do the task.

00:17:41: end times this number might be today a couple of hundred times.

00:17:46: it will go down over time.

00:17:49: And then you know the big machine starts to rattle and in the end, You can start trying out the robot?

00:17:58: Then as an next step um...you watch how it does things and chime-in so basically takeover control when something is wrong and correct its mistakes.

00:18:10: That's very cool Right.

00:18:11: So this is, what we're currently actively working on and where we see progress over time.

00:18:16: How important in these phases also like the software architecture?

00:18:20: Is it compared to obviously the mechanical parts which are super-important To have?

00:18:27: you know The arm move or the mechanics really move?

00:18:30: But how important is the software?

00:18:34: Well I think turns a robot from something that executes commands very efficiently to a data processing system where you have a lot of.

00:18:44: Data intake as well,

00:18:46: yeah

00:18:47: and then funneling it such that it benefits the task that that you have to do.

00:18:53: so um.

00:18:54: So just an example if you have three cameras The amount of data generated per second is about one hundred megabyte.

00:19:01: crazy.

00:19:02: Um.

00:19:03: so I mean You need to compress you need to manage all of that And then you need To make sense Of That.

00:19:08: Yeah Now the new challenge on a new task that the robot system has to be able to deal with.

00:19:14: Yeah, makes sense when you're looking at the modular robots setup?

00:19:20: Why is such vital role in physical AI and why it's much better suited for physical AI than like classical or traditional fixed automation systems?

00:19:32: so I think we are moving into world where a human, you only have advantages.

00:19:44: So for example in we're currently thinking of bi-manual system and so now the reason is simply because it's easier to learn from humans.

00:19:59: when you are closer to the human shape.

00:20:02: We don't have to solve running around but you can do it.

00:20:06: Now what if you don't need two hands?

00:20:11: You can just deploy a one-armed robot.

00:20:13: What if you need slightly longer or shorter hands, because the task itself makes it necessary to do adaptions for space reasons?

00:20:23: For example we can easily do that.

00:20:26: so I think that sets RobCo apart.

00:20:29: It's also parts are serviceable becomes more maintainable and replace something when in case of something breaks.

00:20:37: In the end, we think that for industrial manufacturing this is a great starting position to be in.

00:20:44: Looking at how everything has evolved... For me it feels like physical AI and AI basically came out of nowhere.

00:20:52: then just was there changed everything but you being in space much longer.

00:20:57: How do see evolution industry production automation or also big jump?

00:21:06: It is definitely a big jump.

00:21:08: I mean, there were some approaches before in autonomous driving under the label of end-to-end learning.

00:21:16: Autonomous driving is in some sense a bit simpler because you can always learn from drivers.

00:21:22: but if you have a modern car today it's a big data collection machine and then that steering wheel and gas on the brake are the feedback mechanisms for In robotics, it wasn't quite clear how to do that.

00:21:36: But then the big explosion came after chat GPT and other LLMs came about where I said hey wait a minute we could use very similar mechanism for robotics as well And this way make these big leap in AI in robotics.

00:21:54: So was like two or three years ago.

00:21:57: Interesting because sometimes you know somebody feels different from outside because you only see the result, but it's very interesting that feels similar for you.

00:22:09: When we were looking at use cases around physical AI.

00:22:13: do have any use cases that your already serving today?

00:22:17: We're currently working on concrete customer problems where things are dull.

00:22:27: tasks like a traditional approach would be too expensive.

00:22:33: Very simple things.

00:22:33: Imagine you go to a place and buy a bucket of paint, what do we need to open the bucket?

00:22:43: It's pretty finicky task.

00:22:45: but that is just one example where if can show it then robot would be able with all intricacies of wobbling buckets.

00:23:00: So do you have more examples of how autonomy is already solving problems at your clients?

00:23:08: Autonomy.

00:23:09: I mean, the big thing is vision.

00:23:12: so we have a lot of clients who can as i said use their rigid models or things that they produce in their factory which using a CAD model and upload it, be able to deploy right away.

00:23:31: Other vision parts are that you can recognize like two D labels And for example boxes from the top.

00:23:39: You can pick them up and find their position.

00:23:44: These tasks which already at this state where they have success rates or accuracy numbers that our customers need for the everyday work and where this provides real value.

00:24:02: Interesting, you're talking about accuracy rates because I have one follow-up question to that.

00:24:07: And Accuracy?

00:24:08: For me when i'm thinking of a production line is like hundred percent.

00:24:12: This not really life right.

00:24:14: it's always something.

00:24:18: there are always variants There´s always margin off error included.

00:24:24: How is it handled versus a traditional line compared to, you know one of the

00:24:30: more

00:24:31: advanced lines with physical AI or modular automation?

00:24:36: Yeah.

00:24:36: So I mean what really... The important part is how many nines do we have in your access success rates?

00:24:46: and You never get to hundred percent.

00:24:49: there's always kind some issues And they can either solve by Dealing and tweaking the mechanics.

00:24:57: Now we're dealing in tweaking with models And I mean, when you know for certain technologies which are already pretty mature We can also Correct like we know there what kind of mistakes they make.

00:25:11: so For example if If you pick something maybe or not Accurate in the rotation to the last decimal point, but then you can use a little fixture to correct.

00:25:20: The rotation so suddenly You get much closer to one hundred percent.

00:25:24: So these are the kind of things that you learn and that they knew that you know when you work in the field on this.

00:25:29: day-to-day Problems for physical AI I think we're looking at right now from our base perspective still had lower accuracy rates.

00:25:43: And then the question is Are there use cases where if you are lower than ninety nine point X percent that the already can achieve value?

00:25:53: For example, every twentieth time a piece of cloth falls down.

00:25:59: You might still create value and There's no damage done.

00:26:03: all we need to do is pick up the cloth Every hour or so And put it back into the line right.

00:26:09: We can still correct in certain times when this kind of thing happens.

00:26:14: The second thing is that we're trying to put these feedback mechanisms in so the model gets better over time, either by direct human feedback or learning from its own mistakes.

00:26:26: So everything becomes a learning-based system end-to-end.

00:26:30: That's

00:26:31: really interesting because with code obviously you can fix something like one hundred percent basically but it only fixes one very small edge case Whereas with the model, you're fixing for a lot of cases.

00:26:45: But then obviously they variants are still there which is interesting right?

00:26:50: You cover more by giving it also little margin of error.

00:26:56: It's very interesting whereas the code has more margin for error because it solves less problems.

00:27:02: Yes and also The challenge in the hardware becomes slightly different.

00:27:07: previously We were working on robots that we're deaf and blind.

00:27:12: So what do you need to do?

00:27:14: You basically, make sure your accurate to the micrometer getting into a certain position then being able to do it million times.

00:27:28: Now with feedback loop You have, you know different challenges.

00:27:32: You might want it to be reactive when you touch something so then your accuracy goes down a little bit but you can't correct for these errors because you have the camera in and do see what's wrong.

00:27:44: So that means we're not trying solve much more use cases than before.

00:27:56: Basically endless imagination.

00:27:58: you can come up with now for robots and automation, an infysical AI that has all the sensors.

00:28:05: And it basically can formulate totally new ways of doing things like how do you handle?

00:28:10: Because sometimes something where I'm thinking in old way we thought they've passed ten or fifteen years in doing some thing.

00:28:18: but no there's feels like endless possibility right But How Do You Handle That?

00:28:26: Yeah.

00:28:27: How do you handle your thoughts?

00:28:29: I think we need to be working with a scientific method, and that's basically... We have to be very careful how we spend our time.

00:28:41: so somehow maximize the improvements in information gain per unit of time.

00:28:50: So when looking carefully how to set up metrics, how to improve what we're doing over time.

00:28:58: How do select the experiments that be run?

00:29:02: So and otherwise it's a huge... It's not only our effort.

00:29:07: there is a huge community out there of researchers around the world in different labs industry labs university labs.

00:29:13: currently I'm getting bombarded with news every day.

00:29:17: so The race on I don't think that there will be a winner in the sense, whoever comes up with these things first.

00:29:28: We'll win everything like it is on software and we can make our point by making sure to resolve real-world use cases and build trust for customers.

00:29:40: Very interesting!

00:29:42: Do you think they would be at this moment when factories realized autonomy has fundamentally changed?

00:29:48: Did you see already coming?

00:29:50: I think we will see it when, uh... We can empower our customers to do these things themselves that they otherwise would have to get help for.

00:30:02: So you were just talking about the bombardment of news happening and i'm pretty sure everybody who's working in AI And especially obviously When You're Working In Physical AI The developments are Happening so fast That Nobody Can Keep Up!

00:30:17: connecting to our newsletter.

00:30:19: And obviously with the podcast where we always try to keep you up-to-date and really go into depth, enter some of the topics that get it from I would say The Principal Engineer himself?

00:30:28: Really FromTheSource?

00:30:30: This is our recommendation on top of things because otherwise You'll be just overwhelmed!

00:30:37: That's why We want To provide a great source for the industry as well.

00:30:40: So thank you, Clemens for being on the RobCo podcast today and giving us your insights.

00:30:46: And I can only ask everyone who's listening subscribe join and come back next time when it's again The

00:31:00: RobCo Podcast Thank.