a field note on a shift that's already begun

The next wave of AI
doesn't live on a screen.

Most of what AI has done so far is generate text, classify images, answer questions. The interesting part — the part that's hard, useful, and barely started — is the moment AI leaves the screen and starts affecting the physical world. This is a note on what that means, why it might matter, and who's working on it.

~14 min readread time explainerformat neutralframing no pitchyou decide

01·what it is

A short definition, then a longer one.

short version

Physical AI is what you get when a model that can see, reason, or decide is placed inside something that can move.

The longer version needs a small history.

For most of the last decade, "AI" has meant software AI — algorithms that live on servers, take in data, and return predictions, text, images, or recommendations. ChatGPT, image classifiers, fraud detection, recommender systems. Useful, transformative, but fundamentally disembodied. They operate on representations of the world, not the world itself.

Classical robotics has been around just as long, but worked in the opposite direction. Robots were precise, fast, and strong — but dumb. They did exactly one programmed task, in exactly one controlled environment, with exactly the input they were calibrated for. Change the lighting, rotate the part, swap the product line, and the whole thing fell apart.

Physical AI is the collision of those two traditions. Machines that sense the real world with cameras, lidar, force sensors — reason about what they're seeing with modern AI models — and act on the world with actuators, motors, wheels, grippers.

A warehouse robot that picks an unfamiliar package without being reprogrammed. A factory arm that learns a new assembly step from a 30-second demonstration. A drone that inspects a bridge and flags a crack the model has never seen before. A surgical assistant that adjusts in real time to a patient's anatomy. An autonomous tractor that weeds one row of crops differently from the next because the soil told it to.

None of this is speculation. All of it exists in 2026, at varying stages of maturity. What's new is that all of it is finally starting to work, at the same time, because three separate curves crossed: vision models got good enough, compute got small enough to fit on-device, and robotics hardware got cheap enough to deploy outside research labs.

02·how it's different

The gap between software AI and physical AI is bigger than it looks.

If you've built ML systems, you already know the software AI stack. The physical AI stack rhymes with it but has a completely different set of constraints. It's useful to see them side by side.

Software AI

Physical AI

Input

Clean text, labelled images, structured data

Messy sensor streams — cameras, lidar, IMUs, force, depth — often noisy, often partial

Environment

Controlled. Inference happens in a server.

Uncontrolled. Lighting, dust, vibration, human presence, edge cases daily.

Failure cost

Usually reversible. Log it, retry, fix later.

Often physical. Dropped object, collision, injury, burnt part.

Latency

Seconds are fine. Async is common.

Milliseconds. A control loop can't wait for the cloud.

Data

Scrape, label, benchmark. Abundant.

Record, simulate, collect from a real robot. Scarce and expensive.

Deploy

Ship a container to a cluster.

Flash firmware, calibrate sensors, ship hardware, service it.

This table is also why physical AI is hard — and why engineers who can work across it are uncommon. Software engineers usually don't know sensor fusion or control theory. Mechanical engineers usually don't know deep learning. The people who bridge these worlds get to work on problems that neither tribe can solve alone.

03·why it might matter

Five problems that only physical AI can really solve.

A lot of writing about AI treats "AI for good" as marketing language. These five aren't marketing — they're problems that have resisted software solutions for decades, usually because software alone can't reach into the physical world to fix them.

01healthcare

Surgical assistance in hospitals that don't have specialists.

Most complex surgeries happen in a few dozen cities worldwide. Tier-2 and tier-3 towns just don't have the specialists. Physical AI-assisted surgical systems — tele-operated or semi-autonomous — can let a surgeon in Mumbai perform a procedure with a robot in Warangal, or let a trained generalist attempt something that used to need a specialist. Not replacing doctors. Extending their reach.

impact: access to specialist care for hundreds of millions who currently have none

02agriculture

Farming that uses a fraction of the pesticide and water it does today.

A camera on a tractor can tell a weed from a crop, spray one and not the other. Drones can identify plant disease at the leaf level, days before a human would see it. Soil sensors can tell you which part of a field needs irrigation and which doesn't. Individually, small improvements. At scale, less chemical runoff, less water waste, more yield from the same land — which matters a great deal on a planet adding two billion more people by 2050.

impact: food security on a warming planet

03disaster response

Finding people in collapsed buildings, flooded areas, burning forests.

When a building collapses, the first 72 hours decide who survives. Humans can't safely search debris. Dogs help but don't scale. Small autonomous robots and drones can. The hard part isn't the hardware — it's navigating spaces no one mapped, under conditions no one simulated. That's a physical AI problem, not a software one.

impact: lives saved in the narrow window between disaster and rescue

04manufacturing

Making things locally that currently have to be imported.

India imports around 70% of the electronic components inside its own assembled products. Not because of labour costs — because local factories can't yet deliver the precision and consistency that miniaturised parts require. Physical AI makes precision possible at scale. That changes which countries can manufacture what — and for a country that wants to be more than an assembly hub, that's a generational shift.

impact: economic sovereignty, local jobs, fewer brittle supply chains

05the unsexy one

Work that damages human bodies.

Warehouse lifting. Repetitive assembly. Chemical handling. Underground mining. Deep-sea inspection. Millions of people do jobs that quietly ruin their backs, their lungs, their hearing, their long-term health. Physical AI doesn't eliminate those jobs overnight — but for the first time, there's a real path to removing humans from the most harmful slice of them. This one rarely shows up in investor decks. It should.

impact: quieter than the others, but possibly the largest in human terms

04·who works on it

It's not one discipline. It's a handshake between several.

One of the more interesting things about physical AI is how many different kinds of engineers are needed to ship a single working system. A warehouse robot that picks a package needs people who know perception, people who know motion planning, people who know mechanical design, people who know embedded systems, and people who know data collection. No one person covers all of that.

Here's a rough map of who does what. Most working engineers in this field specialise in one of these and talk fluently across two or three.

◇ perception

Makes machines see.

Computer vision, sensor fusion, point clouds, depth estimation. Often comes from a CS or ML background. Works in Python, PyTorch, OpenCV, ROS. Cares about benchmarks and edge cases in equal measure.

◈ motion

Makes machines move.

Control systems, motion planning, kinematics, trajectory optimisation. Usually comes from mechanical or mechatronics. Fluent in ROS, Gazebo, Isaac Sim, C++. Cares about stability, safety margins, and real-time guarantees.

◍ hardware

Makes the machine itself.

Mechanical design, embedded electronics, PCB layout, firmware, power systems. Comes from ME or ECE. Spends their week in CAD, on a soldering station, or debugging with a logic analyser. Cares about thermals and tolerances.

⌁ learning

Teaches the model.

Reinforcement learning, imitation learning, sim-to-real transfer, data pipelines. Often from ML research. Works in PyTorch, JAX, Isaac Lab. Cares about reward shaping, distribution shift, and the gap between simulation and reality.

◆ systems

Makes it all work as one thing.

Software architecture, middleware, deployment, observability. Often senior generalists. Fluent in ROS 2, DDS, containers, edge compute. Cares about reliability, logging, and the long tail of production bugs.

▲ safety & product

Makes it usable in the world.

HMI design, safety cases, regulatory, field deployment, user research. Often the most underrated role. Cares about how a tired technician at 3am will actually use the system when things go wrong.

The most interesting career profile in physical AI is usually not someone who's narrowly one of the above. It's someone who picked one seriously and learned enough of the others to collaborate. A perception engineer who can read embedded code. A mechanical engineer who can train a small vision model. That shape of person is rare and extremely in demand.

05·the stack, simplified

Every physical AI system has these five layers.

You can skip this section if taxonomies bore you. But it's useful to see how the pieces fit — because each layer is a different engineering problem, with a different set of tools and a different pathway to learning it.

Application

What the user sees. A warehouse operator's dashboard, a surgeon's UI, a field technician's phone. Web, mobile, HMI. The part where your React skills matter in a field that's otherwise full of C++.

Intelligence

The decision-making brain. Vision-language models, planners, learned policies, LLMs that can reason about physical actions. PyTorch, JAX, Hugging Face. This is where most of the recent excitement lives.

Perception

Turning raw sensor streams into understanding. Object detection, SLAM, sensor fusion, depth estimation. OpenCV, PyTorch, specialised hardware. The layer that most hackathon projects touch.

Control & motion

Translating intent into physical action. PID, model-predictive control, motion planning, trajectory optimisation, ROS 2. C++, Python, real-time OS. The layer that keeps a robot safe.

Hardware

Motors, sensors, compute, power. The actual thing. Microcontrollers, PCBs, mechanical design, thermal management. Without this, nothing above it matters.

Most people enter this field through one layer and grow outward. If you already know L3-L4 from software AI, the interesting stretch is downward into L1-L2. If you come from mechanical or electronics, the stretch is upward. The fastest learners spend their first couple of years building small complete systems that cross every layer, even if each layer is done badly. You learn what matters by seeing how they interact.

06·where it's happening in india

A surprising amount of it is in your postal code.

Physical AI in India was, until about 2022, mostly academic — IITs, a few research labs, a handful of startups. That's changed quickly. Hyderabad, Bengaluru, and a few industrial clusters in Tamil Nadu and Maharashtra now host companies doing genuinely frontier work. Most of them are hiring. Most of them, interestingly, are young — five-year-old companies rather than fifty-year-old ones.

Here's a partial map. Not a recommendation, not a ranking — just a sense of the landscape.

◍ hyderabad

Perceptyne Robots

Industrial humanoid robots for dexterous assembly. Taking on tasks that traditional automation can't — parts that need human-like touch.

◍ bengaluru

CynLr

Visual object intelligence for robots. Rebuilding the perception stack from first principles — inspired by how biological vision works.

◍ bengaluru

Ati Motors

Autonomous industrial vehicles — their Sherpa AMR runs in Tata, Hyundai, and Forbes plants globally. Deep work on real-world autonomy.

◍ noida + bengaluru

Addverb Technologies

Warehouse and industrial automation at scale. Mobile robots, sortation systems, full-stack integration. Large deployments, real customers.

◍ bengaluru

Niqo Robotics

Precision agriculture — targeted spraying that cuts pesticide use dramatically. Computer vision in dusty, sun-blasted, unforgiving environments.

◍ pan-india, global

NVIDIA India

Isaac Sim, Isaac ROS, Jetson platforms. The toolchain most of the above companies actually build on. Large India R&D presence.

There are dozens more — in surgical robotics, warehouse automation, drones, defence. The larger point is that physical AI stopped being a thing you have to move abroad to work on. The interesting labs are now ten minutes from a metro station in several Indian cities.

07·how anyone curious can start

A beginner's exploration path, not a curriculum.

If someone wanted to figure out whether this field interests them — not commit to it, just explore — here's a rough sequence that works. The order matters less than the pattern: learn a little, build a little, see if you still want to keep going.

Watch one robot work for an hour.

Not a demo reel — actual technical content. Boston Dynamics' engineering talks on YouTube. Figure AI walkthroughs. A CynLr demo video. Just to see what the current frontier looks like and what you find interesting. This is free research.

search: "Boston Dynamics Atlas explained", "CynLr visual intelligence demo"

Do the ROS 2 beginner tutorials.

ROS (Robot Operating System) is the middleware that nearly all non-trivial robots run on. The official ROS 2 tutorials take about a weekend. You don't need a real robot — everything runs in simulation. This one weekend tells you whether the field feels boring or alive to you.

docs.ros.org/en/humble/Tutorials.html

Simulate before you build.

NVIDIA's Isaac Sim and the older Gazebo both let you spawn a robot in a physics world and write code to control it. This is how serious physical AI work actually happens — simulate ten thousand times, then run once on hardware. The bar to entry is a laptop with a decent GPU.

developer.nvidia.com/isaac-sim

Build one thing that bridges your software skills with something physical.

An Arduino or ESP32, a camera module, a motor, a Python script. Make the camera detect an object and the motor respond. This is how most physical AI engineers started — one tiny project that moves because code told it to. If it interests you, it'll show. If it doesn't, that's also a useful signal.

tooling: Arduino IDE, OpenCV, Python, under ₹3000 in parts

Read one paper that isn't about your current field.

Try RT-2 from DeepMind — a vision-language-action model that can follow natural language instructions on a real robot. It's accessible, short, and shows where the frontier of physical AI is actually moving. You don't need to understand every equation; you need to see the shape of the idea.

search: "RT-2 DeepMind paper"

Show up at a hackathon with a hardware track.

Smart India Hackathon sometimes has hardware problem statements. HackIndia, NASA Space Apps, Hack for Change occasionally do too. Even one weekend with real hardware, under time pressure, teaches more than a semester of theory. Especially if the team is a mix of ML people and mechanical people — that conversation itself is the education.

sih.gov.in · major student hackathons publish problem statements months in advance

Notice what's not on this list: enrolling in a paid course, memorising textbooks, doing a certification. Those can come later, if you find the field still interests you after step 6. The point of exploration is to find out whether the thing is alive for you — not to build a résumé line before you know.

08·honest notes

Things nobody puts on the brochure.

Every field has a version of itself that shows up in hype articles and a version of itself that shows up on a Wednesday at 3pm when something is broken. These are some of the things from the second version.

Physical AI pays less than pure software, at least at entry level.

A new graduate joining a strong software-AI role in 2026 might earn 1.5-2x what the same graduate earns at a physical AI startup. The gap closes in a few years because the skill is rarer, but the first two to three years are financially slower. This is worth knowing up front.

Hardware is slower than software. Emotionally slower.

Code compiles in seconds. A misprinted PCB takes two weeks and a courier. A burnt motor takes a new part, a resolder, and a calibration run. The feedback loops are longer. If you're the kind of person who thrives on rapid iteration, this will frustrate you until it teaches you something software never did — how to think carefully before you act.

iii

The gap between simulation and reality is real, and it hurts.

A policy that works perfectly in Isaac Sim can catastrophically fail on a real robot because the real world has friction the simulator didn't model. This is called the sim-to-real gap, and closing it is an active research problem. Expect your first few real-world deployments to humble you.

Most of the day is debugging, not breakthroughs.

Twitter and YouTube show the highlight reel — a new humanoid walking, a robot folding laundry. The actual work is more like: "the left encoder is drifting by 2mm every hour and I don't know why." This is also true of software engineering, but physical AI has the added joy that the bug might be mechanical, electrical, or numerical, and narrowing it down takes hours.

The field rewards builders more than credential-collectors.

This one is genuine good news. In physical AI, a portfolio of small working systems — a line-following robot, a vision-guided arm, a drone that does one thing well — is worth more than a stack of certifications. Because the field is still young, the people hiring often trust "show me the robot" over "show me the transcript." If you already build things, you already have the right instincts.

And none of this is urgent.

There's a whole genre of career writing that wants you to feel panicked — "this window closes soon," "don't miss out." Physical AI isn't that. If it interests you, explore it at your pace. If it doesn't, ignore it — there are ten other meaningful problems to work on in the next decade. The only bad path is making a decision you haven't actually thought about.

09·closing note

One thought, then you decide.

Most technology waves, looked at closely, are shifts in what gets to be intelligent. The personal computer brought intelligence to the desk. The internet brought intelligence to communication. Smartphones brought intelligence to the pocket. Cloud computing brought intelligence to infrastructure. Large language models brought intelligence to language itself.

Physical AI is what happens when intelligence starts showing up in the physical world — in tractors, surgical tools, manufacturing lines, rescue robots, homes, hospitals, farms.

It's not going to be the only important thing in the next decade. Software AI will keep improving. Biology, energy, space, materials — all have their own waves. But physical AI is unusual in that it requires a combination of disciplines that, until recently, didn't talk to each other much. Which means the people who can fluently work across them are rare, and the problems they get to work on are the kind that stay with you.

Whether any of this is interesting — that's not a question anyone else can answer. But the good news is that finding out is cheap. A weekend with ROS tutorials. An Arduino, a camera, a motor. One hackathon. If it lights you up, you'll know. If it doesn't, that's also fine. Either way, you'll have looked at it honestly, which is more than most people who'll be affected by it will ever do.

That's the whole note.

The next wave of AI doesn't live on a screen.

A short definition, then a longer one.

The gap between software AI and physical AI is bigger than it looks.

Five problems that only physical AI can really solve.

Surgical assistance in hospitals that don't have specialists.

Farming that uses a fraction of the pesticide and water it does today.

Finding people in collapsed buildings, flooded areas, burning forests.

Making things locally that currently have to be imported.

Work that damages human bodies.

It's not one discipline. It's a handshake between several.

Makes machines see.

Makes machines move.

Makes the machine itself.

Teaches the model.

Makes it all work as one thing.

Makes it usable in the world.

Every physical AI system has these five layers.

A surprising amount of it is in your postal code.

Perceptyne Robots

CynLr

Ati Motors

Addverb Technologies

Niqo Robotics

NVIDIA India

A beginner's exploration path, not a curriculum.

Watch one robot work for an hour.

Do the ROS 2 beginner tutorials.

Simulate before you build.

Build one thing that bridges your software skills with something physical.

Read one paper that isn't about your current field.

Show up at a hackathon with a hardware track.

Things nobody puts on the brochure.

Physical AI pays less than pure software, at least at entry level.

Hardware is slower than software. Emotionally slower.

The gap between simulation and reality is real, and it hurts.

Most of the day is debugging, not breakthroughs.

The field rewards builders more than credential-collectors.

And none of this is urgent.

One thought, then you decide.

The next wave of AI
doesn't live on a screen.