Instacart partners with Nvidia to turn smart carts into edge AI devices
Instacart is teaming with a leading next-gen artificial intelligence provider to turn its Caper Cart smart carts into a continuous brick-and-mortar store learning system.
The grocery technology company is integrating Nvidia AI technology into its Caper Cart smart cart platform. Caper Carts, which Instacart acquired in October 2021, are equipped with scales, sensors, touchscreens, and computer vision technology. The carts, which were upgraded in September 2023, use computer vision and AI to automatically identify items as they're placed in the cart.
Specifically, Caper Carts are equipped with basket-facing camera sensors, weights & measures-certified scales, location-tracking systems, outward-facing cameras, and a Nvidia Jetson edge AI device on each cart that runs advanced sensor fusion.
Across the entire Caper Cart install base, the smart carts capture millions of sensor inputs every day, such as which items are going in and out of the basket, how customers move through aisles, how physical interactions connect to purchase history, and shelf conditions.
How Caper Cart physical AI works
Thousands of Caper Carts are currently deployed across more than 100 cities, with implementation tripling year-over-year according to Instacart. Two systems work in parallel to understand the basket from visual sensors: an edge encoder leveraging Nvidia Jetson for real-time feedback and cloud vision-language model (VLM) encoders for reasoning on visual windows of context. The two embedding feeds are combined into a shopping experience decoder to understand user actions, item information, location, and shelf information.
Caper Carts use two cameras to triangulate the exact location of an item or multiple items in 3D space within the basket, enabling Instacart to track multiple items moving in different directions in real-time. However, cameras often can be blocked or obscured, so the weight of the basket captured from the built-in scale is also critical to building an accurate understanding.
Instacart's Caper Cart sensor fusion approach fuses together weight, location, and visual signals to deliver a more accurate understanding of a shopper’s basket. The system keeps learning, and by incorporating Wi-Fi signals, magnetic fields, wheel encoders, and visuals from side cameras with Nvidia Jetson for visual retrieval, both at the shelf level and the item level from side-facing cameras, Instacart obtains cart and item location at the same time.
The Caper Cart system then fuses the two data streams together, creating a single model for all stores that provide a view of where the cart is in the store as well as exactly what’s on the shelf, enabling accurate item recommendations for shoppers.
The ‘grocery world’ model
Instacart says its vision is to build a “grocery world” model, a foundational AI system that understands how products relate to each other, how customers make decisions, and how stores operate physically and commercially.
Extending this layer, Instacart plans to build a system of AI expert agents across functions including catalog intelligence, inventory, store operations, recommendations, and logistics. Together, these agents will form an integrated retail reasoning system, optimizing decisions across online and in-store environments in real-time.
"The future of retail isn’t physical or digital," David McIntosh, chief connected stores officer, Instacart, said in a corporate blog post. "It’s a unified experience fueled by this continuous learning system."
Instacart has begun offering an agentic shopping experience via Caper Carts with a solution called Cart Assistant, which is based on agentic AI and designed to help customers find meal recommendations, build carts faster, and plan meals through a personalized experience. Kroger is one of the first retailers to pilot the solution.
[READ MORE: Kroger introduces in-store agentic AI shopping with Instacart]
For example, store managers will be able to ask an agent which shelves need restocking based on foot traffic and sales or use agents to automatically coordinate merchandising, optimize assortments and personalize the customer shopping experience.
"Grocery is one of the most complex real-world AI challenges in retail, and we believe Physical AI is the way to truly digitize the store," said Azita Martin, VP and GM of retail and CPG at Nvidia. "Instacart’s approach – combining edge computing, accelerated AI infrastructure, and deep marketplace data – unifies online and in-store intelligence by processing signals at the edge and scaling intelligence in the cloud to lay the foundation for the next era of omnichannel retail."

