Training Physical AI on Video Game Clips
A summary of the interesting content that I consumed this past week…
What I Read This Week: a summary of the content that I consumed this past week…
Caught My Eye…
1) Training Physical AI on Video Game Clips
On June 25, General Intuition raised $320 million at a $2.3 billion valuation, led by Khosla Ventures, with General Catalyst, Jeff Bezos, and Eric Schmidt participating. General Intuition trains AI to understand and act in the physical world by learning from video-game recordings.
Most videos teach a model what the world looks like. To interact with the physical world, robots also need to learn how their actions change it. Gameplay clips also record what the player did, the exact button pressed, and the instant it happened. That pairing of image and action is the dataset an agent or a robot needs to learn cause and effect, and it barely exists outside games.
The company was spun out of Medal, a clip-sharing app with 17 million monthly users and hundreds of millions of hours of recorded gameplay. That library served as the initial dataset for training General Intuition’s model.
General Intuition builds what it calls action models and world models: systems that learn how actions change the world and can generate new environments for practice. They demonstrated that the same model playing 100 hours of Fortnite was also the one controlling a robot moving around the office.
General Intuition co-founder Pim de Witte says that “text compresses the four dimensions of reality into a single dimension”. Embodied interactions based on video games are what they believe is missing. Most of the $320 million will go toward rented compute to train the next version.
2) OpenAI Unveils Its First Custom Inference Chip
On June 24, OpenAI and Broadcom unveiled Jalapeño, OpenAI’s first custom chip. It is built for AI inference, the process of running a model after it has already been trained. It is an ASIC (Application-Specific Integrated Circuit), meaning it is hard-wired for one job rather than a general-purpose GPU.
It went from first design to production in nine months, the fastest cycle OpenAI and Broadcom know of for a chip this advanced. OpenAI used its own models to help design the chip: better models help design better chips, and better chips help run better models. Initial deployment is scheduled for the end of 2026.
A chip tuned to OpenAI’s own inference can beat a general-purpose GPU on power and price. Early testing shows performance per watt well above current state-of-the-art parts. Broadcom and Celestica build the silicon while OpenAI designs it around LLM fundamentals. This shows OpenAI clearly moving down the AI stack, expanding its moat by owning more of the infrastructure behind its core product.
A day later, IBM announced the world’s first sub-1-nanometer transistor chip. The scale would be 10,000 times smaller than a human red blood cell. IBM’s new chip is able to pack 100 billion transistors onto a chip the size of a fingernail. The breakthrough was supported by a new transistor architecture, called nanostack, a three-dimensional, nanosheet-based design. IBM also demonstrated a 40 percent scaling improvement in SRAM, the ultra-fast memory inside a processor. Because SRAM helps move data quickly inside the chip, expanding it could reduce bottlenecks in heavy computing tasks like AI. IBM sees a path to production within the next five years.
3) Meta Spends $900M to Fix WhatsApp Pay
CRED is an Indian fintech that began as an app for paying credit-card bills and now offers payments, lending, and savings to 17 million members. It handles more than 40% of India’s credit-card bill payments. On June 22, Meta invested $900 million for about 20% of the company at a $4.5 billion valuation, and hired CRED founder Kunal Shah to run WhatsApp worldwide, replacing the executive who had led it for seven years. CRED says Meta will not receive any of its customer data.
WhatsApp has more than 500 million users in India, but almost no share of the money they move. UPI, India’s instant bank-to-bank payment network, ran 23.2 billion transactions in May, and WhatsApp Pay only handled 0.65% of them. Meta has spent six years and more than $6.6 billion trying to change that. A national rule is meant to cap any single app at 30% of UPI volume. Its deadline keeps getting pushed, now to December 2026. That has allowed the current industry leaders, PhonePe and Google Pay, to remain at 46% and 33%. If the cap is ever enforced, both have to give up market share, creating an opening for WhatsApp Pay.
Learn With My Friends and Me…
Socialists Sweep NYC, China Catches Up in Coding, AI Memory Crunch, Micron’s Blowout Quarter
Deep Dive: Curing Age At the Cellular Level
The top technology entrepreneurs of our generation are investing billions into reversing aging. Each is betting that the best way to reverse aging is with cellular reprogramming: the process of resetting...
















The first part took me right back to 2016
There was a big wave of papers doing exactly this for self-driving cars, things like using GTA V footage for training data. Funny enough it was one of the things that pushed me toward ML when I went to uni.
Here's 2 papers I found to make sure I didn't hallucinate:
- https://arxiv.org/abs/1610.01983
and
- https://arxiv.org/abs/1608.02192
> models trained with game data and just 1/3 of the CamVid training set outperform models trained on the complete CamVid training set
Wish I could find it, I swear I remember seeing a video of one of those big companies like Waymo using their tech in GTA V. Super cool.