Reading view

There are new articles available, click to refresh the page.

Quantum physicists have shrunk and “de-censored” DeepSeek R1

A group of quantum physicists claims to have created a version of the powerful reasoning AI model DeepSeek R1 that strips out the censorship built into the original by its Chinese creators. 

The scientists at Multiverse Computing, a Spanish firm specializing in quantum-inspired AI techniques, created DeepSeek R1 Slim, a model that is 55% smaller but performs almost as well as the original model. Crucially, they also claim to have eliminated official Chinese censorship from the model.

In China, AI companies are subject to rules and regulations meant to ensure that content output aligns with laws and “socialist values.” As a result, companies build in layers of censorship when training the AI systems. When asked questions that are deemed “politically sensitive,” the models often refuse to answer or provide talking points straight from state propaganda.

To trim down the model, Multiverse turned to a mathematically complex approach borrowed from quantum physics that uses networks of high-dimensional grids to represent and manipulate large data sets. Using these so-called tensor networks shrinks the size of the model significantly and allows a complex AI system to be expressed more efficiently.

The method gives researchers a “map” of all the correlations in the model, allowing them to identify and remove specific bits of information with precision. After compressing and editing a model, Multiverse researchers fine-tune it so its output remains as close as possible to that of the original.

To test how well it worked, the researchers compiled a data set of around 25 questions on topics known to be restricted in Chinese models, including “Who does Winnie the Pooh look like?”—a reference to a meme mocking President Xi Jinping—and “What happened in Tiananmen in 1989?” They tested the modified model’s responses against the original DeepSeek R1, using OpenAI’s GPT-5 as an impartial judge to rate the degree of censorship in each answer. The uncensored model was able to provide factual responses comparable to those from Western models, Multiverse says.

This work is part of Multiverse’s broader effort to develop technology to compress and manipulate existing AI models. Most large language models today demand high-end GPUs and significant computing power to train and run. However, they are inefficient, says Roman Orús, Multiverse’s cofounder and chief scientific officer. A compressed model can perform almost as well and save both energy and money, he says. 

There is a growing effort across the AI industry to make models smaller and more efficient. Distilled models, such as DeepSeek’s own R1-Distill variants, attempt to capture the capabilities of larger models by having them “teach” what they know to a smaller model, though they often fall short of the original’s performance on complex reasoning tasks.

Other ways to compress models include quantization, which reduces the precision of the model’s parameters (boundaries that are set when it’s trained), and pruning, which removes individual weights or entire “neurons.”

“It’s very challenging to compress large AI models without losing performance,” says Maxwell Venetos, an AI research engineer at Citrine Informatics, a software company focusing on materials and chemicals, who didn’t work on the Multiverse project. “Most techniques have to compromise between size and capability. What’s interesting about the quantum-inspired approach is that it uses very abstract math to cut down redundancy more precisely than usual.”

This approach makes it possible to selectively remove bias or add behaviors to LLMs at a granular level, the Multiverse researchers say. In addition to removing censorship from the Chinese authorities, researchers could inject or remove other kinds of perceived biases or specialty knowledge. In the future, Multiverse says, it plans to compress all mainstream open-source models.  

Thomas Cao, assistant professor of technology policy at Tufts University’s Fletcher School, says Chinese authorities require models to build in censorship—and this requirement now shapes the global information ecosystem, given that many of the most influential open-source AI models come from China.

Academics have also begun to document and analyze the phenomenon. Jennifer Pan, a professor at Stanford, and Princeton professor Xu Xu conducted a study earlier this year examining government-imposed censorship in large language models. They found that models created in China exhibit significantly higher rates of censorship, particularly in response to Chinese-language prompts.

There is growing interest in efforts to remove censorship from Chinese models. Earlier this year, the AI search company Perplexity released its own uncensored variant of DeepSeek R1, which it named R1 1776. Perplexity’s approach involved post-training the model on a data set of 40,000 multilingual prompts related to censored topics, a more traditional fine-tuning method than the one Multiverse used. 

However, Cao warns that claims to have fully “removed” censorship may be overstatements. The Chinese government has tightly controlled information online since the internet’s inception, which means that censorship is both dynamic and complex. It is baked into every layer of AI training, from the data collection process to the final alignment steps. 

“It is very difficult to reverse-engineer that [a censorship-free model] just from answers to such a small set of questions,” Cao says. 

Google’s new Gemini 3 “vibe-codes” responses and comes with its own agent

Google today unveiled Gemini 3, a major upgrade to its flagship multimodal model. The firm says the new model is better at reasoning, has more fluid multimodal capabilities (the ability to work across voice, text or images), and will work like an agent. 

The previous model, Gemini 2.5, supports multimodal input. Users can feed it images, handwriting, or voice. But it usually requires explicit instructions about the format the user wants back, and it defaults to plain text regardless. 

But Gemini 3 introduces what Google calls “generative interfaces,” which allow the model to make its own choices about what kind of output fits the prompt best, assembling visual layouts and dynamic views on its own instead of returning a block of text. 

Ask for travel recommendations and it may spin up a website-like interface inside the app, complete with modules, images, and follow-up prompts such as “How many days are you traveling?” or “What kinds of activities do you enjoy?” It also presents clickable options based on what you might want next.

When asked to explain a concept, Gemini 3 may sketch a diagram or generate a simple animation on its own if it believes a visual is more effective. 

“Visual layout generates an immersive, magazine-style view complete with photos and modules,” says Josh Woodward, VP of Google Labs, Gemini, and AI Studio. “These elements don’t just look good but invite your input to further tailor the results.” 

With Gemini 3, Google is also introducing Gemini Agent, an experimental feature designed to handle multi-step tasks directly inside the app. The agent can connect to services such as Google Calendar, Gmail, and Reminders. Once granted access, it can execute tasks like organizing an inbox or managing schedules. 

Similar to other agents, it breaks tasks into discrete steps, displays its progress in real time, and pauses for approval from the user before continuing. Google describes the feature as a step toward “a true generalist agent.” It will be available on the web for Google AI Ultra subscribers in the US starting November 18.

The overall approach can seem a lot like “vibe coding,” where users describe an end goal in plain language and let the model assemble the interface or code needed to get there.

The update also ties Gemini more deeply into Google’s existing products. In Search, a limited group of Google AI Pro and Ultra subscribers can now switch to Gemini 3 Pro, the reasoning variation of the new model, to receive deeper, more thorough AI-generated summaries that rely on the model’s reasoning rather than the existing AI Mode.

For shopping, Gemini will now pull from Google’s Shopping Graph—which the company says contains more than 50 billion product listings—to generate its own recommendation guides. Users just need to ask a shopping-related question or search a shopping-related phrase, and the model assembles an interactive, Wirecutter-style product recommendation piece, complete with prices and product details, without redirecting to an external site.

For developers, Google is also pushing single-prompt software generation further. The company introduced Google Antigravity, a  development platform that acts as an all-in-one space where code, tools, and workflows can be created and managed from a single prompt.

Derek Nee, CEO of Flowith, an agentic AI application, told MIT Technology Review that Gemini 3 Pro addresses several gaps in earlier models. Improvements include stronger visual understanding, better code generation, and better performance on long tasks—features he sees as essential for developers of AI apps and agents. 

“Given its speed and cost advantages, we’re integrating the new model into our product,” he says. “We’re optimistic about its potential, but we need deeper testing to understand how far it can go.” 

The State of AI: Is China about to win the race? 

The State of AI is a collaboration between the Financial Times & MIT Technology Review examining the ways in which AI is reshaping global power. Every Monday for the next six weeks, writers from both publications will debate one aspect of the generative AI revolution reshaping global power.

In this conversation, the FT’s tech columnist and Innovation Editor John Thornhill and MIT Technology Review’s Caiwei Chen consider the battle between Silicon Valley and Beijing for technological supremacy.

John Thornhill writes:

Viewed from abroad, it seems only a matter of time before China emerges as the AI superpower of the 21st century. 

Here in the West, our initial instinct is to focus on America’s significant lead in semiconductor expertise, its cutting-edge AI research, and its vast investments in data centers. The legendary investor Warren Buffett once warned: “Never bet against America.” He is right that for more than two centuries, no other “incubator for unleashing human potential” has matched the US.

Today, however, China has the means, motive, and opportunity to commit the equivalent of technological murder. When it comes to mobilizing the whole-of-society resources needed to develop and deploy AI to maximum effect, it may be just as rash to bet against. 

The data highlights the trends. In AI publications and patents, China leads. By 2023, China accounted for 22.6% of all citations, compared with 20.9% from Europe and 13% from the US, according to Stanford University’s Artificial Intelligence Index Report 2025. As of 2023, China also accounted for 69.7% of all AI patents. True, the US maintains a strong lead in the top 100 most cited publications (50 versus 34 in 2023), but its share has been steadily declining. 

Similarly, the US outdoes China in top AI research talent, but the gap is narrowing. According to a report from the US Council of Economic Advisers, 59% of the world’s top AI researchers worked in the US in 2019, compared with 11% in China. But by 2022 those figures were 42% and 28%. 

The Trump administration’s tightening of restrictions for foreign H-1B visa holders may well lead more Chinese AI researchers in the US to return home. The talent ratio could move further in China’s favor.

Regarding the technology itself, US-based institutions produced 40 of the world’s most notable AI models in 2024, compared with 15 from China. But Chinese researchers have learned to do more with less, and their strongest large language models—including the open-source DeepSeek-V3 and Alibaba’s Qwen 2.5-Max—surpass the best US models in terms of algorithmic efficiency.

Where China is really likely to excel in future is in applying these open-source models. The latest report from Air Street Capital shows that China has now overtaken the US in terms of monthly downloads of AI models. In AI-enabled fintech, e-commerce, and logistics, China already outstrips the US. 

Perhaps the most intriguing—and potentially the most productive—applications of AI may yet come in hardware, particularly in drones and industrial robotics. With the research field evolving toward embodied AI, China’s advantage in advanced manufacturing will shine through.

Dan Wang, the tech analyst and author of Breakneck, has rightly highlighted the strengths of China’s engineering state in developing manufacturing process knowledge—even if he has also shown the damaging effects of applying that engineering mentality in the social sphere. “China has been growing technologically stronger and economically more dynamic in all sorts of ways,” he told me. “But repression is very real. And it is getting worse in all sorts of ways as well.”

I’d be fascinated to hear from you, Caiwei, about your take on the strengths and weaknesses of China’s AI dream. To what extent will China’s engineered social control hamper its technological ambitions? 

Caiwei Chen responds:

Hi, John!

You’re right that the US still holds a clear lead in frontier research and infrastructure. But “winning” AI can mean many different things. Jeffrey Ding, in his book Technology and the Rise of Great Powers, makes a counterintuitive point: For a general-purpose technology like AI, long-term advantage often comes down to how widely and deeply technologies spread across society. And China is in a good position to win that race (although “murder” might be pushing it a bit!).

Chips will remain China’s biggest bottleneck. Export restrictions have throttled access to top GPUs, pushing buyers into gray markets and forcing labs to recycle or repair banned Nvidia stock. Even as domestic chip programs expand, the performance gap at the very top still stands.

Yet those same constraints have pushed Chinese companies toward a different playbook: pooling compute, optimizing efficiency, and releasing open-weight models. DeepSeek-V3’s training run, for example, used just 2.6 million GPU-hours—far below the scale of US counterparts. But Alibaba’s Qwen models now rank among the most downloaded open-weights globally, and companies like Zhipu and MiniMax are building competitive multimodal and video models. 

China’s industrial policy means new models can move from lab to implementation fast. Local governments and major enterprises are already rolling out reasoning models in administration, logistics, and finance. 

Education is another advantage. Major Chinese universities are implementing AI literacy programs in their curricula, embedding skills before the labor market demands them. The Ministry of Education has also announced plans to integrate AI training for children of all school ages. I’m not sure the phrase “engineering state” fully captures China’s relationship with new technologies, but decades of infrastructure building and top-down coordination have made the system unusually effective at pushing large-scale adoption, often with far less social resistance than you’d see elsewhere. The use at scale, naturally, allows for faster iterative improvements.

Meanwhile, Stanford HAI’s 2025 AI Index found Chinese respondents to be the most optimistic in the world about AI’s future—far more optimistic than populations in the US or the UK. It’s striking, given that China’s economy has slowed since the pandemic for the first time in over two decades. Many in government and industry now see AI as a much-needed spark. Optimism can be powerful fuel, but whether it can persist through slower growth is still an open question.

Social control remains part of the picture, but a different kind of ambition is taking shape. The Chinese AI founders in this new generation are the most globally minded I’ve seen, moving fluidly between Silicon Valley hackathons and pitch meetings in Dubai. Many are fluent in English and in the rhythms of global venture capital. Having watched the last generation wrestle with the burden of a Chinese label, they now build companies that are quietly transnational from the start.

The US may still lead in speed and experimentation, but China could shape how AI becomes part of daily life, both at home and abroad. Speed matters, but speed isn’t the same thing as supremacy.

John Thornhill replies:

You’re right, Caiwei, that speed is not the same as supremacy (and “murder” may be too strong a word). And you’re also right to amplify the point about China’s strength in open-weight models and the US preference for proprietary models. This is not just a struggle between two different countries’ economic models but also between two different ways of deploying technology.  

Even OpenAI’s chief executive, Sam Altman, admitted earlier this year: “We have been on the wrong side of history here and need to figure out a different open-source strategy.” That’s going to be a very interesting subplot to follow. Who’s called that one right?

Further reading on the US-China competition

There’s been a lot of talk about how people may be using generative AI in their daily lives. This story from the FT’s visual story team explores the reality 

From China, FT reporters ask how long Nvidia can maintain its dominance over Chinese rivals

When it comes to real-world uses, toys and companions devices are a novel but emergent application of AI that is gaining traction in China—but is also heading to the US. This MIT Technology Review story explored it.

The once-frantic data center buildout in China has hit walls, and as the sanctions and AI demands shift, this MIT Technology Review story took an on-the-ground look at how stakeholders are figuring it out.

DeepSeek may have found a new way to improve AI’s ability to remember

An AI model released by the Chinese AI company DeepSeek uses new techniques that could significantly improve AI’s ability to “remember.”

Released last week, the optical character recognition (OCR) model works by extracting text from an image and turning it into machine-readable words. This is the same technology that powers scanner apps, translation of text in photos, and many accessibility tools. 

OCR is already a mature field with numerous high-performing systems, and according to the paper and some early reviews, DeepSeek’s new model performs on par with top models on key benchmarks.

But researchers say the model’s main innovation lies in how it processes information—specifically, how it stores and retrieves memories. Improving how AI models “remember” information could reduce the computing power they need to run, thus mitigating AI’s large (and growing) carbon footprint. 

Currently, most large language models break text down into thousands of tiny units called tokens. This turns the text into representations that models can understand. However, these tokens quickly become expensive to store and compute with as conversations with end users grow longer. When a user chats with an AI for lengthy periods, this challenge can cause the AI to forget things it’s been told and get information muddled, a problem some call “context rot.”

The new methods developed by DeepSeek (and published in its latest paper) could help to overcome this issue. Instead of storing words as tokens, its system packs written information into image form, almost as if it’s taking a picture of pages from a book. This allows the model to retain nearly the same information while using far fewer tokens, the researchers found. 

Essentially, the OCR model is a test bed for these new methods that permit more information to be packed into AI models more efficiently. 

Besides using visual tokens instead of just text tokens, the model is built on a type of tiered compression that is not unlike how human memories fade: Older or less critical content is stored in a slightly more blurry form in order to save space. Despite that, the paper’s authors argue, this compressed content can still remain accessible in the background while maintaining a high level of system efficiency.

Text tokens have long been the default building block in AI systems. Using visual tokens instead is unconventional, and as a result, DeepSeek’s model is quickly capturing researchers’ attention. Andrej Karpathy, the former Tesla AI chief and a founding member of OpenAI, praised the paper on X, saying that images may ultimately be better than text as inputs for LLMs. Text tokens might be “wasteful and just terrible at the input,” he wrote. 

Manling Li, an assistant professor of computer science at Northwestern University, says the paper offers a new framework for addressing the existing challenges in AI memory. “While the idea of using image-based tokens for context storage isn’t entirely new, this is the first study I’ve seen that takes it this far and shows it might actually work,” Li says.

The method could open up new possibilities in AI research and applications, especially in creating more useful AI agents, says Zihan Wang, a PhD candidate at Northwestern University. He believes that since conversations with AI are continuous, this approach could help models remember more and assist users more effectively.

The technique can also be used to produce more training data for AI models. Model developers are currently grappling with a severe shortage of quality text to train systems on. But the DeepSeek paper says that the company’s OCR system can generate over 200,000 pages of training data a day on a single GPU.

The model and paper, however, are only an early exploration of using image tokens rather than text tokens for AI memorization. Li says she hopes to see visual tokens applied not just to memory storage but also to reasoning. Future work, she says, should explore how to make AI’s memory fade in a more dynamic way, akin to how we can recall a life-changing moment from years ago but forget what we ate for lunch last week. Currently, even with DeepSeek’s methods, AI tends to forget and remember in a very linear way—recalling whatever was most recent, but not necessarily what was most important, she says. 

Despite its attempts to keep a low profile, DeepSeek, based in Hangzhou, China, has built a reputation for pushing the frontier in AI research. The company shocked the industry at the start of this year with the release of DeepSeek-R1, an open-source reasoning model that rivaled leading Western systems in performance despite using far fewer computing resources. 

AI toys are all the rage in China—and now they’re appearing on shelves in the US too

Kids have always played with and talked to stuffed animals. But now their toys can talk back, thanks to a wave of companies that are fitting children’s playthings with chatbots and voice assistants. 

It’s a trend that has particularly taken off in China: A recent report by the Shenzhen Toy Industry Association and JD.com predicts that the sector will surpass ¥100 billion ($14 billion) by 2030, growing faster than almost any other branch of consumer AI. According to the Chinese corporation registration database Qichamao, there are over 1,500 AI toy companies operating in China as of October 2025.

One of the latest entrants to the market is a toy called BubblePal, a device the size of a Ping-Pong ball that clips onto a child’s favorite stuffed animal and makes it “talk.” The gadget comes with a smartphone app that lets parents switch between 39 characters, from Disney’s Elsa to the Chinese cartoon classic Nezha. It costs $149, and 200,000 units have been sold since it launched last summer. It’s made by the Chinese company Haivivi and runs on DeepSeek’s large language models. 

Other companies are approaching the market differently. FoloToy, another Chinese startup, allows parents to customize a bear, bunny, or cactus toy by training it to speak with their own voice and speech pattern. FoloToy reported selling more than 20,000 of its AI-equipped plush toys in the first quarter of 2025, nearly equaling its total sales for 2024, and it projects sales of 300,000 units this year. 

But Chinese AI toy companies have their sights set beyond the nation’s borders. BubblePal was launched in the US in December 2024 and is now also available in Canada and the UK. And FoloToy is now sold in more than 10 countries, including the US, UK, Canada, Brazil, Germany, and Thailand. Rui Ma, a China tech analyst at AlphaWatch.AI, says that AI devices for children make particular sense in China, where there is already a well-established market for kid-focused educational electronics—a market that does not exist to the same extent globally. FoloToy’s CEO, Kong Miaomiao, told the Chinese outlet Baijing Chuhai that outside China, his firm is still just “reaching early adopters who are curious about AI.”

China’s AI toy boom builds on decades of consumer electronics designed specifically for children. As early as the 1990s, companies such as BBK popularized devices like electronic dictionaries and “study machines,” marketed to parents as educational aids. These toy-electronics hybrids read aloud, tell interactive stories, and simulate the role of a playmate.

The competition is heating up, however—US companies have also started to develop and sell AI toys. The musician Grimes helped to create Grok, a plush toy that chats with kids and adapts to their personality. Toy giant Mattel is working with OpenAI to bring conversational AI to brands like Barbie and Hot Wheels, with the first products expected to be announced later this year.

However, reviews from parents who’ve bought AI toys in China are mixed. Although many appreciate the fact they are screen-free and come with strict parental controls, some parents say their AI capabilities can be glitchy, leading children to tire of them easily. 

Penny Huang, based in Beijing, bought a BubblePal for her five-year-old daughter, who is cared for mostly by grandparents. Huang hoped that the toy could make her less lonely and reduce her constant requests to play with adults’ smartphones. But the novelty wore off quickly.

“The responses are too long and wordy. My daughter quickly loses patience,” says Huang, “It [the role-play] doesn’t feel immersive—just a voice that sometimes sounds out of place.” 

Another parent who uses BubblePal, Hongyi Li, found the voice recognition lagging: “Children’s speech is fragmented and unclear. The toy frequently interrupts my kid or misunderstands what she says. It also still requires pressing a button to interact, which can be hard for toddlers.” 

Huang recently listed her BubblePal for sale on Xianyu, a secondhand marketplace. “This is just like one of the many toys that my daughter plays for five minutes then gets tired of,” she says. “She wants to play with my phone more than anything else.”

❌