Normal view

There are new articles available, click to refresh the page.
Today — 26 January 2026MIT Technology Review

Inside OpenAI’s big play for science 

26 January 2026 at 13:32

In the three years since ChatGPT’s explosive debut, OpenAI’s technology has upended a remarkable range of everyday activities at home, at work, in schools—anywhere people have a browser open or a phone out, which is everywhere.

Now OpenAI is making an explicit play for scientists. In October, the firm announced that it had launched a whole new team, called OpenAI for Science, dedicated to exploring how its large language models could help scientists and tweaking its tools to support them.

The last couple of months have seen a slew of social media posts and academic publications in which mathematicians, physicists, biologists, and others have described how LLMs (and OpenAI’s GPT-5 in particular) have helped them make a discovery or nudged them toward a solution they might otherwise have missed. In part, OpenAI for Science was set up to engage with this community.

And yet OpenAI is also late to the party. Google DeepMind, the rival firm behind groundbreaking scientific models such as AlphaFold and AlphaEvolve, has had an AI-for-science team for years. (When I spoke to Google DeepMind’s CEO and cofounder Demis Hassabis in 2023 about that team, he told me: “This is the reason I started DeepMind … In fact, it’s why I’ve worked my whole career in AI.”)

So why now? How does a push into science fit with OpenAI’s wider mission? And what exactly is the firm hoping to achieve?

I put these questions to Kevin Weil, a vice president at OpenAI who leads the new OpenAI for Science team, in an exclusive interview last week.

On mission

Weil is a product guy. He joined OpenAI a couple of years ago as chief product officer after being head of product at Twitter and Instagram. But he started out as a scientist. He got two-thirds of the way through a PhD in particle physics at Stanford University before ditching academia for the Silicon Valley dream. Weil is keen to highlight his pedigree: “I thought I was going to be a physics professor for the rest of my life,” he says. “I still read math books on vacation.”

Asked how OpenAI for Science fits with the firm’s existing lineup of white-collar productivity tools or the viral video app Sora, Weil recites the company mantra: “The mission of OpenAI is to try and build artificial general intelligence and, you know, make it beneficial for all of humanity.”

The impact on science of future versions of this technology could be amazing, he says: New medicines, new materials, new devices. “Think about it helping us understand the nature of reality, helping us think through open problems. Maybe the biggest, most positive impact we’re going to see from AGI will actually be from its ability to accelerate science.”

He adds, “With GPT-5, we saw that becoming possible.” 

As Weil tells it, LLMs are now good enough to be useful scientific collaborators, spitballing ideas, suggesting novel directions to explore, and finding fruitful parallels between a scientist’s question and obscure research papers published decades ago or in foreign languages.

That wasn’t the case a year or so ago. Since it announced its first reasoning model, o1, in December 2024, OpenAI has been pushing the envelope of what the technology can do. “You go back a few years and we were all collectively mind-blown that the models could get an 800 on the SAT,” says Weil.

But soon LLMs were acing math competitions and solving graduate-level physics problems. Last year, OpenAI and Google DeepMind both announced that their LLMs had achieved gold-medal-level performance in the International Math Olympiad, one of the toughest math contests in the world. “These models are no longer just better than 90% of grad students,” says Weil. “They’re really at the frontier of human abilities.”

That’s a huge claim, and it comes with caveats. Still, there’s no doubt that GPT-5 is a big improvement on GPT-4 when it comes to complicated problem-solving. GPT-5 includes a so-called reasoning model, a type of LLM that can break down problems into multiple steps and work through them one by one. This technique has made LLMs far better at solving math and logic problems than they used to be. 

Measured against an industry benchmark known as GPQA, which includes more than 400 multiple-choice questions that test PhD-level knowledge in biology, physics, and chemistry, GPT-4 scores 39%, well below the human-expert baseline of around 70%. According to OpenAI, GPT-5.2 (the latest update to the model, released in December) scores 92%. 

Overhyped

The excitement is evident—and perhaps excessive. In October, senior figures at OpenAI, including Weil, boasted on X that GPT-5 had found solutions to several unsolved math problems. Mathematicians were quick to point out that in fact what GPT-5 appeared to have done was dig up existing solutions in old research papers, including at least one written in German. That was still useful, but it wasn’t the achievement OpenAI seemed to have claimed. Weil and his colleagues deleted their posts.

Now Weil is more careful. It is often enough to find answers that exist but have been forgotten, he says: “We collectively stand on the shoulders of giants, and if LLMs can kind of accumulate that knowledge so that we don’t spend time struggling on a problem that is already solved, that’s an acceleration all of its own.”

He plays down the idea that LLMs are about to come up with a game-changing new discovery. “I don’t think models are there yet,” he says. “Maybe they’ll get there. I’m optimistic that they will.”

But, he insists, that’s not the mission: “Our mission is to accelerate science. And I don’t think the bar for the acceleration of science is, like, Einstein-level reimagining of an entire field.”

For Weil, the question is this: “Does science actually happen faster because scientists plus models can do much more, and do it more quickly, than scientists alone? I think we’re already seeing that.”

In November, OpenAI published a series of anecdotal case studies contributed by scientists, both inside and outside the company, that illustrated how they had used GPT-5 and how it had helped. “Most of the cases were scientists that were already using GPT-5 directly in their research and had come to us one way or another saying, ‘Look at what I’m able to do with these tools,’” says Weil.

The key things that GPT-5 seems to be good at are finding references and connections to existing work that scientists were not aware of, which sometimes sparks new ideas; helping scientists sketch mathematical proofs; and suggesting ways for scientists to test hypotheses in the lab.  

“GPT 5.2 has read substantially every paper written in the last 30 years,” says Weil. “And it understands not just the field that a particular scientist is working in; it can bring together analogies from other, unrelated fields.”

“That’s incredibly powerful,” he continues. “You can always find a human collaborator in an adjacent field, but it’s difficult to find, you know, a thousand collaborators in all thousand adjacent fields that might matter. And in addition to that, I can work with the model late at night—it doesn’t sleep—and I can ask it 10 things in parallel, which is kind of awkward to do to a human.”

Solving problems

Most of the scientists OpenAI reached out to back up Weil’s position.

Robert Scherrer, a professor of physics and astronomy at Vanderbilt University, only played around with ChatGPT for fun (“I used to it rewrite the theme song for Gilligan’s Island in the style of Beowulf, which it did very well,” he tells me) until his Vanderbilt colleague Alex Lupsasca, a fellow physicist who now works at OpenAI, told him that GPT-5 had helped solve a problem he’d been working on.

Lupsasca gave Scherrer access to GPT-5 Pro, OpenAI’s $200-a-month premium subscription. “It managed to solve a problem that I and my graduate student could not solve despite working on it for several months,” says Scherrer.

It’s not perfect, he says: “GTP-5 still makes dumb mistakes. Of course, I do too, but the mistakes GPT-5 makes are even dumber.” And yet it keeps getting better, he says: “If current trends continue—and that’s a big if—I suspect that all scientists will be using LLMs soon.”

Derya Unutmaz, a professor of biology at the Jackson Laboratory, a nonprofit research institute, uses GPT-5 to brainstorm ideas, summarize papers, and plan experiments in his work studying the immune system. In the case study he shared with OpenAI, Unutmaz used GPT-5 to analyze an old data set that his team had previously looked at. The model came up with fresh insights and interpretations.  

“LLMs are already essential for scientists,” he says. “When you can complete analysis of data sets that used to take months, not using them is not an option anymore.”

Nikita Zhivotovskiy, a statistician at the University of California, Berkeley, says he has been using LLMs in his research since the first version of ChatGPT came out.

Like Scherrer, he finds LLMs most useful when they highlight unexpected connections between his own work and existing results he did not know about. “I believe that LLMs are becoming an essential technical tool for scientists, much like computers and the internet did before,” he says. “I expect a long-term disadvantage for those who do not use them.”

But he does not expect LLMs to make novel discoveries anytime soon. “I have seen very few genuinely fresh ideas or arguments that would be worth a publication on their own,” he says. “So far, they seem to mainly combine existing results, sometimes incorrectly, rather than produce genuinely new approaches.”

I also contacted a handful of scientists who are not connected to OpenAI.

Andy Cooper, a professor of chemistry at the University of Liverpool and director of the Leverhulme Research Centre for Functional Materials Design, is less enthusiastic. “We have not found, yet, that LLMs are fundamentally changing the way that science is done,” he says. “But our recent results suggest that they do have a place.”

Cooper is leading a project to develop a so-called AI scientist that can fully automate parts of the scientific workflow. He says that his team doesn’t use LLMs to come up with ideas. But the tech is starting to prove useful as part of a wider automated system where an LLM can help direct robots, for example.

“My guess is that LLMs might stick more in robotic workflows, at least initially, because I’m not sure that people are ready to be told what to do by an LLM,” says Cooper. “I’m certainly not.”

Making errors

LLMs may be becoming more and more useful, but caution is still advised. In December, Jonathan Oppenheim, a scientist who works on quantum mechanics, called out a mistake that made its way into a scientific journal. “OpenAI leadership are promoting a paper in Physics Letters B where GPT-5 proposed the main idea—possibly the first peer-reviewed paper where an LLM generated the core contribution,” Oppenheim posted on X. “One small problem: GPT-5’s idea tests the wrong thing.”

He continued: “GPT-5 was asked for a test that detects nonlinear theories. It provided a test that detects nonlocal ones. Related-sounding, but different. It’s like asking for a COVID test, and the LLM cheerfully hands you a test for chickenpox.”

It is clear that a lot of scientists are finding innovative and intuitive ways to engage with LLMs. It is also clear that the technology makes mistakes that can be so subtle even experts miss them.

Part of the problem is the way ChatGPT can flatter you into letting down your guard. As Oppenheim put it: “A core issue is that LLMs are being trained to validate the user, while science needs tools that challenge us.” In an extreme case, one individual (who was not a scientist) was persuaded by ChatGPT into thinking for months that he’d invented a new branch of mathematics.

Of course, Weil is well aware of the problem of hallucination. But he insists that newer models are hallucinating less and less. Even so, focusing on hallucination might be missing the point, he says.

“One of my teammates here, an ex math professor, said something that stuck with me,” says Weil. “He said: ‘When I’m doing research, if I’m bouncing ideas off a colleague, I’m wrong 90% of the time and that’s kind of the point. We’re both spitballing ideas and trying to find something that works.’”

“That’s actually a desirable place to be,” says Weil. “If you say enough wrong things and then somebody stumbles on a grain of truth and then the other person seizes on it and says, ‘Oh, yeah, that’s not quite right, but what if we—’ You gradually kind of find your trail through the woods.”

This is Weil’s core vision for OpenAI for Science. GPT-5 is good, but it is not an oracle. The value of this technology is in pointing people in new directions, not coming up with definitive answers, he says.

In fact, one of the things OpenAI is now looking at is making GPT-5 dial down its confidence when it delivers a response. Instead of saying Here’s the answer, it might tell scientists: Here’s something to consider.

“That’s actually something that we are spending a bunch of time on,” says Weil. “Trying to make sure that the model has some sort of epistemological humility.”

Another thing OpenAI is looking at is how to use GPT-5 to fact-check GPT-5. It’s often the case that if you feed one of GPT-5’s answers back into the model, it will pick it apart and highlight mistakes.

“You can kind of hook the model up as its own critic,” says Weil. “Then you can get a workflow where the model is thinking and then it goes to another model, and if that model finds things that it could improve, then it passes it back to the original model and says, ‘Hey, wait a minute—this part wasn’t right, but this part was interesting. Keep it.’ It’s almost like a couple of agents working together and you only see the output once it passes the critic.”

What Weil is describing also sounds a lot like what Google DeepMind did with AlphaEvolve, a tool that wrapped the LLM Gemini inside a wider system that filtered out the good responses from the bad and fed them back in again to be improved on. Google DeepMind has used AlphaEvolve to solve several real-world problems.

OpenAI faces stiff competition from rival firms, whose own LLMs can do most, if not all, of the things it claims for its own models. If that’s the case, why should scientists use GPT-5 instead of Gemini or Anthropic’s Claude, families of models that are themselves improving every year? Ultimately, OpenAI for Science may be as much an effort to a flag in new territory as anything else. The real innovations are still to come. 

“I think 2026 will be for science what 2025 was for software engineering,” says Weil. “At the beginning of 2025, if you were using AI to write most of your code, you were an early adopter. Whereas 12 months later, if you’re not using AI to write most of your code, you’re probably falling behind. We’re now seeing those same early flashes for science as we did for code.”

He continues: “I think that in a year, if you’re a scientist and you’re not heavily using AI, you’ll be missing an opportunity to increase the quality and pace of your thinking.”

Why chatbots are starting to check your age

26 January 2026 at 12:05

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

How do tech companies check if their users are kids?

This question has taken on new urgency recently thanks to growing concern about the dangers that can arise when children talk to AI chatbots. For years Big Tech asked for birthdays (that one could make up) to avoid violating child privacy laws, but they weren’t required to moderate content accordingly. Two developments over the last week show how quickly things are changing in the US and how this issue is becoming a new battleground, even among parents and child-safety advocates.

In one corner is the Republican Party, which has supported laws passed in several states that require sites with adult content to verify users’ ages. Critics say this provides cover to block anything deemed “harmful to minors,” which could include sex education. Other states, like California, are coming after AI companies with laws to protect kids who talk to chatbots (by requiring them to verify who’s a kid). Meanwhile, President Trump is attempting to keep AI regulation a national issue rather than allowing states to make their own rules. Support for various bills in Congress is constantly in flux.

So what might happen? The debate is quickly moving away from whether age verification is necessary and toward who will be responsible for it. This responsibility is a hot potato that no company wants to hold.

In a blog post last Tuesday, OpenAI revealed that it plans to roll out automatic age prediction. In short, the company will apply a model that uses factors like the time of day, among others, to predict whether a person chatting is under 18. For those identified as teens or children, ChatGPT will apply filters to “reduce exposure” to content like graphic violence or sexual role-play. YouTube launched something similar last year. 

If you support age verification but are concerned about privacy, this might sound like a win. But there’s a catch. The system is not perfect, of course, so it could classify a child as an adult or vice versa. People who are wrongly labeled under 18 can verify their identity by submitting a selfie or government ID to a company called Persona. 

Selfie verifications have issues: They fail more often for people of color and those with certain disabilities. Sameer Hinduja, who co-directs the Cyberbullying Research Center, says the fact that Persona will need to hold millions of government IDs and masses of biometric data is another weak point. “When those get breached, we’ve exposed massive populations all at once,” he says. 

Hinduja instead advocates for device-level verification, where a parent specifies a child’s age when setting up the child’s phone for the first time. This information is then kept on the device and shared securely with apps and websites. 

That’s more or less what Tim Cook, the CEO of Apple, recently lobbied US lawmakers to call for. Cook was fighting lawmakers who wanted to require app stores to verify ages, which would saddle Apple with lots of liability. 

More signals of where this is all headed will come on Wednesday, when the Federal Trade Commission—the agency that would be responsible for enforcing these new laws—is holding an all-day workshop on age verification. Apple’s head of government affairs, Nick Rossi, will be there. He’ll be joined by higher-ups in child safety at Google and Meta, as well as a company that specializes in marketing to children.

The FTC has become increasingly politicized under President Trump (his firing of the sole Democratic commissioner was struck down by a federal court, a decision that is now pending review by the US Supreme Court). In July, I wrote about signals that the agency is softening its stance toward AI companies. Indeed, in December, the FTC overturned a Biden-era ruling against an AI company that allowed people to flood the internet with fake product reviews, writing that it clashed with President Trump’s AI Action Plan.

Wednesday’s workshop may shed light on how partisan the FTC’s approach to age verification will be. Red states favor laws that require porn websites to verify ages (but critics warn this could be used to block a much wider range of content). Bethany Soye, a Republican state representative who is leading an effort to pass such a bill in her state of South Dakota, is scheduled to speak at the FTC meeting. The ACLU generally opposes laws requiring IDs to visit websites and has instead advocated for an expansion of existing parental controls.

While all this gets debated, though, AI has set the world of child safety on fire. We’re dealing with increased generation of child sexual abuse material, concerns (and lawsuits) about suicides and self-harm following chatbot conversations, and troubling evidence of kids’ forming attachments to AI companions. Colliding stances on privacy, politics, free expression, and surveillance will complicate any effort to find a solution. Write to me with your thoughts. 

The power of sound in a virtual world

In an era where business, education, and even casual conversations occur via screens, sound has become a differentiating factor. We obsess over lighting, camera angles, and virtual backgrounds, but how we sound can be just as critical to credibility, trust, and connection.

That’s the insight driving Erik Vaveris, vice president of product management and chief marketing officer at Shure, and Brian Scholl, director of the Perception & Cognition Laboratory at Yale University. Both see audio as more than a technical layer: It’s a human factor shaping how people perceive intelligence, trustworthiness, and authority in virtual settings.

“If you’re willing to take a little bit of time with your audio set up, you can really get across the full power of your message and the full power of who you are to your peers, to your employees, your boss, your suppliers, and of course, your customers,” says Vaveris.

Scholl’s research shows that poor audio quality can make a speaker seem less persuasive, less hireable, and even less credible.

“We know that [poor] sound doesn’t reflect the people themselves, but we really just can’t stop ourselves from having those impressions,” says Scholl. “We all understand intuitively that if we’re having difficulty being understood while we’re talking, then that’s bad. But we sort of think that as long as you can make out the words I’m saying, then that’s probably all fine. And this research showed in a somewhat surprising way, to a surprising degree, that this is not so.”

For organizations navigating hybrid work, training, and marketing, the stakes have become high.

Vaveris points out that the pandemic was a watershed moment for audio technology. As classrooms, boardrooms, and conferences shifted online almost overnight, demand accelerated for advanced noise suppression, echo cancellation, and AI-driven processing tools that make meetings more seamless. Today, machine learning algorithms can strip away keyboard clicks or reverberation and isolate a speaker’s voice in noisy environments. That clarity underpins the accuracy of AI meeting assistants that can step in to transcribe, summarize, and analyze discussions.

The implications across industries are rippling. Clearer audio levels the playing field for remote participants, enabling inclusive collaboration. It empowers executives and creators alike to produce broadcast-quality content from the comfort of their home office. And it offers companies new ways to build credibility with customers and employees without the costly overhead of traditional production.

Looking forward, the convergence of audio innovation and AI promises an even more dynamic landscape: from real-time captioning in your native language to audio filtering, to smarter meeting tools that capture not only what is said but how it’s said, and to technologies that disappear into the background while amplifying the human voice at the center.

“There’s a future out there where this technology can really be something that helps bring people together,” says Vaveris. “Now that we have so many years of history with the internet, we know there’s usually two sides to the coin of technology, but there’s definitely going to be a positive side to this, and I’m really looking forward to it.

In a world increasingly mediated by screens, sound may prove to be the most powerful connector of all.

This episode of Business Lab is produced in partnership with Shure.

Full Transcript

Megan Tatum: From MIT Technology Review, I’m Megan Tatum, and this is Business Lab, the show that helps business leaders make sense of new technologies coming out of the lab and into the marketplace.

This episode is produced in partnership with Shure.

Our topic today is the power of sound. As our personal and professional lives become increasingly virtual, audio is emerging as an essential tool for everything from remote work to virtual conferences to virtual happy hour. While appearance is often top of mind in video conferencing and streaming, audio can be as or even more important, not only to effective communication, but potentially to brand equity for both the speaker and the company.

Two words for you: crystal clear.

My guests today are Erik Vaveris, VP of Product Management and Chief Marketing Officer at Shure, and Brian Scholl, Director of the Perception & Cognition Laboratory at Yale University.

Welcome, Erik and Brian.


Erik Vaveris: Thank you, Megan. And hello, Brian. Thrilled to be here today.

Brian Scholl: Good afternoon, everyone.

Megan: Fantastic. Thank you both so much for being here. Erik, let’s open with a bit of background. I imagine the pandemic changed the audio industry in some significant ways, given the pivot to our modern remote hybrid lifestyles. Could you talk a bit about that journey and some of the interesting audio advances that arose from that transformative shift?

Erik: Absolutely, Megan. That’s an interesting thing to think about now being here in 2025. And if you put yourself back in those moments in 2020, when things were fully shut down and everything was fully remote, the importance of audio quality became immediately obvious. As people adopted Zoom or Teams or platforms like that overnight, there were a lot of technical challenges that people experienced, but the importance of how they were presenting themselves to people via their audio quality was a bit less obvious. As Brian’s noted in a lot of the press that he’s received for his wonderful study, we know how we look on video. We can see ourselves back on the screen, but we don’t know how we sound to the people with whom we’re speaking.

If a meeting participant on the other side can manage to parse the words that you’re saying, they’re not likely to speak up and say, “Hey, I’m having a little bit of trouble hearing you.” They’ll just let the meeting continue. And if you don’t have a really strong level of audio quality, you’re asking the people that you’re talking to devote way too much brainpower to just determining the words that you’re saying. And you’re going to be fatiguing to listen to. And your message won’t come across. In contrast, if you’re willing to take a little bit of time with your audio set up, you can really get across the full power of your message and the full power of who you are to your peers, to your employees, your boss, your suppliers, and of course your customers. Back in 2020, this very quickly became a marketing story that we had to tell immediately.

And I have to say, it’s so gratifying to see Brian’s research in the news because, to me, it was like, “Yes, this is what we’ve been experiencing. And this is what we’ve been trying to educate people about.” Having the real science to back it up means a lot. But from that, development on improvements to key audio processing algorithms accelerated across the whole AV industry.

I think, Megan and Brian, you probably remember hearing loud keyboard clicking when you were on calls and meetings, or people eating potato chips and things like that back on those. But you don’t hear that much today because most platforms have invested in AI-trained algorithms to remove undesirable noises. And I know we’re going to talk more about that later on.

But the other thing that happened, thankfully, was that as we got into the late spring and summer of 2020, was that educational institutions, especially universities, and also businesses realized that things were going to need to change quickly. Nothing was going to be the same. And universities realized that all classrooms were going to need hybrid capabilities for both remote students and students in the classroom. And that helped the market for professional AV equipment start to recover because we had been pretty much completely shut down in the earlier months. But that focus on hybrid meeting spaces of all types accelerated more investment and more R&D into making equipment and further developing those key audio processing algorithms for more and different types of spaces and use cases. And since then, we’ve really seen a proliferation of different types of unobtrusive audio capture devices based on arrays of microphones and the supporting signal processing behind them. And right now, machine-learning-trained signal processing is really the norm. And that all accelerated, unfortunately, because of the pandemic.

Megan: Yeah. Such an interesting period of change, as you say. And Brian, what did you observe and experience in academia during that time? How did that time period affect the work at your lab?

Brian: I’ll admit, Megan, I had never given a single thought to audio quality or anything like that, certainly until the pandemic hit. I was thrown into this, just like the rest of the world was. I don’t believe I’d ever had a single video conference with a student or with a class or anything like that before the pandemic hit. But in some ways, our experience in universities was quite extreme. I went on a Tuesday from teaching an in-person class with 300 students to being on Zoom with everyone suddenly on a Thursday. Business meetings come in all shapes and sizes. But this was quite extreme. This was a case where suddenly I’m talking to hundreds and hundreds of people over Zoom. And every single one of them knows exactly what I sound like, except for me, because I’m just speaking my normal voice and I have no idea how it’s being translated through all the different levels of technology.

I will say, part of the general rhetoric we have about the pandemic focuses on all the negatives and the lack of personal connection and nuance and the fact that we can’t see how everyone’s paying attention to each other. Our experience was a bit more mixed. I’ll just tell you one anecdote. Shortly after the pandemic started, I started teaching a seminar with about 20 students. And of course, this was still online. What I did is I just invited, for whatever topic we were discussing on any given day, I sent a note to whoever was the clear world leader in the study of whatever that topic was. I said, “Hey, don’t prepare a talk. You don’t have to answer any questions. But just come join us on Zoom and just participate in the conversation. The students will have read some of your work.”

Every single one of them said, “Let me check my schedule. Oh, I’m stuck at home for a year. Sure. I’d be happy to do that.” And that was quite a positive. The students got to meet a who’s who of cognitive science from this experience. And it’s true that there were all these technological difficulties, but that would never, ever have happened if we were teaching the class in real life. That would’ve just been way too much travel and airfare and hotel and scheduling and all of that. So, it was a mixed bag for us.

Megan: That’s fascinating.

Erik: Yeah. Megan, can I add?

Megan: Of course.

Erik: That is really interesting. And that’s such a cool idea. And it’s so wonderful that that worked out. I would say that working for a global company, we like to think that, “Oh, we’re all together. And we’re having these meetings. And we’re in the same room,” but the reality was we weren’t in the same room. And there hadn’t been enough attention paid to the people who were conferencing in speaking not their native language in a different time zone, maybe pretty deep into the evening, in some cases. And the remote work that everybody got thrown into immediately at the start of the pandemic did force everybody to start to think more about those types of interactions and put everybody on a level playing field.

And that was insightful. And that helped some people have stronger voices in the work that we were doing than they maybe did before. And it’s also led businesses really across the board, there’s a lot written about this, to be much more focused on making sure that participants from those who may be remote at home, may be in the office, may be in different offices, may be in different time zones, are all able to participate and collaborate on really a level playing field. And that is a positive. That’s a good thing.

Megan: Yeah. There are absolutely some positive side effects there, aren’t there? And it inspired you, Brian, to look at this more closely. And you’ve done a study that shows poor audio quality can actually affect the perception of listeners. So, I wonder what prompted the study, in particular. And what kinds of data did you gather? What methodology did you use?

Brian: Yeah. The motivation for this study was actually a real-world experience, just like we’ve been talking about. In addition to all of our classes moving online with no notice whatsoever, the same thing was true of our departmental faculty meetings. Very early on in the pandemic, we had one of these meetings. And we were talking about some contentious issue about hiring or whatever. And two of my colleagues, who I’d known very well and for many, many years, spoke up to offer their opinions. And one of these colleagues is someone who I’m very close with. We almost always see eye to eye. He was actually a former graduate student of mine once upon a time. And we almost always see eye to eye on things. He happened to be participating in that meeting from an old not-so-hot laptop. His audio quality had that sort of familiar tinny quality that we’re all familiar with. I could totally understand everything he was saying, but I found myself just being a little skeptical.

I didn’t find his points so compelling as usual. Meanwhile, I had another colleague, someone who I deeply respect, I’ve collaborated with, but we don’t always see eye to eye on these things. And he was participating in this first virtual faculty meeting from his home recording studio. Erik, I don’t know if his equipment would be up to your level or not, but he sounded better than real life. He sounded like he was all around us. And I found myself just sort of naturally agreeing with his points, which sort of was notable and a little surprising in that context. And so, we turned this into a study.

We played people a number of short audio clips, maybe like 30 seconds or so. And we had these being played in the context of very familiar situations and decisions. One of them might be like a hiring decision. You would have to listen to this person telling you why they think they might be a good fit for your job. And then afterwards, you had to make a simple judgment. It might be of a trait. How intelligent did that person seem? Or it might be a real-world decision like, “Hey, based on this, how likely would you be to pursue trying to hire them?” And critically, we had people listen to exactly the same sort of scripts, but with a little bit of work behind the scenes to affect the audio quality. In one case, the audio sounded crisp and clear. Recorded with a decent microphone. And here’s what it sounded like.

Audio Clip: After eight years in sales, I’m currently seeking a new challenge which will utilize my meticulous attention to detail and friendly professional manner. I’m an excellent fit for your company and will be an asset to your team as a senior sales manager.

Brian: Okay. Whatever you think of the content of that message, at least it’s nice and clear. Other subjects listened to exactly the same recording. But again, it had that sort of tinny quality that we’re all familiar with when people’s voices are filtered through a microphone or a recording setup that’s not so hot. That sounded like this.

Audio Clip: After eight years in sales, I’m currently seeking a new challenge which will utilize my meticulous attention to detail and friendly professional manner. I’m an excellent fit for your company and will be an asset to your team as a senior sales manager.

Brian: All right. Now, the thing that I hope you can get from that recording there is that although it clearly has this what we would call, as a technical term, a disfluent sound, it’s just a little harder to process, you are ultimately successful, right? Megan, Erik, you were able to understand the words in that second recording.

Megan: Yeah.

Erik: Mm-hmm.

Brian: And we made sure this was true for all of our subjects. We had them do word-for-word transcription after they made these judgments. And I’ll also just point out that this kind of manipulation clearly can’t be about the person themselves, right? You couldn’t make your voices sound like that in real world conversation if you tried. Voices just don’t do those sorts of things. Nevertheless, in a way that sort of didn’t make sense, that was kind of irrational because this couldn’t reflect the person, this affected all sorts of judgments about people.

So, people were judged to be about 8% less hirable. They were judged to be about 8% less intelligent. We also did this in other contexts. We did this in the context of dateability as if you were listening to a little audio clip from someone who was maybe interested in dating you, and then you had to make a judgment of how likely would you be to date this person. Same exact result. People were a little less datable when their audio was a little more tinny, even though they were completely understandable.

The experiment, the result that I thought was in some ways most striking is one of the clips was about someone who had been in a car accident. It was a little narrative about what had happened in the car accident. And they were talking as if to the insurance agent. They were saying, “Hey, it wasn’t my fault. This is what happened.” And afterwards, we simply had people make a natural intuitive judgment of how credible do you think the person’s story was. And when it was recorded with high-end audio, these messages were judged to be about 8% more credible in this context. So those are our experiments. What it shows really is something about the power of perception. We know that that sort of sound doesn’t reflect the people themselves, but we really just can’t stop ourselves from having those impressions made. And I don’t know about you guys, but, Erik, I think you’re right, that we all understand intuitively that if we’re having difficulty being understood while we’re talking, then that’s bad. But we sort of think that as long as you can make out the words I’m saying, then that’s probably all fine. And this research showed in a somewhat surprising way to a surprising degree that this is not so.

Megan: It’s absolutely fascinating.

Erik: Wow.

Megan: From an industry perspective, Erik, what are your thoughts on those study results? Did it surprise you as well?

Erik: No, like I said, I found it very, very gratifying because we invest a lot in trying to make sure that people understand the importance of quality audio, but we kind of come about that intuitively. Our entire company is audio people. So of course, we think that. And it’s our mission to help other people achieve those higher levels of audio in everything that they do, whether you’re a minister at a church or you’re teaching a class or you’re performing on stage. When I first saw in the news about Brian’s study, I think it was the NPR article that just came up in one of my feeds. I read it and it made me feel like my life’s work has been validated to some extent. I wouldn’t say we were surprised by it, but iIt made a lot of sense to us. Let’s put it that way.

Megan: And how-

Brian: This is what we’re hearing. Oh, sorry. Megan, I was going to say this is what we’re hearing from a lot of the audio professionals as they’re saying, “Hey, you scientists, you finally caught up to us.” But of course-

Erik: I wouldn’t say it that way, Brian.

Brian: Erik, you’re in an unusual circumstance because you guys think about audio every day. When we’re on Zoom, look, I can see the little rectangle as well as you can. I can see exactly how I look like. I can check the lighting. I check my hair. We all do that every day. But I would say most people really, they use whatever microphone came with their setup, and never give a second thought to what they sound like because they don’t know what they sound like.

Megan: Yeah. Absolutely.

Erik: Absolutely.

Megan: Avoid listening to yourself back as well. I think that’s common. We don’t scrutinize audio as much as we should. I wonder, Erik, since the study came out, how are you seeing that research play out across industry? Can you talk a bit about the importance of strong, clear audio in today’s virtual world and the challenges that companies and employees are facing as well?

Erik: Yeah. Sure, Megan. That’s a great question. And studies kind of back this up, businesses understand that collaboration is the key to many things that we do. They know that that’s critical. And they are investing in making the experiences for the people at work better because of that knowledge, that intuitive understanding. But there are challenges. It can be expensive. You need solutions that people who are going to walk into a room or join a meeting on their personal device, that they’re motivated to use and that they can use because they’re simple. You also have to overcome the barriers to investment. We in the AV industry have had to look a lot at how can we bring down the overall cost of ownership of setting up AV technology because, as we’ve seen, the prices of everything that goes into making a product are not coming down.

Simplifying deployment and management is critical. Beyond just audio technology, IoT technology and cloud technology for IT teams to be able to easily deploy and manage classrooms across an entire university campus or conference rooms across a global enterprise are really, really critical. And those are quickly evolving. And integrations with more standard common IT tools are coming out. And that’s one area. Another thing is just for the end user, having the same user interface in each conference room that is familiar to everyone from their personal devices is also important. For many, many years, a lot of people had the experience where, “Hey, it’s time we’re going to actually do a conference meeting.” And you might have a few rooms in your company or in your office area that could do that. And you walk into the meeting room. And how long does it take you to actually get connected to the people you’re going to talk with?

There was always a joke that you’d have to spend the first 15 minutes of a meeting working all of that out. And that’s because the technology was fragmented and you had to do a lot of custom work to make that happen. But these days, I would say platforms like Zoom and Teams and Google and others are doing a really great job with this. If you have the latest and greatest in your meeting rooms and you know how to join from your own personal device, it’s basically the same experience. And that is streamlining the process for everyone. Bringing down the costs of owning it so that companies can get to those benefits to collaboration is kind of the key.

Megan: I was going to ask if we could dive a little deeper into that kind of audio quality, the technological advancements that AI has made possible, which you did touch on slightly there, Erik. What are the most significant advancements, in your view? And how are those impacting the ways we use audio and the things we can do with it?

Erik: Okay. Let me try to break that down into-

Megan: That’s a big question. Sorry.

Erik: … a couple different sections. Yeah. No, and one that’s just so exciting. Machine-learning-based digital signal processing, or DSP, is here and is the norm now. If you think about the beginning of telephones and teleconferencing, just going way back, one of the initial problems you had whenever you tried to get something out of a dedicated handset onto a table was echo. And I’m sure we’ve all heard that at some point in our life. You need to have a way to cancel echo. But by the way, you also want people to be able to speak at the same time on both ends of a call. You get to some of those very rudimentary things. Machine learning is really supercharging those algorithms to provide better performance with fewer trade-offs, fewer artifacts in the actual audio signal.

Noise reduction has come a long way. I mentioned earlier on, keyboard sounds and the sounds of people eating, and how you just don’t hear that anymore, at least I don’t when I’m on conference calls. But only a few years ago, that could be a major problem. The machine-learning-trained digital signal processing is in the market now and it’s doing a better job than ever in removing things that you don’t want from your sound. We have a new de-verberation algorithm, so if you have a reverberant room with echoes and reflections that’s getting into the audio signal, that can degrade the experience there. We can remove that now. Another thing, the flip side of that is that there’s also a focus on isolating the sound that you do want and the signal that you do want.

Microsoft has rolled out a voice print feature in Teams that allows you, if you’re willing, to provide them with a sample of your voice. And then whenever you’re talking from your device, it will take out anything else that the microphone may be picking up so that even if you’re in a really noisy environment outdoors or, say, in an airport, the people that you’re speaking with are going to hear you and only you. And it’s pretty amazing as well. So those are some of the things that are happening today and are available today.

Another thing that’s emerged from all of this is we’ve been talking about how important audio quality is to the people participating in a discussion, the people speaking, the people listening, how everyone is perceived, but a new consumer, if you will, of audio in a discussion or a meeting has emerged, and that is in the form of the AI agent that can summarize meetings and create action plans, do those sorts of things. But for it to work, a clean transcription of what was said is already table stakes. It can’t garbled. It can’t miss key things. It needs to get it word for word, sentence for sentence throughout the entire meeting. And the ability to attribute who said what to the meeting participants, even if they’re all in the same room, is quickly upon us. And the ability to detect and integrate sentiment and emotion of the participants is going to become very important as well for us to really get the full value out of those kinds of AI agents.

So audio quality is as important as ever for humans, as Brian notes, in some ways more important because this is now the normal way that we talk and meet, but it’s also critical for AI agents to work properly. And it’s different, right? It’s a different set of considerations. And there’s a lot of emerging thought and work that’s going into that as well. And boy, Megan, there’s so much more we could say about this beyond meetings and video conferences. AI tools to simplify the production process. And of course, there’s generative AI of music content. I know that’s beyond the scope of what we’re talking about. But it’s really pretty incredible when you look around at the work that’s happening and the capabilities that are emerging.

Megan: Yeah. Absolutely. Sounds like there are so many elements to consider and work going on. It’s all fascinating. Brian, what kinds of emerging capabilities and use cases around AI and audio quality are you seeing in your lab as well?

Brian: Yeah. Well, I’m sorry that Brian himself was not able to be here today, but I’m an AI agent.

Megan: You got me for a second there.

Brian: Just kidding. The fascinating thing that we’re seeing from the lab, from the study of people’s impressions is that all of this technology that Erik has described, when it works best, it’s completely invisible. Erik, I loved your point about not hearing potato chips being eaten or rain in the background or something like that. You’re totally right. I used to notice that all the time. I don’t think I’ve noticed that recently, but I also didn’t notice that I haven’t noticed that recently, right? It just kind of disappears. The interesting thing about these perceptual impressions, we’re constantly drawing intuitive conclusions about people based on how they sound. And that might be a good thing or a bad thing when we’re judging things like trustworthiness, for example, on the basis of a short audio clip.

But clearly, some of these things are valid, right? We can judge the size of someone or even of an animal based on how they sound, right? A chihuahua can’t make the sound of a lion. A lion can’t make the sound of a chihuahua. And that’s always been true because we’re producing audio signals that go right into each other’s ears. And now, of course, everything that Erik is talking about, that’s not true. It goes through all of these different layers of technology increasingly fueled by AI. But when that technology works the best way, it’s as if it isn’t there at all and we’re just hearing each other directly.

Erik: That’s the goal, right? That it’s seamless open communication and we don’t have to think about the technology anymore.

Brian: It’s a tough business to be in, I think, though, Erik, because people have to know what’s going on behind the surface in order to value it. Otherwise, we just expect it to work.

Erik: Well, that’s why we try to put the logo of our products on the side of them so they show up in the videos. But yeah, it’s a good point.

Brian: Very good. Very good.

Erik: Yeah.

Megan: And we’ve talked about virtual meetings and conversations quite a bit, but there’s also streamed and recorded content, which are increasingly important at work as well. I wondered, Erik, if you could talk a bit about how businesses are leveraging audio in new ways for things like marketing campaigns and internal upskilling and training and areas like that?

Erik: Yeah. Well, one of the things I think we’ve all seen in marketing is that not everything is a high production value commercial anymore. And there’s still a place for that, for sure. But people tend to trust influencers that they follow. People search on TikTok, on YouTube for topics. Those can be the place that they start. And as technology’s gotten more accessible, not just audio, but of course, the video technology too, content creators can produce satisfying content on their own or with just a couple of people with them. And Brian’s study shows that it doesn’t really matter what the origins of the content are for it to be compelling.

For the person delivering the message to be compelling, the audio quality does have to hit a certain level. But because the tools are simpler to use and you need less things to connect and pull together a decent production system, creator-driven content is becoming even more and more integral to a marketing campaign. And so not just what they maybe post on their Instagram page or post on LinkedIn, for example, but us as a brand being able to take that content and use that actually in paid media and things like that is all entirely possible because of the overall quality of the content. So that’s something that’s been a trend that’s been in process really, I would say, maybe since the advent of podcasts. But it’s been an evolution. And it’s come a long, long way.

Another thing, and this is really interesting, and this hits home personally, but I remember when I first entered the workforce, and I hope I’m not showing my age too badly here, but I remember the word processing department. And you would write down on a piece of paper, like a memo, and you would give it to the word processing department and somebody would type it up for you. That was a thing. And these days, we’re seeing actually more and more video production with audio, of course, transfer to the actual producers of the content.

In my company, at Shure, we make videos for different purposes to talk about different initiatives or product launches or things that we’re doing just for internal use. And right now, everybody, including our CEO, she makes these videos just at her own desk. She has a little software tool and she can show a PowerPoint and herself and speak to things. And with very, very limited amount of editing, you can put that out there. And I’ve seen friends and colleagues at other companies in very high-level roles just kind of doing their own production. Being able to buy a very high quality microphone with really advanced signal processing built right in, but just plug it in via USB and have it be handled as simply as any consumer device, has made it possible to do really very useful production where you are going to actually sound good and get your message across, but without having to make such a big production out of it, which is kind of cool.

Megan: Yeah. Really democratizes access to sort of creating high quality content, doesn’t it? And of course, no technology discussion is complete without a mention of return on investment, particularly nowadays. Erik, what are some ways companies can get returns on their audio tech investments as well? Where are the most common places you see cost savings?

Erik: Yeah. Well, we collaborated on a study with IDC Research. And they came up with some really interesting findings on this. And one of them was, no surprise, two-thirds or more of companies have taken action on improving their communication and collaboration technology, and even more have additional or initial investments still planned. But the ROI of those initiatives isn’t really tied to the initiative itself. It’s not like when you come out with a new product, you look at how that product performs, and that’s the driver of your ROI. The benefits of smoother collaboration come in the form of shorter meetings, more productive meetings, better decision-making, faster decision-making, stronger teamwork. And so to build an ROI model, what IDC concluded was that you have to build your model to account for those advantages really across the enterprise or across your university, or whatever it may be, and kind of up and down the different set of activities where they’re actually going to be utilized.

So that can be complex. Quantifying things can always be a challenge. But like I said, companies do seem to understand this. And I think that’s because, this is just my hunch, but because everybody, including the CEO and the CFO and the whole finance department, uses and benefits from collaboration technology too. Perhaps that’s one reason why the value is easier to convey. Even if they have not taken the time to articulate things like we’re doing here today, they know when a meeting is good and when it’s not good. And maybe that’s one of the things that’s helping companies to justify these investments. But it’s always tricky to do ROI on projects like that. But again, focusing on the broader benefits of collaboration and breaking it down into what it means for specific activities and types of meetings, I think, is the way to go about doing that.

Megan: Absolutely. And Brian, what kinds of advancements are you seeing in the lab that perhaps one day might contribute to those cost savings?

Brian: Well, I don’t know anything about cost savings, Megan. I’m a college professor. I live a pure life of the mind.

Megan: Of course.

Brian: ROI does not compute for me. No, I would say we are in an extremely exciting frontier right now because of AI and many different technologies. The studies that we talked about earlier, in one sense, they were broad. We explored many different traits from dating to hiring to credibility. And we isolated them in all sorts of ways we didn’t talk about. We showed that it wasn’t due to overall affect or pessimism or something like that. But in those studies, we really only tested one very particular set of dimensions along which an audio signal can vary, which is some sort of model of clarity. But in reality, the audio signal is so multi-dimensional. And as we’re getting more and more tools these days, we can not only change audio along the lines of clarity, as we’ve been talking about, but we can potentially manipulate it in all sorts of ways.

We’re very interested in pushing these studies forward and in exploring how people’s sort of brute impressions that they make are affected by all sorts of things. Meg and Erik, we walk around the world all the time making these judgments about people, right? You meet someone and you’re like, “Wow, I could really be friends with them. They seem like a great person.” And you know that you’re making that judgment, but you have no idea why, right? It just seems kind of intuitive. Well, in an audio signal, when you’re talking to someone, you can think of, “What if their signal is more bass heavy? What if it’s a little more treble heavy? What if we manipulate it in this way? In that way?”

When we talked about the faculty meeting that motivated this whole research program, I mentioned that my colleague, who was speaking from his home recording studio, he actually didn’t sound clear like in real life. He sounded better than in real life. He sounded like he was all around us. What is the implication of that? I think there’s so many different dimensions of an audio signal that we’re just being able to readily control and manipulate that it’s going to be very exciting to see how all of these sorts of things impact our impressions of each other.

Megan: And there may be some overlap with this as well, but I wondered if we could close with a future forward look, Brian. What are you looking forward to in emerging audio technology? What are some exciting opportunities on the horizon, perhaps related to what you were just talking about there?


Brian: Well, we’re interested in studying this from a scientific perspective. Erik, you talked about how when you started. When I started doing this science, we didn’t have a word processing department. We had a stone tablet department. But I hear tell that the current generation, when they send photos back and forth to each other, that they, as a matter, of course, they apply all sorts of filters-

Erik: Oh, yes.

Brian: … to those video signals, those video or just photographic signals. We’re all familiar with that. That hasn’t quite happened with the audio signals yet, but I think that’s coming up as well. You can imagine that you record yourself saying a little message and then you filter it this way or that way. And that’s going to become the Wild West about the kinds of impressions we make on each other, especially if and when you don’t know that those filters have been operating in the first place.


Megan: That’s so interesting. Erik, what are you looking forward to in audio technology as well?


Erik: Well, I’m still thinking about what Brian said.


Megan: Yeah. That’s-


Erik: That’s very interesting.


Megan: It’s terrifying.


Erik: I have to go back again. I’ll go back to the past, maybe 15 to 20 years. And I remember at work, we had meeting rooms with the Starfish phones in the middle of the table. And I remember that we would have international meetings with our partners there that were selling our products in different countries, including in Japan and in China, and the people actually in our own company in those countries. We knew the time zone was bad. And we knew that English wasn’t their native language, and tried to be as courteous as possible with written materials and things like that. But I went over to China, and I had to actually be on the other end of one of those calls. And I’m a native English speaker, or at least a native Chicago dialect of American English speaker. And really understanding how challenging it was for them to participate in those meetings just hit me right between the eyes.

We’ve come so far, which is wonderful. But I think of a scenario, and this is not far off, there are many companies working on this right now, where not only can you get a real time captioning in your native language, no matter what the language of the participant, you can actually hear the person who’s speaking’s voice manipulated into your native language.

I’m never going to be a fluent Japanese or Chinese speaker, that’s for sure. But I love the thought that I could actually talk with people and they could understand me as though I were speaking their native language, and that they could communicate to me and I could understand them in the way that they want to be understood. I think there’s a future out there where this technology can really be something that helps bring people together. Now that we have so many years of history with the internet, we know there’s usually two sides to the coin of technology, but there’s definitely going to be a positive side to this, and I’m really looking forward to it.

Megan: Gosh, that sounds absolutely fascinating. Thank you both so much for such an interesting discussion.

That was Erik Vaveris, the VP of product management and chief marketing officer at Shure, and Brian Scholl, director of the Perception & Cognition Laboratory at Yale University, whom I spoke with from Brighton in England.

That’s it for this episode of Business Lab. I’m your host, Megan Tatum. I’m a contributing editor at Insights, the custom publishing division of MIT Technology Review. We were founded in 1899 at the Massachusetts Institute of Technology. And you can find us in print on the web and at events each year around the world. For more information about us and the show, please check out our website at technologyreview.com.

This show is available wherever you get your podcasts. If you enjoyed this episode, we hope you’ll take a moment to rate and review us. Business Lab is a production of MIT Technology Review. And this episode was produced by Giro Studios. Thanks for listening.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff. It was researched, designed, and written entirely by human writers, editors, analysts, and illustrators. This includes the writing of surveys and collection of data for surveys. AI tools that may have been used were limited to secondary production processes that passed thorough human review.

The Download: why LLMs are like aliens, and the future of head transplants

26 January 2026 at 08:10

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

Meet the new biologists treating LLMs like aliens  

How large is a large language model? We now coexist with machines so vast and so complicated that nobody quite understands what they are, how they work, or what they can really do—not even the people who build them.

That’s a problem. Even though nobody fully understands how it works—and thus exactly what its limitations might be—hundreds of millions of people now use this technology every day. 

To help overcome our ignorance, researchers are studying LLMs as if they were doing biology or neuroscience on vast living creatures—city-size xenomorphs that have appeared in our midst. And they’re discovering that large language models are even weirder than they thought. Read the full story.

—Will Douglas Heaven

This is our latest story to be turned into a MIT Technology Review Narrated podcast, which we publish each week on Spotify and Apple Podcasts. Just navigate to MIT Technology Review Narrated on either platform, and follow us to get all our new content as it’s released.

And mechanistic interpretability, the technique these researchers are using to try and understand AI models, is one of our 10 Breakthrough Technologies for 2026. Check out the rest of the list here!

Job titles of the future: Head-transplant surgeon

The Italian neurosurgeon Sergio Canavero has been preparing for a surgery that might never happen. His idea? Swap a sick person’s head—or perhaps just the brain—onto a younger, healthier body.

Canavero caused a stir in 2017 when he announced that a team he advised in China had exchanged heads between two corpses. But he never convinced skeptics that his technique could succeed—or to believe his claim that a procedure on a live person was imminent.

Canavero may have withdrawn from the spotlight, but the idea of head transplants isn’t going away. Instead, he says, the concept has recently been getting a fresh look from life-extension enthusiasts and stealth Silicon Valley startups. Read the full story.

—Antonio Regalado

This story is from the latest print issue of MIT Technology Review magazine, which is all about exciting innovations. If you haven’t already, subscribe now to receive future issues once they land.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 Big Tech is facing multiple high-profile social media addiction lawsuits 
Meta, TikTok and YouTube will face parents’ accusations in court this week. (WP $)
+ It’s the first time they’re defending against these claims before a jury in a court of law. (CNN)

2 Power prices are surging in the world’s largest data center hub
Virginia is struggling to meet record demand during a winter storm, partly because of the centers’ electricity demands. (Reuters)
+ Why these kinds of violent storms are getting harder to forecast. (Vox)
+ AI is changing the grid. Could it help more than it harms? (MIT Technology Review)

3 TikTok has started collecting even more data on its users
Including precise information about their location. (Wired $)

4 ICE-watching groups are successfully fighting DHS efforts to unmask them

An anonymous account holder sued to block ICE from identifying them—and won. (Ars Technica)

5 A new wave of AI companies want to use AI to make AI better
The AI ouroboros is never-ending. (NYT $)
+ Is AI really capable of making bona fide scientific advancements? (Undark)
+ AI trained on AI garbage spits out AI garbage. (MIT Technology Review)

6 Iran is testing a two-tier internet
Meaning its current blackout could become permanent. (Rest of World)

7 Don’t believe the humanoid robot hype
Even a leading robot maker admits that at best, they’re only half as efficient as humans. (FT $)
+ Tesla wants to put its Optimus bipedal machine to work in its Austin factory. (Insider)
+ Why the humanoid workforce is running late. (MIT Technology Review)

8 AI is changing how manufacturers create new products
Including thinner chewing gum containers and new body wash odors. (WSJ $)
+ AI could make better beer. Here’s how. (MIT Technology Review)

9 New Jersey has had enough of e-bikes 🚲

But will other US states follow its lead? (The Verge)

10 Sci-fi writers are cracking down on AI
Human-produced works only, please. (TechCrunch)
+ San Diego Comic-Con was previously a safe space for AI-generated art. (404 Media)
+ Generative AI is reshaping South Korea’s webcomics industry. (MIT Technology Review)

Quote of the day

“Choosing American digital technology by default is too easy and must stop.”

—Nicolas Dufourcq, head of French state-owned investment bank Bpifrance, makes his case for why Big European companies should use European-made software as tensions with the US rise, the Wall Street Journal reports.

One more thing

The return of pneumatic tubes

Pneumatic tubes were once touted as something that would revolutionize the world. In science fiction, they were envisioned as a fundamental part of the future—even in dystopias like George Orwell’s 1984, where they help to deliver orders for the main character, Winston Smith, in his job rewriting history to fit the ruling party’s changing narrative.

In real life, the tubes were expected to transform several industries in the late 19th century through the mid-20th. For a while, the United States took up the systems with gusto.

But by the mid to late 20th century, use of the technology had largely fallen by the wayside, and pneumatic tube technology became virtually obsolete. Except in hospitals. Read the full story.

—Vanessa Armstrong

We can still have nice things

A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or skeet ’em at me.)

+ You really can’t beat the humble jacket potato for a cheap, comforting meal. 
+ These tips might help you whenever anxiety strikes. ($)
+ There are some amazing photos in this year’s Capturing Ecology awards.
+ You can benefit from meditation any time, anywhere. Give it a go!

Before yesterdayMIT Technology Review

The Download: chatbots for health, and US fights over AI regulation

23 January 2026 at 08:07

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.


“Dr. Google” had its issues. Can ChatGPT Health do better?  

For the past two decades, there’s been a clear first step for anyone who starts experiencing new medical symptoms: Look them up online. The practice was so common that it gained the pejorative moniker “Dr. Google.” But times are changing, and many medical-information seekers are now using LLMs. According to OpenAI, 230 million people ask ChatGPT health-related queries each week.  

That’s the context around the launch of OpenAI’s new ChatGPT Health product, which debuted earlier this month. The big question is: can the obvious risks of using AI for health-related queries be mitigated enough for them to be a net benefit? Read the full story

—Grace Huckins

America’s coming war over AI regulation  

In the final weeks of 2025, the battle over regulating artificial intelligence in the US reached boiling point. On December 11, after Congress failed twice to pass a law banning state AI laws, President Donald Trump signed a sweeping executive order seeking to handcuff states from regulating the booming industry.  

Instead, he vowed to work with Congress to establish a “minimally burdensome” national AI policy. The move marked a victory for tech titans, who have been marshaling multimillion-dollar war chests to oppose AI regulations, arguing that a patchwork of state laws would stifle innovation.

In 2026, the battleground will shift to the courts. While some states might back down from passing AI laws, others will charge ahead. Read our story about what’s on the horizon

—Michelle Kim

This story is from MIT Technology Review’s What’s Next series of stories that look across industries, trends, and technologies to give you a first look at the future. You can read the rest of them here.  

Measles is surging in the US. Wastewater tracking could help.

This week marked a rather unpleasant anniversary: It’s a year since Texas reported a case of measles—the start of a significant outbreak that ended up spreading across multiple states. Since the start of January 2025, there have been over 2,500 confirmed cases of measles in the US. Three people have died. 

As vaccination rates drop and outbreaks continue, scientists have been experimenting with new ways to quickly identify new cases and prevent the disease from spreading. And they are starting to see some success with wastewater surveillance. Read the full story.

—Jessica Hamzelou 

This story is from The Checkup, our weekly newsletter giving you the inside track on all things health and biotech. Sign up to receive it in your inbox every Thursday.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 The US is dismantling itself
A foreign enemy could not invent a better chain of events to wreck its standing in the world. (Wired $)  
+ We need to talk about whether Donald Trump might be losing it.  (New Yorker $)

2 Big Tech is taking on more debt to fund its AI aspirations
And the bubble just keeps growing. (WP $)
Forget unicorns. 2026 is shaping up to be the year of the “hectocorn.” (The Guardian)
+ Everyone in tech agrees we’re in a bubble. They just can’t agree on what happens when it pops. (MIT Technology Review)

3 DOGE accessed even more personal data than we thought 
Even now, the Trump administration still can’t say how much data is at risk, or what it was used for. (NPR)

4 TikTok has finalized a deal to create a new US entity 
Ending years of uncertainty about its fate in America. (CNN)
Why China is the big winner out of all of this. (FT $)

5 The US is now officially out of the World Health Organization 
And it’s leaving behind nearly $300 million in bills unpaid. (Ars Technica
The US withdrawal from the WHO will hurt us all. (MIT Technology Review)

6 AI-powered disinformation swarms pose a threat to democracy
A would-be autocrat could use them to persuade populations to accept cancelled elections or overturn results. (The Guardian)
The era of AI persuasion in elections is about to begin. (MIT Technology Review)

7 We’re about to start seeing more robots everywhere
But exactly what they’ll look like remains up for debate. (Vox $)
Chinese companies are starting to dominate entire sectors of AI and robotics. (MIT Technology Review)

8 Some people seem to be especially vulnerable to loneliness
If you’re ‘other-directed’, you could particularly benefit from less screentime. (New Scientist $)

9 This academic lost two years of work with a single click
TL;DR: Don’t rely on ChatGPT to store your data. (Nature)

10 How animals develop a sense of direction 🦇🧭
Their ‘internal compass’ seems to be informed by landmarks that help them form a mental map. (Quanta $)

Quote of the day

“The rate at which AI is progressing, I think we have AI that is smarter than any human this year, and no later than next year.”

—Elon Musk simply cannot resist the urge to make wild predictions at Davos, Wired reports. 

One more thing

ADAM DETOUR

Africa fights rising hunger by looking to foods of the past

After falling steadily for decades, the prevalence of global hunger is now on the rise—nowhere more so than in sub-Saharan Africa. 

Africa’s indigenous crops are often more nutritious and better suited to the hot and dry conditions that are becoming more prevalent, yet many have been neglected by science, which means they tend to be more vulnerable to diseases and pests and yield well below their theoretical potential.

Now the question is whether researchers, governments, and farmers can work together in a way that gets these crops onto plates and provides Africans from all walks of life with the energy and nutrition that they need to thrive, whatever climate change throws their way. Read the full story.

—Jonathan W. Rosen

We can still have nice things

A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or skeet ’em at me.)

+ The only thing I fancy dry this January is a martini. Here’s how to make one.
+ If you absolutely adore the Bic crystal pen, you might want this lamp
+ Cozy up with a nice long book this winter. ($)
+ Want to eat healthier? Slow down and tune out food ‘noise’. ($)

America’s coming war over AI regulation

23 January 2026 at 05:00

MIT Technology Review’s What’s Next series looks across industries, trends, and technologies to give you a first look at the future. You can read the rest of them here.

In the final weeks of 2025, the battle over regulating artificial intelligence in the US reached a boiling point. On December 11, after Congress failed twice to pass a law banning state AI laws, President Donald Trump signed a sweeping executive order seeking to handcuff states from regulating the booming industry. Instead, he vowed to work with Congress to establish a “minimally burdensome” national AI policy, one that would position the US to win the global AI race. The move marked a qualified victory for tech titans, who have been marshaling multimillion-dollar war chests to oppose AI regulations, arguing that a patchwork of state laws would stifle innovation.

In 2026, the battleground will shift to the courts. While some states might back down from passing AI laws, others will charge ahead, buoyed by mounting public pressure to protect children from chatbots and rein in power-hungry data centers. Meanwhile, dueling super PACs bankrolled by tech moguls and AI-safety advocates will pour tens of millions into congressional and state elections to seat lawmakers who champion their competing visions for AI regulation. 

Trump’s executive order directs the Department of Justice to establish a task force that sues states whose AI laws clash with his vision for light-touch regulation. It also directs the Department of Commerce to starve states of federal broadband funding if their AI laws are “onerous.” In practice, the order may target a handful of laws in Democratic states, says James Grimmelmann, a law professor at Cornell Law School. “The executive order will be used to challenge a smaller number of provisions, mostly relating to transparency and bias in AI, which tend to be more liberal issues,” Grimmelmann says.

For now, many states aren’t flinching. On December 19, New York’s governor, Kathy Hochul, signed the Responsible AI Safety and Education (RAISE) Act, a landmark law requiring AI companies to publish the protocols used to ensure the safe development of their AI models and report critical safety incidents. On January 1, California debuted the nation’s first frontier AI safety law, SB 53—which the RAISE Act was modeled on—aimed at preventing catastrophic harms such as biological weapons or cyberattacks. While both laws were watered down from earlier iterations to survive bruising industry lobbying, they struck a rare, if fragile, compromise between tech giants and AI safety advocates.

If Trump targets these hard-won laws, Democratic states like California and New York will likely take the fight to court. Republican states like Florida with vocal champions for AI regulation might follow suit. Trump could face an uphill battle. “The Trump administration is stretching itself thin with some of its attempts to effectively preempt [legislation] via executive action,” says Margot Kaminski, a law professor at the University of Colorado Law School. “It’s on thin ice.”

But Republican states that are anxious to stay off Trump’s radar or can’t afford to lose federal broadband funding for their sprawling rural communities might retreat from passing or enforcing AI laws. Win or lose in court, the chaos and uncertainty could chill state lawmaking. Paradoxically, the Democratic states that Trump wants to rein in—armed with big budgets and emboldened by the optics of battling the administration—may be the least likely to budge.

In lieu of state laws, Trump promises to create a federal AI policy with Congress. But the gridlocked and polarized body won’t be delivering a bill this year. In July, the Senate killed a moratorium on state AI laws that had been inserted into a tax bill, and in November, the House scrapped an encore attempt in a defense bill. In fact, Trump’s bid to strong-arm Congress with an executive order may sour any appetite for a bipartisan deal. 

The executive order “has made it harder to pass responsible AI policy by hardening a lot of positions, making it a much more partisan issue,” says Brad Carson, a former Democratic congressman from Oklahoma who is building a network of super PACs backing candidates who support AI regulation. “It hardened Democrats and created incredible fault lines among Republicans,” he says. 

While AI accelerationists in Trump’s orbit—AI and crypto czar David Sacks among them—champion deregulation, populist MAGA firebrands like Steve Bannon warn of rogue superintelligence and mass unemployment. In response to Trump’s executive order, Republican state attorneys general signed a bipartisan letter urging the FCC not to supersede state AI laws.

With Americans increasingly anxious about how AI could harm mental health, jobs, and the environment, public demand for regulation is growing. If Congress stays paralyzed, states will be the only ones acting to keep the AI industry in check. In 2025, state legislators introduced more than 1,000 AI bills, and nearly 40 states enacted over 100 laws, according to the National Conference of State Legislatures.

Efforts to protect children from chatbots may inspire rare consensus. On January 7, Google and Character Technologies, a startup behind the companion chatbot Character.AI, settled several lawsuits with families of teenagers who killed themselves after interacting with the bot. Just a day later, the Kentucky attorney general sued Character Technologies, alleging that the chatbots drove children to suicide and other forms of self-harm. OpenAI and Meta face a barrage of similar suits. Expect more to pile up this year. Without AI laws on the books, it remains to be seen how product liability laws and free speech doctrines apply to these novel dangers. “It’s an open question what the courts will do,” says Grimmelmann. 

While litigation brews, states will move to pass child safety laws, which are exempt from Trump’s proposed ban on state AI laws. On January 9, OpenAI inked a deal with a former foe, the child-safety advocacy group Common Sense Media, to back a ballot initiative in California called the Parents & Kids Safe AI Act, setting guardrails around how chatbots interact with children. The measure proposes requiring AI companies to verify users’ age, offer parental controls, and undergo independent child-safety audits. If passed, it could be a blueprint for states across the country seeking to crack down on chatbots. 

Fueled by widespread backlash against data centers, states will also try to regulate the resources needed to run AI. That means bills requiring data centers to report on their power and water use and foot their own electricity bills. If AI starts to displace jobs at scale, labor groups might float AI bans in specific professions. A few states concerned about the catastrophic risks posed by AI may pass safety bills mirroring SB 53 and the RAISE Act. 

Meanwhile, tech titans will continue to use their deep pockets to crush AI regulations. Leading the Future, a super PAC backed by OpenAI president Greg Brockman and the venture capital firm Andreessen Horowitz, will try to elect candidates who endorse unfettered AI development to Congress and state legislatures. They’ll follow the crypto industry’s playbook for electing allies and writing the rules. To counter this, super PACs funded by Public First, an organization run by Carson and former Republican congressman Chris Stewart of Utah, will back candidates advocating for AI regulation. We might even see a handful of candidates running on anti-AI populist platforms.

In 2026, the slow, messy process of American democracy will grind on. And the rules written in state capitals could decide how the most disruptive technology of our generation develops far beyond America’s borders, for years to come.

Measles is surging in the US. Wastewater tracking could help.

23 January 2026 at 05:00

This week marked a rather unpleasant anniversary: It’s a year since Texas reported a case of measles—the start of a significant outbreak that ended up spreading across multiple states. Since the start of January 2025, there have been over 2,500 confirmed cases of measles in the US. Three people have died.

As vaccination rates drop and outbreaks continue, scientists have been experimenting with new ways to quickly identify new cases and prevent the disease from spreading. And they are starting to see some success with wastewater surveillance.

After all, wastewater contains saliva, urine, feces, shed skin, and more. You could consider it a rich biological sample. Wastewater analysis helped scientists understand how covid was spreading during the pandemic. It’s early days, but it is starting to help us get a handle on measles.

Globally, there has been some progress toward eliminating measles, largely thanks to vaccination efforts. Such efforts led to an 88% drop in measles deaths between 2000 and 2024, according to the World Health Organization. It estimates that “nearly 59 million lives have been saved by the measles vaccine” since 2000.

Still, an estimated 95,000 people died from measles in 2024 alone—most of them young children. And cases are surging in Europe, Southeast Asia, and the Eastern Mediterranean region.

Last year, the US saw the highest levels of measles in decades. The country is on track to lose its measles elimination status—a sorry fate that met Canada in November after the country recorded over 5,000 cases in a little over a year.

Public health efforts to contain the spread of measles—which is incredibly contagious—typically involve clinical monitoring in health-care settings, along with vaccination campaigns. But scientists have started looking to wastewater, too.

Along with various bodily fluids, we all shed viruses and bacteria into wastewater, whether that’s through brushing our teeth, showering, or using the toilet. The idea of looking for these pathogens in wastewater to track diseases has been around for a while, but things really kicked into gear during the covid-19 pandemic, when scientists found that the coronavirus responsible for the disease was shed in feces.

This led Marlene Wolfe of Emory University and Alexandria Boehm of Stanford University to establish WastewaterSCAN, an academic-led program developed to analyze wastewater samples across the US. Covid was just the beginning, says Wolfe. “Over the years we have worked to expand what can be monitored,” she says.

Two years ago, for a previous edition of the Checkup, Wolfe told Cassandra Willyard that wastewater surveillance of measles was “absolutely possible,” as the virus is shed in urine. The hope was that this approach could shed light on measles outbreaks in a community, even if members of that community weren’t able to access health care and receive an official diagnosis. And that it could highlight when and where public health officials needed to act to prevent measles from spreading. Evidence that it worked as an effective public health measure was, at the time, scant.

Since then, she and her colleagues have developed a test to identify measles RNA. They trialed it at two wastewater treatment plants in Texas between December 2024 and May 2025. At each site, the team collected samples two or three times a week and tested them for measles RNA.

Over that period, the team found measles RNA in 10.5% of the samples they collected, as reported in a preprint paper published at medRxiv in July and currently under review at a peer-reviewed journal. The first detection came a week before the first case of measles was officially confirmed in the area. That’s promising—it suggests that wastewater surveillance might pick up measles cases early, giving public health officials a head start in efforts to limit any outbreaks.

There are more promising results from a team in Canada. Mike McKay and Ryland Corchis-Scott at the University of Windsor in Ontario and their colleagues have also been testing wastewater samples for measles RNA. Between February and November 2025, the team collected samples from a wastewater treatment facility serving over 30,000 people in Leamington, Ontario. 

These wastewater tests are somewhat limited—even if they do pick up measles, they won’t tell you who has measles, where exactly infections are occurring, or even how many people are infected. McKay and his colleagues have begun to make some progress here. In addition to monitoring the large wastewater plant, the team used tampons to soak up wastewater from a hospital lateral sewer.

They then compared their measles test results with the number of clinical cases in that hospital. This gave them some idea of the virus’s “shedding rate.” When they applied this to the data collected from the Leamington wastewater treatment facility, the team got estimates of measles cases that were much higher than the figures officially reported. 

Their findings track with the opinions of local health officials (who estimate that the true number of cases during the outbreak was around five to 10 times higher than the confirmed case count), the team members wrote in a paper published on medRxiv a couple of weeks ago.

There will always be limits to wastewater surveillance. “We’re looking at the pool of waste of an entire community, so it’s very hard to pull in information about individual infections,” says Corchis-Scott.

Wolfe also acknowledges that “we have a lot to learn about how we can best use the tools so they are useful.” But her team at WastewaterSCAN has been testing wastewater across the US for measles since May last year. And their findings are published online and shared with public health officials.

In some cases, the findings are already helping inform the response to measles. “We’ve seen public health departments act on this data,” says Wolfe. Some have issued alerts, or increased vaccination efforts in those areas, for example. “[We’re at] a point now where we really see public health departments, clinicians, [and] families using that information to help keep themselves and their communities safe,” she says.

McKay says his team has stopped testing for measles because the Ontario outbreak “has been declared over.” He says testing would restart if and when a single new case of measles is confirmed in the region, but he also thinks that his research makes a strong case for maintaining a wastewater surveillance system for measles.

McKay wonders if this approach might help Canada regain its measles elimination status. “It’s sort of like [we’re] a pariah now,” he says. If his approach can help limit measles outbreaks, it could be “a nice tool for public health in Canada to [show] we’ve got our act together.”

This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here.

“Dr. Google” had its issues. Can ChatGPT Health do better?

22 January 2026 at 12:38

For the past two decades, there’s been a clear first step for anyone who starts experiencing new medical symptoms: Look them up online. The practice was so common that it gained the pejorative moniker “Dr. Google.” But times are changing, and many medical-information seekers are now using LLMs. According to OpenAI, 230 million people ask ChatGPT health-related queries each week. 

That’s the context around the launch of OpenAI’s new ChatGPT Health product, which debuted earlier this month. It landed at an inauspicious time: Two days earlier, the news website SFGate had broken the story of Sam Nelson, a teenager who died of an overdose last year after extensive conversations with ChatGPT about how best to combine various drugs. In the wake of both pieces of news, multiple journalists questioned the wisdom of relying for medical advice on a tool that could cause such extreme harm.

Though ChatGPT Health lives in a separate sidebar tab from the rest of ChatGPT, it isn’t a new model. It’s more like a wrapper that provides one of OpenAI’s preexisting models with guidance and tools it can use to provide health advice—including some that allow it to access a user’s electronic medical records and fitness app data, if granted permission. There’s no doubt that ChatGPT and other large language models can make medical mistakes, and OpenAI emphasizes that ChatGPT Health is intended as an additional support, rather than a replacement for one’s doctor. But when doctors are unavailable or unable to help, people will turn to alternatives. 

Some doctors see LLMs as a boon for medical literacy. The average patient might struggle to navigate the vast landscape of online medical information—and, in particular, to distinguish high-quality sources from polished but factually dubious websites—but LLMs can do that job for them, at least in theory. Treating patients who had searched for their symptoms on Google required “a lot of attacking patient anxiety [and] reducing misinformation,” says Marc Succi, an associate professor at Harvard Medical School and a practicing radiologist. But now, he says, “you see patients with a college education, a high school education, asking questions at the level of something an early med student might ask.”

The release of ChatGPT Health, and Anthropic’s subsequent announcement of new health integrations for Claude, indicate that the AI giants are increasingly willing to acknowledge and encourage health-related uses of their models. Such uses certainly come with risks, given LLMs’ well-documented tendencies to agree with users and make up information rather than admit ignorance. 

But those risks also have to be weighed against potential benefits. There’s an analogy here to autonomous vehicles: When policymakers consider whether to allow Waymo in their city, the key metric is not whether its cars are ever involved in accidents but whether they cause less harm than the status quo of relying on human drivers. If Dr. ChatGPT is an improvement over Dr. Google—and early evidence suggests it may be—it could potentially lessen the enormous burden of medical misinformation and unnecessary health anxiety that the internet has created.

Pinning down the effectiveness of a chatbot such as ChatGPT or Claude for consumer health, however, is tricky. “It’s exceedingly difficult to evaluate an open-ended chatbot,” says Danielle Bitterman, the clinical lead for data science and AI at the Mass General Brigham health-care system. Large language models score well on medical licensing examinations, but those exams use multiple-choice questions that don’t reflect how people use chatbots to look up medical information.

Sirisha Rambhatla, an assistant professor of management science and engineering at the University of Waterloo, attempted to close that gap by evaluating how GPT-4o responded to licensing exam questions when it did not have access to a list of possible answers. Medical experts who evaluated the responses scored only about half of them as entirely correct. But multiple-choice exam questions are designed to be tricky enough that the answer options don’t give them entirely away, and they’re still a pretty distant approximation for the sort of thing that a user would type into ChatGPT.

A different study, which tested GPT-4o on more realistic prompts submitted by human volunteers, found that it answered medical questions correctly about 85% of the time. When I spoke with Amulya Yadav, an associate professor at Pennsylvania State University who runs the Responsible AI for Social Emancipation Lab and led the study, he made it clear that he wasn’t personally a fan of patient-facing medical LLMs. But he freely admits that, technically speaking, they seem up to the task—after all, he says, human doctors misdiagnose patients 10% to 15% of the time. “If I look at it dispassionately, it seems that the world is gonna change, whether I like it or not,” he says.

For people seeking medical information online, Yadav says, LLMs do seem to be a better choice than Google. Succi, the radiologist, also concluded that LLMs can be a better alternative to web search when he compared GPT-4’s responses to questions about common chronic medical conditions with the information presented in Google’s knowledge panel, the information box that sometimes appears on the right side of the search results.

Since Yadav’s and Succi’s studies appeared online, in the first half of 2025, OpenAI has released multiple new versions of GPT, and it’s reasonable to expect that GPT-5.2 would perform even better than its predecessors. But the studies do have important limitations: They focus on straightforward, factual questions, and they examine only brief interactions between users and chatbots or web search tools. Some of the weaknesses of LLMs—most notably their sycophancy and tendency to hallucinate—might be more likely to rear their heads in more extensive conversations and with people who are dealing with more complex problems. Reeva Lederman, a professor at the University of Melbourne who studies technology and health, notes that patients who don’t like the diagnosis or treatment recommendations that they receive from a doctor might seek out another opinion from an LLM—and the LLM, if it’s sycophantic, might encourage them to reject their doctor’s advice.

Some studies have found that LLMs will hallucinate and exhibit sycophancy in response to health-related prompts. For example, one study showed that GPT-4 and GPT-4o will happily accept and run with incorrect drug information included in a user’s question. In another, GPT-4o frequently concocted definitions for fake syndromes and lab tests mentioned in the user’s prompt. Given the abundance of medically dubious diagnoses and treatments floating around the internet, these patterns of LLM behavior could contribute to the spread of medical misinformation, particularly if people see LLMs as trustworthy.

OpenAI has reported that the GPT-5 series of models is markedly less sycophantic and prone to hallucination than their predecessors, so the results of these studies might not apply to ChatGPT Health. The company also evaluated the model that powers ChatGPT Health on its responses to health-specific questions, using their publicly available HeathBench benchmark. HealthBench rewards models that express uncertainty when appropriate, recommend that users seek medical attention when necessary, and refrain from causing users unnecessary stress by telling them their condition is more serious that it truly is. It’s reasonable to assume that the model underlying ChatGPT Health exhibited those behaviors in testing, though Bitterman notes that some of the prompts in HealthBench were generated by LLMs, not users, which could limit how well the benchmark translates into the real world.

An LLM that avoids alarmism seems like a clear improvement over systems that have people convincing themselves they have cancer after a few minutes of browsing. And as large language models, and the products built around them, continue to develop, whatever advantage Dr. ChatGPT has over Dr. Google will likely grow. The introduction of ChatGPT Health is certainly a move in that direction: By looking through your medical records, ChatGPT can potentially gain far more context about your specific health situation than could be included in any Google search, although numerous experts have cautioned against giving ChatGPT that access for privacy reasons.

Even if ChatGPT Health and other new tools do represent a meaningful improvement over Google searches, they could still conceivably have a negative effect on health overall. Much as automated vehicles, even if they are safer than human-driven cars, might still prove a net negative if they encourage people to use public transit less, LLMs could undermine users’ health if they induce people to rely on the internet instead of human doctors, even if they do increase the quality of health information available online.

Lederman says that this outcome is plausible. In her research, she has found that members of online communities centered on health tend to put their trust in users who express themselves well, regardless of the validity of the information they are sharing. Because ChatGPT communicates like an articulate person, some people might trust it too much, potentially to the exclusion of their doctor. But LLMs are certainly no replacement for a human doctor—at least not yet.

Dispatch from Davos: hot air, big egos and cold flexes

By: Mat Honan
22 January 2026 at 11:39

This story first appeared in The Debrief, our subscriber-only newsletter about the biggest news in tech by Mat Honan, Editor in Chief. Subscribe to read the next edition as soon as it lands.

It’s supposed to be frigid in Davos this time of year. Part of the charm is seeing the world’s elite tromp through the streets in respectable suits and snow boots. But this year it’s positively balmy, with highs in the mid 30s, or a little over 1°C. The current conditions when I flew out of New York were colder, and definitely snowier. I’m told this is due to something called a föhn, a dry warm wind that’s been blowing across the Alps. 

I’m no meteorologist, but it’s true that there is a lot of hot air here. 

On Wednesday, President Donald Trump arrived in Davos to address the assembly, and held forth for more than 90 minutes, weaving his way through remarks about the economy, Greenland, windmills, Switzerland, Rolexes, Venezuela, and drug prices. It was a talk lousy with gripes, grievances and outright falsehoods. 

One small example: Trump made a big deal of claiming that China, despite being the world leader in manufacturing windmill componentry, doesn’t actually use them for energy generation itself. In fact, it is the world leader in generation, as well. 

I did not get to watch this spectacle from the room itself. Sad! 

By the time I got to the Congress Hall where the address was taking place, there was already a massive scrum of people jostling to get in. 

I had just wrapped up moderating a panel on “the intelligent co-worker,” ie: AI agents in the workplace. I was really excited for this one as the speakers represented a diverse cross-section of the AI ecosystem. Christoph Schweizer, CEO of BCG had the macro strategic view; Enrique Lores, HP CEO, could speak to both hardware and large enterprises, Workera CEO Kian Katanforoosh has the inside view on workforce training and transformation, Manjul Shah CEO of Hippocratic AI addressed working in the high stakes field of healthcare, and Kate Kallot CEO of Amini AI gave perspective on the global south and Africa in particular. 

Interestingly, most of the panel shied away from using the term co-worker, and some even rejected the term agent. But the view they painted was definitely one of humans working alongside AI and augmenting what’s possible. Shah, for example, talked about having agents call 16,000 people in Texas during a heat wave to perform a health and safety check. It was a great discussion. You can watch the whole thing here

But by the time it let out, the push of people outside the Congress Hall was already too thick for me to get in. In fact I couldn’t even get into a nearby overflow room. I did make it into a third overflow room, but getting in meant navigating my way through a mass of people, so jammed in tight together that it reminded me of being at a Turnstile concert. 

The speech blew way past its allotted time, and I had to step out early to get to yet another discussion. Walking through the halls while Trump spoke was a truly surreal experience. He had truly captured the attention of the gathered global elite. I don’t think I saw a single person not starting at a laptop, or phone or iPad, all watching the same video. 

Trump is speaking again on Thursday in a previously unscheduled address to announce his Board of Peace. As is (I heard) Elon Musk. So it’s shaping up to be another big day for elite attention capture. 

I should say, though, there are elites, and then there are elites. And there are all sorts of ways of sorting out who is who. Your badge color is one of them. I have a white participant badge, because I was moderating panels. This gets you in pretty much anywhere and therefore is its own sort of status symbol. Where you are staying is another. I’m in Klosters, a neighboring town that’s a 40 minute train ride away from the Congress Centre. Not so elite. 

There are more subtle ways of status sorting, too. Yesterday I learned that when people ask if this is your first time at Davos, it’s sometimes meant as a way of trying to figure out how important you are. If you’re any kind of big deal, you’ve probably been coming for years. 

But the best one I’ve yet encountered happened when I made small talk with the woman sitting next to me as I changed back into my snow boots. It turned out that, like me, she lived in California–at least part time. “But I don’t think I’ll stay there much longer,” she said, “due to the new tax law.” This was just an ice cold flex. 

Because California’s newly proposed tax legislation? It only targets billionaires. 

Welcome to Davos.

The Download: Yann LeCun’s new venture, and lithium’s on the rise

22 January 2026 at 08:10

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.


Yann LeCun’s new venture is a contrarian bet against large language models    

Yann LeCun is a Turing Award recipient and a top AI researcher, but he has long been a contrarian figure in the tech world. He believes that the industry’s current obsession with large language models is wrong-headed and will ultimately fail to solve many pressing problems.  

Instead, he thinks we should be betting on world models—a different type of AI that accurately reflects the dynamics of the real world. Perhaps it’s no surprise, then, that he recently left Meta, where he had served as chief scientist for FAIR (Fundamental AI Research), the company’s influential research lab that he founded. 

LeCun sat down with MIT Technology Review in an exclusive online interview from his Paris apartment to discuss his new venture, life after Meta, the future of artificial intelligence, and why he thinks the industry is chasing the wrong ideas. Read the full interview

—Caiwei Chen

Why 2026 is a hot year for lithium

—Casey Crownhart

In 2026, I’m going to be closely watching the price of lithium.

If you’re not in the habit of obsessively tracking commodity markets, I certainly don’t blame you. (Though the news lately definitely makes the case that minerals can have major implications for global politics and the economy.)

But lithium is worthy of a close look right now. The metal is crucial for lithium-ion batteries used in phones and laptops, electric vehicles, and large-scale energy storage arrays on the grid. 

Prices have been on quite the roller coaster over the last few years, and they’re ticking up again. What happens next could have big implications for mining and battery technology. Read the full storyThis story first appeared in The Spark, our newsletter all about the tech we can use to combat the climate crisis. Sign up to receive it in your inbox every Wednesday.  

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 Trump has climbed down from his plan for the US to take Greenland 
To the relief of many across Europe. (BBC)
Trump says he’s agreed a deal to access Greenland’s rare earths. Experts say that’s ‘bonkers.’ (CNN)
+ European leaders are feeling flummoxed about what’s going on. (FT $)

2 Apple is reportedly developing a wearable AI pin
It’s still in the very early stages—but this could be a huge deal if it makes it to launch. (The Information $)
+ It’s also planning to revamp Siri and turn it into an AI chatbot. (Bloomberg $)
Are we ready to trust AI with our bodies? (MIT Technology Review)

3 CEOs say AI saves people time. Their employees disagree.
Many even say that it’s currently dragging down their productivity. (WSJ $)
The AI boom will increase US carbon emissions—but it doesn’t have to. (Wired $)
+ Let’s also not forget that large language models remain a security nightmare. (IEEE Spectrum)

4 This chart shows how measles cases are exploding in America
They’ve hit a 30-year high, with the US on track to lose its ‘elimination status.’ (Axios $)
Things are poised to get even worse this year. (Wired $)

5 Your first humanoid robot coworker will almost definitely be Chinese
But will it be truly useful? That’s the even bigger question. (Wired $)
+ Nvidia CEO Jensen Huang says Europe could do more to compete in robotics and AI. (CNBC)

6 Bezos’ Blue Origin is about to compete with Starlink
It plans to send the first ‘TeraWave’ satellites into space next year. (Reuters $)
On the ground in Ukraine’s largest Starlink repair shop. (MIT Technology Review)

7 Trump’s family made $1.4 billion off crypto last year 
Move along, no conflicts of interest to see here. (Bloomberg $)

8 Comic-Con has banned AI art
After an artist-led backlash last week. (404 Media)
Hundreds of creatives are warning against an AI future built on ‘theft on a grand scale’. (The Verge $)

9 What it’s like living without a smartphone for a month
Potentially blissful for you, but probably a bit annoying for everyone else. (The Guardian)
Why teens with ADHD are particularly vulnerable to the perils of social media. (Nature

10 Elon Musk is feuding with a budget airline 
The airline is winning, in case you wondered. (WP $)

Quote of the day

“I wouldn’t edit anything about Donald Trump, because the man makes me insane.”

—Wikipedia founder Jimmy Wales tells Wired why he’s steering clear of the US President’s page.   

One more thing

BOB O’CONNOR

How electricity could help tackle a surprising climate villain

Cement hides in plain sight—it’s used to build everything from roads and buildings to dams and basement floors. But it’s also a climate threat. Cement production accounts for more than 7% of global carbon dioxide emissions—more than sectors like aviation, shipping, or landfills.

One solution to this climate catastrophe might be coursing through the pipes at Sublime Systems. The startup is developing an entirely new way to make cement. Instead of heating crushed-up rocks in lava-hot kilns, Sublime’s technology zaps them in water with electricity, kicking off chemical reactions that form the main ingredients in its cement.

But it faces huge challenges: competing with established industry players, and persuading builders to use its materials in the first place. Read the full story.

—Casey Crownhart

We can still have nice things

A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or skeet ’em at me.)

+ Earth may be a garbage fire, but space is beautiful
+ Do you know how to tie your shoelaces up properly? Are you sure?!
+ I defy British readers not to feel a pang of nostalgia at these crisp packets.
+ Going to bed around the same time every night seems to be a habit worth adopting. ($)

Why 2026 is a hot year for lithium

22 January 2026 at 06:00

In 2026, I’m going to be closely watching the price of lithium.

If you’re not in the habit of obsessively tracking commodity markets, I certainly don’t blame you. (Though the news lately definitely makes the case that minerals can have major implications for global politics and the economy.)

But lithium is worthy of a close look right now.

The metal is crucial for lithium-ion batteries used in phones and laptops, electric vehicles, and large-scale energy storage arrays on the grid. Prices have been on quite the roller coaster over the last few years, and they’re ticking up again after a low period. What happens next could have big implications for mining and battery technology.

Before we look ahead, let’s take a quick trip down memory lane. In 2020, global EV sales started to really take off, driving up demand for the lithium used in their batteries. Because of that growing demand and a limited supply, prices shot up dramatically, with lithium carbonate going from under $10 per kilogram to a high of roughly $70 per kilogram in just two years.

And the tech world took notice. During those high points, there was a ton of interest in developing alternative batteries that didn’t rely on lithium. I was writing about sodium-based batteries, iron-air batteries, and even experimental ones that were made with plastic.

Researchers and startups were also hunting for alternative ways to get lithium, including battery recycling and processing methods like direct lithium extraction (more on this in a moment).

But soon, prices crashed back down to earth. We saw lower-than-expected demand for EVs in the US, and developers ramped up mining and processing to meet demand. Through late 2024 and 2025, lithium carbonate was back around $10 a kilogram again. Avoiding lithium or finding new ways to get it suddenly looked a lot less crucial.

That brings us to today: lithium prices are ticking up again. So far, it’s nowhere close to the dramatic rise we saw a few years ago, but analysts are watching closely. Strong EV growth in China is playing a major role—EVs still make up about 75% of battery demand today. But growth in stationary storage, batteries for the grid, is also contributing to rising demand for lithium in both China and the US.

Higher prices could create new opportunities. The possibilities include alternative battery chemistries, specifically sodium-ion batteries, says Evelina Stoikou, head of battery technologies and supply chains at BloombergNEF. (I’ll note here that we recently named sodium-ion batteries to our 2026 list of 10 Breakthrough Technologies.)

It’s not just batteries, though. Another industry that could see big changes from a lithium price swing: extraction.

Today, most lithium is mined from rocks, largely in Australia, before being shipped to China for processing. There’s a growing effort to process the mineral in other places, though, as countries try to create their own lithium supply chains. Tesla recently confirmed that it’s started production at its lithium refinery in Texas, which broke ground in 2023. We could see more investment in processing plants outside China if prices continue to climb.

This could also be a key year for direct lithium extraction, as Katie Brigham wrote in a recent story for Heatmap. That technology uses chemical or electrochemical processes to extract lithium from brine (salty water that’s usually sourced from salt lakes or underground reservoirs), quickly and cheaply. Companies including Lilac Solutions, Standard Lithium, and Rio Tinto are all making plans or starting construction on commercial facilities this year in the US and Argentina. 

If there’s anything I’ve learned about following batteries and minerals over the past few years, it’s that predicting the future is impossible. But if you’re looking for tea leaves to read, lithium prices deserve a look. 

This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here.

Yann LeCun’s new venture is a contrarian bet against large language models  

22 January 2026 at 05:00

Yann LeCun is a Turing Award recipient and a top AI researcher, but he has long been a contrarian figure in the tech world. He believes that the industry’s current obsession with large language models is wrong-headed and will ultimately fail to solve many pressing problems. 

Instead, he thinks we should be betting on world models—a different type of AI that accurately reflects the dynamics of the real world. He is also a staunch advocate for open-source AI and criticizes the closed approach of frontier labs like OpenAI and Anthropic. 

Perhaps it’s no surprise, then, that he recently left Meta, where he had served as chief scientist for FAIR (Fundamental AI Research), the company’s influential research lab that he founded. Meta has struggled to gain much traction with its open-source AI model Llama and has seen internal shake-ups, including the controversial acquisition of ScaleAI. 

LeCun sat down with MIT Technology Review in an exclusive online interview from his Paris apartment to discuss his new venture, life after Meta, the future of artificial intelligence, and why he thinks the industry is chasing the wrong ideas. 

Both the questions and answers below have been edited for clarity and brevity.

You’ve just announced a new company, Advanced Machine Intelligence (AMI).  Tell me about the big ideas behind it.

It is going to be a global company, but headquartered in Paris. You pronounce it “ami”—it means “friend” in French. I am excited. There is a very high concentration of talent in Europe, but it is not always given a proper environment to flourish. And there is certainly a huge demand from the industry and governments for a credible frontier AI company that is neither Chinese nor American. I think that is going to be to our advantage.

So an ambitious alternative to the US-China binary we currently have. What made you want to pursue that third path?

Well, there are sovereignty issues for a lot of countries, and they want some control over AI. What I’m advocating is that AI is going to become a platform, and most platforms tend to become open-source. Unfortunately, that’s not really the direction the American industry is taking. Right? As the competition increases, they feel like they have to be secretive. I think that is a strategic mistake.

It’s certainly true for OpenAI, which went from very open to very closed, and Anthropic has always been closed. Google was sort of a little open. And then Meta, we’ll see. My sense is that it’s not going in a positive direction at this moment.

Simultaneously, China has completely embraced this open approach. So all leading open-source AI platforms are Chinese, and the result is that academia and startups, outside of the US, have basically embraced Chinese models. There’s nothing wrong with that—you know, Chinese models are good. Chinese engineers and scientists are great. But you know, if there is a future in which all of our information diet is being mediated by AI assistance, and the choice is either English-speaking models produced by proprietary companies always close to the US or Chinese models which may be open-source but need to be fine-tuned so that they answer questions about Tiananmen Square in 1989—you know, it’s not a very pleasant and engaging future. 

They [the future models] should be able to be fine-tuned by anyone and produce a very high diversity of AI assistance, with different linguistic abilities and value systems and political biases and centers of interests. You need high diversity of assistance for the same reason that you need high diversity of press. 

That is certainly a compelling pitch. How are investors buying that idea so far?

They really like it. A lot of venture capitalists are very much in favor of this idea of open-source, because they know for a lot of small startups, they really rely on open-source models. They don’t have the means to train their own model, and it’s kind of dangerous for them strategically to embrace a proprietary model.

You recently left Meta. What’s your view on the company and Mark Zuckerberg’s leadership? There’s a perception that Meta has fumbled its AI advantage.

I think FAIR [LeCun’s lab at Meta] was extremely successful in the research part. Where Meta was less successful is in picking up on that research and pushing it into practical technology and products. Mark made some choices that he thought were the best for the company. I may not have agreed with all of them. For example, the robotics group at FAIR was let go, which I think was a strategic mistake. But I’m not the director of FAIR. People make decisions rationally, and there’s no reason to be upset.

So, no bad blood? Could Meta be a future client for AMI?

Meta might be our first client! We’ll see. The work we are doing is not in direct competition. Our focus on world models for the physical world is very different from their focus on generative AI and LLMs.

You were working on AI long before LLMs became a mainstream approach. But since ChatGPT broke out, LLMs have become almost synonymous with AI.

Yes, and we are going to change that. The public face of AI, perhaps, is mostly LLMs and chatbots of various types. But the latest ones of those are not pure LLMs. They are LLM plus a lot of things, like perception systems and code that solves particular problems. So we are going to see LLMs as kind of the orchestrator in systems, a little bit.

Beyond LLMs, there is a lot of AI that is behind the scenes that runs a big chunk of our society. There are assistance driving programs in a car, quick-turn MRI images, algorithms that drive social media—that’s all AI. 

You have been vocal in arguing that LLMs can only get us so far. Do you think LLMs are overhyped these days? Can you summarize to our readers why you believe that LLMs are not enough?

There is a sense in which they have not been overhyped, which is that they are extremely useful to a lot of people, particularly if you write text, do research, or write code. LLMs manipulate language really well. But people have had this illusion, or delusion, that it is a matter of time until we can scale them up to having human-level intelligence, and that is simply false.

The truly difficult part is understanding the real world. This is the Moravec Paradox (a phenomenon observed by the computer scientist Hans Moravec in 1988): What’s easy for us, like perception and navigation, is hard for computers, and vice versa. LLMs are limited to the discrete world of text. They can’t truly reason or plan, because they lack a model of the world. They can’t predict the consequences of their actions. This is why we don’t have a domestic robot that is as agile as a house cat, or a truly autonomous car.

We are going to have AI systems that have humanlike and human-level intelligence, but they’re  not going to be built on LLMs, and it’s not going to happen next year or two years from now. It’s going to take a while. There are major conceptual breakthroughs that have to happen before we have AI systems that have human-level intelligence. And that is what I’ve been working on. And this company, AMI Labs, is focusing on the next generation.

And your solution is world models and JEPA architecture (JEPA, or “joint embedding predictive architecture,” is a learning framework that trains AI models to understand the world, created by LeCun while he was at Meta). What’s the elevator pitch?

The world is unpredictable. If you try to build a generative model that predicts every detail of the future, it will fail.  JEPA is not generative AI. It is a system that learns to represent videos really well. The key is to learn an abstract representation of the world and make predictions in that abstract space, ignoring the details you can’t predict. That’s what JEPA does. It learns the underlying rules of the world from observation, like a baby learning about gravity. This is the foundation for common sense, and it’s the key to building truly intelligent systems that can reason and plan in the real world. The most exciting work so far on this is coming from academia, not the big industrial labs stuck in the LLM world.

The lack of non-text data has been a problem in taking AI systems further in understanding the physical world. JEPA is trained on videos. What other kinds of data will you be using?

Our systems will be trained on video, audio, and sensor data of all kinds—not just text. We are working with various modalities, from the position of a robot arm to lidar data to audio. I’m also involved in a project using JEPA to model complex physical and clinical phenomena. 

What are some of the concrete, real-world applications you envision for world models?

The applications are vast. Think about complex industrial processes where you have thousands of sensors, like in a jet engine, a steel mill, or a chemical factory. There is no technique right now to build a complete, holistic model of these systems. A world model could learn this from the sensor data and predict how the system will behave. Or think of smart glasses that can watch what you’re doing, identify your actions, and then predict what you’re going to do next to assist you. This is what will finally make agentic systems reliable. An agentic system that is supposed to take actions in the world cannot work reliably unless it has a world model to predict the consequences of its actions. Without it, the system will inevitably make mistakes. This is the key to unlocking everything from truly useful domestic robots to Level 5 autonomous driving.

Humanoid robots are all the rage recently, especially ones built by companies from China. What’s your take?

There are all these brute-force ways to get around the limitations of learning systems, which require inordinate amounts of training data to do anything. So the secret of all the companies getting robots to do kung fu or dance is they are all planned in advance. But frankly, nobody—absolutely nobody—knows how to make those robots smart enough to be useful. Take my word for it. 


You need an enormous amount of tele-operation training data for every single task, and when the environment changes a little bit, it doesn’t generalize very well. What this tells us is we are missing something very big. The reason why a 17-year-old can learn to drive in 20 hours is because they already know a lot about how the world behaves. If we want a generally useful domestic robot, we need systems to have a kind of good understanding of the physical world. That’s not going to happen until we have good world models and planning.

There’s a growing sentiment that it’s becoming harder to do foundational AI research in academia because of the massive computing resources required. Do you think the most important innovations will now come from industry?

No. LLMs are now technology development, not research. It’s true that it’s very difficult for academics to play an important role there because of the requirements for computation, data access, and engineering support. But it’s a product now. It’s not something academia should even be interested in. It’s like speech recognition in the early 2010s—it was a solved problem, and the progress was in the hands of industry. 

What academia should be working on is long-term objectives that go beyond the capabilities of current systems. That’s why I tell people in universities: Don’t work on LLMs. There is no point. You’re not going to be able to rival what’s going on in industry. Work on something else. Invent new techniques. The breakthroughs are not going to come from scaling up LLMs. The most exciting work on world models is coming from academia, not the big industrial labs. The whole idea of using attention circuits in neural nets came out of the University of Montreal. That research paper started the whole revolution. Now that the big companies are closing up, the breakthroughs are going to slow down. Academia needs access to computing resources, but they should be focused on the next big thing, not on refining the last one.

You wear many hats: professor, researcher, educator, public thinker … Now you just took on a new one. What is that going to look like for you?

I am going to be the executive chairman of the company, and Alex LeBrun [a former colleague from Meta AI] will be the CEO. It’s going to be LeCun and LeBrun—it’s nice if you pronounce it the French way.

I am going to keep my position at NYU. I teach one class per year, I have PhD students and postdocs, so I am going to be kept based in New York. But I go to Paris pretty often because of my lab. 

Does that mean that you won’t be very hands-on?

Well, there’s two ways to be hands-on. One is to manage people day to day, and another is to actually get your hands dirty in research projects, right? 

I can do management, but I don’t like doing it. This is not my mission in life. It’s really to make science and technology progress as far as we can, inspire other people to work on things that are interesting, and then contribute to those things. So that has been my role at Meta for the last seven years. I founded FAIR and led it for four to five years. I kind of hated being a director. I am not good at this career management thing. I’m much more visionary and a scientist.

What makes Alex LeBrun the right fit?

Alex is a serial entrepreneur; he’s built three successful AI companies. The first he sold to Microsoft; the second to Facebook, where he was head of the engineering division of FAIR in Paris. He then left to create Nabla, a very successful company in the health-care space. When I offered him the chance to join me in this effort, he accepted almost immediately. He has the experience to build the company, allowing me to focus on science and technology. 

You’re headquartered in Paris. Where else do you plan to have offices?

We are a global company. There’s going to be an office in North America.

New York, hopefully?

New York is great. That’s where I am, right? And it’s not Silicon Valley. Silicon Valley is a bit of a monoculture.

What about Asia? I’m guessing Singapore, too?

Probably, yeah. I’ll let you guess. 

And how are you attracting talent?

We don’t have any issue recruiting. There are a lot of people in the AI research community who think the future of AI is in world models. Those people, regardless of pay package, will be motivated to come work for us because they believe in the technological future we are building. We’ve already recruited people from places like OpenAI, Google DeepMind, and xAI.

I heard that Saining Xie, a prominent researcher from NYU and Google DeepMind, might be joining you as chief scientist. Any comments?

Saining is a brilliant researcher. I have a lot of admiration for him. I hired him twice already. I hired him at FAIR, and I convinced my colleagues at NYU that we should hire him there. Let’s just say I have a lot of respect for him.

When will you be ready to share more details about AMI Labs, like financial backing or other core members?

Soon—in February, maybe. I’ll let you know.

Rethinking AI’s future in an augmented workplace

There are many paths AI evolution could take. On one end of the spectrum, AI is dismissed as a marginal fad, another bubble fueled by notoriety and misallocated capital. On the other end, it’s cast as a dystopian force, destined to eliminate jobs on a large scale and destabilize economies. Markets oscillate between skepticism and the fear of missing out, while the technology itself evolves quickly and investment dollars flow at a rate not seen in decades. 

All the while, many of today’s financial and economic thought leaders hold to the consensus that the financial landscape will stay the same as it has been for the last several years. Two years ago, Joseph Davis, global chief economist at Vanguard, and his team felt the same but wanted to develop their perspective on AI technology with a deeper foundation built on history and data. Based on a proprietary data set covering the last 130 years, Davis and his team developed a new framework, The Vanguard Megatrends Model, from research that suggested a more nuanced path than hype extremes: that AI has the potential to be a general purpose technology that lifts productivity, reshapes industries, and augments human work rather than displaces it. In short, AI will be neither marginal nor dystopian. 

“Our findings suggest that the continuation of the status quo, the basic expectation of most economists, is actually the least likely outcome,” Davis says. “We project that AI will have an even greater effect on productivity than the personal computer did. And we project that a scenario where AI transforms the economy is far more likely than one where AI disappoints and fiscal deficits dominate. The latter would likely lead to slower economic growth, higher inflation, and increased interest rates.”

Implications for business leaders and workers

Davis does not sugar-coat it, however. Although AI promises economic growth and productivity, it will be disruptive, especially for business leaders and workers in knowledge sectors. “AI is likely to be the most disruptive technology to alter the nature of our work since the personal computer,” says Davis. “Those of a certain age might recall how the broad availability of PCs remade many jobs. It didn’t eliminate jobs as much as it allowed people to focus on higher value activities.” 

The team’s framework allowed them to examine AI automation risks to over 800 different occupations. The research indicated that while the potential for job loss exists in upwards of 20% of occupations as a result of AI-driven automation, the majority of jobs—likely four out of five—will result in a mixture of innovation and automation. Workers’ time will increasingly shift to higher value and uniquely human tasks. 

This introduces the idea that AI could serve as a copilot to various roles, performing repetitive tasks and generally assisting with responsibilities. Davis argues that traditional economic models often underestimate the potential of AI because they fail to examine the deeper structural effects of technological change. “Most approaches for thinking about future growth, such as GDP, don’t adequately account for AI,” he explains. “They fail to link short-term variations in productivity with the three dimensions of technological change: automation, augmentation, and the emergence of new industries.” Automation enhances worker productivity by handling routine tasks; augmentation allows technology to act as a copilot, amplifying human skills; and the creation of new industries creates new sources of growth.

Implications for the economy 

Ironically, Davis’s research suggests that a reason for the relatively low productivity growth in recent years may be a lack of automation. Despite a decade of rapid innovation in digital and automation technologies, productivity growth has lagged since the 2008 financial crisis, hitting 50-year lows. This appears to support the view that AI’s impact will be marginal. But Davis believes that automation has been adopted in the wrong places. “What surprised me most was how little automation there has been in services like finance, health care, and education,” he says. “Outside of manufacturing, automation has been very limited. That’s been holding back growth for at least two decades.” The services sector accounts for more than 60% of US GDP and 80% of the workforce and has experienced some of the lowest productivity growth. It is here, Davis argues, that AI will make the biggest difference.

One of the biggest challenges facing the economy is demographics, as the Baby Boomer generation retires, immigration slows, and birth rates decline. These demographic headwinds reinforce the need for technological acceleration. “There are concerns about AI being dystopian and causing massive job loss, but we’ll soon have too few workers, not too many,” Davis says. “Economies like the US, Japan, China, and those across Europe will need to step up function in automation as their populations age.” 

For example, consider nursing, a profession in which empathy and human presence are irreplaceable. AI has already shown the potential to augment rather than automate in this field, streamlining data entry in electronic health records and helping nurses reclaim time for patient care. Davis estimates that these tools could increase nursing productivity by as much as 20% by 2035, a crucial gain as health-care systems adapt to ageing populations and rising demand. “In our most likely scenario, AI will offset demographic pressures. Within five to seven years, AI’s ability to automate portions of work will be roughly equivalent to adding 16 million to 17 million workers to the US labor force,” Davis says. “That’s essentially the same as if everyone turning 65 over the next five years decided not to retire.” He projects that more than 60% of occupations, including nurses, family physicians, high school teachers, pharmacists, human resource managers, and insurance sales agents, will benefit from AI as an augmentation tool. 

Implications for all investors 

As AI technology spreads, the strongest performers in the stock market won’t be its producers, but its users. “That makes sense, because general-purpose technologies enhance productivity, efficiency, and profitability across entire sectors,” says Davis. This adoption of AI is creating flexibility for investment options, which means diversifying beyond technology stocks might be appropriate as reflected in Vanguard’s Economic and Market Outlook for 2026. “As that happens, the benefits move beyond places like Silicon Valley or Boston and into industries that apply the technology in transformative ways.” And history shows that early adopters of new technologies reap the greatest productivity rewards. “We’re clearly in the experimentation phase of learning by doing,” says Davis. “Those companies that encourage and reward experimentation will capture the most value from AI.” 

Looking globally, Davis sees the United States and China as significantly ahead in the AI race. “It’s a virtual dead heat,” he says. “That tells me the competition between the two will remain intense.” But other economies, especially those with low automation rates and large service sectors, like Japan, Europe, and Canada, could also see significant benefits. “If AI is truly going to be transformative, three sectors stand out: health care, education, and finance,” says Davis. “For AI to live up to its potential, it must fundamentally reshape these industries, which face high costs and rising demand for better, faster, more personalized services.”

However, Davis says Vanguard is more bullish on AI’s potential to transform the economy than it was just a year ago. Especially since that transformation requires application beyond Silicon Valley. “When I speak to business leaders, I remind them that this transformation hasn’t happened yet,” says Davis. “It’s their investment and innovation that will determine whether it does.”

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff. It was researched, designed, and written by human writers, editors, analysts, and illustrators. This includes the writing of surveys and collection of data for surveys. AI tools that may have been used were limited to secondary production processes that passed thorough human review.

Everyone wants AI sovereignty. No one can truly have it.

By: Cathy Li
21 January 2026 at 09:00

Governments plan to pour $1.3 trillion into AI infrastructure by 2030 to invest in “sovereign AI,” with the premise being that countries should be in control of their own AI capabilities. The funds include financing for domestic data centers, locally trained models, independent supply chains, and national talent pipelines. This is a response to real shocks: covid-era supply chain breakdowns, rising geopolitical tensions, and the war in Ukraine.  

But the pursuit of absolute autonomy is running into reality. AI supply chains are irreducibly global: Chips are designed in the US and manufactured in East Asia; models are trained on data sets drawn from multiple countries; applications are deployed across dozens of jurisdictions.  

If sovereignty is to remain meaningful, it must shift from a defensive model of self-reliance to a vision that emphasizes the concept of orchestration, balancing national autonomy with strategic partnership. 

Why infrastructure-first strategies hit walls 

A November survey by Accenture found that 62% of European organizations are now seeking sovereign AI solutions, driven primarily by geopolitical anxiety rather than technical necessity. That figure rises to 80% in Denmark and 72% in Germany. The European Union has appointed its first Commissioner for Tech Sovereignty. 

This year, $475 billion is flowing into AI data centers globally. In the United States, AI data centers accounted for roughly one-fifth of GDP growth in the second quarter of 2025. But the obstacle for other nations hoping to follow suit isn’t just money. It’s energy and physics. Global data center capacity is projected to hit 130 gigawatts by 2030, and for every $1 billion spent on these facilities, $125 million is needed for electricity networks. More than $750 billion in planned investment is already facing grid delays. 

And it’s also talent. Researchers and entrepreneurs are mobile, drawn to ecosystems with access to capital, competitive wages, and rapid innovation cycles. Infrastructure alone won’t attract or retain world-class talent.  

What works: An orchestrated sovereignty

What nations need isn’t sovereignty through isolation but through specialization and orchestration. This means choosing which capabilities you build, which you pursue through partnership, and where you can genuinely lead in shaping the global AI landscape. 

The most successful AI strategies don’t try to replicate Silicon Valley; they identify specific advantages and build partnerships around them. 

Singapore offers a model. Rather than seeking to duplicate massive infrastructure, it invested in governance frameworks, digital-identity platforms, and applications of AI in logistics and finance, areas where it can realistically compete. 

Israel shows a different path. Its strength lies in a dense network of startups and military-adjacent research institutions delivering outsize influence despite the country’s small size. 

South Korea is instructive too. While it has national champions like Samsung and Naver, these firms still partner with Microsoft and Nvidia on infrastructure. That’s deliberate collaboration reflecting strategic oversight, not dependence.  

Even China, despite its scale and ambition, cannot secure full-stack autonomy. Its reliance on global research networks and on foreign lithography equipment, such as extreme ultraviolet systems needed to manufacture advanced chips and GPU architectures, shows the limits of techno-nationalism. 

The pattern is clear: Nations that specialize and partner strategically can outperform those trying to do everything alone. 

Three ways to align ambition with reality 

1.  Measure added value, not inputs.  

Sovereignty isn’t how many petaflops you own. It’s how many lives you improve and how fast the economy grows. Real sovereignty is the ability to innovate in support of national priorities such as productivity, resilience, and sustainability while maintaining freedom to shape governance and standards.  

Nations should track the use of AI in health care and monitor how the technology’s adoption correlates with manufacturing productivity, patent citations, and international research collaborations. The goal is to ensure that AI ecosystems generate inclusive and lasting economic and social value.  

2. Cultivate a strong AI innovation ecosystem. 

Build infrastructure, but also build the ecosystem around it: research institutions, technical education, entrepreneurship support, and public-private talent development. Infrastructure without skilled talent and vibrant networks cannot deliver a lasting competitive advantage.   

3. Build global partnerships.  

Strategic partnerships enable nations to pool resources, lower infrastructure costs, and access complementary expertise. Singapore’s work with global cloud providers and the EU’s collaborative research programs show how nations advance capabilities faster through partnership than through isolation. Rather than competing to set dominant standards, nations should collaborate on interoperable frameworks for transparency, safety, and accountability.  

What’s at stake 

Overinvesting in independence fragments markets and slows cross-border innovation, which is the foundation of AI progress. When strategies focus too narrowly on control, they sacrifice the agility needed to compete. 

The cost of getting this wrong isn’t just wasted capital—it’s a decade of falling behind. Nations that double down on infrastructure-first strategies risk ending up with expensive data centers running yesterday’s models, while competitors that choose strategic partnerships iterate faster, attract better talent, and shape the standards that matter. 

The winners will be those who define sovereignty not as separation, but as participation plus leadership—choosing who they depend on, where they build, and which global rules they shape. Strategic interdependence may feel less satisfying than independence, but it’s real, it is achievable, and it will separate the leaders from the followers over the next decade. 

The age of intelligent systems demands intelligent strategies—ones that measure success not by infrastructure owned, but by problems solved. Nations that embrace this shift won’t just participate in the AI economy; they’ll shape it. That’s sovereignty worth pursuing. 

Cathy Li is head of the Centre for AI Excellence at the World Economic Forum.

The Download: Trump at Davos, and AI scientists

21 January 2026 at 08:10

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

All anyone wants to talk about at Davos is AI and Donald Trump

—Mat Honan, MIT Technology Review’s editor in chief 

At Davos this year Trump is dominating all the side conversations. There are lots of little jokes. Nervous laughter. Outright anger. Fear in the eyes. It’s wild. The US president is due to speak here today, amid threats of seizing Greenland and fears that he’s about to permanently fracture the NATO alliance.

But Trump isn’t the only game in town—everyone’s also talking about AI. Read Mat’s story to find out more

This subscriber-only story appeared first in The Debrief, Mat’s weekly newsletter about the biggest stories in tech. Sign up here to get the next one in your inbox, and subscribe if you haven’t already!

The UK government is backing AI that can run its own lab experiments

A number of startups and university teams that are building “AI scientists” to design and run experiments in the lab, including robot biologists and chemists, have just won extra funding from the UK government agency that funds moonshot R&D.  

The competition, set up by ARIA (the Advanced Research and Invention Agency), gives a clear sense of how fast this technology is moving: The agency received 245 proposals from research teams that are already building tools capable of automating increasing amounts of lab work. Read the full story to learn more. 

—Will Douglas Heaven 

Everyone wants AI sovereignty. No one can truly have it.

—Cathy Li is head of the Centre for AI Excellence at the World Economic Forum

Governments plan to pour $1.3 trillion into AI infrastructure by 2030 to invest in “sovereign AI,” with the premise being that countries should be in control of their own AI capabilities. The funds include financing for domestic data centers, locally trained models, independent supply chains, and national talent pipelines. 

This is a response to real shocks: covid-era supply chain breakdowns, rising geopolitical tensions, and the war in Ukraine. But the pursuit of absolute autonomy is running into reality: AI supply chains are irreducibly global. If sovereignty is to remain meaningful, it must shift from defensive self-reliance to a vision that balances national autonomy with strategic partnership. Read the full story.

Here’s how extinct DNA could help us in the present—and the future

Thanks to genetic science, gene editing, and techniques like cloning, it’s now possible to move DNA through time, studying genetic information in ancient remains and then re-creating it in the bodies of modern beings. And that, scientists say, offers new ways to try to help endangered species, engineer new plants that resist climate change, or even create new human medicines.  

Read more about why genetic resurrection is one of our 10 Breakthrough Technologies this year, and check out the rest of the list.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 The White House wants Americans to embrace AI
It faces an uphill battle—the US public is mostly pretty gloomy about AI’s impact. (WP $) 
What’s next for AI in 2026. (MIT Technology Review)

2 The UN says we’re entering an “era of water bankruptcy” 
And it’s set to affect the vast majority of us on the planet. (Reuters $)
Water shortages are fueling the protests in Iran. (Undark
This Nobel Prize–winning chemist dreams of making water from thin air. (MIT Technology Review)

3 How is US science faring after a year of Trump?
Not that well, after proposed budget cuts amounting to $32 billion. (Nature $)
The foundations of America’s prosperity are being dismantled. (MIT Technology Review

4 We need to talk about the early career AI jobs crisis 
Young people are graduating and finding there simply aren’t any roles for them to do. (NY Mag $)
+ AI companies are fighting to win over teachers. (Axios $)
Chinese universities want students to use more AI, not less. (MIT Technology Review)

5 The AI boyfriend business is booming in China
And it’s mostly geared towards Gen Z women. (Wired $)
It’s surprisingly easy to stumble into a relationship with an AI chatbot. (MIT Technology Review

6 Snap has settled a social media addiction lawsuit ahead of a trial 
However the other defendants, including Meta, TikTok and YouTube, are still fighting it. (BBC)
A new study is going to examine the effects of restricting social media for children. (The Guardian)

7 Here are some of the best ideas of this century so far
From smartphones to HIV drugs, the pace of progress has been dizzying. (New Scientist $)

8 Robots may be on the cusp of becoming very capable
Until now, their role in the world of work has been limited. AI could radically change that. (FT $)
Why the humanoid workforce is running late. (MIT Technology Review)

9 Scientists are racing to put a radio telescope on the moon 
If they succeed, it will be able to ‘hear’ all the way back to over 13 billion years ago, just 380,000 years after the big bang. (IEEE Spectrum)
+ Inside the quest to map the universe with mysterious bursts of radio energy. (MIT Technology Review

10 It turns out cows can use tools
What will we discover next? Flying pigs?! (Futurism)

Quote of the day

“We’re still staggering along, but I don’t know for how much longer. I don’t have the energy any more.”

—A researcher at the National Oceanic and Atmospheric Administration tells Nature they and their colleagues are exhausted by the Trump administration’s attacks on science.  

One more thing

Palmer Luckey on the Pentagon’s future of mixed reality

Palmer Luckey has, in some ways, come full circle.  

His first experience with virtual-reality headsets was as a teenage lab technician at a defense research center in Southern California, studying their potential to curb PTSD symptoms in veterans. He then built Oculus, sold it to Facebook for $2 billion, left Facebook after a highly public ousting, and founded Anduril, which focuses on drones, cruise missiles, and other AI-enhanced technologies for the US Department of Defense. The company is now valued at $14 billion.

Now Luckey is redirecting his energy again, to headsets for the military. He spoke to MIT Technology Review about his plans. Read the full interview.

—James O’Donnell

We can still have nice things

A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or skeet ’em at me.)

+ I want to skip around every single one of these beautiful gardens.
+ Your friends help you live longer. Isn’t that nice of them?!
+ Brb, just buying a pharaoh headdress for my cat.
+ Consider this your annual reminder that you don’t need a gym membership or fancy equipment to get fitter.

All anyone wants to talk about at Davos is AI and Donald Trump

By: Mat Honan
21 January 2026 at 06:20

This story first appeared in The Debrief, our subscriber-only newsletter about the biggest news in tech by Mat Honan, Editor in Chief. Subscribe to read the next edition as soon as it lands.

Hello from the World Economic Forum annual meeting in Davos, Switzerland. I’ve been here for two days now, attending meetings, speaking on panels, and basically trying to talk to anyone I can. And as far as I can tell, the only things anyone wants to talk about are AI and Trump. 

Davos is physically defined by the Congress Center, where the official WEF sessions take place, and the Promenade, a street running through the center of the town lined with various “houses”—mostly retailers that are temporarily converted into meeting hubs for various corporate or national sponsors. So there is a Ukraine House, a Brazil House, Saudi House, and yes, a USA House (more on that tomorrow). There are a handful of media houses from the likes of CNBC and the Wall Street Journal. Some houses are devoted to specific topics; for example, there’s one for science and another for AI. 

But like everything else in 2026, the Promenade is dominated by tech companies. At one point I realized that literally everything I could see, in a spot where the road bends a bit, was a tech company house. Palantir, Workday, Infosys, Cloudflare, C3.ai. Maybe this should go without saying, but their presence, both in the houses and on the various stages and parties and platforms here at the World Economic Forum, really drove home to me how utterly and completely tech has captured the global economy. 

While the houses host events and serve as networking hubs, the big show is inside the Congress Center. On Tuesday morning, I kicked off my official Davos experience there by moderating a panel with the CEOs of Accenture, Aramco, Royal Philips, and Visa. The topic was scaling up AI within organizations. All of these leaders represented companies that have gone from pilot projects to large internal implementations. It was, for me, a fascinating conversation. You can watch the whole thing here, but my takeaway was that while there are plenty of stories about AI being overhyped (including from us), it is certainly having substantive effects at large companies.  

Aramco CEO Amin Nasser, for example, described how that company has found $3 billion to $5 billion in cost savings by improving the efficiency of its operations. Royal Philips CEO Roy Jakobs described how it was allowing health-care practitioners to spend more time with patients by doing things such as automated note-taking. (This really resonated with me, as my wife is a pediatrics nurse, and for decades now I’ve heard her talk about how much of her time is devoted to charting.) And Visa CEO Ryan McInerney talked about his company’s push into agentic commerce and the way that will play out for consumers, small businesses, and the global payments industry. 

To elaborate a little on that point, McInerney painted a picture of commerce where agents won’t just shop for things you ask them to, which will be basically step one, but will eventually be able to shop for things based on your preferences and previous spending patterns. This could be your regular grocery shopping, or even a vacation getaway. That’s going to require a lot of trust and authentication to protect both merchants and consumers, but it is clear that the steps into agentic commerce we saw in 2025 were just baby ones. There are much bigger ones coming for 2026. (Coincidentally, I had a discussion with a senior executive from Mastercard on Monday, who made several of the same points.) 

But the thing that really resonated with me from the panel was a comment from Accenture CEO Julie Sweet, who has a view not only of her own large org but across a spectrum of companies: “It’s hard to trust something until you understand it.” 

I felt that neatly summed up where we are as a society with AI. 

Clearly, other people feel the same. Before the official start of the conference I was at AI House for a panel. The place was packed. There was a consistent, massive line to get in, and once inside, I literally had to muscle my way through the crowd. Everyone wanted to get in. Everyone wanted to talk about AI. 

(A quick aside on what I was doing there: I sat on a panel called “Creativity and Identity in the Age of Memes and Deepfakes,” led by Atlantic CEO Nicholas Thompson; it featured the artist Emi Kusano, who works with AI, and Duncan Crabtree-Ireland, the chief negotiator for SAG-AFTRA, who has been at the center of a lot of the debates about AI in the film and gaming industries. I’m not going to spend much time describing it because I’m already running long, but it was a rip-roarer of a panel. Check it out.)

And, okay. Sigh. Donald Trump. 

The president is due here Wednesday, amid threats of seizing Greenland and fears that he’s about to permanently fracture the NATO alliance. While AI is all over the stages, Trump is dominating all the side conversations. There are lots of little jokes. Nervous laughter. Outright anger. Fear in the eyes. It’s wild. 

These conversations are also starting to spill out into the public. Just after my panel on Tuesday, I headed to a pavilion outside the main hall in the Congress Center. I saw someone coming down the stairs with a small entourage, who was suddenly mobbed by cameras and phones. 

Moments earlier in the same spot, the press had been surrounding David Beckham, shouting questions at him. So I was primed for it to be another celebrity—after all, captains of industry were everywhere you looked. I mean, I had just bumped into Eric Schmidt, who was literally standing in line in front of me at the coffee bar. Davos is weird. 

But in fact, it was Gavin Newsom, the governor of California, who is increasingly seen as the leading voice of the Democratic opposition to President Trump, and a likely contender, or even front-runner, in the race to replace him. Because I live in San Francisco I’ve encountered Newsom many times, dating back to his early days as a city supervisor before he was even mayor. I’ve rarely, rarely, seen him quite so worked up as he was on Tuesday. 

Among other things, he called Trump a narcissist who follows “the law of the jungle, the rule of Don” and compared him to a T-Rex, saying, “You mate with him or he devours you.” And he was just as harsh on the world leaders, many of whom are gathered in Davos, calling them “pathetic” and saying he should have brought knee pads for them. 

Yikes.

There was more of this sentiment, if in more measured tones, from Canadian prime minister Mark Carney during his address at Davos. While I missed his remarks, they had people talking. “If we’re not at the table, we’re on the menu,” he argued. 

Reimagining ERP for the agentic AI era

The story of enterprise resource planning (ERP) is really a story of businesses learning to organize themselves around the latest, greatest technology of the times. In the 1960s through the ’80s, mainframes, material requirements planning (MRP), and manufacturing resource planning (MRP II) brought core business data from file cabinets to centralized systems. Client-server architectures defined the ’80s and ’90s, taking digitization mainstream during the internet’s infancy. And in the 21st century, as work moved beyond the desktop, SaaS and cloud ushered in flexible access and elastic infrastructure.

The rise of composability and agentic AI marks yet another dawn—and an apt one for the nascent intelligence age. Composable architectures let organizations assemble capabilities from multiple systems in a mix-and-match fashion, so they can swap vendor gridlock for an à la carte portfolio of fit-for-purpose modules. On top of that architectural shift, agentic AI enables coordination across systems that weren’t originally designed to talk to one another.

Early indicators suggest that AI-enabled ERP will yield meaningful performance gains: One 2024 study found that organizations implementing AI-driven ERP solutions stand to gain around a 30% boost in user satisfaction and a 25% lift in productivity; another suggested that AI-driven ERP can lead to processing time savings of up to 45%, as well as improvements in decision accuracy to the tune of 60%.

These dual advancements address long-standing gaps that previous ERP eras fell short of delivering: freedom to innovate outside of vendor roadmaps, capacity for rapid iteration, and true interoperability across all critical functions. This shift signals the end of monolithic dependency as well as a once-in-a-generation opportunity for early movers to gain a competitive edge.

Key takeaways include:

  • Enterprises are moving away from monolithic ERP vendor upgrades in favor of modular architectures that allow them to change or modernize components independently while keeping a stable core for essential transactions.
  • Agentic AI is a timely complement to composability, functioning as a UX and orchestration layer that can coordinate workflows across disparate systems and turn multi-step processes into automated, cross-platform operations.
  • These dual shifts are finally enabling technology architecture to organize around the business, instead of the business around the ERP. Companies can modernize by reconfiguring and extending what they already have, rather than relying on ERP-centric upgrades.

Download the report.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff. It was researched, designed, and written by human writers, editors, analysts, and illustrators. This includes the writing of surveys and collection of data for surveys. AI tools that may have been used were limited to secondary production processes that passed thorough human review.

The era of agentic chaos and how data will save us

20 January 2026 at 10:00

AI agents are moving beyond coding assistants and customer service chatbots into the operational core of the enterprise. The ROI is promising, but autonomy without alignment is a recipe for chaos. Business leaders need to lay the essential foundations now.

The agent explosion is coming

Agents are independently handling end-to-end processes across lead generation, supply chain optimization, customer support, and financial reconciliation. A mid-sized organization could easily run 4,000 agents, each making decisions that affect revenue, compliance, and customer experience. 

The transformation toward an agent-driven enterprise is inevitable. The economic benefits are too significant to ignore, and the potential is becoming a reality faster than most predicted. The problem? Most businesses and their underlying infrastructure are not prepared for this shift. Early adopters have found unlocking AI initiatives at scale to be extremely challenging. 

The reliability gap that’s holding AI back

Companies are investing heavily in AI, but the returns aren’t materializing. According to recent research from Boston Consulting Group, 60% of companies report minimal revenue and cost gains despite substantial investment. However, the leaders reported they achieved five times the revenue increases and three times the cost reductions. Clearly, there is a massive premium for being a leader. 

What separates the leaders from the pack isn’t how much they’re spending or which models they’re using. Before scaling AI deployment, these “future-built” companies put critical data infrastructure capabilities in place. They invested in the foundational work that enables AI to function reliably. 

A framework for agent reliability: The four quadrants

To understand how and where enterprise AI can fail, consider four critical quadrants: models, tools, context, and governance.

Take a simple example: an agent that orders you pizza. The model interprets your request (“get me a pizza”). The tool executes the action (calling the Domino’s or Pizza Hut API). Context provides personalization (you tend to order pepperoni on Friday nights at 7pm). Governance validates the outcome (did the pizza actually arrive?). 

Each dimension represents a potential failure point:

  • Models: The underlying AI systems that interpret prompts, generate responses, and make predictions
  • Tools: The integration layer that connects AI to enterprise systems, such as APIs, protocols, and connectors 
  • Context: Before making decisions, information agents need to understand the full business picture, including customer histories, product catalogs, and supply chain networks
  • Governance: The policies, controls, and processes that ensure data quality, security, and compliance

This framework helps diagnose where reliability gaps emerge. When an enterprise agent fails, which quadrant is the problem? Is the model misunderstanding intent? Are the tools unavailable or broken? Is the context incomplete or contradictory? Or is there no mechanism to verify that the agent did what it was supposed to do?

Why this is a data problem, not a model problem

The temptation is to think that reliability will simply improve as models improve. Yet, model capability is advancing exponentially. The cost of inference has dropped nearly 900 times in three years, hallucination rates are on the decline, and AI’s capacity to perform long tasks doubles every six months.

Tooling is also accelerating. Integration frameworks like the Model Context Protocol (MCP) make it dramatically easier to connect agents with enterprise systems and APIs.

If models are powerful and tools are maturing, then what is holding back adoption?

To borrow from James Carville, “It is the data, stupid.” The root cause of most misbehaving agents is misaligned, inconsistent, or incomplete data.

Enterprises have accumulated data debt over decades. Acquisitions, custom systems, departmental tools, and shadow IT have left data scattered across silos that rarely agree. Support systems do not match what is in marketing systems. Supplier data is duplicated across finance, procurement, and logistics. Locations have multiple representations depending on the source.

Drop a few agents into this environment, and they will perform wonderfully at first, because each one is given a curated set of systems to call. Add more agents and the cracks grow, as each one builds its own fragment of truth.

This dynamic has played out before. When business intelligence became self-serve, everyone started creating dashboards. Productivity soared, reports failed to match. Now imagine that phenomenon not in static dashboards, but in AI agents that can take action. With agents, data inconsistency produces real business consequences, not just debates among departments.

Companies that build unified context and robust governance can deploy thousands of agents with confidence, knowing they’ll work together coherently and comply with business rules. Companies that skip this foundational work will watch their agents produce contradictory results, violate policies, and ultimately erode trust faster than they create value.

Leverage agentic AI without the chaos 

The question for enterprises centers on organizational readiness. Will your company prepare the data foundation needed to make agent transformation work? Or will you spend years debugging agents, one issue at a time, forever chasing problems that originate in infrastructure you never built?

Autonomous agents are already transforming how work gets done. But the enterprise will only experience the upside if those systems operate from the same truth. This ensures that when agents reason, plan, and act, they do so based on accurate, consistent, and up-to-date information. 

The companies generating value from AI today have built on fit-for-purpose data foundations. They recognized early that in an agentic world, data functions as essential infrastructure. A solid data foundation is what turns experimentation into dependable operations.

At Reltio, the focus is on building that foundation. The Reltio data management platform unifies core data from across the enterprise, giving every agent immediate access to the same business context. This unified approach enables enterprises to move faster, act smarter, and unlock the full value of AI.

Agents will define the future of the enterprise. Context intelligence will determine who leads it.

For leaders navigating this next wave of transformation, see Relatio’s practical guide:
Unlocking Agentic AI: A Business Playbook for Data Readiness. Get your copy now to learn how real-time context becomes the decisive advantage in the age of intelligence. 

The UK government is backing AI that can run its own lab experiments

20 January 2026 at 08:28

A number of startups and university teams that are building “AI scientists” to design and run experiments in the lab, including robot biologists and chemists, have just won extra funding from the UK government agency that supports moonshot R&D. The competition, set up by ARIA (the Advanced Research and Invention Agency), gives a clear sense of how fast this technology is moving: The agency received 245 proposals from researchers who are already developing tools that can automate significant stretches of lab work.

ARIA defines an AI scientist as a system that can run an entire scientific workflow, coming up with hypotheses, designing and running experiments to test those hypotheses, and then analyzing the results. In many cases, the system may then feed those results back into itself and run the loop again and again. Human scientists become overseers, coming up with the initial research questions and then letting the AI scientist get on with the grunt work.

“There are better uses for a PhD student than waiting around in a lab until 3 a.m. to make sure an experiment is run to the end,” says Ant Rowstron, ARIA’s chief technology officer. 

ARIA picked 12 projects to fund from the 245 proposals, doubling the amount of funding it had intended to allocate because of the large number and high quality of submissions. Half the teams are from the UK; the rest are from the US and Europe. Some of the teams are from universities, some from industry. Each will get around £500,000 (around $675,000) to cover nine months’ work. At the end of that time, they should be able to demonstrate that their AI scientist was able to come up with novel findings.

Winning teams include Lila Sciences, a US company that is building what it calls an AI nano-scientist—a system that will design and run experiments to discover the best ways to compose and process quantum dots, which are nanometer-scale semiconductor particles used in medical imaging, solar panels, and QLED TVs.

“We are using the funds and time to prove a point,” says Rafa Gómez-Bombarelli, chief science officer for physical sciences at Lila: “The grant lets us design a real AI robotics loop around a focused scientific problem, generate evidence that it works, and document the playbook so others can reproduce and extend it.”

Another team, from the University of Liverpool, UK, is building a robot chemist, which runs multiple experiments at once and uses a vision language model to help troubleshoot when the robot makes an error.

And a startup based in London, still in stealth mode, is developing an AI scientist called ThetaWorld, which is using LLMs to design experiments on the physical and chemical interactions that are important for the performance of batteries. Those experiments will then be run in an automated lab.

Taking the temperature

Compared with the £5 million projects spanning two or three years that ARIA usually funds, £500,000 is small change. But that was the idea, says Rowstron: It’s an experiment on ARIA’s part too. By funding a range of projects for a short amount of time, the agency is taking the temperature at the cutting edge to determine how the way science is done is changing, and how fast. What it learns will become the baseline for funding future large-scale projects.   

Rowstron acknowledges there’s a lot of hype, especially now that most of the top AI companies have teams focused on science. When results are shared by press release and not peer review, it can be hard to know what the technology can and can’t do. “That’s always a challenge for a research agency trying to fund the frontier,” he says. “To do things at the frontier, we’ve got to know what the frontier is.”

For now, the cutting edge involves agentic systems calling up other existing tools on the fly. “They’re running things like large language models to do the ideation, and then they use other models to do optimization and run experiments,” says Rowstron. “And then they feed the results back round.”

Rowstron sees the technology stacked in tiers. At the bottom are AI tools designed by humans for humans, such as AlphaFold. These tools let scientists leapfrog slow and painstaking parts of the scientific pipeline but can still require many months of lab work to verify results. The idea of an AI scientist is to automate that work too.  

An AI scientist sits in a layer above those human-made tools and calls on them as needed, says Rowstron. “There’s a point in time—and I don’t think it’s a decade away—where that AI scientist layer says, ‘I need a tool and it doesn’t exist,’ and it will actually create an AlphaFold kind of tool just on the way to figuring out how to solve another problem. That whole bottom zone will just be automated.”

Going off the rails

But we’re not there yet, he says. All the projects ARIA is now funding involve systems that call on existing tools rather than spin up new ones.

There are also unsolved problems with agentic systems in general, which limits how long they can run by themselves without going off the rails and making errors. For example, a study, titled “Why LLMs aren’t scientists yet,” posted online last week by researchers at Lossfunk, an AI lab based in India, reports that in an experiment to get LLM agents to run a scientific workflow to completion, the system failed three out of four times. According to the researchers, the reasons the LLMs broke down included “deviation from original research specifications toward simpler, more familiar solutions” and “overexcitement that declares success despite obvious failures.”

“Obviously, at the moment these tools are still fairly early in their cycle and these things might plateau,” says Rowstron. “I’m not expecting them to win a Nobel Prize.”

“But there is a world where some of these tools will force us to operate so much quicker,” he continues. “And if we end up in that world, it’s super important for us to be ready.”

The Download: digitizing India, and scoring embryos

20 January 2026 at 08:13

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

The man who made India digital isn’t done yet

Nandan Nilekani can’t stop trying to push India into the future. He started nearly 30 years ago, masterminding an ongoing experiment in technological state capacity that started with Aadhaar—the world’s largest digital identity system. 

Using Aadhaar as the bedrock, Nilekani and people working with him went on to build a sprawling collection of free, interoperating online tools that add up to nothing less than a digital infrastructure for society, covering government services, banking, and health care. They offer convenience and access that would be eye-popping in wealthy countries a tenth of India’s size. 

At 70 years old, Nilekani should be retired. But he has a few more ideas. Read our profile to learn about what he’s set his sights on next.

—Edd Gent

Embryo scoring is slowly becoming more mainstream

Many Americans agree that it’s acceptable to screen embryos for severe genetic diseases. Far fewer say it’s okay to test for characteristics related to a future child’s appearance, behavior, or intelligence. But a few startups are now advertising what they claim is a way to do just that.

This new kind of testing—which can cost up to $50,000—is incredibly controversial. Nevertheless, the practice has grown popular in Silicon Valley, and it’s becoming more widely available to everyone. Read the full story

—Julia Black
Embryo scoring is one of our 10 Breakthrough Technologies this year. Check out what else made the list, and scroll down to vote for the technology you think deserves the 11th slot.

Five AI predictions for 2026

What will surprise us most about AI in 2026?

Tune in at 12.30pm today to hear me, our senior AI editor Will Douglas Heaven and senior AI reporter James O’Donnell discuss our “5 AI Predictions for 2026”. This special LinkedIn Live event will explore the trends that are poised to transform the next twelve months of AI. The conversation will also offer a first glimpse at EmTech AI 2026, MIT Technology Review’s longest running AI event for business leadership. Sign up to join us later today! 

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 Europe is trying to build its own DeepSeek
That’s been a goal for a while, but US hostility is making those efforts newly urgent. (Wired $)
Plenty of Europeans want to wean off US technology. That’s easier said than done. (New Scientist $)
DeepSeek may have found a new way to improve AI’s ability to remember. (MIT Technology Review $)

2 Ship-tracking data shows China is creating massive floating barriers
The maneuvers show that Beijing can now rapidly muster large numbers of the boats in disputed seas. (NYT $)
Quantum navigation could solve the military’s GPS jamming problem. (MIT Technology Review)

3 The AI bubble risks disrupting the global economy, says the IMF
But it’s hard to see anyone pumping the brakes any time soon. (FT $)
British politicians say the UK is being exposed to ‘serious harm’ by AI risks. (The Guardian)
What even is the AI bubble? (MIT Technology Review)

4 Cryptocurrencies are dying in record numbers
In an era of one-off joke coins and pump and dump scams, that’s surely a good thing. (Gizmodo)
President Trump has pardoned a lot of people who’ve committed financial crimes. (NBC)

5 Threads has more global daily mobile users than X now
And once-popular alternative Bluesky barely even makes the charts. (Forbes)

6 The UK is considering banning under 16s from social media 
Just weeks after a similar ban took effect in Australia. (BBC)

7 You can burn yourself out with AI coding agents 
They could be set to make experienced programmers busier than ever before. (Ars Technica)
Why Anthropic’s Claude Code is taking the AI world by storm. (WSJ $)
AI coding is now everywhere. But not everyone is convinced. (MIT Technology Review)

8 Some tech billionaires are leaving California 👋
Not all though—the founders of Nvidia and Airbnb say they’ll stay and pay the 5% wealth tax. (WP $)
Tech bosses’ support for Trump is paying off for them big time. (FT $)

9 Matt Damon says Netflix tells directors to repeat movie plots
To accommodate all the people using their phones. (NME)

10 Why more people are going analog in 2026 🧶
Crafting, reading, and other screen-free hobbies are on the rise. (CNN)
Dumbphones are becoming popular too—but it’s worth thinking hard before you switch. (Wired $)

Quote of the day

‘It may sound like American chauvinism…and it is. We’re done apologising about that.”

—Thomas Dans, a Trump appointee who heads the US Arctic Research Commission, tells the FT his boss is deadly serious about acquiring Greenland. 

One more thing

BRUCE PETERSON

Inside the fierce, messy fight over “healthy” sugar tech

On the outskirts of Charlottesville, Virginia, a new kind of sugar factory is taking shape. The facility is being developed by a startup called Bonumose. It uses a processed corn product called maltodextrin that is found in many junk foods and is calorically similar to table sugar (sucrose). 

But for Bonumose, maltodextrin isn’t an ingredient—it’s a raw material. When it’s poured into the company’s bioreactors, what emerges is tagatose. Found naturally in small concentrations in fruit, some grains, and milk, it is nearly as sweet as sucrose but apparently with only around half the calories, and wider health benefits.

Bonumose’s process originated in a company spun out of the Virginia Tech lab of Yi-Heng “Percival” Zhang. When MIT Technology Review spoke to Zhang, he was sitting alone in an empty lab in Tianjin, China, after serving a two-year sentence of supervised release in Virginia for conspiracy to defraud the US government, making false statements, and obstruction of justice. If sugar is the new oil, the global battle to control it has already begun. Read the full story

—Mark Harris

We can still have nice things

A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or skeet ’em at me.)

+ Paul Mescal just keeps getting cooler.
+ Make this year calmer with these evidence-backed tips. ($)
+ I can confirm that Lumie wake-up lamps really are worth it (and no one paid me to say so!)
+ There are some real gems in Green Day’s bassist Mike Dirnt’s favorite albums list.

❌
❌