Today we will take a look at WhatsApp forensics. WhatsApp is one of those apps that are both private and routine for many users. People treat chats like a private conversation, and because it feels comfortable, users often share things there that they would not say on public social networks. Thatβs why WhatsApp is so critical for digital forensics. The app stores conversations, media, timestamps, group membership information and metadata that can help reconstruct events, identify contacts and corroborate timelines in criminal and cyber investigations.
At Hackers-Arise we offer professional digital forensics services that support cybercrime investigations and fraud examinations. WhatsApp forensics is done to find reliable evidence. The data recovered from a device can show who communicated with whom, when messages were sent and received, what media was exchanged, and often which account owned the device. That information is used to link suspects and verify statements. It also maps movements when combined with location artifacts that investigators and prosecutors can trust.
You will see how WhatsApp keeps its data on different platforms and what those files contain.
WhatsApp Artifacts on Android Devices
On Android, WhatsApp stores most of its private application data inside the deviceβs user data area. In a typical layout you will find the appβs files under a path such as /data/data/com.whatsapp/ (or equivalently /data/user/0/com.whatsapp/ on many devices). Those directories are not normally accessible without elevated privileges. To read them directly you will usually need superuser (root) access on the device or a physical dump of the file system obtained through lawful and technically appropriate means. If you do not have root or a physical image, your options are restricted to logical backups or other extraction methods which may not expose the private WhatsApp databases.
Source: Group-IB
Two files deserve immediate attention on Android: wa.db and msgstore.db. Both are SQLite databases and together they form the core of WhatsApp evidence.
Source: Group-IB
wa.db is the contacts database. It lists the WhatsApp userβs contacts and typically contains phone numbers, display names, status strings, timestamps for when contacts were created or changed, and other registration metadata. You will usually open the file with a SQLite browser or query it with sqlite3 to inspect tables. The key tables investigators look for are the table that stores contact records (often named wa_contacts or similar), sqlite_sequence which holds auto-increment counts and gives you a sense of scale, and android_metadata which contains localization info such as the app language.
Source: Group-IB
Wa.db is essentially the address book for WhatsApp. It has names, numbers and a little context for each contact.
Source: Group-IB
msgstore.db is the message store. This database contains sent and received messages, timestamps, message status, sender and receiver identifiers, and references to media files. In many WhatsApp versions you will find tables that include a general information table (often named sqlite_sequence), a full-text index table for message content (message_fts_content or similar), the main messages table which usually contains the message body and metadata, messages_thumbnails which catalogs images and their timestamps, and a chat_list table that stores conversation entries.Β
Be aware that WhatsApp evolves and field names change between versions. Newer schema versions may include extra fields such as media_enc_hash, edit_version, or payment_transaction_id. Always inspect the schema before you rely on a specific field name.
Source: Group-IB
On many Android devices WhatsApp also keeps encrypted backups in a public storage location, typically under /data/media/0/WhatsApp/Databases/ (the virtual SD card)
or /mnt/sdcard/WhatsApp/Databases/ for physical SD cards. Those backup files look like msgstore.db.cryptXX, where XX indicates the cryptographic scheme version.Β
Source: Group-IB
The msgstore.db.cryptXX files are an encrypted copy of msgstore.db intended for device backups. To decrypt them you need a cryptographic key that WhatsApp stores privately on the device, usually somewhere like /data/data/com.whatsapp/files/. Without that key, those encrypted backups are not readable.
Other important Android files and directories to examine include the preferences and registration XMLs in /data/data/com.whatsapp/shared_prefs/. The file com.whatsapp_preferences.xml often contains profile details and configuration values. A fragment of such a file may show the phone number associated with the account, the app version, a profile message such as βHey there! I am using WhatsApp.β and the account display name. The registration.RegisterPhone.xml file typically contains registration metadata like the phone number and regional format.Β
The axolotl.db file in /data/data/com.whatsapp/databases/ holds cryptographic keys (used in the Signal/Double Ratchet protocol implementation) and account identification data. chatsettings.db contains app settings. Logs are kept under /data/data/com.whatsapp/files/Logs/ and may include whatsapp.log as well as compressed rotated backups like whatsapp-YYYY-MM-DD.1.log.gz
These logs can reveal app activity and errors that may be useful for timing or troubleshooting analysis.
Source: Group-IB
Media is often stored in the media tree on internal or external storage:
/data/media/0/WhatsApp/Media/WhatsApp Images/ for images,
/data/media/0/WhatsApp/Media/WhatsApp Voice Notes/ for voice messages (usually Opus format), WhatsApp Audio, WhatsApp Video, and WhatsApp Profile Photos.
Source: Group-IB
Within the appβs private area you may also find cached profile pictures under /data/data/com.whatsapp/cache/Profile Pictures/ and avatar thumbnails under /data/data/com.whatsapp/files/Avatars/. Some avatar thumbnails use a .j extension while actually being JPEG files. Always validate file signatures rather than trusting extensions.
If the device uses an SD card, a WhatsApp directory at the cardβs root may store copies of shared files (/mnt/sdcard/WhatsApp/.Share/), a trash folder for deleted content (/mnt/sdcard/WhatsApp/.trash/), and the Databases subdirectory with encrypted backups and media subfolders mirroring those on internal storage. The presence of deleted files or .trash folders can be a fruitful source of recovered media.
A key complication on Android is manufacturer or custom-ROM behavior. Some vendors add features that change where app data is stored. For example, certain Xiaomi phones implement a βSecond Spaceβ feature that creates a second user workspace. WhatsApp in the second workspace stores its data under a different user ID path such as /data/user/10/com.whatsapp/databases/wa.db rather than the usual /data/user/0/com.whatsapp/databases/wa.db
As things evolve and change, you need to validate the actual paths on the target device rather than assuming standard locations.
WhatsApp Artifacts on iOS Devices
On iOS, WhatsApp tends to centralize its data into a few places and is commonly accessible via device backups. The main application database is often ChatStorage.sqlite located under a shared group container such as /private/var/mobile/Applications/group.net.whatsapp.WhatsApp.shared/ (some forensic tools display this as AppDomainGroup-group.net.whatsapp.WhatsApp.shared).
Source: Group-IB
Within ChatStorage.sqlite the most informative tables are often ZWAMESSAGE, which stores message records, and ZWAMEDIAITEM, which stores metadata for attachments and media items. Other tables like ZWAPROFILEPUSHNAME and ZWAPROFILEPICTUREITEM map WhatsApp identifiers to display names and avatars. The table Z_PRIMARYKEY typically contains general database metadata such as record counts.
Source: Group-IB
iOS also places supporting files in the group container. BackedUpKeyValue.sqlite can contain cryptographic keys and data useful for identifying account ownership. ContactsV2.sqlite stores contact details: names, phone numbers, profile statuses and WhatsApp IDs. A simple text file like consumer_version may hold the app version and current_wallpaper.jpg (or wallpaper in older versions) contains the background image used in WhatsApp chats. The blockedcontacts.dat file lists blocked numbers, and pw.dat can hold an encrypted password. Preference plists such as net.whatsapp.WhatsApp.plist or group.net.whatsapp.WhatsApp.shared.plist store profile settings.
Source: Group-IB
Media thumbnails, avatars and message media are stored under paths like /private/var/mobile/Applications/group.net.whatsapp.WhatsApp.shared/Media/Profile/ and /private/var/mobile/Applications/group.net.whatsapp.WhatsApp.shared/Message/Media/. WhatsApp logs, for example calls.log and calls.backup.log, often survive in the Documents or Library/Logs folders and can help establish call activity.
Because iOS devices are frequently backed up through iTunes or Finder, you can often extract WhatsApp artefacts from a device backup rather than needing a full file system image. If the backup is unencrypted and complete, it may include the ChatStorage.sqlite file and associated media. If the backup is encrypted you will need the backup password or legal access methods to decrypt it. In practice, many investigators create a forensic backup and then examine the WhatsApp databases with a SQLite viewer or a specialized forensic tool that understands WhatsApp schema differences across versions.
Practical Notes For Beginners
From the databases and media files described above you can recover contact lists, full or partial chat histories, timestamps in epoch format (commonly Unix epoch in milliseconds on Android), message status (sent, delivered, read), media filenames and hashes, group membership, profile names and avatars, blocked contacts, and even application logs and version metadata. It helps us understand who communicated with whom, when messages were exchanged, whether media were transferred, and which accounts were configured on the device.
For beginners, a few practical cautions are important to keep in mind. First, always operate on forensic images or copies of extracted files. Do not work directly on the live device unless you are performing an approved, controlled acquisition and you have documented every action. Second, use reliable forensic tools to open SQLite databases. If you are parsing fields manually, confirm timestamp formats and time zones. Third, encrypted backups require the deviceβs key to decrypt. The key is usually stored in the private application area on Android, and without it you cannot decode the .cryptXX files. Fourth, deleted chats and files are not always gone, as databases may leave records or media may remain in caches or on external storage. Yet recovery is never guaranteed and depends on many factors including the time since deletion and subsequent device activity.
When you review message tables, map the message ID fields to media references carefully. Many WhatsApp versions use separate tables for media items where the actual file is referenced by a media ID or filename. Thumbnail tables and media directories will help you reconstruct the link between a textual message and the file that accompanied it. Pay attention to the presence of additional fields in newer app versions. These may contain payment IDs, edit history or encryption metadata. Adapt your queries accordingly.
Finally, because WhatsApp and operating systems change over time, always inspect the schema and file timestamps on the specific evidence you have. Do not assume field names or paths are identical between devices or app versions. Keep a list of the paths and filenames you find so you can reproduce your process and explain it in reports.
Summary
WhatsApp forensics is a rich discipline. On Android the primary artifacts are the wa.db contacts database, the msgstore.db message store and encrypted backups such as msgstore.db.cryptXX, together with media directories, preference XMLs and cryptographic key material in the app private area. On iOS the main artifact is ChatStorage.sqlite and a few supporting files in the app group container and possibly contained in a device backup. To retrieve and interpret these artifacts you must have appropriate access to the device or an image and know where to look for the app files on the device you are examining. Also, be comfortable inspecting SQLite databases and be prepared to decrypt backups where necessary.
If this kind of work interests you and you want to understand how real mobile investigations are carried out, you can also join our three-day mobile forensics course. The training walks you through the essentials of Android and iOS, explains how evidence is stored on modern devices, and shows you how investigators extract and analyze that data during real cases. You will work with practical labs that involve hidden apps, encrypted communication, and devices that may have been rooted or tampered with.Β
Terry Gerton: Well, I want to talk about your new book that talks about the U.S. campaign against the Sinaloa cartel and its Chinese chemical suppliers. This tells a story that a lot of people donβt know. So begin by filling us in on the background.
Jake Braun:Β Sure. So I initially got the idea to write this when I was sitting with some HSI, Homeland Security Investigations folks, at a retreat that we were doing in Crystal City, actually talking about fentanyl. And one of the guys there starts going into the takedown of El Chapo. And itβs just a fascinating story. And I had no idea how much HSI was involved in this. Obviously, the DEA was super involved as well. And he goes into all the wiretapping they were doing and working with the Mexican Marines and all this stuff to get them. And so he mentioned this group called the TCIUs, the Transnational Criminal Units that HSI pays. These are elite in Mexicoβs case, Mexican law enforcement officials that are on the U.S. payroll and kick down doors for our brave men and women out there. So went down to meet some of them and we sat at the top of the Sofitel Hotel, which is right next to the U.S.-Mexico embassy. Itβs where all the government people sit, and weβre sitting there with these agents and their HSI handlers and itβs like a rooftop thing. Weβre drinking beers and by the end of the night, doing shots of tequila and everything. And these guys are showing me pictures in their phones of like them taking down these huge Sinaloa cartel groups, and theyβve got like guys with balaclavas all handcuffed next to these helicopters and so on. And talking about the shootouts theyβre in and everything else and I was like, βOh my God, somebodyβs got to tell this story.β And so I just kind of started writing down what we were doing every week and eventually it turned into this book. But really, thereβs kind of three main pieces to it. One is really just an assessment of HSI and really just what theyβve become as an organization. And at least at that time, really just fascinating everything that theyβve done in the last 15 years or so since they were stood up. But then also that fentanyl is not a redux of the crack cocaine epidemic. Most people who are taking fentanyl donβt know theyβre taking it. So it really is more like a mass poisoning than anything else. And then finally, as I came to find out when I was with HSI and their TCIUs and so on, just the complete transformation from a corporate perspective that the Sinaloa cartel has gone through over the last decade or so and how that is so responsible for what weβre facing with fentanyl today. So it was really a fascinating journey for me and hopefully, Iβve been able to pull back the curtain and for folks and add some interesting color to make it a cool kind of thriller type story while also going into some really kind of heavy topics.
Terry Gerton:Β Well, letβs take those three that you mentioned and sort them in order because this operation that you describe is really an unusual collaboration across agencies and across countries. What surprised you most about how that team was formed and how it operated?
Jake Braun:Β Well, it was really interesting in the sense that for most of history, law enforcement has looked at criminal organizations from kind of a kingpin strategy, right? Itβs like in Chicago, where Iβm from, they go in and they take down Al Capone and like help decapitates the mob here and everything else. Well, Sinaloaβs been around for over a century. They can outfight the government in parts of Mexico and theyβre as big as a Fortune 50 company. Weβve taken out almost every head of the cartel theyβve ever had, and theyβre stronger today than theyβve ever been. And so we started putting together a counter network approach, looking at it from a counterterrorism perspective, the way we took out ISIS or al-Qaida as a network, as opposed to trying to just take out bin Laden or one of the terrorist leaders, but trying to go after the network. And that really required a whole-of-government approach. So it wasnβt just HSI or DEA. I mean, they were in many ways the tip of the spear, but we had massive involvement from the intelligence community, the military from an intel perspective, obviously DEA, other parts of DOJ, and nearly every part of DHS, whether it be CBP, Coast Guard, Intel and Analysis, et cetera. And so the meetings we had on this, it was really a cast of everybody and anybody who had worked in the War on Terror because it was really kind of the same approach that we took to stand up this operation against Sinaloa. And by the way, in the first year after we launched the effort, which we launched in β23, fentanyl fatalities went down by 37% in 2024. So we think itβs working and the current administration, I think, has picked up the many places where we left off and, hopefully, weβll see the deaths further decline in coming years.
Terry Gerton:Β Iβm speaking with Jake Braun. Heβs the executive director of the Cyber Policy Initiative at the University of Chicago Harris School of Public Policy. Well, letβs come back to that for a minute, and thatβs your second point. You mentioned fentanyl is really not so much a drug as it is a mass poisoning, but also the impact of this operation, reducing fentanyl deaths by about 40%. What are the key takeaways from those points? How should they impact what weβre thinking about in terms of national policy?
Jake Braun:Β Sure. So first off, people might view saying itβs a mass poisoning as somewhat hyperbolic, but I really donβt believe it is. And this is something else that I really did not know until I started working on this. Almost everybody, even drug users, avoids fentanyl like the plague. But the way they wind up dying from fentanyl almost always is that it is cut into something else that theyβre taking. Now sometimes itβs cut into other drugs, which of course folks shouldnβt be doing, but they also donβt deserve to die for it. Oftentimes though, itβs cut into fake prescription pills that folks are given from a friend or thereβs these horrible stories about a kid whoβs studying for finals in college and they want to take an Adderall or a Xanax or something like that and they take one from a friend thinking itβs real. Oftentimes, the friend thought it was real, too. And it turns out it has fentanyl in it and they die from one dose. Thatβs where this is this again is not the crack cocaine epidemic. People are dying who donβt even know that theyβre taking these drugs. And from a public policy perspective, I think that requires a very different approach for how we inform potential victims to not take the drug. It canβt just be like, βHey, donβt take fentanyl.β Nobodyβs trying to take fentanyl. Itβs you canβt really take anything that you donβt know exactly where it came from, even prescription pills. Not prescription pills you get from a pharmacy, but from a friend or a colleague or whatever. So thatβs one major difference in how public policy needs to really think through how to address this. When it comes to the kind of counternetwork approach that we took and looking at Sinaloa, what again was so fascinating to me that I did not know going into this was that Sinaloa has completely changed its business model in the last decade. So it was an essentially a Fortune 50 company that had two main commodities that sold marijuana and cocaine. Well, marijuana, weβve mostly legalized in the country and even in states where itβs not legal, theyβre getting it from another state that is, generally not the Sinaloa cartel. And cocaine, which used to be incredibly popular back in the 80s, about 7% of the population reported doing it in any given month, itβs now down to 0.3% of the population is doing cocaine. So itβs like if you went to McDonaldβs and said, βOh, guess what? Nobody is going to buy your hamburgers and french fries anymore.β I mean, what would they do? So what Sinaloa did is theyβve taken over the migration trade. I mean, you cannot cross the border in the United States or into the United States or Mexico unless you pay Sinaloa or their main rival, CJNG. That is a big shift. That is not the way migration worked years ago. And then separately, since cocaine and marijuana arenβt making money for them anymore, they figured out how to both cut fentanyl into the drugs they have to increase their margins. But also they got into this illicit prescription drug market and of course they did that right at the heels of us weaning the population off of oxycotton and other drugs that had plagued society for well over a decade. And they filled that void, which was something that a space they were not in before. And thatβs made this so much more tragic is their entrance into the illicit prescription drug market.
Terry Gerton: What is the implication of Sinaloaβs realignment on U.S. operations in the Caribbean right now on our counter drug operations?
Jake Braun:Β Well, I think that theyβve largely moved, they and the other cartels have largely moved a lot of their operations out of the Caribbean because itβs easier for them to smuggle things across the border via tunnels, drones and so on and so forth. Thereβs still some, donβt get me wrong, but most of what theyβre doing is not in the Caribbean. That being said, they have really dramatically stepped up their efforts to try and get fentanyl into the country from any vantage point, including the Caribbean. I think that whatβs critically important with stopping what theyβre doing is to really focus specifically on fentanyl, because no administration, Democrat, Republican, Libertarian, Green Party, no administration will ever end criminality. That has been around since humans have existed. Itβs not going to stop. But we could end fentanyl and I think if we were able to turn up the heat so high and really just put our boot on the throat of Sinaloa the way we did on al-Qaida and ISIS, they would stop selling fentanyl because they could sell all the other stuff they do, and weβd relegate this back to normal cops and robbers the way we have before with all the other illicit things they do like racketeering and prostitution and other drugs and so on, things far less deadly than fentanyl. But without a real direct focus on fentanyl, I donβt see a world in which kind of a broader approach is really going to end this one issue. And the idea that weβre going to end the Sinaloa cartel in general, them or rivals will come in and take their place later. But again, if we focus narrowly on fentanyl, I think we could end this epidemic in the United States. And there and thereβs a moment for this right now. I think the president has shut down the border and or at least shut down illegal crossings. He fulfilled his top campaign promise basically already. And so weβre in a moment now where they really could turn their attention to specifically stopping this horrible epidemic thatβs killing so many people.
Terry Gerton:Β That sounds like a policy recommendation.
Glassine envelopes used to package fentanyl pills or fentanyl powder are displayed at a Drug Enforcement Administration (DEA) research laboratory on Tuesday, April 29, 2025, in Northern Virginia. (AP Photo/Mark Schiefelbein)
When the Palestinian stem-cell scientist Jacob Hanna was stopped while entering the US last May, airport customs agents took him aside and held him for hours in βsecondary,β a back office where you donβt have your passport and canβt use your phone. There were two young Russian women and a candy machine in the room with him. Hanna, who has a trim beard and glasses and holds an Israeli passport, accepted the scrutiny. βItβs almost like you are under arrest, but in a friendly way,β he says. He agreed to turn over his phone and social media for inspection.Β Β
βThey said, βYou have the right to refuse,ββ he recalls, βand I said, βNo, no, itβs an open book.ββ
The agents scrolling through his feeds would have learned that Hanna is part of Israelβs small Arab Christian minority, a nonbinary LGBTQ-rights advocate, and an outspoken critic of the Gaza occupation, who uses his social media accounts to post images of atrocities and hold up a mirror to scientific colleagues including those at the Weizmann Institute of Science, the pure-science powerhouse where he worksβIsraelβs version of Caltech or Rockefeller University. In his luggage, they would have found his keffiyeh, or traditional headscarf, which Hanna last year vowed to wear at lecture podiums on his many trips abroad.
Hanna had been stopped before; he knew the routine. Anything to declare? Any biological samples? But this time the agentsβ questions touched on a specific new topic: embryos.
Weeks earlier, a Harvard University researcher had been arrested for having frog embryos in her luggage and sent to a detention center in Louisiana. Hanna didnβt have any specimens from his lab, but if he had, it would have been surprisingly hard to say what they were. Thatβs because his lab specializes in creating synthetic embryo models, structures that resemble real embryos but donβt involve sperm, eggs, or fertilization.Β
Instead of relying on the same old recipe biology has followed for a billion years, give or take, Hanna is coaxing the beginnings of animal bodies directly from stem cells. Join these cells together in the right way, and they will spontaneously attempt to organize into an embryoβa feat thatβs opening up the earliest phases of development to scientific scrutiny and may lead to a new source of tissue for transplant medicine.
Soon it could be difficult to distinguish between a real human embryoβthe kind with legal protectionsβand one conjured from stem cells.
In 2022, working with mice, Hanna reported heβd used the technique to produce synthetic embryos with beating hearts and neural foldsβgrowing them inside small jars connected to a gas mixer, a type of artificial womb. The next year, he repeated the trick using human cells. This time the structures were not so far developed, still spherical in shape. Nonetheless, they were incredibly realistic mimics of a two-week-old human embryo, including cells destined to form the placenta.Β
These sorts of models arenβt yet the same as embryos. Itβs rare that they form correctlyβit takes a hundred tries to make oneβand they skip past normal steps before popping into existence. Yet to scientists like the French biologist Denis Duboule, Hannaβs creations are βentirely astonishing and very disturbing.β Soon, Duboule expects, it could be difficult to distinguish between a real human embryoβthe kind with legal protectionsβand one conjured from stem cells.Β
Hanna is the vanguard of a wider movement thatβs fusing advanced methods in genetics, stem-cell biology, and still-Βprimitive artificial wombs to create bodies where theyβve never grown beforeβoutside the uterus. Joining the chase are researchers at Caltech, the University of Cambridge, and Rockefeller in New York, as well as a growing cadre of startup companies with commercial aims. Thereβs Renewal Bio, a startup Hanna cofounded, which hopes to grow synthetic embryos as a source of youthful replacement cells, such as bits of liver or even eggs. In Europe, Dawn Bio has started placing a type of embryo model called a blastoid on uterine tissue. That will light up a pregnancy test and could, the company thinks, provide new insights into IVF medicine. Patent offices in the US and Europe are seeing a flood of claims as universities grasp for exclusive commercial control over these new types of beings.Β
Jacob Hanna leads a team at the Weizmann Institute of Science in Rehovot, Israel, that is studying how to create embryos without using sperm, eggs, or fertilization. Heβs cofounded a startup company, Renewal Bio, that has plans to use these synthetic embryo models as bioprinters to produce youthful tissue, but ethical questions surround the project.
AHMAD GHARABLI/GETTY IMAGES
Hanna declined a request to discuss his research for this story. But for the last three years, MIT Technology Review has followed Hanna across online presentations, lecture halls, and two in-person ethics meetings, both organized by the Global Observatory for Genome Editing, a public consultation project where he agreed to engage with religious scholars, bioethicists, and other experts. What emerged is a remarkable picture of a scientist working at a Nobel Prize level but whose research, though approved by his institution, raises serious long-term ethical questions.
Exactly how far Hanna has taken his models of the human embryo is an open question. According to public comments from Renewal Bio, the answer is at least 28 days. But itβs possibly further. One scientist in contact with the company said he thought theyβd reached close to day 40, a point where you would see the beginning of eyes and budding limbs. Renewal did not respond to a request for comment.
But even if he hasnβt gotten that far yet, Hanna intends to. His team is βtrying to make entities at more advanced stagesβdepending on the goal, it could be day 30 in development, day 40, or day 70,β he told an audience last May in Cambridge, Massachusetts, where heβd traveled to join a panel discussion involving religious scholars and social scientists at the Global Observatoryβs annual summit. The more advanced versions would be similar in size and development to a fetus in the third month of pregnancy.Β
O. Carter Snead, a bioethicist from the University of Notre Dame who led the panel featuring Hanna, approached me afterward to ask if Iβd heard what the scientist had said. Snead was surprised that Hanna had so frankly disclosed his goals and that no one had objected, or maybe even grasped what it meant. Perhaps, Snead thinks, this technology wonβt sink in until people can see it with their own eyes. βIf you had one of these spinning bottles with something that looked like a human fetus inside it, I think youβd get peopleβs attention,β he says. βThatβs going to be like, whoaβwhat are we doing?β
Snead, a Catholic who sits on a panel that advises the Vatican, also was not comforted by Hannaβs plan to make sure his models, if they advance to later stages of development, will pass ethical scrutiny. That plan involves blocking the formation of the head, brain, or perhaps heart of the synthetic structures, by means including genetic modification. If thereβs no brain, Hannaβs reasoning goes, thereβs no awareness, no person, and no foul. Just a clump of organs.
Snead says thatβs not the same standard of humanity he knows, which treats all humans the same, regardless of their intellectual capacity or anything else. βWhat is considered human? Who is considered human?β wonders Snead. βItβs whoβs in and whoβs out. There is a dramatic consequence of being in versus out of the boundaries of humanity.β
The beginnings of bodies
Each of usβme, you the reader, and Jacob Hannaβstarted as a fertilized egg, a single cell thatβs able to divide and dynamically carry out a program to build a complete body with all its organs and billions of specialized cells. Science has long sought ways to seize on that dramatic potential. A first step came in the 1990s, when scientists were able to isolate powerful stem cells from five-day-old embryos created through in vitro fertilizationβand keep them growing in their labs. These embryonic stem cells had the inherent potential to become any other type of cell. If they could be directed in the lab to form, for example, neurons or the insulin-making cells that diabetics need, that would open up a way to treat disease using cell transplants.Β
A side-by-side comparison of synthetic (left) and natural (right) mouse embryos shows similar formation of the brain and heart.
AMADEI AND HANDFORD/UNIVERSITY OF CAMBRIDGE
But these lab recipes are often unsuccessful, which explains the general lack of new stem-cell treatments. βThe sad truth is that over 25 years that weβve been working on this problem, there are about 10 cell types you make that have reasonable function,β says Chad Cowan, chief scientific officer of the stem-cell company Century Therapeutics. If we think of the body as a car, he explains, βweβve got only spark plugs. We maybe have some tires.β The bodyβs most potent blood-forming cells in particular βnever appear,β according to Cowan, even though biotech companies have spent millions trying to make them.
Hannaβs startup plans to use synthetic embryos as a kind of βbioprinter,β producing medically valuable cells in cases where other methods have failed.
It turns out, though, that stem cells retain a natural urge to work together. Scientists began to notice that, when left alone, the cells would join into blobs, tubes, and cavitiesβsome of which resembled parts of an embryo.Β
Early versions of these structures were crude, even just a swirling film of cells on a glass slide. But each year, they have grown more realistic. By 2023, Hanna was describing what he called a βbona fideβ human embryo model that was βfully integrated,β with all the major parts arranged in an architecture that was hard to distinguish from the real thing.Β
His company, Renewal, plans to use these synthetic embryos as a kind of βbioprinter,β producing medically valuable cells in cases where other methods have failed. This could be particularly valuable if the synthetic embryos are a perfect match with a patientβs DNA. And thatβs possible too: These days reprogramming anyoneβs skin cells into stem cells is easily done. Hanna has tried it on himself, transforming his own cells into synthetic embryos.Β
Hannaβs research, and that of other groups, has at times collided with a powerful scientific body called the International Society for Stem Cell Research, or ISSCR, a self-governance organization that sets boundaries about what research can and canβt be published and what terminology to use. Thatβs to shield scientists from sensational headlines, public backlash, or the reach of actual regulators.Β
The organization has taken a particularly categorical position on structures made from stem cells, saying they are mere βmodels.β According to a statement it fired off in 2023, βembryo models are neither synthetic nor embryosββand, it added, they βcannot and will not develop to the equivalent of postnatal stage human.βΒ
Many scientists, including Hanna, agree no one should ever try to make a stem-cell baby. But he is fairly certain these structures will become more realistic and can grow further. In fact, that may be the real test of what an embryo is: whether it can dynamically keep reaching new stages of development, especially organogenesis, or the first emergence of organs. The language in the ISSCR statement, he complained, was βbrainwashing.βΒ
Replacement parts
Most of the commercial projects involving synthetic embryos are doomed to a short and fitful life as the technology proves too difficult or undeveloped. But the idea isnβt going away. Instead, there are signals itβs getting bigger, and weirder. In an editorial published in March by MIT Technology Review, a group of Stanford scientists put forward a proposal for what they called βbodyoids,β arguing that stem cells and artificial wombs may lead to an βunlimited sourceβ of nonsentient human bodies for use in drug research or as organ donors. One of its authors, Henry Greely, among the foremost bioethicists in the US, posted on Bluesky that even though the idea gives him βsome creeps,β he added his name because he feels it is plausible enough to need discussion, and βsoon.β
These sorts of plansβreal or rumoredβhave gotten the attention of the stem-cell police, the ISSCR. This June, an ethics committee led by Amander Clark, a fetal specialist at UCLA and a past president of the society, wrote that it had become aware of βcommercial and other groups raising the possibility of building an embryo in vitroβ and bringing it to viability inside βartificial systems.β Though the ISSCR had previously decreed that embryo models βcannot and will notβ develop to term, it now declared efforts aiming at viability βunsafe and unethical,β placing them in a βprohibitedβ category. It added that the ban would cover βany purpose: reproductive, research, or commercial.βΒ
Blurred boundaries
Clark and her colleagues are right that, for the foreseeable future, no one is going to decant a full-term baby out of a bottle. Thatβs still science fiction. But thereβs a pressing issue that needs to be dealt with right now. And thatβs what to do about synthetic embryo models that develop just part of the wayβsay for a few weeks, or months, as Hanna proposes.Β
Because right now, hardly any laws or policies apply to synthetic embryos. One reason is their unnatural origin: Because these entities donβt start with conception and grow in labs, most existing laws wonβt cover them. That includes the Fetus Farming Prohibition Act, legislation passed unanimously in 2006 by the US Congress, which sought to prevent anyone from growing a fetus for its organs. But that law references βa human pregnancyβ and a βuterusββand there would be neither if a synthetic embryo were grown in a mechanical vessel.Β
Another policy under pressure is the β14-day rule,β a widely employed convention that natural embryos should not be grown longer than two weeks in the lab. Though itβs a mostly arbitrary stopping point, itβs been convenient for laboratory scientists to know where their limit is. But that rule isnβt being applied to the embryo models. For instance, even though the United Kingdom has a 14-day rule enshrined in law, that legislation doesnβt define what an embryo is. To scientists working on models, thatβs a critical loophole. If the structures arenβt considered true embryos, then the rule doesnβt apply. Β
Last year, the University of Cambridge, in the UK, described the situation as a βgrey areaβ and said it βhas left scientists and research organisations uncertain about the acceptable boundaries of their work, both legally and ethically.βΒ
Researchers at the university, which is a hot spot for human embryo models, have been working with one that has advanced features, including beating heart cells. But the appearance of distinctive features under their microscopes is unsettlingβeven to scientists. βI was scared, honestly,β Jitesh Neupane, who led that work, told the Guardian in 2023. βI had to look down and look back again.βΒ
That particular stem-cell model isnβt completeβit entirely lacks placenta cells and a brain. So itβs not a real embryo. But it could get ever trickier to insist the models donβt count, given the accelerating race to make them more realistic. To Duboule, scientists are caught in a βfoolβs paradoxβ and a βrather unstable situation.β
Even incomplete models raise the question of where to draw the line. Should you stop when it can feel pain? When itβs just too human-looking for comfort? Scientific leaders may soon have to decide if there are βmorally significantβ human featuresβlike hands or a faceβthat should be avoided, whether the structure has a brain or not. βI personally think there should be regulation, and many in the field believe this too,β says Alejandro De Los Angeles, a stem-cell biologist affiliated with the University of Central Florida.Β
βI always live in fear that I might find myself embroiled in some kind of a scandal β¦ Things can shift very quickly for political reasons.β
Jacob Hanna
Hanna says he has all the necessary approvals in Israel to carry his work forward. But he also worries that the ground rules could change. βIβm almost the only one [in Israel] doing these kinds of experiments, and I always live in fear that I might find myself embroiled in some kind of a scandal,β he says. βThings can shift very quickly for political reasons.βΒ
And his statements about the situation in Gaza have made him a target. Heβs gotten voicemails wondering why a Weizmann professor is so sympathetic to Palestine, and once when he returned from a trip, someone had tucked an Israeli army beret into the door handle of his car. Last year, he says, political opponents even went after his science by filing a complaint that his research was illegal.
What is clear is that Hanna, who is gregarious and attentive, has worked to cultivate a large group of friends and allies, including religious authoritiesβall part of a campaign to explain the science and hear out other views. He says he got a perfect grade in a bioethics class with a rabbi, conferenced with a priest from his hometown in Galilee, and even paid his respects to an Orthodox professor at a conservative hospital in Jerusalem. βIt was unofficial. I didnβt have to get a permit from him,β Hanna says. βBut β¦ what does he think? Can I get him on board? Do I get a different opinion?βΒ
βI really do think itβs admirable that he is willing to ask these hard questions about what it is that heβs doing. I think that makes him different,β says Snead. βBut if you are cynical, you could ask if his focus on the ethical dimension of this is more of a branding exercise.β Perhaps, Snead says, itβs a way to market the structures as the βgreen, sustainable alternative to embryos.β
A heartbeat in a jar
To admirers, Hanna is a doctor and researcher βheads above the rest,β according to Eli Adashi, the former dean of Brown Universityβs medical school. βHeβs very unusual, very special, and is making major discoveries that canβt be ignored,β Adashi says. βHeβs one of those unusually talented people that exceed the capacity of us mortals, and it all emanates from a town in Galilee that no one knows exists.β
While it is something of a rarity for a Palestinian to rise so high in Israelβs ivory tower, in reality Hanna has an elite backgroundβheβs from a family of MDs, and an uncle, Nabil Hanna, co-developed the first antibody drug for cancer, the blockbuster rituximab.
Since the October 7 attack on Israel by Hamas, Israel has been at war in Gaza, and Hannaβs team has felt the effects. One young scientist dropped his pipette to don an IDF uniform. Another trainee, who is from Gaza, had a brother and other family members struck dead by an Israeli missile that hit near a church where people were sheltering. Then, this June, an Iranian ballistic missile hit the grounds of the Weizmann Institute, shattering windows and walls and sending Hannaβs students scrambling to save research.Β
Despite delays in his research due to the ongoing conflict, Hannaβs ideas and technologies are being exportedβand emulated. One place to see a version of the artificial womb is at the Janelia Research Campus, in Virginia, where one of Hannaβs former students, Alejandro Aguilera CastrejΓ³n, now operates a lab of his own. Aguilera CastrejΓ³n, for whom science was a ticket out of the poor outskirts of Mexico City, has tattoos from his wrists to his elbows; the newest depicts a hydra, a sea polyp noted for being able to regenerate itself from a few cells.
During a visit in June, Aguilera CastrejΓ³n flipped aside a black cover to reveal the incubator: a metal wheel that slowly turned, gently agitating jars filled with blood serum. Inside one, a mouse embryo driftedβa tiny, translucent shape, curved like a comma. Then, awesomely, a red-colored blob expanded in its center. A heartbeat.Β
That day, it was a normal mouse embryo in the jarβit had been transferred there to see how far it would grow. Aguilera CastrejΓ³n has the goal of eventually birthing a mouse from an incubator, a process called ectogenesis. But the stem-cell embryos donβt grow as well or as long, he says. The problem isnβt just the challenge of growing them in culture jars. Thereβs probably some kind of fundamental disorganization. They arenβt entirely normalβnot yet true embryos.
A rotating bioreactor, developed in Israel, is used to grow synthetic embryos in small jars of blood serum.
GETTY IMAGES
Aguilera CastrejΓ³n, who spent eight years at Weizmann contributing to Hannaβs research, is skeptical that the human version of the technology is ready for commercialization. For one thing, itβs inefficient. In every 100 attempts to make a synthetic embryo, the desired structure will form only once or twice. The rest are disorganized blobsβcloser to βhuevos fritosβ than real embryos, he says. βI do think the human embryo model will go further, but it could take years,β he adds.
In Aguilera CastrejΓ³nβs view, Hanna is well placed to lead that work. One reason is that Israel offers a relatively permissive environmentβand so does Jewish thought. In the Talmud, the embryo is considered βmere waterβ until the 40th day. Plus, Hanna is already successful. βSome people arenβt allowed to do it. And some people want to do it, but they canβt,β says Aguilera CastrejΓ³n. βJacob wants to make it as realistic as possible and go as far as possibleβthat is his aim. Heβs very ambitious and wants to tackle very big things people donβt dare to do. He really wants to do something big. His main aim is always to grow them as far as you can.βΒ
The first payoff of a technology for mimicking embryos this way is a new view of the unfolding human no one has ever had before. Real human embryos are rarely seen at the early stages, since theyβre inside the wombβand at four or five weeks, many people donβt even know theyβre pregnant. Itβs been a black box. But synthetic models of the embryo can be made in the thousands (depending on the type), studied closely, inspected with modern microscopes, and subjected to dyes and genetic engineering tools, all while theyβre still alive. Add a known toxic chemical that causes birth defects, like thalidomide, and you can closely trace the effects. βSince we donβt have a way to peer into the uterus, this allows us to watch things as if they are intrauterine but are not,β says Adashi, the former Brown dean and a fertility doctor.Β
Whatβs more, a synthetic embryo may be able to make cells correctlyβjust as a real one doesβand make all types at once, expanding on the limited few that scientists can create from stem cells today. While not all embryonic material is useful to medicine, the blood-forming cells in an embryo are known to be particularly potent. In mice, they can be extracted and multipliedβand if transplanted into a mouse subjected to lethal radiation, they will save it.Β
Hanna imagines a cancer patient who needs a bone marrow transplant but canβt find a match. Could blood-forming cells be harvested from, say, 100 or 500 embryo-stage clones of that person, providing perfectly matched tissue?Β
In his cost-benefit analysis, he believes the chance to save lives outweighs the moral risk of growing embryo models for a month, which is about how long it takes for key blood cells to form. At that stage, says Hanna, he thinks βthere is still no personification of the embryoβ and itβs permissible to use them in research.
Young everything
Hanna cofounded Renewal in 2022 with Omri Amirav-Drory, a venture capitalist whose fund, NFX, raised about $9 million for the company and purchased rights to Weizmann patents. The startupβs idea is to create synthetic embryos from the cells of patients, allowing them to grow for weeks or months to produce what Amirav-Drory calls βperfect cellsβ for transplant. That is because the synthetic structure, as a clone, would contain βyoung, genetically identical everything.β
Speaking at an event for tech futurists last year near San Francisco, Amirav-Drory flashed a picture of pregnancy tests used on the synthetic embryos. βWe even went to CVS,β he said, βand by day eight itβs already triggering a pregnancy test. So itβs alive.β Β
Amirav-Drory is a fan of Peter F. Hamilton, the science fiction author whose Commonwealth series features a society where space colonists transfer their minds into cloned bodies, attaining second lives. And heβs pitched Hannaβs technology along related lines, as a new type of longevity medicine based on replacing old cells with young ones. He is convinced Hannaβs work is βmagicβ thatβs sure to win a Nobel.
βThe importance of getting rid of the head is all ethical. It just means we can make all these bodies and organ structures without having to cross ethical lines or harm sentient living beings.β
Carsten Charlesworth, researcher, Stanford University
But he knows the startup has both technical and ethical challenges. The technical challenge is that once the synthetic embryos reach a certain size and age, the incubator canβt support them any longer. Thatβs because they lack a blood supply and need to absorb oxygen and nutrients from their surroundings; they starve once they get too big. One idea being considered is to add a feeding tube, but that involves microsurgery and isnβt easily scalable. The ethical issue is also age related: The more developed they become, the more they will be recognizably human, with the beginnings of organs and small, webbed fingers and toes. βNo one has a problem with day 14, but the further we go, the further it looks like a baby, and we get into trouble. So how do we solve that?β Amirav-Drory asked a different audience, in Menlo Park.
The solution, so far, is a neural knockoutβgenetic changes made to the embryoids so they donβt develop a brain. The group has already tried out the concept on mice, removing a gene called LIM-1. That yielded a headless mouse, which looks a bit like a pink thumb, except with little claws and a tail. Those mice wonβt live after birth, but they can develop in the womb. βWe got synthetic mouse embryos growing with no head, with no brain,β Amirav-Drory said in Menlo Park. βItβs just to show you where we can go to solve both technical and ethical issues.βΒ
The idea of brain removal is a surprisingly active area of researchβsuggesting that itβs no sideshow. Working with mice, for example, Nakauchiβs team at Stanford is currently testing several different genetic changes to see if they can consistently yield an animal with no brain or head, but whose other tissues are normal. βThe importance of getting rid of the head is all ethical. It just means we can make all these bodies and organ structures without having to cross ethical lines or harm sentient living beings,β says Carsten Charlesworth, a researcher in Nakauchiβs lab. He says the group is working toward a βgenetic software packageβ it can add to mouse embryos to create a βreproducible phenotype.β
It may seem surprising that a technique designed to call forth a living being from stem cells is, simultaneously, being paired with a tactic to diminish that being. To Douglas Kysar, a professor at Yale Law School, thatβs part of a broader trend toward what he calls βlife that is not life,β which includes innovations like lab-grown meat. In the areas of animal-rights law Kysar studies, commercial biotech projects have begun to explore what he terms βdisenhancementβ and βdisengineering.β That is the use of genetics to reduce the capacity of animals to suffer, feel pain, or have conscious experience at all, typically as part of a program to increase the efficiency and ethics of food production.Β
For humans, of course, the worry around genetic engineering is usually that it will be used for enhancementβcreating a baby with advantages. Itβs much harder to think of examples where genetic disenhancements get pointed at the human embryo. John Evans, who co-directs the Institute of Applied Ethics at the University of California, San Diego, told me he can think of one, in literature. Hannaβs plans remind him of Bokanovskyβs Process, the fictional method of producing clones of different intelligence levels in the 1932 novel Brave New World.
That may not be a complete turnoff to investors. Lately, the plots of science fiction dystopiasβJurassic Park, Gattacaβseem to be getting repurposed at hot biotech properties. Thereβs Colossal, the company that wants to re-create extinct animals. Aguilera CastrejΓ³n says heβs already had a high-dollar offer to pack up his academic lab and join a startup company that wants to build an artificial womb. And when Hanna was at the Global Observatory meeting near Boston βearlier this year, he was being shadowed by Matt Krisiloff, CEO of the Silicon Valley company Conception, which was set up to try to manufacture human eggs in the lab and has funding from OpenAI leader Sam Altman.
Eggs are another cell type that has proved difficult to generate from a stem cell in the lab. But a growing fetus willΒ form millions of immature egg cells. So just imagine: Someone too old to conceive gives some blood, which is converted into stem cells and then into a clone, from which the fetal gonad is dissected. Maybe the reproductive cells found there could be matured further in the lab. Or maybe those young and perfectly matched ovariesβher ovaries, really, not anyone elseβsβcould be returned to her body to finish developing. A fertility expert, David Albertini, told me it might just be possible.
During the ethics meeting he traveled to the US in May to attend, Hanna participated on a panel whose topic was βsources of moral authority.β Hannaβs authority comes from the possible benefits the science of synthetic embryos may bring. But he also wields his moral credibility. Early in his remarks, Hanna had framed the whole matter in a way that made worrying about whatβs in the petri dish start to sound silly. Wearing a keffiyeh around his shoulders, he said: βIβd like to start and, you know, just remind everyone, unfortunately, that there is a genocide ongoing right now in Gaza, where children are being starved intentionally. And it is relevant, because weβre sitting here and weβre discussing human dignity, weβre discussing the status of an embryo, and weβre discussing the status of a fetus. But what about the life of the children, and adults, and innocent adults? How does it relate?β
Consider, if you will, the translucent blob in the eye of a microscope: a human blastocyst, the biological specimen that emerges just five days or so after a fateful encounter between egg and sperm. This bundle of cells, about the size of a grain of sand pulled from a powdery white Caribbean beach, contains the coiled potential of a future life: 46 chromosomes, thousands of genes, and roughly six billion base pairs of DNAβan instruction manual to assemble a one-of-a-kind human.
Now imagine a laser pulse snipping a hole in the blastocystβs outermost shell so a handful of cells can be suctioned up by a microscopic pipette. This is the moment, thanks to advances in genetic sequencing technology, when it becomes possible to read virtually that entire instruction manual.
An emerging field of science seeks to use the analysis pulled from that procedure to predict what kind of a person that embryo might become. Some parents turn to these tests to avoid passing on devastating genetic disorders that run in their families. A much smaller group, driven by dreams of Ivy League diplomas or attractive, well-behaved offspring, are willing to pay tens of thousands of dollars to optimize for intelligence, appearance, and personality. Some of the most eager early boosters of this technology are members of the Silicon Valley elite, including tech billionaires like Elon Musk, Peter Thiel, and Coinbase CEO Brian Armstrong.Β
Embryo selection is less like a build-a-baby workshop and more akin to a store where parents can shop for their future children from several available modelsβcomplete with stat cards.
But customers of the companies emerging to provide it to the public may not be getting what theyβre paying for. Genetics experts have been highlighting the potential deficiencies of this testing for years. A 2021 paper by members of the European Society of Human Genetics said, βNo clinical research has been performed to assess its diagnostic effectiveness in embryos. Patients need to be properly informed on the limitations of this use.β And a paper published this May in the Journal of Clinical Medicine echoed this concern and expressed particular reservations about screening for psychiatric disorders and non-Βdisease-related traits: βUnfortunately, no clinical research has to date been published comprehensively evaluating the effectiveness of this strategy [of predictive testing]. Patient awareness regarding the limitations of this procedure is paramount.β Β Β Β
Moreover, the assumptions underlying some of this workβthat how a person turns out is the product not of privilege or circumstance but of innate biologyβhave made these companies a political lightning rod.Β
SELMAN DESIGN
As this niche technology begins to make its way toward the mainstream, scientists and ethicists are racing to confront the implicationsβfor our social contract, for future generations, and for our very understanding of what it means to be human.
Preimplantation genetic testing (PGT), while still relatively rare, is not new. Since the 1990s, parents undergoing in vitro fertilization have been able to access a number of genetic tests before choosing which embryo to use. A type known as PGT-M can detect single-gene disorders like cystic fibrosis, sickle cell anemia, and Huntingtonβs disease. PGT-A can ascertain the sex of an embryo and identify chromosomal abnormalities that can lead to conditions like Down syndrome or reduce the chances that an embryo will implant successfully in the uterus. PGT-SR helps parents avoid embryos with issues such as duplicated or missing segments of the chromosome.
Those tests all identify clear-cut genetic problems that are relatively easy to detect, but most of the genetic instruction manual included in an embryo is written in far more nuanced code. In recent years, a fledgling market has sprung up around a new, more advanced version of the testing process called PGT-P: preimplantation genetic testing for polygenic disorders (and, some claim, traits)βthat is, outcomes determined by the elaborate interaction of hundreds or thousands of genetic variants.
In 2020, the first baby selected using PGT-P was born. While the exact figure is unknown, estimates put the number of children who have now been born with the aid of this technology in the hundreds. As the technology is commercialized, that number is likely to grow.
Embryo selection is less like a build-a-baby workshop and more akin to a store where parents can shop for their future children from several available modelsβcomplete with stat cards indicating their predispositions.
A handful of startups, armed with tens of millions of dollars of Silicon Valley cash, have developed proprietary algorithms to compute these statsβanalyzing vast numbers of genetic variants and producing a βpolygenic risk scoreβ that shows the probability of an embryo developing a variety of complex traits.Β Β
For the last five years or so, two companiesβGenomic Prediction and Orchidβhave dominated this small landscape, focusing their efforts on disease prevention. But more recently, two splashy new competitors have emerged: Nucleus Genomics and Herasight, which have rejected the more cautious approach of their predecessors and waded into the controversial territory of genetic testing for intelligence. (Nucleus also offers tests for a wide variety of other behavioral and appearance-related traits.)Β
The practical limitations of polygenic risk scores are substantial. For starters, there is still a lot we donβt understand about the complex gene interactions driving polygenic traits and disorders. And the biobank data sets they are based on tend to overwhelmingly represent individuals with Western European ancestry, making it more difficult to generate reliable scores for patients from other backgrounds. These scores also lack the full context of environment, lifestyle, and the myriad other factors that can influence a personβs characteristics. And while polygenic risk scores can be effective at detecting large, population-level trends, their predictive abilities drop significantly when the sample size is as tiny as a single batch of embryos that share much of the same DNA.
But beyond questions of whether evidence supports the technologyβs effectiveness, critics of the companies selling it accuse them of reviving a disturbing ideology: eugenics, or the belief that selective breeding can be used to improve humanity. Indeed, some of the voices who have been most confident that these methods can successfully predict nondisease traits have made startling claims about natural genetic hierarchies and innate racial differences.
What everyone can agree on, though, is that this new wave of technology is helping to inflame a centuries-old debate over nature versus nurture.
The term βeugenicsβ was coined in 1883 by a British anthropologist and statistician named Sir Francis Galton, inspired in part by the work of his cousin Charles Darwin. He derived it from a Greek word meaning βgood in stock, hereditarily endowed with noble qualities.β
Some of modern historyβs darkest chapters have been built on Galtonβs legacy, from the Holocaust to the forced sterilization laws that affected certain groups in the United States well into the 20th century. Modern science has demonstrated the many logical and empirical problems with Galtonβs methodology. (For starters, he counted vague concepts like βeminenceββas well as infections like syphilis and tuberculosisβas heritable phenotypes, meaning characteristics that result from the interaction of genes and environment.)
Yet even today, Galtonβs influence lives on in the field of behavioral genetics, which investigates the genetic roots of psychological traits. Starting in the 1960s, researchers in the US began to revisit one of Galtonβs favorite methods: twin studies. Many of these studies, which analyzed pairs of identical and fraternal twins to try to determine which traits were heritable and which resulted from socialization, were funded by the US government. The most well-known of these, the Minnesota Twin Study, also accepted grants from the Pioneer Fund, a now defunct nonprofit that had promoted eugenics and βrace bettermentβ since its founding in 1937.Β
The nature-versus-nurture debate hit a major inflection point in 2003, when the Human Genome Project was declared complete. After 13 years and at a cost of nearly $3 billion, an international consortium of thousands of researchers had sequenced 92% of the human genome for the first time.
Today, the cost of sequencing a genome can be as low as $600, and one company says it will soon drop even further. This dramatic reduction has made it possible to build massive DNA databases like the UK Biobank and the National Institutes of Healthβs All of Us, each containing genetic data from more than half a million volunteers. Resources like these have enabled researchers to conduct genome-wide association studies, or GWASs, which identify correlations between genetic variants and human traits by analyzing single-nucleotide polymorphisms (SNPs)βthe most common form of genetic variation between individuals. The findings from these studies serve as a reference point for developing polygenic risk scores.
Most GWASs have focused on disease prevention and personalized medicine. But in 2011, a group of medical researchers, social scientists, and economists launched the Social Science Genetic Association Consortium (SSGAC) to investigate the genetic basis of complex social and behavioral outcomes. One of the phenotypes they focused on was the level of education people reached.
βIt was a bit of a phenotype of convenience,β explains Patrick Turley, an economist and member of the steering committee at SSGAC, given that educational attainment is routinely recorded in surveys when genetic data is collected. Still, it was βclear that genes play some role,β he says. βAnd trying to understand what that role is, I think, is really interesting.β He adds that social scientists can also use genetic data to try to better βunderstand the role that is due to nongenetic pathways.β
Many on the left are generally willing to allow that any number of traits, from addiction to obesity, are genetically influenced. Yet heritable cognitive ability seems to be βbeyond the pale for us to integrate as a source of difference.β
The work immediately stirred feelings of discomfortβnot least among the consortiumβs own members, who feared that they might unintentionally help reinforce racism, inequality, and genetic determinism.Β
Itβs also created quite a bit of discomfort in some political circles, says Kathryn Paige Harden, a psychologist and behavioral geneticist at the University of Texas in Austin, who says she has spent much of her career making the unpopular argument to fellow liberals that genes are relevant predictors of social outcomes.Β
Harden thinks a strength of those on the left is their ability to recognize βthat bodies are different from each other in a way that matters.β Many are generally willing to allow that any number of traits, from addiction to obesity, are genetically influenced. Yet, she says, heritable cognitive ability seems to be βbeyond the pale for us to integrate as a source of difference that impacts our life.βΒ
Harden believes that genes matter for our understanding of traits like intelligence, and that this should help shape progressive policymaking. She gives the example of an education department seeking policy interventions to improve math scores in a given school district. If a polygenic risk score is βas strongly correlated with their school gradesβ as family income is, she says of the students in such a district, then βdoes deliberately not collecting that [genetic] information, or not knowing about it, make your research harder [and] your inferences worse?β
To Harden, persisting with this strategy of avoidance for fear of encouraging eugenicists is a mistake. If βinsisting that IQ is a myth and genes have nothing to do with it was going to be successful at neutralizing eugenics,β she says, βit wouldβve won by now.β
Part of the reason these ideas are so taboo in many circles is that todayβs debate around genetic determinism is still deeply infused with Galtonβs ideasβand has become a particular fixation among the online right.Β
Harden, though, warns against discounting the work of an entire field because of a few noisy neoreactionaries. βI think there can be this idea that technology is giving rise to the terrible racism,β she says. The truth, she believes, is that βthe racism has preexisted any of this technology.β
In 2019, a company called Genomic Prediction began to offer the first preimplantation polygenic testing that had ever been made commercially available. With its LifeView Embryo Health Score, prospective parents are able to assess their embryosβ predisposition to genetically complex health problems like cancer, diabetes, and heart disease. Pricing for the service starts at $3,500. Genomic Prediction uses a technique called an SNP array, which targets specific sites in the genome where common variants occur. The results are then cross-checked against GWASs that show correlations between genetic variants and certain diseases.
Four years later, a company named Orchid began offering a competing test. Orchidβs Whole Genome Embryo Report distinguished itself by claiming to sequence more than 99% of an embryoβs genome, allowing it to detect novel mutations and, the company says, diagnose rare diseases more accurately. For $2,500 per embryo, parents can access polygenic risk scores for 12 disorders, including schizophrenia, breast cancer, and hypothyroidism.Β
Orchid was founded by a woman named Noor Siddiqui. Before getting undergraduate and graduate degrees from Stanford, she was awarded the Thiel fellowshipβa $200,000 grant given to young entrepreneurs willing to work on their ideas instead of going to collegeβback when she was a teenager, in 2012. This set her up to attract attention from members of the tech elite as both customers and financial backers. Her company has raised $16.5 million to date from investors like Ethereum founder Vitalik Buterin, former Coinbase CTO Balaji Srinivasan, and Armstrong, the Coinbase CEO.
In August Siddiqui made the controversial suggestion that parents who choose not to use genetic testing might be considered irresponsible. βJust be honest: youβre okay with your kid potentially suffering for life so you can feel morally superior β¦β she wrote on X.
Americans have varied opinions on the emerging technology. In 2024, a group of bioethicists surveyed 1,627 US adults to determine attitudes toward a variety of polygenic testing criteria. A large majority approved of testing for physical health conditions like cancer, heart disease, and diabetes. Screening for mental health disorders, like depression, OCD, and ADHD, drew a more mixedβbut still positiveβresponse. Appearance-related traits, like skin color, baldness, and height, received less approval as something to test for.
Intelligence was among the most contentious traitsβunsurprising given the way it has been weaponized throughout history and the lack of cultural consensus on how it should even be defined. (In many countries, intelligence testing for embryos is heavily regulated; in the UK, the practice is banned outright.) In the 2024 survey, 36.9% of respondents approved of preimplantation genetic testing for intelligence, 40.5% disapproved, and 22.6% said they were uncertain.
Despite the disagreement, intelligence has been among the traits most talked about as targets for testing. From early on, Genomic Prediction says, it began receiving inquiries βfrom all over the worldβ about testing for intelligence, according to Diego Marin, the companyβs head of global business development and scientific affairs.
At one time, the company offered a predictor for what it called βintellectual disability.β After some backlash questioning both the predictive capacity and the ethics of these scores, the company discontinued the feature. βOur mission and vision of this company is not to improve [a baby], but to reduce risk for disease,β Marin told me. βWhen it comes to traits about IQ or skin color or height or something thatβs cosmetic and doesnβt really have a connotation of a disease, then we just donβt invest in it.β
Orchid, on the other hand, does test for genetic markers associated with intellectual disability and developmental delay. But that may not be all. According to one employee of the company, who spoke on the condition of anonymity, intelligence testing is also offered to βhigh-rollerβ clients. According to this employee, another source close to the company, and reporting in the Washington Post, Musk used Orchidβs services in the conception of at least one of the children he shares with the tech executive Shivon Zilis. (Orchid, Musk, and Zilis did not respond to requests for comment.)
I met Kian Sadeghi, the 25-year-old founder of New Yorkβbased Nucleus Genomics, on a sweltering July afternoon in his SoHo office. Slight and kinetic, Sadeghi spoke at a machine-gun pace, pausing only occasionally to ask if I was keeping up.Β
Sadeghi had modified his first organismβa sample of brewerβs yeastβat the age of 16. As a high schooler in 2016, he was taking a course on CRISPR-Cas9 at a Brooklyn laboratory when he fell in love with the βbeautiful depthβ of genetics. Just a few years later, he dropped out of college to build βa better 23andMe.βΒ
His company targets what you might call the application layer of PGT-P, accepting data from IVF clinicsβand even from the competitors mentioned in this storyβand running its own computational analysis.
βUnlike a lot of the other testing companies, weβre software first, and weβre consumer first,β Sadeghi told me. βItβs not enough to give someone a polygenic score. What does that mean? How do you compare them? Thereβs so many really hard design problems.β
Like its competitors, Nucleus calculates its polygenic risk scores by comparing an individualβs genetic data with trait-associated variants identified in large GWASs, providing statistically informed predictions.Β
Nucleus provides two displays of a patientβs results: a Z-score, plotted from β4 to 4, which explains the risk of a certain trait relative to a population with similar genetic ancestry (for example, if Embryo #3 has a 2.1 Z-score for breast cancer, its risk is higher than average), and an absolute risk score, which includes relevant clinical factors (Embryo #3 has a minuscule actual risk of breast cancer, given that it is male).
The real difference between Nucleus and its competitors lies in the breadth of what it claims to offer clients. On its sleek website, prospective parents can sort through more than 2,000 possible diseases, as well as traits from eye color to IQ. Access to the Nucleus Embryo platform costs $8,999, while the companyβs new IVF+ offeringβwhich includes one IVF cycle with a partner clinic, embryo screening for up to 20 embryos, and concierge services throughout the processβstarts at $24,999.
βMaybe you want your baby to have blue eyes versus green eyes,β Nucleus founder Kian Sadeghi said at a June event. βThat is up to the liberty of the parents.β
Its promises are remarkably bold. The company claims to be able to forecast a propensity for anxiety, ADHD, insomnia, and other mental issues. It says you can see which of your embryos are more likely to have alcohol dependence, which are more likely to be left-handed, and which might end up with severe acne or seasonal allergies. (Nevertheless, at the time of writing, the embryo-screening platform provided this disclaimer: βDNA is not destiny. Genetics can be a helpful tool for choosing an embryo, but itβs not a guarantee. Genetic research is still in itβs [sic] infancy, and thereβs still a lot we donβt know about how DNA shapes who we are.β)
To people accustomed to sleep trackers, biohacking supplements, and glucose monitoring, taking advantage of Nucleusβs options might seem like a no-brainer. To anyone who welcomes a bit of serendipity in their life, this level of perceived control may be disconcerting to say the least.
Sadeghi likes to frame his arguments in terms of personal choice. βMaybe you want your baby to have blue eyes versus green eyes,β he told a small audience at Nucleus Embryoβs June launch event. βThat is up to the liberty of the parents.β
On the official launch day, Sadeghi spent hours gleefully sparring with X users who accused him of practicing eugenics. He rejects the term, favoring instead βgenetic optimizationββthough it seems he wasnβt too upset about the free viral marketing. βThis week we got five million impressions on Twitter,β he told a crowd at the launch event, to a smattering of applause. (In an email to MIT Technology Review, Sadeghi wrote, βThe history of eugenics is one of coercion and discrimination by states and institutions; what Nucleus does is the oppositeβgenetic forecasting that empowers individuals to make informed decisions.β)
Nucleus has raised more than $36 million from investors like Srinivasan, Alexis Ohanianβs venture capital firm Seven Seven Six, and Thielβs Founders Fund. (Like Siddiqui, Sadeghi was a recipient of a Thiel fellowship when he dropped out of college; a representative for Thiel did not respond to a request for comment for this story.) Sadeghi has even poached Genomic Predictionβs cofounder Nathan Treff, who is now Nucleusβs chief clinical officer.
Sadeghiβs real goal is to build a one-stop shop for every possible application of genetic sequencing technology, from genealogy to precision medicine to genetic engineering. He names a handful of companies providing these services, with a combined market cap in the billions. βNucleus is collapsing all five of these companies into one,β he says. βWe are not an IVF testing company. We are a genetic stack.β
This spring, I elbowed my way into a packed hotel bar in the Flatiron district, where over a hundred people had gathered to hear a talk called βHow to create SUPERBABIES.β The event was part of New Yorkβs Deep Tech Week, so I expected to meet a smattering of biotech professionals and investors. Instead, I was surprised to encounter a diverse and curious group of creatives, software engineers, students, and prospective parentsβmany of whom had come with no previous knowledge of the subject.
The speaker that evening was Jonathan Anomaly, a soft-spoken political philosopher whose didactic tone betrays his years as a university professor.
Some of Anomalyβs academic work has focused on developing theories of rational behavior. At Duke and the University of Pennsylvania, he led introductory courses on game theory, ethics, and collective action problems as well as bioethics, digging into thorny questions about abortion, vaccines, and euthanasia. But perhaps no topic has interested him so much as the emerging field of genetic enhancement.Β
In 2018, in a bioethics journal, Anomaly published a paper with the intentionally provocative title βDefending Eugenics.β He sought to distinguish what he called βpositive eugenicsββnoncoercive methods aimed at increasing traits that βpromote individual and social welfareββfrom the so-called βnegative eugenicsβ we know from our history books.
Anomaly likes to argue that embryo selection isnβt all that different from practices we already take for granted. Donβt believe two cousins should be allowed to have children? Perhaps youβre a eugenicist, he contends. Your friend who picked out a six-foot-two Harvard grad from a binder of potential sperm donors? Same logic.
His hiring at the University of Pennsylvania in 2019 caused outrage among some students, who accused him of βracial essentialism.β In 2020, Anomaly left academia, lamenting that βAmerican universities had become an intellectual prison.β
A few years later, Anomaly joined a nascent PGT-P company named Herasight, which was promising to screen for IQ.
At the end of July, the company officially emerged from stealth mode. A representative told me that most of the money raised so far is from angel investors, including Srinivasan, who also invested in Orchid and Nucleus. According to the launch announcement on X, Herasight has screened βhundreds of embryosβ for private customers and is beginning to offer its first publicly available consumer product, a polygenic assessment that claims to detect an embryoβs likelihood of developing 17 diseases.
Their marketing materials boast predictive abilities 122% better than Orchidβs and 193% better than Genomic Predictionβs for this set of diseases. (βHerasight is comparing their current predictor to models we published over five years ago,β Genomic Prediction responded in a statement. βOur team is confident our predictors are world-class and are not exceeded in quality by any other lab.β)Β
The company did not include comparisons with Nucleus, pointing to the βabsence of published performance validationsβ by that company and claiming it represented a case where βmarketing outpaces science.β (βNucleus is known for world-class science and marketing, and we understand why thatβs frustrating to our competitors,β a representative from the company responded in a comment.)Β
Herasight also emphasized new advances in βwithin-family validationβ (making sure that the scores are not merely picking up shared environmental factors by comparing their performance between unrelated people to their performance between siblings) and βcross-Βancestry accuracyβ (improving the accuracy of scores for people outside the European ancestry groups where most of the biobank data is concentrated). The representative explained that pricing varies by customer and the number of embryos tested, but it can reach $50,000.
When it comes to traits that Jonathan Anomaly believes are genetically encoded, intelligence is just the tip of the iceberg. He has also spoken about the heritability of empathy, violence, religiosity, and political leanings.
Herasight tests for just one non-disease-related trait: intelligence. For a couple who produce 10 embryos, it claims it can detect an IQ spread of about 15 points, from the lowest-scoring embryo to the highest. The representative says the company plans to release a detailed white paper on its IQ predictor in the future.
The day of Herasightβs launch, Musk responded to the company announcement: βCool.β Meanwhile, a Danish researcher named Emil Kirkegaard, whose research has largely focused on IQ differences between racial groups, boosted the company to his nearly 45,000 followers on X (as well as in a Substack blog), writing, βProper embryo selection just landed.β Kirkegaard has in fact supported Anomalyβs work for years; heβs posted about him on X and recommended his 2020 book Creating Future People, which he called a βbiotech eugenics advocacy book,β adding: βNaturally, I agree with this stuff!β
When it comes to traits that Anomaly believes are genetically encoded, intelligenceβwhich he claimed in his talk is about 75% heritableβis just the tip of the iceberg. He has also spoken about the heritability of empathy, impulse control, violence, passivity, religiosity, and political leanings.
Anomaly concedes there are limitations to the kinds of relative predictions that can be made from a small batch of embryos. But he believes weβre only at the dawn of what he likes to call the βreproductive revolution.β At his talk, he pointed to a technology currently in development at a handful of startups: in vitro gametogenesis. IVG aims to create sperm or egg cells in a laboratory using adult stem cells, genetically reprogrammed from cells found in a sample of skin or blood. In theory, this process could allow a couple to quickly produce a practically unlimited number of embryos to analyze for preferred traits. Anomaly predicted this technology could be ready to use on humans within eight years.
SELMAN DESIGN
βI doubt the FDA will allow it immediately. Thatβs what places like PrΓ³spera are for,β he said, referring to the so-called βstartup cityβ in Honduras, where scientists and entrepreneurs can conduct medicalexperiments free from the kinds of regulatory oversight theyβd encounter in the US.
βYou might have a moral intuition that this is wrong,β said Anomaly, βbut when itβs discovered that elites are doing it privately β¦ the dominoes are going to fall very, very quickly.β The coming βevolutionary arms race,β he claimed, will βchange the moral landscape.β
He added that some of those elites are his own customers: βI could already name names, but I wonβt do it.β
After Anomalyβs talk was over, I spoke with a young photographer who told me he was hoping to pursue a masterβs degree in theology. He came to the event, he told me, to reckon with the ethical implications of playing God. βTechnology is sending us toward an Old-to-New-Testament transition moment, where we have to decide what parts of religion still serve us,β he said soberly.
Criticisms of polygenic testing tend to fall into two camps: skepticism about the testsβ effectiveness and concerns about their ethics. βOn one hand,β says Turley from the Social Science Genetic Association Consortium, βyou have arguments saying βThis isnβt going to work anyway, and the reason itβs bad is because weβre tricking parents, which would be a problem.β And on the other hand, they say, βOh, this is going to work so well that itβs going to lead to enormous inequalities in society.β Itβs just funny to see. Sometimes these arguments are being made by the same people.β
One of those people is Sasha Gusev, who runs a quantitative genetics lab at the Dana-Farber Cancer Institute. A vocal critic of PGT-P for embryo selection, he also often engages in online debates with the far-right accounts promoting race science on X.
Gusev is one of many professionals in his field who believe that because of numerous confounding socioeconomic factorsβfor example, childhood nutrition, geography, personal networks, and parenting stylesβthere isnβt much point in trying to trace outcomes like educational attainment back to genetics, particularly not as a way to prove that thereβs a genetic basis for IQ.
He adds, βI think thereβs a real risk in moving toward a society where you see genetics and βgenetic endowmentsβ as the drivers of peopleβs behavior and as a ceiling on their outcomes and their capabilities.β
Gusev thinks there is real promise for this technology in clinical settings among specific adult populations. For adults identified as having high polygenic risk scores for cancer and cardiovascular disease, he argues, a combination of early screening and intervention could be lifesaving. But when it comes to the preimplantation testing currently on the market, he thinks there are significant limitationsβand few regulatory measures or long-term validation methods to check the promises companies are making. He fears that giving these services too much attention could backfire.
βThese reckless, overpromised, and oftentimes just straight-up manipulative embryo selection applications are a risk for the credibility and the utility of these clinical tools,β he says.
Many IVF patients have also had strong reactions to publicity around PGT-P. When the New York Timespublished an opinion piece about Orchid in the spring, angry parents took to Reddit to rant. One user posted, βFor people who dont [sic] know why other types of testing are necessary or needed this just makes IVF people sound like we want to create βperfectβ babies, while we just want (our) healthy babies.β
Still, others defended the need for a conversation. βWhen could technologies like this change the mission from helping infertile people have healthy babies to eugenics?β one Redditor posted. βItβs a fine line to walk and an important discussion to have.β
Some PGT-P proponents, like Kirkegaard and Anomaly, have argued that policy decisions should more explicitly account for genetic differences. In a series of blog posts following the 2024 presidential election, under the header βMake science great again,β Kirkegaard called for ending affirmative action laws, legalizing race-based hiring discrimination, and removing restrictions on data sets like the NIHβs All of Us biobank that prevent researchers like him from using the data for race science. Anomaly has criticized social welfare policies for putting a finger on the scale to βpunish the high-IQ people.β
Indeed, the notion of genetic determinism has gained some traction among loyalists to President Donald Trump.Β
In October 2024, Trump himself made a campaign stop on the conservative radio program The Hugh Hewitt Show. He began a rambling answer about immigration and homicide statistics. βA murderer, I believe this, itβs in their genes. And we got a lot of bad genes in our country right now,β he told the host.
Gusev believes that while embryo selection wonβt have much impact on individual outcomes, the intellectual framework endorsed by many PGT-P advocates could have dire social consequences.
βIf you just think of the differences that we observe in society as being cultural, then you help people out. You give them better schooling, you give them better nutrition and education, and theyβre able to excel,β he says. βIf you think of these differences as being strongly innate, then you can fool yourself into thinking that thereβs nothing that can be done and people just are what they are at birth.β
For the time being, there are no plans for longitudinal studies to track actual outcomes for the humans these companies have helped bring into the world. Harden, the behavioral geneticist from UT Austin, suspects that 25 years down the line, adults who were once embryos selected on the basis of polygenic risk scores are βgoing to end up with the same question that we all have.β They will look at their life and wonder, βWhat wouldβve had to change for it to be different?β
Julia Black is a Brooklyn-based features writer and a reporter in residence at Omidyar Network. She has previously worked for Business Insider, Vox, The Information, and Esquire.
Itβs the 25th of June and Iβm shivering in my lab-issued underwear in Fort Worth, Texas. Libby Cowgill, an anthropologist in a furry parka, has wheeled me and my cot into a metal-walled room set to 40 Β°F. A loud fan pummels me from above and siphons the dregs of my body heat through the cotβs mesh from below. A large respirator fits snug over my nose and mouth. The device tracks carbon dioxide in my exhalesβa proxy for how my metabolism speeds up or slows down throughout the experiment. Eventually Cowgill will remove my respirator to slip a wire-thin metal temperature probe several pointy inches into my nose.
Cowgill and a graduate student quietly observe me from the corner of their so-called βclimate chamber.β Just a few hours earlier Iβd sat beside them to observe as another volunteer, a 24-year-old personal trainer, endured the cold. Every few minutes, they measured his skin temperature with a thermal camera, his core temperature with a wireless pill, and his blood pressure and other metrics that hinted at how his body handles extreme cold. He lasted almost an hour without shivering; when my turn comes, I shiver aggressively on the cot for nearly an hour straight.
Iβm visiting Texas to learn about this experiment on how different bodies respond to extreme climates. βWhatβs the record for fastest to shiver so far?β I jokingly ask Cowgill as she tapes biosensing devices to my chest and legs. After I exit the cold, she surprises me: βYou, believe it or not, were not the worst person weβve ever seen.β
Climate change forces us to reckon with the knotty science of how our bodies interact with the environment.
Cowgill is a 40-something anthropologist at the University of Missouri who powerlifts and teaches CrossFit in her spare time. Sheβs small and strong, with dark bangs and geometric tattoos. Since 2022, sheβs spent the summers at the University of North Texas Health Science Center tending to these uncomfortable experiments. Her team hopes to revamp the science of thermoregulation.Β
While we know in broad strokes how people thermoregulate, the science of keeping warm or cool is mottled with blind spots. βWe have the general picture. We donβt have a lot of the specifics for vulnerable groups,β says Kristie Ebi, an epidemiologist with the University of Washington who has studied heat and health for over 30 years. βHow does thermoregulation work if youβve got heart disease?βΒ
βEpidemiologists have particular tools that theyβre applying for this question,β Ebi continues. βBut we do need more answers from other disciplines.β
Climate change is subjecting vulnerable people to temperatures that push their limits. In 2023, about 47,000 heat-related deaths are believed to have occurred in Europe. Researchers estimate that climate change could add an extra 2.3 million European heat deaths this century. Thatβs heightened the stakes for solving the mystery of just what happens to bodies in extreme conditions.Β
Extreme temperatures already threaten large stretches of the world. Populations across the Middle East, Asia, and sub-ΒSaharan Africa regularly face highs beyond widely accepted levels of human heat tolerance. Swaths of the southern US, northern Europe, and Asia now also endure unprecedented lows: The 2021 Texas freeze killed at least 246 people, and a 2023 polar vortex sank temperatures in Chinaβs northernmost city to a hypothermic record of β63.4 Β°F.Β
This change is here, and more is coming. Climate scientists predict that limiting emissions can prevent lethal extremes from encroaching elsewhere. But if emissions keep course, fierce heat and even cold will reach deeper into every continent. About 2.5 billion people in the worldβs hottest places donβt have air-Βconditioning. When people do, it can make outdoor temperatures even worse, intensifying the heat island effect in dense cities. And neither AC nor radiators are much help when heat waves and cold snaps capsize the power grid.
COURTESY OF MAX G. LEVY
COURTESY OF MAX G. LEVY
COURTESY OF MAX G. LEVY
βYou, believe it or not, were not the worst person weβve ever seen,β the author was told after enduring Cowgillβs βclimate chamber.β
Through experiments like Cowgillβs, researchers around the world are revising rules about when extremes veer from uncomfortable to deadly. Their findings change how we should think about the limits of hot and coldβand how to survive in a new world.Β
Embodied change
Archaeologists have known for some time that we once braved colder temperatures than anyone previously imagined. Humans pushed into Eurasia and North America well before the last glacial period ended about 11,700 years ago. We were the only hominins to make it out of this era. Neanderthals, Denisovans, and Homo floresiensis all went extinct. We donβt know for certain what killed those species. But we do know that humans survived thanks to protection from clothing, large social networks, and physiological flexibility. Human resilience to extreme temperature is baked into our bodies, behavior, and genetic code. We wouldnβt be here without it.Β
βOur bodies are constantly in communication with the environment,β says Cara Ocobock, an anthropologist at the University of Notre Dame who studies how we expend energy in extreme conditions. She has worked closely with Finnish reindeer herders and Wyoming mountaineers.Β
But the relationship between bodies and temperature is surprisingly still a mystery to scientists. In 1847, the anatomist Carl Bergmann observed that animal species grow larger in cold climates. The zoologist Joel Asaph Allen noted in 1877 that cold-dwellers had shorter appendages. Then thereβs the nose thing: In the 1920s, the British anthropologist Arthur Thomson theorized that people in cold places have relatively long, narrow noses, the better to heat and humidify the air they take in. These theories stemmed from observations of animals like bears and foxes, and others that followed stemmed from studies comparing the bodies of cold-accustomed Indigenous populations with white male control groups. Some, like those having to do with optimization of surface area, do make sense: It seems reasonable that a tall, thin body increases the amount of skin available to dump excess heat. The problem is, scientists have never actually tested this stuff in humans.Β
βOur bodies are constantly in communication with the environment.β
Cara Ocobock, anthropologist, University of Notre Dame
Some of what we know about temperature tolerance thus far comes from century-old race science or assumptions that anatomy controls everything. But science has evolved. Biology has matured. Childhood experiences, lifestyles, fat cells, and wonky biochemical feedback loops can contribute to a picture of the body as more malleable than anything imagined before. And thatβs prompting researchers to change how they study it.
βIf you take someone whoβs super long and lanky and lean and put them in a cold climate, are they gonna burn more calories to stay warm than somebody whoβs short and broad?β Ocobock says. βNo oneβs looked at that.β
Ocobock and Cowgill teamed up with Scott Maddux and Elizabeth Cho at the Center for Anatomical Sciences at the University of North Texas Health Fort Worth. All four are biological anthropologists who have also puzzled over whether the rules Bergmann, Allen, and Thomson proposed are actually true.Β
For the past four years, the team has been studying how factors like metabolism, fat, sweat, blood flow, and personal history control thermoregulation.Β
Your native climate, for example, may influence how you handle temperature extremes. In a unique study of mortality statistics from 1980s Milan, Italians raised in warm southern Italy were more likely to survive heat waves in the northern part of the country.Β
Similar trends have appeared in cold climes. Researchers often measure cold tolerance by a personβs βbrown adipose,β a type of fat that is specialized for generating heat (unlike white fat, which primarily stores energy). Brown fat is a cold adaptation because it delivers heat without the mechanism of shivering. Studies have linked it to living in cold climates, particularly at young ages. Wouter van Marken Lichtenbelt, the physiologist at Maastricht University who with colleagues discovered brown fat in adults, has shown that this tissue can further activate with cold exposure and even help regulate blood sugar and influence how the body burns other fat.Β
That adaptability served as an early clue for the Texas team. They want to know how a personβs response to hot and cold correlates with height, weight, and body shape. What is the difference, Maddux asks, between βa male whoβs 6 foot 6 and weighs 240 poundsβ and someone else in the same environment βwhoβs 4 foot 10 and weighs 89 poundsβ? But the team also wondered if shape was only part of the story.Β
Their multi-year experiment uses tools that anthropologists couldnβt have imagined a century agoβdevices that track metabolism in real time and analyze genetics. Each participant gets a CT scan (measuring body shape), a DEXA scan (estimating percentages of fat and muscle), high-resolution 3D scans, and DNA analysis from saliva to examine ancestry genetically.Β
Volunteers lie on a cot in underwear, as I did, for about 45 minutes in each climate condition, all on separate days. Thereβs dry cold, around 40 Β°F, akin to braving a walk-in refrigerator. Then dry heat and humid heat: 112 Β°F with 15% humidity and 98Β Β°F with 85% humidity. They call it βgoing to Vegasβ and βgoing to Houston,β says Cowgill. The chamber session is long enough to measure an effect, but short enough to be safe.Β
Before I traveled to Texas, Cowgill told me she suspected the old rules would fall. Studies linking temperature tolerance to race and ethnicity, for example, seemed tenuous because biological anthropologists today reject the concept of distinct races. Itβs a false premise, she told me: βNo one in biological anthropology would argue that human beings do not vary across the globeβthatβs obvious to anyone with eyes. [But] you canβt draw sharp borders around populations.βΒ
She added, βI think thereβs a substantial possibility that we spend four years testing this and find out that really, limb length, body mass, surface area [β¦] are not the primary things that are predicting how well you do in cold and heat.βΒ
Adaptable to a degree
In July 1995, a week-long heat wave pushed Chicago above 100 Β°F, killing roughly 500 people. Thirty years later, Ollie Jay, a physiologist at the University of Sydney, can duplicate the conditions of that exceptionally humid heat wave in a climate chamber at his laboratory.Β
βWe can simulate the Chicago heat wave of β95. The Paris heat wave of 2003. The heat wave [in early July of this year]Β in Europe,β Jay says. βAs long as weβve got the temperature and humidity information, we can re-create those conditions.β
βEverybody has quite an intimate experience of feeling hot, so weβve got 8 billion experts on how to keep cool,β he says. Yet our internal sense of when heat turns deadly is unreliable. Even professional athletes overseen by experienced medics have died after missing dangerous warning signs. And little research has been done to explore how vulnerable populations such as elderly people, those with heart disease, and low-income communities with limited access to cooling respond to extreme heat.Β
Jayβs team researches the most effective strategies for surviving it. He lambastes air-conditioning, saying it demands so much energy that it can aggravate climate change in βa vicious cycle.β Instead, he has monitored peopleβs vital signs while they use fans and skin mists to endure three hours in humid and dry heat. In results published last year, his research found that fans reduced cardiovascular strain by 86% for people with heart disease in the type of humid heat familiar in Chicago.Β
Dry heat was a different story. In that simulation, fans not only didnβt help but actually doubled the rate at which core temperatures rose in healthy older people.
Heat kills. But not without a fight. Your body must keep its internal temperature in a narrow window flanking 98 Β°F by less than two degrees. The simple fact that youβre alive means you are producing heat. Your body needs to export that heat without amassing much more. The nervous system relaxes narrow blood vessels along your skin. Your heart rate increases, propelling more warm blood to your extremities and away from your organs. You sweat. And when that sweat evaporates, it carries a torrent of body heat away with it.Β
This thermoregulatory response can be trained. Studies by van Marken Lichtenbelt have shown that exposure to mild heat increases sweat capacity, decreases blood pressure, and drops resting heart rate. Long-term studies based on Finnish saunas suggest similar correlations.Β
The body may adapt protectively to cold, too. In this case, body heat is your lifeline. Shivering and exercise help keep bodies warm. So can clothing. Cardiovascular deaths are thought to spike in cold weather. But people more adapted to cold seem better able to reroute their blood flow in ways that keep their organs warm without dropping their temperature too many degrees in their extremities.Β
Earlier this year, the biological anthropologist Stephanie B. Levy (no relation) reported that New Yorkers who experienced lower average temperatures had more productive brown fat, adding evidence for the idea that the inner workings of our bodies adjust to the climate throughout the year and perhaps even throughout our lives. βDo our bodies hold a biological memory of past seasons?β Levy wonders. βThatβs still an open question. Thereβs some work in rodent models to suggest that thatβs the case.β
Although people clearly acclimatize with enough strenuous exposures to either cold or heat, Jay says, βyou reach a ceiling.β Consider sweat: Heat exposure can increase the amount you sweat only until your skin is completely saturated. Itβs a nonΒnegotiable physical limit. Any additional sweat just means leaking water without carrying away any more heat. βIβve heard people say weβll just find a way of evolving out of thisβweβll biologically adapt,β Jay says. βUnless weβre completely changing our body shape, then thatβs not going to happen.β
And body shape may not even sway thermoregulation as much as previously believed. The subject I observed, a personal trainer, appeared outwardly adapted for cold: his broad shoulders didnβt even fit in a single CT scan image. Cowgill supposed that this muscle mass insulated him. When he emerged from his session in the 40 Β°F environment, though, he had finally started shiveringβintensely. The researchers covered him in a heated blanket. He continued shivering. Driving to lunch over an hour later in a hot car, he still mentioned feeling cold. An hour after that, a finger prick drew no blood, a sign that blood vessels in his extremities remained constricted. His body temperature fell about half a degree C in the cold sessionβa significant dropβand his wider build did not appear to shield him from the cold as well as my involuntary shivering protected me.Β
I asked Cowgill if perhaps there is no such thing as being uniquely predisposed to hot or cold. βAbsolutely,β she said.Β
A hot mess
So if body shape doesnβt tell us much about how a person maintains body temperature, and acclimation also runs into limits, then how do we determine how hot is too hot?Β
In 2010 two climate change researchers, Steven Sherwood and Matthew Huber, argued that regions around the world become uninhabitable at wet-bulb temperatures of 35 Β°C, or 95 Β°F. (Wet-bulb measurements are a way to combine air temperature and relative humidity.) Above 35 Β°C, a person simply wouldnβt be able to dissipate heat quickly enough. But it turns out that their estimate was too optimistic.Β
Researchers βran withβ that number for a decade, says Daniel Vecellio, a bioclimatologist at the University of Nebraska, Omaha. βBut the number had never been actually empirically tested.β In 2021 a Pennsylvania State University physiologist, W. Larry Kenney, worked with Vecellio and others to test wet-bulb limits in a climate chamber. Kenneyβs lab investigates which combinations of temperature, humidity, and time push a personβs body over the edge.Β
Not long after, the researchers came up with their own wet-bulb limit of human tolerance: below 31 Β°C in warm, humid conditions for the youngest cohort, people in their thermoregulatory prime. Their research suggests that a day reaching 98β―Β°F and 65% humidity, for example, poses danger in a matter of hours, even for healthy people.Β
JUSTIN CLEMONS
JUSTIN CLEMONS
JUSTIN CLEMONS
Cowgill and her colleagues Elizabeth Cho (top) and Scott Maddux prepare graduate student Joanna Bui for a βroom-temperature test.β
In 2023, Vecellio and Huber teamed up, combining the growing arsenal of lab data with state-of-the-art climate simulations to predict where heat and humidity most threatened global populations: first the Middle East and South Asia, then sub-Saharan Africa and eastern China. And assuming that warming reaches 3 to 4 Β°C over preindustrial levels this century, as predicted, parts of North America, South America, and northern and central Australia will be next.Β
Last June, Vecellio, Huber, and Kenney co-published an article revising the limits that Huber had proposed in 2010. βWhy not 35 Β°C?β explained why the human limits have turned out to be lower than expected. Those initial estimates overlooked the fact that our skin temperature can quickly jump above 101 Β°F in hot weather, for example, making it harder to dump internal heat.
The Penn State team has published deep dives on how heat tolerance changes with sex and age. Older participantsβ wet-bulb limits wound up being even lowerβbetween 27 and 28 Β°C in warm, humid conditionsβand varied more from person to person than they did in young people. βThe conditions that we experience nowβespecially here in North America and Europe, places like thatβare well below the limits that we found in our research,β Vecellio says. βWe know that heat kills now.β Β
What this fast-growing body of research suggests, Vecellio stresses, is that you canβt define heat risk by just one or two numbers. Last year, he and researchers at Arizona State University pulled up the hottest 10% of hours between 2005 and 2020 for each of 96 US cities. They wanted to compare recent heat-health research with historical weather data for a new perspective: How frequently is it so hot that peopleβs bodies canβt compensate for it? Over 88% of those βhot hoursβ met that criterion for people in full sun. In the shade, most of those heat waves became meaningfully less dangerous.Β
βThereβs really almost no one who βneedsβ to die in a heat wave,β says Ebi, the epidemiologist. βWe have the tools. We have the understanding. Essentially all [those] deaths are preventable.β
More than a number
A year after visiting Texas, I called Cowgill to hear what she was thinking after four summers of chamber experiments. She told me that the only rule about hot and cold she currently stands behind is β¦ well, none.
She recalled a recent participantβthe smallest man in the study, weighing 114 pounds. βHe shivered like a leaf on a tree,β Cowgill says. Normally, a strong shiverer warms up quickly. Core temperature may even climb a little. βThis [guy] was just shivering and shivering and shivering and not getting any warmer,β she says. She doesnβt know why this happened. βEvery time I think I get a picture of whatβs going on in there, weβll have one person come in and just kind of be a complete exception to the rule,β she says, adding that you canβt just gloss over how much human bodies vary inside and out.
The same messiness complicates physiology studies.Β
Jay looks to embrace bodily complexities by improving physiological simulations of heat and the human strain it causes. Heβs piloted studies that input a personβs activity level and type of clothing to predict core temperature, dehydration, and cardiovascular strain based on the particular level of heat. One can then estimate the personβs risk on the basis of factors like age and health. Heβs also working on physiological models to identify vulnerable groups, inform early-warning systems ahead of heat waves, and possibly advise cities on whether interventions like fans and mists can help protect residents. βHeat is an all-of-Βsociety issue,β Ebi says. Officials could better prepare the public for cold snaps this way too.
βDeath is not the only thing weβre concerned about,β Jay adds.Β Extreme temperatures bring morbidity and sickness and strain hospital systems: βThereβs all these community-level impacts that weβre just completely missing.β
Climate change forces us to reckon with the knotty science of how our bodies interact with the environment. Predicting the health effects is a big and messy matter.Β
The first wave of answers from Fort Worth will materialize next year. The researchers will analyze thermal images to crunch data on brown fat. Theyβll resolve whether, as Cowgill suspects, your body shape may not sway temperature tolerance as much as previously assumed. βHuman variation is the rule,β she says, βnot the exception.βΒ
Max G. Levy is an independent journalist who writes about chemistry, public health, and the environment.
Be honest: Have you ever looked up someone from your childhood on social media with the sole intention of seeing how theyβve aged?Β
One of my colleagues, who shall remain nameless, certainly has. He recently shared a photo of a former classmate. βCan you believe weβre the same age?β he asked, with a hint of glee in his voice. A relative also delights in this pastime. βWow, she looks like an old woman,β sheβll say when looking at a picture of someone she has known since childhood. The years certainly are kinder to some of us than others.
But wrinkles and gray hairs aside, it can be difficult to know how wellβor poorlyβsomeoneβs body is truly aging, under the hood. A person who develops age-related diseases earlier in life, or has other biological changes associated with aging (such as elevated cholesterol or markers of inflammation), might be considered βbiologically olderβ than a similar-age person who doesnβt have those changes. Some 80-year-olds will be weak and frail, while others are fit and active.Β
Doctors have long used functional tests that measure their patientsβ strength or the distance they can walk, for example, or simply βeyeballβ them to guess whether they look fit enough to survive some treatment regimen, says Tamir Chandra, who studies aging at the Mayo Clinic.Β
But over the past decade, scientists have been uncovering new methods of looking at the hidden ways our bodies are aging. What theyβve found is changing our understanding of aging itself.Β
βAging clocksβ are new scientific tools that can measure how our organs are wearing out, giving us insight into our mortality and health. They hint at our biological age. While chronological age is simply how many birthdays weβve had, biological age is meant to reflect something deeper. It measures how our bodies are handling the passing of time andβperhapsβlets us know how much more of it we have left. And while you canβt change your chronological age, you just might be able to influence your biological age.
The science is still new, and few experts in the fieldβsome of whom affectionately refer to it as βclock worldββwould argue that an aging clock can definitively reveal an individualβs biological age.Β
But their work is revealing that aging clocks can offer so much more than an insta-brag, a snake-oil pitchβor even just an eye-catching number. In fact, they are helping scientists unravel some of the deepest mysteries in biology: Why do we age? How do we age? When does aging begin? What does it even mean to age?
Ultimately, and most importantly, they might soon tell us whether we can reverse the whole process.
Clocks kick off
The way your genes work can change. Molecules called methyl groups can attach to DNA, controlling the way genes make proteins. This process is called methylation, and it can potentially occur at millions of points along the genome. These epigenetic markers, as they are known, can switch genes on or off, or increase or decrease how much protein they make. Theyβre not part of our DNA, but they influence how it works.
In 2011, Steve Horvath, then a biostatistician at the University of California, Los Angeles, took part in a study that was looking for links between sexual orientation and these epigenetic markers. Steve is straight; he says his twin brother, Markus, who also volunteered, is gay.
That study didnβt find a link between DNA methylΒation and sexual orientation. But when Horvath looked at the data, he noticed a different trendβa very strong link between age and methylation at around 88 points on the genome. He once told me he fell off his chair when he saw it.Β
Many of the affected genes had already been linked to age-related brain and cardiovascular diseases, but it wasnβt clear how methylation might be related to those diseases.Β
If a model could work out what average aging looks like, it could potentially estimate whether someone was aging unusually fast or slowly. It could transform medicine and fast-track the search for an anti-aging drug. It could help us understand what aging is, and why it happens at all.
In 2013, Horvath collected methylation data from 8,000 tissue and cell samples to create what he called the Horvath clockβessentially a mathematical model that could estimate age on the basis of DNA methylation at 353 points on the genome. From a tissue sample, it was able to detect a personβs age within a range of 2.9 years.
That clock changed everything. Its publication in 2013 marked the birth of βclock world.β To some, the possibilities were almost endless. If a model could work out what average aging looks like, it could potentially estimate whether someone was aging unusually fast or slowly. It could transform medicine and fast-track the search for an anti-aging drug. It could help us understand what aging is, and why it happens at all.
The epigenetic clock was a success story in βa field that, frankly, doesnβt have a lot of success stories,β says JoΓ£o Pedro de MagalhΓ£es, who researches aging at the University of Birmingham, UK.
It took a few years, but as more aging researchers heard about the clock, they began incorporating it into their research and even developing their own clocks. Horvath became a bit of a celebrity. Scientists started asking for selfies with him at conferences, he says. Some researchers even made T-shirts bearing the front page of his 2013 paper.
Some of the many other aging clocks developed since have become notable in their own right. Examples include the PhenoAge clock, which incorporates health data such as blood cell counts and signs of inflammation along with methylΒation, and the Dunedin Pace of Aging clock, which tells you how quickly or slowly a person is aging rather than pointing to a specific age. Many of the clocks measure methylation, but some look at other variables, such as proteins in blood or certain carbohydrate molecules that attach to such proteins.
Today, there are hundreds or even thousands of clocks out there, says Chiara Herzog, who researches aging at Kingβs College London and is a member of the Biomarkers of Aging Consortium. Everyone has a favorite. Horvath himself favors his GrimAge clock, which was named after the Grim Reaper because it is designed to predict time to death.
That clock was trained on data collected from people who were monitored for decades, many of whom died in that period. Horvath wonβt use it to tell people when they might die of old age, he stresses, saying that it wouldnβt be ethical. Instead, it can be used to deliver a biological age that hints at how long a person might expect to live. Someone who is 50 but has a GrimAge of 60 can assume that, compared with the average 50-year-old, they might be a bit closer to the end.
GrimAge is not perfect. While it can strongly predict time to death given the health trajectory someone is on, no aging clock can predict if someone will start smoking or get a divorce (which generally speeds aging) or suddenly take up running (which can generally slow it). βPeople are complicated,β Horvath tells MIT Technology Review. βThereβs a huge error bar.β
But accuracy is a challenge for all aging clocks. Part of the problem lies in how they were designed. Most of the clocks were trained to link age with methylation. The best clocks will deliver an estimate that reflects how far a personβs biology deviates from the average. Aging clocks are still judged on how well they can predict a personβs chronological age, but you donβt want them to be too close, says Lucas Paulo de Lima Camillo, head of machine learning at Shift Bioscience, who was awarded $10,000 by the Biomarkers of Aging Consortium for developing a clock that could estimate age within a range of 2.55 years.
None of the clocks are precise enough to predict the biological age of a single person.Putting the same biological sample through five different clocks will give you five wildly different results.
LEON EDLER
βThereβs this paradox,β says Camillo. If a clock is really good at predicting chronological age, thatβs all it will tell youβand it probably wonβt reveal much about your biological age. No one needs an aging clock to tell them how many birthdays theyβve had. Camillo says heβs noticed that when the clocks get too close to βperfectβ age prediction, they actually become less accurate at predicting mortality.
Therein lies the other central issue for scientists who develop and use aging clocks: What is the thing they are really measuring? It is a difficult question for a field whose members notoriously fail to agree on the basics. (Everything from the definition of aging to how it occurs and why is up for debate among the experts.)
They do agree that aging is incredibly complex. A methylation-based aging clock might tell you about how that collection of chemical markers compares across individuals, but at best, itβs only giving you an idea of their βepigenetic age,β says Chandra. There are probably plenty of other biological markers that might reveal other aspects of aging, he says: βNone of the clocks measure everything.βΒ
We donβt know why some methyl groups appear or disappear with age, either. Are these changes causing damage? Or are they a by-product of it? Are the epigenetic patterns seen in a 90-year-old a sign of deterioration? Or have they been responsible for keeping that person alive into very old age?
To make matters even more complicated, two different clocks can give similar answers by measuring methylation at entirely different regions of the genome. No one knows why, or which regions might be the best ones to focus on.
βThe biomarkers have this black-box quality,β says Jesse Poganik at Brigham and Womenβs Hospital in Boston. βSome of them are probably causal, some of them may be adaptive β¦ and some of them may just be neutralβ: either βthereβs no reason for them not to happenβ or βthey just happen by random chance.β
Even the same clock can give you different answers if you put a sample through it more than once. βTheyβre not yet individually predictive,β says Herzog. βWe donβt know what [a clock result] means for a person, [or if] theyβre more or less likely to develop disease.β
And itβs why plenty of aging researchersβeven those who regularly use the clocks in their workβhavenβt bothered to measure their own epigenetic age. βLetβs say I do a clock and it says that my biological age β¦ is five years older than it should be,β says MagalhΓ£es. βSo what?β He shrugs. βI donβt see much point in it.β
You might think this lack of clarity would make aging clocks pretty useless in a clinical setting. But plenty of clinics are offering them anyway. Some longevity clinics are more careful, and will regularly test their patients with a range of clocks, noting their results and tracking them over time. Others will simply offer an estimate of biological age as part of a longevity treatment package.
And then there are the people who use aging clocks to sell supplements. While no drug or supplement has been definitively shown to make people live longer, that hasnβt stopped the lightly regulated wellness industry from pushing a range of βtreatmentsβ that range from lotions to herbal pills all the way through to stem-cell injections.
Some of these people come to aging meetings. I was in the audience at an event when one CEO took to the stage to claim he had reversed his own biological age by 18 yearsβthanks to the supplement he was selling. Tom Weldon of Ponce de Leon Health told us his gray hair was turning brown. His biological age was supposedly reversing so rapidly that he had reached βlongevity escape velocity.β
But if the people who buy his supplements expect some kind of Benjamin Button effect, they might be disappointed. His company hasnβt yet conducted a randomized controlled trial to demonstrate any anti-aging effects of that supplement, called Rejuvant. Weldon says that such a trial would take years and cost millions of dollars, and that heβd βhave to increase the price of our product more than four timesβ to pay for one. (The company has so far tested the active ingredient in mice and carried out a provisional trial in people.)
More generally, Horvath says he βgets a bad taste in [his] mouthβ when people use the clocks to sell products and βmake a quick buck.β But he thinks that most of those sellers have genuine faith in both the clocks and their products. βPeople truly believe their own nonsense,β he says. βThey are so passionate about what they discovered, they fall into this trap of believing [their] own prejudices.βΒ
The accuracy of the clocks is at a level that makes them useful for research, but not for individual predictions. Even if a clock did tell someone they were five years younger than their chronological age, that wouldnβt necessarily mean the person could expect to live five years longer, says MagalhΓ£es. βThe field of aging has long been a rich ground for snake-oil salesmen and hype,β he says. βIt comes with the territory.β (Weldon, for his part, says Rejuvant is the only product that has βclinically meaningfulβ claims.)Β
In any case, MagalhΓ£es adds that he thinks any publicity is better than no publicity.
And thereβs the rub. Most people in the longevity field seem to have mixed feelings about the trendiness of aging clocks and how they are being used. Theyβll agree that the clocks arenβt ready for consumer prime time, but they tend to appreciate the attention. Longevity research is expensive, after all. With a surge in funding and an explosion in the number of biotech companies working on longevity, aging scientists are hopeful that innovation and progress will follow.Β
So they want to be sure that the reputation of aging clocks doesnβt end up being tarnished by association. Because while influencers and supplement sellers are using their βbiological agesβ to garner attention, scientists are now using these clocks to make some remarkable discoveries. Discoveries that are changing the way we think about aging.
How to be young again
Two little mice lie side by side, anesthetized and unconscious, as Jim White prepares his scalpel. The animals are of the same breed but look decidedly different. One is a youthful three-month-old, its fur thick, black, and glossy. By comparison, the second mouse, a 20-month-old, looks a little the worse for wear. Its fur is graying and patchy. Its whiskers are short, and it generally looks kind of frail.
But the two mice are about to have a lot more in common. White, with some help from a colleague, makes incisions along the side of each mouseβs body and into the upper part of an arm and leg on the same side. He then carefully stitches the two animals togetherβmembranes, fascia, and skin.Β
The procedure takes around an hour, and the mice are then roused from their anesthesia. At first, the two still-groggy animals pull away from each other. But within a few days, they seem to have accepted that they now share their bodies. Soon their circulatory systems will fuse, and the animals will share a blood flow too.
βPeople are complicated. Thereβs a huge error bar.β β Steve Horvath, former biostatistician at the University of California, Los Angeles
LEON EDLER
White, who studies aging at Duke University, has been stitching mice together for years; he has performed this strange procedure, known as heterochronic parabiosis, more than a hundred times. And heβs seen a curious phenomenon occur. The older mice appear to benefit from the arrangement. They seem to get younger.
Experiments with heterochronic parabiosis have been performed for decades, but typically scientists keep the mice attached to each other for only a few weeks, says White. In their experiment, he and his colleagues left the mice attached for three monthsβequivalent to around 10 human years. The team then carefully separated the animals to assess how each of them had fared. βYouβd think that theyβd want to separate immediately,β says White. βBut when you detach them β¦ they kind of follow each other around.β
The most striking result of that experiment was that the older mice who had been attached to a younger mouse ended up living longer than other mice of a similar age. β[They lived] around 10% longer, but [they] also maintained a lot of [their] function,β says White. They were more active and maintained their strength for longer, he adds.
When his colleagues, including Poganik, applied aging clocks to the mice, they found that their epigenetic ages were lower than expected. βThe young circulation slowed aging in the old mice,β says White. The effect seemed to last, tooβat least for a little while. βIt preserved that youthful state for longer than we expected,β he says.
The young mice went the other way and appeared biologically older, both while they were attached to the old mice and shortly after they were detached. But in their case, the effect seemed to be short-lived, says White: βThe young mice went back to being young again.βΒ
To White, this suggests that something about the βyouthful stateβ might be programmed in some way. That perhaps it is written into our DNA. Maybe we donβt have to go through the biological process of aging.Β
This gets at a central debate in the aging field: What is aging, and why does it happen? Some believe itβs simply a result of accumulated damage. Some believe that the aging process is programmed; just as we grow limbs, develop a brain, reach puberty, and experience menopause, we are destined to deteriorate. Others think programs that play an important role in our early development just turn out to be harmful later in life by chance. And there are some scientists who agree with all of the above.
Whiteβs theory is that being old is just βa loss of youth,β he says. If thatβs the case, thereβs a silver lining: Knowing how youth is lost might point toward a way to somehow regain it, perhaps by restoring those youthful programs in some way.Β
Dogs and dolphins
Horvathβs eponymous clock was developed by measuring methylation in DNA samples taken from tissues around the body. It seems to represent aging in all these tissues, which is why Horvath calls it a pan-tissue clock. Given that our organs are thought to age differently, it was remarkable that a single clock could measure aging in so many of them.
But Horvath had ambitious plans for an even more universal clock: a pan-species model that could measure aging in all mammals. He started out, in 2017, with an email campaign that involved asking hundreds of scientists around the world to share samples of tissues from animals they had worked with. He tried zoos, too.Β Β
The pan-mammalian clock suggests that there is something universal about agingβnot just that all mammals experience it in a similar way, but that a similar set of genetic or epigenetic factors might be responsible for it.
βI learned that people had spent careers collecting [animal] tissues,β he says. βThey had freezers full of [them].β Amenable scientists would ship those frozen tissues, or just DNA, to Horvathβs lab in California, where he would use them to train a new model.
Horvath says he initially set out to profile 30 different species. But he ended up receiving around 15,000 samples from 200 scientists, representing 348 speciesβincluding everything from dogs to dolphins. Could a single clock really predict age in all of them?
βI truly felt it would fail,β says Horvath. βBut it turned out that I was completely wrong.β He and his colleagues developed a clock that assessed methylation at 36,000 locations on the genome. The result, which was published in 2023 as the pan-mammalian clock, can estimate the age of any mammal and even the maximum lifespan of the species. The data set is open to anyone who wants to download it, he adds: βI hope people will mine the data to find the secret of how to extend a healthy lifespan.β
The pan-mammalian clock suggests that there is something universal about agingβnot just that all mammals experience it in a similar way, but that a similar set of genetic or epigenetic factors might be responsible for it.
Comparisons between mammals also support the idea that the slower methylation changes occur, the longer the lifespan of the animal, says Nelly Olova, an epigeneticist who researches aging at the University of Edinburgh in the UK. βDNA methylation slowly erodes with age,β she says. βWe still have the instructions in place, but they become a little messier.β The research in different mammals suggests that cells can take only so much change before they stop functioning.
βThereβs a finite amount of change that the cell can tolerate,β she says. βIf the instructions become too messy and noisy β¦ it cannot support life.β
Olova has been investigating exactly when aging clocks first begin to tickβin other words, the point at which aging starts. Clocks can be trained on data from volunteers, and by matching the patterns of methylation on their DNA to their chronological age. The trained clocks are then typically used to estimate the biological age of adults. But they can also be used on samples from children. Or babies. They can be used to work out the biological age of cells that make up embryos.Β
In her research, Olova used adult skin cells, whichβthanks to Nobel Prizeβwinning research in the 2000sβcan be βreprogrammedβ back to a state resembling that of the pluripotent stem cells found in embryos. When Olova and her colleagues used a βpartial reprogrammingβ approach to take cells close to that state, they found that the closer they got to the entirely reprogrammed state, the βyoungerβ the cells were.Β
It was around 20 days after the cells had been reprogrammed into stem cells that they reached the biological age of zero according to the clock used, says Olova. βIt was a bit surreal,β she says. βThe pluripotent cells measure as minus 0.5; theyβre slightly below zero.β
Vadim Gladyshev, a prominent aging researcher at Harvard University, has since proposed that the same negative level of aging might apply to embryos. After all, some kind of rejuvenation happens during the early stages of embryo formationβan aged egg cell and an aged sperm cell somehow create a brand-new cell. The slate is wiped clean.
Gladyshev calls this point βground zero.β He posits that itβs reached sometime during the βmid-embryonic state.β At this point, aging begins. And so does βorganismal life,β he argues. βItβs interesting how this coincides with philosophical questions about when life starts,β says Olova.Β
Some have argued that life begins when sperm meets egg, while others have suggested that the point when embryonic cells start to form some kind of unified structure is what counts. The ground zero point is when the body plan is set out and cells begin to organize accordingly, she says. βBefore that, itβs just a bunch of cells.β
This doesnβt mean that life begins at the embryonic state, but it does suggest that this is when aging beginsβperhaps as the result of βa generational clearance of damage,β says Poganik.
It is early daysβno pun intendedβfor this research, and the science is far from settled. But knowing when aging begins could help inform attempts to rewind the clock. If scientists can pinpoint an ideal biological age for cells, perhaps they can find ways to get old cells back to that state. There might be a way to slow aging once cells reach a certain biological age, too.Β
βPresumably, there may be opportunities for targeting aging before β¦ youβre full of gray hair,β says Poganik. βIt could mean that there is an ideal window for intervention which is much earlier than our current geriatrics-based approach.β
When young meets old
When White first started stitching mice together, he would sit and watch them for hours. βI was like, look at them go! Theyβre together, and they donβt even care!β he says. Since then, heβs learned a few tricks. He tends to work with female mice, for instanceβthe males tend to bicker and nip at each other, he says. The females, on the other hand, seem to get on well.Β
The effect their partnership appears to have on their biological ages, if only temporarily, is among the ways aging clocks are helping us understand that biological age is plastic to some degree. White and his colleagues have also found, for instance, that stress seems to increase biological age, but that the effect can be reversed once the stress stops. Both pregnancy and covid-19 infections have a similar reversible effect.
Poganik wonders if this finding might have applications for human organ transplants. Perhaps thereβs a way to measure the biological age of an organ before it is transplanted and somehow rejuvenate organs before surgery.Β
But new data from aging clocks suggests that this might be more complicated than it sounds. Poganik and his colleagues have been using methylation clocks to measure the biological age of samples taken from recently transplanted hearts in living people.Β
If being old is simply a case of losing our youthfulness, then that might give us a clue to how we can somehow regain it.
Young hearts do well in older bodies, but the biological age of these organs eventually creeps up to match that of their recipient. The same is true for older hearts in younger bodies, says Poganik, who has not yet published his findings. βAfter a few months, the tissue may assimilate the biological age of the organism,β he says.Β
If thatβs the case, the benefits of young organs might be short-lived. It also suggests that scientists working on ways to rejuvenate individual organs may need to focus their anti-aging efforts on more systemic means of rejuvenationβfor example, stem cells that repopulate the blood. Reprogramming these cells to a youthful state, perhaps one a little closer to βground zero,β might be the way to go.
Whole-body rejuvenation might be some way off, but scientists are still hopeful that aging clocks might help them find a way to reverse aging in people.
βWe have the machinery to reset our epigenetic clock to a more youthful state,β says White. βThat means we have the ability to turn the clock backwards.βΒ
The story is a collaboration between MIT Technology Review and Aventine, a non-profit research foundation that creates and supports content about how technology and science are changing the way we live.
Itβs not often you get a text about the robustness of your immune system, but thatβs what popped up on my phone last spring. Sent by John Tsang, an immunologist at Yale, the text came after his lab had put my blood through a mind-boggling array of newfangled tests. The resultβthink of it as a full-body, high-resolution CT scan of my immune systemβwould reveal more about the state of my health than any test I had ever taken. And it could potentially tell me far more than I wanted to know.
βDavid,β the text read, βyou are the red dot.β
Tsang was referring to an image he had attached to the text that showed a graph with a scattering of black dots representing other people whose immune systems had been evaluatedβand a lone red one. There also was a score: 0.35.
I had no idea what any of this meant.
The red dot was the culmination of an immuno-quest I had begun on an autumn afternoon a few months earlier, when a postdoc in Tsangβs lab drew several vials of my blood. It was also a significant milestone in a decades-long journey Iβve taken as a journalist covering life sciences and medicine. Over the years, Iβve offered myself up as a human guinea pig for hundreds of tests promising new insights into my health and mortality. In 2001, I was one of the first humans to have my DNA sequenced. Soon after, in the early 2000s, researchers tapped into my proteomeβproteins circulating in my blood. Then came assessments of my microbiome, metabolome, and much more. I have continued to test-drive the latest protocols and devices, amassing tens of terabytes of data on myself, and Iβve reported on the results in dozens of articles and a book called Experimental Man. Over time, the tests have gotten better and more informative, but no test I had previously taken promised to deliver results more comprehensive or closer to revealing the truth about my underlying state of health than what John Tsang was offering.
Over the years, Iβve offered myself up as a human guinea pig for hundreds of tests promising new insights into my health and mortality. But no test I had previously taken promised to deliver results more comprehensive or closer to revealing the truth about my underlying state of health.
It also was not lost on me that Iβm now 20-plus years older than I was when I took those first tests. Back in my 40s, I was ridiculously healthy. Since then, Iβve been battered by various pathogens, stresses, and injuries, including two bouts of covid and long covidβand, well, life.
But Iβd kept my apprehensions to myself as Tsang, a slim, perpetually smiling man who directs the Yale Center for Systems and Engineering Immunology, invited me into his office in New Haven to introduce me to something called the human immunome.
John Tsang has helped create a new test for your immune system.
JULIE BIDWELL
Made up of 1.8 trillion cells and trillions more proteins, metabolites, mRNA, and other biomolecules, every personβs immunome is different, and it is constantly changing. Itβs shaped by our DNA, past illnesses, the air we have breathed, the food we have eaten, our age, and the traumas and stresses we have experiencedβin short, everything we have ever been exposed to physically and emotionally. Right now, your immune system is hard at work identifying and fending off viruses and rogue cells that threaten to turn cancerousβor maybe already have. And it is doing an excellent job of it all, or not, depending on how healthy it happens to be at this particular moment.
Yet as critical as the immunome is to each of us, this universe of cells and molecules has remained largely beyond the reach of modern medicineβa vast yet inaccessible operating system that powerfully influences everything from our vulnerability to viruses and cancer to how well we age to whether we tolerate certain foods better than others.
Now, thanks to a slew of new technologies and to scientists like Tsang, who is on the Steering Committee of the Chan Zuckerberg Biohub New York, understanding this vital and mysterious system is within our grasp, paving the way for powerful new tools and tests to help us better assess, diagnose and treat diseases.
Already, new research is revealing patterns in the ways our bodies respond to stress and disease. Scientists are creating contrasting portraits of weak and robust immunomesβportraits that someday, itβs hoped, could offer new insights into patient care and perhaps detect illnesses before symptoms appear. There are plans afoot to deploy this knowledge and technology on a global scale, which would enable scientists to observe the effects of climate, geography, and countless other factors on the immunome. The results could transform what it means to be healthy and how we identify and treat disease.
It all begins with a test that can tell you whether your immune system is healthy or not.
Reading the immunome
Sitting in his office last fall, Tsangβa systems immunologist whose expertise combines computer science and immunologyβbegan my tutorial in immunomics by introducing me to a study that he and his team wrote up in a 2024 paper published in Nature Medicine. It described the results of measurements made on blood samples taken from 270 subjectsβtests similar to the ones Tsangβs team would be running on me. In the study, Tsang and his colleagues looked at the immune systems of 228 patients diagnosed with a variety of genetic disorders and a control group of 42 healthy people.
To help me visualize what my results might look like, Tsang opened his laptop to reveal several colorful charts from the study, punctuated by black dots representing each person evaluated. The results reminded me vaguely of abstract paintings by Joan MirΓ³. But in place of colorful splotches, whirls, and circles were an assortment of scatter plots, Gantt charts, and heat maps tinted in greens, blues, oranges, and purples.
It all looked like gibberish to me.
Luckily, Tsang was willing to serve as my guide. Flashing his perpetually patient smile, he explained that these colorful jumbles depicted what his team had uncovered about each subject after taking blood samples and assessing the details of how well their immune cells, proteins, mRNA, and other immune system components were doing their job.
IBRAHIM RAYINTAKATH
The results placed peopleβrepresented by the individual dotsβon a left-to-right continuum, ranging from those with unhealthy immunomes on the left to those with healthy immunomes on the right. Background colors, meanwhile, were used to identify people with different medical conditions affecting their immune systems. For example, olive-green indicated those with auto-immune disorders; orange backgrounds were designated for individuals with no known disease history. Tsang said he and his team would be placing me on a similar graph after they finished analyzing my blood.
Tsangβs measurements go significantly beyond what can be discerned from the handful of immune biomarkers that people routinely get tested for today. βThe main immune cell panel typically ordered by a physician is called a CBC differential,β he told me. CBC, which stands for βcomplete blood count,β is a decades-old type of analysis that counts levels of red blood cells, hemoglobin, and basic immune cell types (neutrophils, lymphocytes, monocytes, basophils, and eosinophils). Changes in these levels can indicate whether a personβs immune system might be reacting to a virus or other infection, cancer, or something else. Other blood testsβlike one that looks for elevated levels of C-reactive protein, which can indicate inflammation associated with heart diseaseβare more specific than the CBC. But they still rely on blunt countingβin this case of certain proteins.
Tsangβs assessment, by contrast, tests up to a million cells, proteins, mRNA and immune biomoleculesβsignificantly more than the CBC and others. His protocol is designed to paint a more holistic portrait of a personβs immune system by not only counting cells and molecules but also by assessing their interactions. The CBC βdoesnβt tell me as a physician what the cells being counted are doing,β says Rachel Sparks, a clinical immunologist who was the lead author of the Nature Medicine study and is now a translational medicine physician with the drug giant AstraZeneca. βI just know that there are more neutrophils than normal, which may or may not indicate that theyβre behaving badly. We now have technology that allows us to see at a granular level what a cell is actually doing when a virus appearsβhow itβs changing and reacting.β
Tsangβs measurements go significantly beyond what can be discerned from the handful of immune biomarkers that people routinely get tested for today. His assessment tests up to a million cells, proteins, mRNA and immune biomolecules.
Such breakthroughs have been made possible thanks to a raft of new and improved technologies that have evolved over the past decade, allowing scientists like Tsang and Sparks to explore the intricacies of the immunome with newfound precision. These include devices that can count myriad different types of cells and biomolecules, as well as advanced sequencers that identify and characterize DNA, RNA, proteins, and other molecules. There are now instruments that also can measure thousands of changes and reactions that occur inside a single immune cell as it reacts to a virus or other threat.
Tsang and Sparkβsβ team used data generated by such measurements to identify and characterize a series of signals distinctive to unhealthy immune systems. Then they used the presence or absence of these signals to create a numerical assessment of the health of a personβs immunomeβa score they call an βimmune health metric,β or IHM.
Clinical immunologist Rachel Sparks hopes new tests can improve medical care.
JARED SOARES
To make sense of the crush of data being collected, Tsangβs team used machine-learning algorithms that correlated the results of the many measurements with a patientβs known health status and age. They also used AI to compare their findings with immune system data collected elsewhere. All this allowed them to determine and validate an IHM score for each person, and to place it on their spectrum, identifying that person as healthy or not.
It all came together for the first time with the publication of the Nature Medicine paper, in which Tsang and his colleagues reported the results from testing multiple immune variables in the 270 subjects. They also announced a remarkable discovery: Patients with different kinds of diseases reacted with similar disruptions to their immunomes. For instance, many showed a lower level of the aptly named natural killer immune cells, regardless of what they were suffering from. Critically, the immune profiles of those with diagnosed diseases tended to look very different from those belonging to the outwardly healthy people in the study. And, as expected, immune health declined in the older patients.
But then the results got really interesting. In a few cases, the immune systems of Β unhealthy and healthy people looked similar, with some people appearing near the βhealthyβ area of the chart even though they were known to have diseases. Most likely this was because their symptoms were in remission and not causing an immune reaction at the moment when their blood was drawn, Tsang told me.Β
In other cases, people without a known disease showed up on the chart closer to those who were known to be sick. βSome of these people who appear to be in good health are overlapping with pathology that traditional metrics canβt spot,β says Tsang, whose Nature Medicine paper reported that roughly half the healthy individuals in the study had IHM scores that overlapped with those of people known to be sick. Either these seemingly healthy people had normal immune systems that were busy fending off, say, a passing virus, orΒ their immune systems had been impacted by aging and the vicissitudes of life. Potentially more worrisome, they were harboring an illness or stress that was not yet making them ill but might do so eventually.
These findings have obvious implications for medicine. Spotting a low immune score in a seemingly healthy person could make it possible to identify and start treating an illness before symptoms appear, diseases worsen, or tumors grow and metastasize. IHM-style evaluations could also provide clues as to why some people respond differently to viruses like the one that causes covid, and why vaccinesβwhich are designed to activate a healthy immune systemβmight not work as well in people whose immune systems are compromised.
Spotting a low immune score in a seemingly healthy person could make it possible to identify and start treating an illness before symptoms appear, diseases worsen, or tumors grow and metastasize.
βOne of the more surprising things about the last pandemic was that all sorts of random younger people who seemed very healthy got sick and then they were gone,β says Mark Davis, a Stanford immunologist who helped pioneer the science being developed in labs like Tsangβs. βSome had underlying conditions like obesity and diabetes, but some did not. So the question is, could we have pointed out that something was off with these folksβ immune systems? Could we have diagnosed that and warned people to take extra precautions?β
Tsangβs IHM test is designed to answer a simple question: What is the relative health of your immune system? But there are other assessments being developed to provide more detailed information on how the body is doing. Tsangβs own team is working on a panel of additional scores aimed at getting finer detail on specific immune conditions. These include a test that measures the health of a personβs bone marrow, which makes immune cells. βIf you have a bone marrow stress or inflammatory condition in the bone marrow, you could have lower capacity to produce cells, which will be reflected by this score,β he says. Another detailed metric will measure protein levels to predict how a person will respond to a virus.
Tsang hopes that an IHM-style test will one day be part of a standard physical examβa snapshot of a patientβs immune system that could inform care. For instance, has a period of intense stress compromised the immune system, making it less able to fend off this seasonβs flu? Will someoneβs score predict a better or worse response to a vaccine or a cancer drug? How does a personβs immune system change with age?
Or, as I anxiously wondered while waiting to learn my own score, will the results reveal an underlying disorder or disease, silently ticking away until it shows itself?
Toward a human immunome projectΒ Β
The quest to create advanced tests like the IHM for the immune systembegan more than 15 years ago, when scientists like Mark Davis became frustrated with a field in which researchβprimarily in miceβwas focused mostly on individual immune cells and proteins. In 2007 he launched the Stanford Human Immune Monitoring Center, one of the first efforts to conceptualize the human immunome as a holistic, body-wide network in human beings. Speaking by Zoom from his office in Palo Alto, California, Davis told me that the effort had spawned other projects, including a landmark twin study showing that a lot of immune variation is not genetic, which was then the prevailing theory, but is heavily influenced by environmental factorsβa major shift in scientistsβ understanding.
Shai Shen-Orr sees a day when people will check their immune scores on an app.
COURTESY OF SHAI SHEN-ORR
Davis and others also laid the groundwork for tests like John Tsangβs by discovering how a T cellβamong the most common and important immune playersβcan recognize pathogens, cancerous cells, and other threats, triggering defensive measures that can include destroying the threat. This and other discoveries have revealed many of the basic mechanics of how immune cells work, says Davis, βbut thereβs still a lot we have to learn.β
One researcher working with Davis in those early days was Shai Shen-Orr, who is now director of the Zimin Institute for AI Solutions in Healthcare at the Technion-Israel Institute of Technology, based in Haifa, Israel. (Heβs also a frequent collaborator with Tsang.) Shen-Orr, like Tsang, is a systems immunologist. He recalls that in 2007, when he was a postdoc in Davisβs lab, immunologists had identified around 100 cell types and a similar number of cytokinesβproteins that act as messengers in the immune system. But they werenβt able to measure them simultaneously, which limited visibility into how the immune system works as a whole. Today, Shen-Orr says, immunologists can measure hundreds of cell types and thousands of proteins and watch them interact.
Shen-Orrβs current lab has developed its own version of an immunome test that he calls IMM-AGE (short for βimmune ageβ), the basics of which were published in a 2019 paper in Nature Medicine. IMM-AGE looks at the composition of peopleβs immune systemsβhow many of each type of immune cell they have and how these numbers change as they age. His team has used this information primarily to ascertain a personβs risk of heart disease.
Shen-Orr also has been a vociferous advocate for expanding the pool of test samples, which now come mostly from Americans and Europeans. βWe need to understand why different people in different environments react differently and how that works,β he says. βWe also need to test a lot more peopleβmaybe millions.β
Tsang has seen why a limited sample size can pose problems. In 2013, he says, researchers at the National Institutes of Health came up with a malaria vaccine that was effective for almost everyone who got it during clinical trials conducted in Maryland. βBut in Africa,β he says, βit only worked for about 25% of the people.β He attributes this to the significant differences in genetics, diet, climate, and other environmental factors that cause peopleβs immunomes to develop differently. βWhy?β he asks. βWhat exactly was different about the immune systems in Maryland and Tanzania? Thatβs what we need to understand so we can design personalized vaccines and treatments.β
βWhat exactly was different about the immune systems in Maryland and Tanzania? Thatβs what we need to understand so we can design personalized vaccines and treatments.β
John Tsang
For several years, Tsang and Shen-Orr have advocated going global with testing, βbut there has been resistance,β Shen-Orr says. βLook, medicine is conservative and moves slowly, and the technology is expensive and labor intensive.β They finally got the audience they needed at a 2022 conference in La Jolla, California, convened by the Human Immunome Project, or HIP. (The organization was originally founded in 2016 to create more effective vaccines but had recently changed its name to emphasize a pivot from just vaccines to the wider field of immunome science.) It was in La Jolla that they met HIPβs then-new chairperson, Jane Metcalfe, a cofounder of Wired magazine, who saw what was at stake.
βWeβve got all of these advanced molecular immunological profiles being developed,β she said, βbut we canβt begin to predict the breadth of immune system variability if weβreΒ only testing small numbers of people in Palo Alto or Tel Aviv. And thatβs when the big aha moment struck us that we need sites everywhere to collect that information so we can build proper computer models and a predictive understanding of the human immune system.β
IBRAHIM RAYINTAKATH
Following that meeting, HIP created a new scientific plan, with Tsang and Shen-Orr as chief science officers. The group set an ambitious goal of raising around $3 billion over the next 10 yearsβa goal Tsang and Metcalfe say will be met by working in conjunction with a broad network of public and private supporters. Cutbacks in federal funding for biomedical research in the US may limit funds from this traditional source, but HIP plans to work with government agencies outside the US too, with the goal of creating a comprehensive global immunological database.
HIPβs plan is to first develop a pilot version based on Tsangβs test, which it will call the Immune Monitoring Kit, to test a few thousand people in Africa, Australia, East Asia, Europe, the US, and Israel. The initial effort, according to Metcalfe, is expected to begin by the end of the year. Β
After that, HIP would like to expand to some 150 sites around the world, eventually assessing about 250,000 people and collecting a vast cache of data and insights that Tsang believes will profoundly affectβeven revolutionizeβclinical medicine, public health, and drug development.
My immune health metric score is β¦
As HIP develops its pilot study to take on the world, John Tsang, for better or worse, has added one more North American Caucasian male to the small number of people who have received an IHM score to date. That would be me.
It took a long time to get my score, but Tsang didnβt leave me hanging once he pinged me the red dot. βWe plotted you with other participants who are clinically quite healthy,β he texted, referring to a cluster of black dots on the grid he had sent, although he cautioned that the group Iβm being compared with includes only a few dozen people. βHigher IHM means better immune health,β he wrote, referring to my 0.35 score, which he described as a number on an arbitrary scale. βAs you can see, your IHM is right in the middle of a bunch of people 20 years younger.β
This was a relief, given that our immune system, like so many other bodily functions, declines with ageβthough obviously at different rates. Yet I also felt a certain disappointment. To be honest, I had expected more granular detail after having a million or so cells and markers testedβlike perhaps some insights on why I got long covid (twice) and others didnβt. Tsang and other scientists are working on ways to extract more specific information from the tests. Still, he insists that the single score itself is a powerful tool to understand the general state of our immunomes, indicating the absence or presence of underlying health issues that might not be revealed in traditional testing.
To be honest, I had expected more granular detail after having a million or so cells and markers testedβlike perhaps some insights on why I got long covid (twice) and others didnβt.
I asked Tsang what my score meant for my future. βYour score is always changing depending on what youβre exposed to and due to age,β he said, adding that the IHM is still so new that itβs hard to know exactly what the score means until researchers do more workβand until HIP can evaluate and compare thousands or hundreds of thousands of people. They also need to keep testing me over time to see how my immune system changes as itβs exposed to new perturbations and stresses.
For now, Iβm left with a simple number. Though it tells me little about the detailed workings of my immune system, the good news is that it raises no red flags. My immune system, it turns out, is pretty healthy.
A few days after receiving my score from Tsang, I heard from Shen-Orr about more results. Tsang had shared my data with his lab so that he could run his IMM-AGE protocol on my immunome and provide me with another score to worry about. Shen-Orrβs result put the age of my immune system at around 57βstill 10 years younger than my true age.
The coming age of the immunome
Shai Shen-Orr imagines a day when people will be able to check their advanced IHM and IMM-AGE scoresβor their HIP Immune Monitoring Kit scoreβon an app after a blood draw, the way they now check health data such as heart rate and blood pressure. Jane Metcalfe talks about linking IHM-type measurements and analyses with rising global temperatures and steamier days and nights to study how global warming might affect the immune system of, say, a newborn or a pregnant woman. βThis could be plugged into other peopleβs models and really help us understand the effects of pollution, nutrition, or climate change on human health,β she says.
βI think [in 10 years] Iβll be able to use this much more granular understanding of what the immune system is doing at the cellular level in my patients. And hopefully we could target our therapies more directly to those cells or pathways that are contributing to disease.β
Rachel Sparks
Other clues could also be on the horizon. βAt some point weβll have IHM scores that can provide data on who will be most affected by a virus during a pandemic,β Tsang says. Maybe that will help researchers engineer an immune system response that shuts down the virus before it spreads. He says itβs possible to run a test like that now, but it remains experimental and will take years to fully develop, test for safety and accuracy, and establish standards and protocols for use as a tool of global public health. βThese things take a long time,β he says.Β
The same goes for bringing IHM-style tests into the exam room, so doctors like Rachel Sparks can use the results to help treat their patients. βI think in 10 years, with some effort, we really could have something useful,β says Stanfordβs Mark Davis. Sparks agrees. βI think by then Iβll be able to use this much more granular understanding of what the immune system is doing at the cellular level in my patients,β she says. βAnd hopefully we could target our therapies more directly to those cells or pathways that are contributing to disease.β
Personally, Iβll wait for more details with a mix of impatience, curiosity, and at least a hint of concern. I wonder what more the immune circuitry deep inside me might reveal about whether Iβm healthy at this very moment, or will be tomorrow, or next month, or years from now.Β
Something is rotten in the city of Nunapitchuk. In recent years, a crack has formed in the middle of a house. Sewage has leached into the earth. Soil has eroded around buildings, leaving them perched atop precarious lumps of dirt. There are eternal puddles. And mold. The ground can feel squishy, sodden.Β
This small town in northern Alaska is experiencing a sometimes overlooked consequence of climate change: thawing permafrost. And Nunapitchuk is far from the only Arctic town to find itself in such a predicament.Β
Permafrost, which lies beneath about 15% of the land in the Northern Hemisphere, is defined as ground that has remained frozen for at least two years. Historically, much of the worldβs permafrost has remained solid and stable for far longer, allowing people to build whole towns atop it. But as the planet warms, a process that is happening more rapidly near the poles than at more temperate latitudes, permafrost is thawing and causing a host of infrastructural and environmental problems.
Now scientists think they may be able to use satellite data to delve deep beneath the groundβs surface and get a better understanding of how the permafrost thaws, and which areas might be most severely affected because they had more ice to start with. Clues from the short-term behavior of those especially icy areas, seen from space, could portend future problems.
Using information gathered both from space and on the ground, they are working with affected communities to anticipate whether a houseβs foundation will crackβand whether it is worth mending that crack or is better to start over in a new house on a stable hilltop. These scientistsβ permafrost predictions are already helping communities like Nunapitchuk make those tough calls.
But itβs not just civilian homes that are at risk. One of the top US intelligence agencies, the National Geospatial-Intelligence Agency (NGA), is also interested in understanding permafrost better. Thatβs because the same problems that plague civilians in the high north also plague military infrastructure, at home and abroad. The NGA is, essentially, an organization full of space spiesβpeople who analyze data from surveillance satellites and make sense of it for the countryβs national security apparatus.Β
Understanding the potential instabilities of the Alaskan military infrastructureβwhich includes radar stations that watch for intercontinental ballistic missiles, as well as military bases and National Guard postsβis key to keeping those facilities in good working order and planning for their strengthened future. Understanding the potential permafrost weaknesses that could affect the infrastructure of countries like Russia and China, meanwhile, affords what insiders might call βsituational awarenessβ about competitors.Β
The work to understand this thawing will only become more relevant, for civilians and their governments alike, as the world continues to warm.Β
The ground beneath
If you live much below the Arctic Circle, you probably donβt think a lot about permafrost. But it affects you no matter where you call home.
In addition to the infrastructural consequences for real towns like Nunapitchuk, thawing permafrost contains sequestered carbonβtwice as much as currently inhabits the atmosphere. As the permafrost thaws, the process can release greenhouse gases into the atmosphere. That release can cause a feedback loop: Warmer temperatures thaw permafrost, which releases greenhouse gases, which warms the air more, which thenβyou get it.Β
The microbes themselves, along with previously trapped heavy metals, are also set dangerously free.
For many years, researchersβ primary options for understanding some of these freeze-thaw changes involved hands-on, on-the-ground surveys. But in the late 2000s, Kevin Schaefer, currently a senior scientist at the Cooperative Institute for Research in Environmental Sciences at the University of Colorado Boulder, started to investigate a less labor-intensive idea: using radar systems aboard satellites to survey the ground beneath.Β
This idea implanted itself in his brain in 2009, when he traveled to a place called Toolik Lake, southwest of the oilfields of Prudhoe Bay in Alaska. One day, after hours of drilling sample cores out of the ground to study permafrost, he was relaxing in the Quonset hut, chatting with colleagues. They began to discuss howΒ space-based radar could potentially detect how the land sinks and heaves back up as temperatures change.Β
Huh, he thought. Yes,radar probably could do that.Β
Scientists call the ground right above permafrost the active layer. The water in this layer of soil contracts and expands with the seasons: during the summer, the ice suffusing the soil melts and the resulting decrease in volume causes the ground to dip. During the winter, the water freezes and expands, bulking the active layer back up. Radar can help measure that height difference, which is usually around one to five centimeters.Β
Schaefer realized that he could use radar to measure the ground elevation at the start and end of the thaw. The electromagnetic waves that bounce back at those two times would have traveled slightly different distances. That difference would reveal the tiny shift in elevation over the seasons and would allow him to estimate how much water had thawed and refrozen in the active layer and how far below the surface the thaw had extended.
With radar, Schaefer realized, scientists could cover a lot more literal ground, with less effort and at lower cost.
βIt took us two years to figure out how to write a paper on it,β he says; no one had ever made those measurements before. He and colleagues presented the idea at the 2010 meeting of the American Geophysical Union and published a paper in 2012 detailing the method, using it to estimate the thickness of the active layer on Alaskaβs North Slope.
When they did, they helped start a new subfield that grew as large-scale data sets started to become available around 5 to 10 years ago, says Roger Michaelides, a geophysicistat Washington University in St. Louis and a collaborator of Schaeferβs. Researchersβ efforts were aided by the growth in space radar systems and smaller, cheaper satellites.Β
With the availability of global data sets (sometimes for free, from government-run satellites like the European Space Agencyβs Sentinel) and targeted observations from commercial companies like Iceye, permafrost studies are moving from bespoke regional analyses to more automated, large-scale monitoring and prediction.
The remote view
Simon Zwieback, a geospatial and environmental expert at the University of Alaska Fairbanks, sees the consequences of thawing permafrost firsthand every day. His office overlooks a university parking lot, a corner of which is fenced off to keep cars and pedestrians from falling into a brand-new sinkhole. That area of asphalt had been slowly sagging for more than a year, but over a week or two this spring, it finally started to collapse inward.Β
Kevin Schaefer stands on top of a melting layer of ice near the Alaskan pipeline on the North Slope of Alaska.
COURTESY OF KEVIN SCHAEFER
The new remote research methods are a large-scale version of Zwieback taking in the view from his window. Researchers look at the ground and measure how its height changes as ice thaws and refreezes. The approach can cover wide swaths of land, but it involves making assumptions about whatβs going on below the surfaceβnamely, how much ice suffuses the soil in the active layer and permafrost. Thawing areas with relatively low ice content could mimic thinner layers with more ice. And itβs important to differentiate the two, since more ice in the permafrost means more potential instability.Β
To check that theyβre on the right track, scientists have historically had to go out into the field. But a few years ago, Zwieback started to explore a way to make better and deeper estimates of ice content using the available remote sensing data. Finding a way to make those kinds of measurements on a large scale was more than an academic exercise: Areas of what he calls βexcess iceβ are most liable to cause instability at the surface. βIn order to plan in these environments, we really need to know how much ice there is, or where those locations are that are rich in ice,β he says.
Zwieback, who did his undergraduate and graduate studies in Switzerland and Austria, wasnβt always so interested in permafrost, or so deeply affected by it. But in 2014, when he was a doctoral student in environmental engineering, he joined an environmental field campaign in Siberia, at the Lena River Delta, which resembles a gigantic piece of coral fanning out into the Arctic Ocean. Zwieback was near a town called Tiksi, one of the worldβs northernmost settlements. Itβs a military outpost and starting point for expeditions to the North Pole, featuring an abandoned plane near the ocean. Its Soviet-era concrete buildings sometimes bring it to the front page of the r/UrbanHell subreddit.Β
Here, Zwieback saw part of the coastline collapse, exposing almost pure ice. It looked like a subterranean glacier, but it was permafrost. βThat really had an indelible impact on me,β he says.Β
Later, as a doctoral student in Zurich and postdoc in Canada, he used his radar skills to understand the rapid changes that the activity of permafrost impressed upon the landscape.Β
And now, with his job in Fairbanks and his ideas about the use of radar sensing, he has done work funded by the NGA, which has an open Arctic data portal.Β
In his Arctic research, Zwieback started with the approach underlying most radar permafrost studies: looking at the groundβs seasonal subsidence and heave. βBut thatβs something that happens very close to the surface,β he says. βIt doesnβt really tell us about these long-term destabilizing effects,β he adds.
In warmer summers, he thought, subtle clues would emerge that could indicate how much ice is buried deeper down.
For example, he expected those warmer-than-average periods to exaggerate the amount of change seen on the surface, making it easier to tell which areas are ice-rich. Land that was particularly dense with ice would dip more than it βshouldββa precursor of bigger dips to come.
The first step, then, was to measure subsidence directly, as usual. But from there, Zwieback developed an algorithm to ingest data about the subsidence over timeβas measured by radarβand other environmental information, like the temperatures at each measurement. He then created a digital model of the land that allowed him to adjust the simulated amount of ground ice and determine when it matched the subsidence seen in the real world. With that, researchers could infer the amount of ice beneath.
Next, he made maps of that ice that could potentially be useful to engineersβwhether they were planning a new subdivision or, as his funders might be, keeping watch on a military airfield.
βWhat was new in my work was to look at these much shorter periods and use them to understand specific aspects of this whole system, and specifically how much ice there is deep down,β Zwieback says.Β
The NGA, which has also funded Schaeferβs work, did not respond to an initial request for comment but did later provide feedback for fact-checking. It removed an article on its website about Zwiebackβs grant and its application to agency interests around the time that the current presidential administration began to ban mention of climate change in federal research. But the thawing earth is of keen concern.Β
To start, the US has significant military infrastructure in Alaska: Itβs home to six military bases and 49 National Guard posts, as well as 21 missile-detecting radar sites. Most are vulnerable to thaw now or in the near future, given that 85% of the state is on permafrost.Β
Beyond American borders, the broader north is in a state of tension. Russiaβs relations with Northern Europe are icy. Its invasion of Ukraine has left those countries fearing that they too could be invaded, prompting Sweden and Finland, for instance, to join NATO. The US has threatened takeovers of Greenland and Canada. And Chinaβwhich has shipping and resource ambitions for the regionβis jockeying to surpass the US as the premier superpower.Β
Permafrost plays a role in the situation. βAs knowledge has expanded, so has the understanding that thawing permafrost can affect things NGA cares about, including the stability of infrastructure in Russia and China,β read the NGA article. Permafrost covers 60% of Russia, and thaws have affected more than 40% of buildings in northern Russia already, according to statements from the countryβs minister of natural resources in 2021. Experts say critical infrastructure like roads and pipelines is at risk, along with military installations. That could weaken both Russiaβs strategic position and the security of its residents. In China, meanwhile, according to a report from the Council on Strategic Risks, important moving parts like the Qinghai-Tibet Railway, βwhich allows Beijing to more quickly move military personnel near contested areas of the Indian border,β is susceptible to ground thawβas are oil and gas pipelines linking Russia and China.Β
In the field
Any permafrost analysis that relies on data from space requires verification on Earth. The hope is that remote methods will become reliable enough to use on their own, but while theyβre being developed, researchers must still get their hands muddy with more straightforward and longer tested physical methods. Some use a network called Circumpolar Active Layer Monitoring, which has existed since 1991, incorporating active-layer data from hundreds of measurement sites across the Northern Hemisphere.Β
Sometimes, that data comes from people physically probing an area; other sites use tubes permanently inserted into the ground, filled with a liquid that indicates freezing; still others use underground cables that measure soil temperature. Some researchers, like Schaefer, lug ground-penetrating radar systems around the tundra. Heβs taken his system to around 50 sites and made more than 200,000 measurements of the active layer.
The field-ready ground-penetrating radar comes in a big boxβthe size of a steamer trunkβthat emits radio pulses. These pulses bounce off the bottom of the active layer, or the top of the permafrost. In this case, the timing of that reflection reveals how thick the active layer is. With handles designed for humans, Schaeferβs team drags this box around the Arcticβs boggier areas.Β
The box floats. βI do not,β he says. He has vivid memories of tromping through wetlands, his legs pushing straight down through the muck, his body sinking up to his hips.
Andy Parsekian and Kevin Schaefer haul a ground penetrating radar unit through the tundra near Utqiagvik.
COURTESY OF KEVIN SCHAEFER
Zwieback also needs to verify what he infers from his space data. And so in 2022, he went to the Toolik Field station, a National Science Foundationβfunded ecology research facility along the Dalton Highway and adjacent to Schaeferβs Toolik Lake. This road, which goes from Fairbanks up to the Arctic Ocean, is colloquially called the Haul Road; it was made famous in the TV show Ice Road Truckers. From this access point, Zwiebackβs team needed to get deep samples of soil whose ice content could be analyzed in the lab.
Every day, two teams would drive along the Dalton Highway to get close to their field sites. Slamming their car doors, they would unload and hop on snow machines to travel the final distance. Often they would see musk oxen, looking like bison that never cut their hair. The grizzlies were also interested in these oxen, and in the nearby caribou.Β
At the sites they could reach, they took out a corer, a long, tubular piece of equipment driven by a gas engine, meant to drill deep into the ground. Zwieback or a teammate pressed it into the earth. The barrelβs two blades rotated, slicing a cylinder about five feet down to ensure that their samples went deep enough to generate data that can be compared with the measurements made from space. Then they pulled up and extracted the cylinder, a sausage of earth and ice.
All day every day for a week, they gathered cores that matched up with the pixels in radar images taken from space. In those cores, the ice was apparent to the eye. But Zwieback didnβt want anecdata. βWe want to get a number,β he says.
So he and his team would pack their soil cylinders back to the lab. There they sliced them into segments and measured their volume, in both their frozen and their thawed form, to see how well the measured ice content matched estimates from the space-based algorithm.Β
The initial validation, which took months, demonstrated the value of using satellites for permafrost work. The ice profiles that Zwiebackβs algorithm inferred from the satellite data matched measurements in the lab down to about 1.1 feet, and farther in a warm year, with some uncertainty near the surface and deeper into the permafrost.Β
Whereas it cost tens of thousands of dollars to fly in on a helicopter, drive in a car, and switch to a snowmobile to ultimately sample a small area using your hands, only to have to continue the work at home, the team needed just a few hundred dollars to run the algorithm on satellite data that was free and publicly available.Β
Michaelides, who is familiar with Zwiebackβs work, agrees that estimating excess ice content is key to making infrastructural decisions, and that historical methods of sussing it out have been costly in all senses. Zwiebackβs method of using late-summer clues to infer whatβs going on at that depth βis a very exciting idea,β he says, and the results βdemonstrate that there is considerable promise for this approach.βΒ
He notes, though, that using space-based radar to understand the thawing ground is complicated: Ground ice content, soil moisture, and vegetation can differ even within a single pixel that a satellite can pick out. βTo be clear, this limitation is not unique to Simonβs work,β Michaelides says; it affects all space-radar methods. There is also excess ice below even where Zwiebackβs algorithm can probeβsomething the labor-intensive on-ground methods can pick up that still canβt be seen from space.Β
Mapping out the future
After Zwieback did his fieldwork, NGA decided to do its own. The agencyβs attempt to independently validate his workβin Prudhoe Bay, Utqiagvik, and Fairbanksβwas part of a project it called Frostbyte.Β
Its partners in that projectβthe Armyβs Cold Regions Research Engineering Laboratory and Los Alamos National Laboratoryβdeclined requests for interviews. As far as Zwieback knows, theyβre still analyzing data.Β
But the intelligence community isnβt the only group interested in research like Zwiebackβs. He also works with Arctic residents, reaching out to rural Alaskan communities where people are trying to make decisions about whether to relocate or where to build safely. βThey typically canβt afford to do expensive coring,β he says. βSo the idea is to make these data available to them.βΒ
Zwieback and his team haul their gear out to gather data from drilled core samples, a process which can be arduous and costly.
ANDREW JOHNSON
Schaefer is also trying to bridge the gap between his science and the people it affects. Through a company called Weather Stream, he is helping communities identify risks to infrastructure before anything collapses, so they can take preventative action.
Making such connections has always been a key concern for Erin Trochim, a geospatial scientist at the University of Alaska Fairbanks. As a researcher who works not just on permafrost but also on policy, sheβs seen radar science progress massively in recent yearsβwithout commensurate advances on the ground.
For instance, itβs still hard for residents in her town of Fairbanksβor anywhereβto know if thereβs permafrost on their property at all, unless theyβre willing to do expensive drilling. Sheβs encountered this problem, still unsolved, on property she owns. And if an expert canβt figure it out, non-experts hardly stand a chance. βItβs just frustrating when a lot of this information that we know from the science side, and [thatβs] trickled through the engineering side, hasnβt really translated into the on-the-ground construction,β she says.Β
There is a group, though, trying to turn that trickle into a flood: Permafrost Pathways, a venture that launched with a $41 million grant through the TED Audacious Project. In concert with affected communities, including Nunapitchuk, it is building a data-gathering network on the ground, and combining information from that network with satellite data and local knowledge to help understand permafrost thaw and develop adaptation strategies.Β
βI think about it often as if you got a diagnosis of a disease,β says Sue Natali, the head of the project. βItβs terrible, but itβs also really great, because when you know what your problem is and what youβre dealing with, itβs only then that you can actually make a plan to address it.βΒ
And the communities Permafrost Pathways works with are making plans. Nunapitchuk has decided to relocate, and the town and the research group have collaboratively surveyed the proposed new location: a higher spot on hardpacked sand. Permafrost Pathways scientists were able to help validate the stability of the new siteβand prove to policymakers that this stability would extend into the future.Β
Radar helps with that in part, Natali says, because unlike other satellite detectors, it penetrates clouds. βIn Alaska, itβs extremely cloudy,β she says. βSo other data sets have been very, very challenging. Sometimes we get one image per year.β
And so radar data, and algorithms like Zwiebackβs that help scientists and communities make sense of that data, dig up deeper insight into whatβs going on beneath northernersβ feetβand how to step forward on firmer ground.Β
When Kenneth Wehr started managing the Greenlandic-language version of Wikipedia four years ago, his first act was to delete almost everything. It had to go, he thought, if it had any chance of surviving.
Wehr, whoβs 26, isnβt from Greenlandβhe grew up in Germanyβbut he had become obsessed with the island, an autonomous Danish territory, after visiting as a teenager. Heβd spent years writing obscure Wikipedia articles in his native tongue on virtually everything to do with it. He even ended up moving to Copenhagen to study Greenlandic, a language spoken by some 57,000 mostly Indigenous Inuit people scattered across dozens of far-flung Arctic villages.Β
The Greenlandic-language edition was added to Wikipedia around 2003, just a few years after the site launched in English. By the time Wehr took its helm nearly 20 years later, hundreds of Wikipedians had contributed to it and had collectively written some 1,500 articles totaling over tens of thousands of words. It seemed to be an impressive vindication of the crowdsourcing approach that has made Wikipedia the go-to source for information online, demonstrating that it could work even in the unlikeliest places.Β
There was only one problem: The Greenlandic Wikipedia was a mirage.Β
Virtually every single article had been published by people who did not actually speak the language. Wehr, who now teaches Greenlandic in Denmark, speculates that perhaps only one or two Greenlanders had ever contributed. But what worried him most was something else: Over time, he had noticed that a growing number of articles appeared to be copy-pasted into Wikipedia by people using machine translators. They were riddled with elementary mistakesβfrom grammatical blunders to meaningless words to more significant inaccuracies, like an entry that claimed Canada had only 41 inhabitants. Other pages sometimes contained random strings of letters spat out by machines that were unable to find suitable Greenlandic words to express themselves.Β
βIt might have looked Greenlandic to [the authors], but they had no way of knowing,β complains Wehr.
βSentences wouldnβt make sense at all, or they would have obvious errors,β he adds. βAI translators are really bad at Greenlandic.βΒ Β
What Wehr describes is not unique to the Greenlandic edition.Β
Wikipedia is the most ambitious multilingual project after the Bible: There are editions in over 340 languages, and a further 400 even more obscure ones are being developed and tested. Many of these smaller editions have been swamped with automatically translated content as AI has become increasingly accessible. Volunteers working on four African languages, for instance, estimated to MIT Technology Review that between 40% and 60% of articles in their Wikipedia editions were uncorrected machine translations. And after auditing the Wikipedia edition in Inuktitut, an Indigenous language close to Greenlandic thatβs spoken in Canada, MIT Technology Review estimates that more than two-thirds of pages containing more than several sentences feature portions created this way.Β
This is beginning to cause a wicked problem. AI systems, from Google Translate to ChatGPT, learn to βspeakβ new languages by scraping huge quantities of text from the internet. Wikipedia is sometimes the largest source of online linguistic data for languages with few speakersβso any errors on those pages, grammatical or otherwise, can poison the wells that AI is expected to draw from. That can make the modelsβ translation of these languages particularly error-prone, which creates a sort of linguistic doom loop as people continue to add more and more poorly translated Wikipedia pages using those tools, and AI models continue to train from poorly translated pages. Itβs a complicated problem, but it boils down to a simple concept: Garbage in, garbage out.Β
βThese models are built on raw data,β says Kevin Scannell, a former professor of computer science at Saint Louis University who now builds computer software tailored for endangered languages. βThey will try and learn everything about a language from scratch. There is no other input. There are no grammar books. There are no dictionaries. There is nothing other than the text that is inputted.β
There isnβt perfect data on the scale of this problem, particularly because a lot of AI training data is kept confidential and the field continues to evolve rapidly. But back in 2020, Wikipedia was estimated to make up more than half the training data that was fed into AI models translating some languages spoken by millions across Africa, including Malagasy, Yoruba, and Shona. In 2022, a research team from Germany that looked into what data could be obtained by online scraping even found that Wikipedia was the sole easily accessible source of online linguistic data for 27 under-resourced languages.Β
This could have significant repercussions in cases where Wikipedia is poorly writtenβpotentially pushing the most vulnerable languages on Earth toward the precipice as future generations begin to turn away from them.Β
βWikipedia will be reflected in the AI models for these languages,β says Trond Trosterud, a computational linguist at the University of TromsΓΈ in Norway, who has been raising the alarm about the potentially harmful outcomes of badly run Wikipedia editions for years. βI find it hard to imagine it will not have consequences. And, of course, the more dominant position that Wikipedia has, the worse it will be.βΒ
Use responsibly
Automation has been built into Wikipedia since the very earliest days. Bots keep the platform operational: They repair broken links, fix bad formatting, and even correct spelling mistakes. These repetitive and mundane tasks can be automated away with little problem. There is even an army of bots that scurry around generating short articles about rivers, cities, or animals by slotting their names into formulaic phrases. They have generally made the platform better.Β
But AI is different. Anybody can use it to cause massive damage with a few clicks.Β
Wikipedia has managed the onset of the AI era better than many other websites. It has not been flooded with AI bots or disinformation, as social media has been. It largely retains the innocence that characterized the earlier internet age. Wikipedia is open and free for anyone to use, edit, and pull from, and itβs run by the very same community it serves. It is transparent and easy to use. But community-run platforms live and die on the size of their communities. English has triumphed, while Greenlandic has sunk.Β
βWe need good Wikipedians. This is something that people take for granted. It is not magic,β says Amir Aharoni, a member of the volunteer Language Committee, which oversees requests to open or close Wikipedia editions. βIf you use machine translation responsibly, it can be efficient and useful. Unfortunately, you cannot trust all people to use it responsibly.βΒ
Trosterud has studied the behavior of users on small Wikipedia editions and says AI has empowered a subset that he terms βWikipedia hijackers.β These users can range widelyβfrom naive teenagers creating pages about their hometowns or their favorite YouTubers to well-meaning Wikipedians who think that by creating articles in minority languages they are in some way βhelpingβ those communities.Β
βThe problem with them nowadays is that they are armed with Google Translate,β Trosterud says, adding that this is allowing them to produce much longer and more plausible-looking content than they ever could before: βEarlier they were armed only with dictionaries.βΒ
This has effectively industrialized the acts of destructionβwhich affect vulnerable languages most, since AI translations are typically far less reliable for them. There can be lots of different reasons for this, but a meaningful part of the issue is the relatively small amount of source text that is available online. And sometimes models struggle to identify a language because it is similar to others, or because some, including Greenlandic and most Native American languages, have structures that make them badly suited to the way most machine translation systems work. (Wehr notes that in Greenlandic most words are agglutinative, meaning they are built by attaching prefixes and suffixes to stems. As a result, many words are extremely context specific and can express ideas that in other languages would take a full sentence.)Β
Research produced by Google before a major expansion of Google Translate rolled out three years ago found that translation systems for lower-resourced languages were generally of a lower quality than those for better-resourced ones. Researchers found, for example, that their model would often mistranslate basic nouns across languages, including the names of animals and colors. (In a statement to MITTechnology Review, Google wrote that it is βcommitted to meeting a high standard of quality for all 249 languagesβ it supports βby rigorously testing and improving [its] systems, particularly for languages that may have limited public text resources on the web.β)Β
Wikipedia itself offers a built-in editing tool called Content Translate, which allows users to automatically translate articles from one language to anotherβthe idea being that this will save time by preserving the references and fiddly formatting of the originals. But it piggybacks on external machine translation systems, so itβs largely plagued by the same weaknesses as other machine translatorsβa problem that the Wikimedia Foundation says is hard to solve. Itβs up to each editionβs community to decide whether this tool is allowed, and some have decided against it. (Notably, English-language Wikipedia has largely banned its use, claiming that some 95% of articles created using Content Translate failed to meet an acceptable standard without significant additional work.) But itβs at least easy to tell when the program has been used; Content Translate adds a tag on the Wikipedia back end.Β
Other AI programs can be harder to monitor. Still, many Wikipedia editors I spoke with said that once their languages were added to major online translation tools, they noticed a corresponding spike in the frequency with which poor, likely machine-translated pages were created.Β
Some Wikipedians using AI to translate content do occasionally admit that they do not speak the target languages. They may see themselves as providing smaller communities with rough-cut articles that speakers can then fixβessentially following the same model that has worked well for more active Wikipedia editions.Β Β
Google Translate, for instance, says the Fulfulde word for January means June, while ChatGPT says itβs August or September. The programs also suggest the Fulfulde word for βharvestβ means βfeverβ or βwell-being,β among other possibilities.Β Β
But once error-filled pages are produced in small languages, there is usually not an army of knowledgeable people who speak those languages standing ready to improve them. There are few readers of these editions, and sometimes not a single regular editor.Β
Yuet Man Lee, a Canadian teacher in his 20s, says that he used a mix of Google Translate and ChatGPT to translate a handful of articles that he had written for the English Wikipedia into Inuktitut, thinking itβd be nice to pitch in and help a smaller Wikipedia community. He says he added a note to one saying that it was only a rough translation. βI did not think that anybody would notice [the article],β he explains. βIf you put something out there on the smaller Wikipediasβmost of the time nobody does.βΒ
But at the same time, he says, he still thought βsomeone might see it and fix it upββadding that he had wondered whether the Inuktitut translation that the AI systems generated was grammatically correct. Nobody has touched the article since he created it.
Lee, who teaches social sciences in Vancouver and first started editing entries in the English Wikipedia a decade ago, says that users familiar with more active Wikipedias can fall victim to this mindset, which he terms a βbigger-Wikipedia arroganceβ: When they try to contribute to smaller Wikipedia editions, they assume that others will come along to fix their mistakes. It can sometimes work. Lee says he had previously contributed several articles to Wikipedia in Tatar, a language spoken by several million people mainly in Russia, and at least one of those was eventually corrected. But the Inuktitut Wikipedia is, by comparison, a βbarren wasteland.βΒ
He emphasizes that his intentions had been good: He wanted to add more articles to an Indigenous Canadian Wikipedia. βI am now thinking that it may have been a bad idea. I did not consider that I could be contributing to a recursive loop,β he says. βIt was about trying to get content out there, out of curiosity and for fun, without properly thinking about the consequences.βΒ
Β βTotally, completely no futureβ
Wikipedia is a project that is driven by wide-eyed optimism. Editing can be a thankless task, involving weeks spent bickering with faceless, pseudonymous people, but devotees put in hours of unpaid labor because of a commitment to a higher cause. It is this commitment that drives many of the regular small-language editors I spoke with. They all feared what would happen if garbage continued to appear on their pages.
Abdulkadir Abdulkadir, a 26-year-old agricultural planner who spoke with me over a crackling phone call from a busy roadside in northern Nigeria, said that he spends three hours every day fiddling with entries in his native Fulfulde, a language used mainly by pastoralists and farmers across the Sahel. βBut the work is too much,β he said.Β
Abdulkadir sees an urgent need for the Fulfulde Wikipedia to work properly. He has been suggesting it as one of the few online resources for farmers in remote villages, potentially offering information on which seeds or crops might work best for their fields in a language they can understand. If you give them a machine-translated article, Abdulkadir told me, then it could βeasily harm them,β as the information will probably not be translated correctly into Fulfulde.Β
Google Translate, for instance, says the Fulfulde word for January means June, while ChatGPT says itβs August or September. The programs also suggest the Fulfulde word for βharvestβ means βfeverβ or βwell-being,β among other possibilities.Β Β
Abdulkadir said he had recently been forced to correct an article about cowpeas, a foundational cash crop across much of Africa, after discovering that it was largely illegible.Β
If someone wants to create pages on the Fulfulde Wikipedia, Abdulkadir said, they should be translated manually. Otherwise, βwhoever will read your articles will [not] be able to get even basic knowledge,β he tells these Wikipedians. Nevertheless, he estimates that some 60% of articles are still uncorrected machine translations. Abdulkadir told me that unless something important changes with how AI systems learn and are deployed, then the outlook for Fulfulde looks bleak. βIt is going to be terrible, honestly,β he said. βTotally, completely no future.βΒ
Across the country from Abdulkadir, Lucy Iwuala contributes to Wikipedia in Igbo, a language spoken by several million people in southeastern Nigeria. βThe harm has already been done,β she told me, opening the two most recently created articles. Both had been automatically translated via Wikipediaβs Content Translate and contained so many mistakes that she said it would have given her a headache to continue reading them. βThere are some terms that have not even been translated. They are still in English,β she pointed out. She recognized the username that had created the pages as a serial offender. βThis one even includes letters that are not used in the Igbo language,β she said.Β
Iwuala began regularly contributing to Wikipedia three years ago out of concern that Igbo was being displaced by English. It is a worry that is common to many who are active on smaller Wikipedia editions. βThis is my culture. This is who I am,β she told me. βThat is the essence of it all: to ensure that you are not erased.βΒ
Iwuala, who now works as a professional translator between English and Igbo, said the users doing the most damage are inexperienced and see AI translations as a way to quickly increase the profile of the Igbo Wikipedia. She often finds herself having to explain at online edit-a-thons she organizes, or over email to various error-prone editors, that the results can be the exact opposite, pushing users away: βYou will be discouraged and you will no longer want to visit this place. You will just abandon it and go back to the English Wikipedia.βΒ Β
These fears are echoed by Noah Haβalilio Solomon, an assistant professor of Hawaiian language at the University of Hawaiβi. He reports that some 35% of words on some pages in the Hawaiian Wikipedia are incomprehensible. βIf this is the Hawaiian that is going to exist online, then it will do more harm than anything else,β he says.Β
Hawaiian, which was teetering on the verge of extinction several decades ago, has been undergoing a recovery effort led by Indigenous activists and academics. Seeing such poor Hawaiian on such a widely used platform as Wikipedia is upsetting to Haβalilio Solomon.Β
βIt is painful, because it reminds us of all the times that our culture and language has been appropriated,β he says. βWe have been fighting tooth and nail in an uphill climb for language revitalization. There is nothing easy about that, and this can add extra impediments. People are going to think that this is an accurate representation of the Hawaiian language.βΒ
The consequences of all these Wikipedia errors can quickly become clear. AI translators that have undoubtedly ingested these pages in their training data are now assisting in the production, for instance, of error-strewn AI-generated books aimed at learners of languages as diverse as Inuktitut and Cree, Indigenous languages spoken in Canada, and Manx, a small Celtic language spoken on the Isle of Man. Many of these have been popping up for sale on Amazon. βIt was just complete nonsense,β says Richard Compton, a linguist at the University of Quebec in Montreal, of a volume he reviewed that had purported to be an introductory phrasebook for Inuktitut.Β
Rather than making minority languages more accessible, AI is now creating an ever expanding minefield for students and speakers of those languages to navigate. βIt is a slap in the face,β Compton says. He worries that younger generations in Canada, hoping to learn languages in communities that have fought uphill battles against discrimination to pass on their heritage, might turn to online tools such as ChatGPT or phrasebooks on Amazon and simply make matters worse. βIt is fraud,β he says.
A race against time
According to UNESCO, a language is declared extinct every two weeks. But whether the Wikimedia Foundation, which runs Wikipedia, has an obligation to the languages used on its platform is an open question. When I spoke to Runa Bhattacharjee, a senior director at the foundation, she said that it was up to the individual communities to make decisions about what content they wanted to exist on their Wikipedia. βUltimately, the responsibility really lies with the community to see that there is no vandalism or unwanted activity, whether through machine translation or other means,β she said. Usually, Bhattacharjee added, editions were considered for closure only if a specific complaint was raised about them.Β
But if there is no active community, how can an edition be fixed or even have a complaint raised?Β
Bhattacharjee explained that the Wikimedia Foundation sees its role in such cases as about maintaining the Wikipedia platform in case someone comes along to revive it: βIt is the space that we provide for them to grow and develop. That is where we are at.βΒ Β Β
Inari Saami, spoken in a single remote community in northern Finland, is a poster child for how people can take good advantage of Wikipedia. The language was headed toward extinction four decades ago; there were only four children who spoke it. Their parents created the Inari Saami Language Association in a last-ditch bid to keep it going. The efforts worked. There are now several hundred speakers, schools that use Inari Saami as a medium of instruction, and 6,400 Wikipedia articles in the language, each one copy-edited by a fluent speaker.Β
This success highlights how Wikipedia can indeed provide small and determined communities with a unique vehicle to promote their languagesβ preservation. βWe donβt care about quantity. We care about quality,β says Fabrizio Brecciaroli, a member of the Inari Saami Language Association. βWe are planning to use Wikipedia as a repository for the written language. We need to provide tools that can be used by the younger generations. It is important for them to be able to use Inari Saami digitally.βΒ
This has been such a success that Wikipedia has been integrated into the curriculum at the Inari Saamiβspeaking schools, Brecciaroli adds. He fields phone calls from teachers asking him to write up simple pages on topics from tornadoes to Saami folklore. Wikipedia has even offered a way to introduce words into Inari Saami. βWe have to make up new words all the time,β Brecciaroli says. βYoung people need them to speak about sports, politics, and video games. If they are unsure how to say something, they now check Wikipedia.β
Wikipedia is a monumental intellectual experiment. Whatβs happening with Inari Saami suggests that with maximum care, it can work in smaller languages. βThe ultimate goal is to make sure that Inari Saami survives,β Brecciaroli says. βIt might be a good thing that there isnβt a Google Translate in Inari Saami.βΒ
That may be trueβthough large language models like ChatGPT can be made to translate phrases into languages that more traditional machine translation tools do not offer. Brecciaroli told me that ChatGPT isnβt great in Inari Saami but that the quality varies significantly depending on what you ask it to do; if you ask it a question in the language, then the answer will be filled with words from Finnish and even words it invents. But if you ask it something in English, Finnish, or Italian and then ask it to reply in Inari Saami, it will perform better.Β
In light of all this, creating as much high-quality content online as can possibly be written becomes a race against time. βChatGPT only needs a lot of words,β Brecciaroli says. βIf we keep putting good material in, then sooner or later, we will get something out. That is the hope.β This is an idea supported by multiple linguists I spoke withβthat it may be possible to end the βgarbage in, garbage outβ cycle. (OpenAI, which operates ChatGPT, did not respond to a request for comment.)
Still, the overall problem is likely to grow and grow, since many languages are not as lucky as Inari Saamiβand their AI translators will most likely be trained on more and more AI slop. Wehr, unfortunately, seems far less optimistic about the future of his beloved Greenlandic.Β
Since deleting much of the Greenlandic-language Wikipedia, he has spent years trying to recruit speakers to help him revive it. He has appeared in Greenlandic media and made social media appeals. But he hasnβt gotten much of a response; he says it has been demoralizing.Β
βThere is nobody in Greenland who is interested in this, or who wants to contribute,β he says. βThere is completely no point in it, and that is why it should be closed.βΒ
Late last year, he began a process requesting that the Wikipedia Language Committee shut down the Greenlandic-language edition. Months of bitter debate followed between dozens of Wikipedia bureaucrats; some seemed to be surprised that a superficially healthy-seeming edition could be gripped by so many problems.Β
Then, earlier this month, Wehrβs proposal was accepted: Greenlandic Wikipedia is set to be shuttered, and any articles that remain will be moved into the Wikipedia Incubator, where new language editions are tested and built. Among the reasons cited by the Language Committee is the use of AI tools, which have βfrequently produced nonsense that could misrepresent the language.βΒ Β Β
Nevertheless, it may be too lateβmistakes in Greenlandic already seem to have become embedded in machine translators. If you prompt either Google Translate or ChatGPT to do something as simple as count to 10 in proper Greenlandic, neither program can deliver.Β
Jacob Judah is an investigative journalist based in London.Β
Trendβ’ Research analyzed a campaign distributing Atomic macOS Stealer (AMOS), a malware family targeting macOS users. Attackers disguise the malware as βcrackedβ versions of legitimate apps, luring users into installation.
by Gary Miliefsky, Publisher, Cyber Defense Magazine Every year, Black Hat showcases not just the latest innovations and products from the cybersecurity industry but also the presence of major government...
The Trend Microβ’ Managed Detection and Response team uncovered a threat campaign orchestrated by an active group, Water Curse. The threat actor exploits GitHub, one of the most trusted platforms for open-source software, as a delivery channel for weaponized repositories.
We have detected a new tactic involving fake CAPTCHA pages that trick users into executing harmful commands in Windows. This scheme uses disguised files sent via phishing and other malicious methods.
Trend Microβ’ Managed XDR assisted in an investigation of a B2B BEC attack that unveiled an entangled mesh weaved by the threat actor with the help of a compromised server, ensnaring three business partners in a scheme that spanned for days. This article features investigation insights, a proposed incident timeline, and recommended security practices.
In this blog entry, we discuss how the Black Basta and Cactus ransomware groups utilized the BackConnect malware to maintain persistent control and exfiltrate sensitive data from compromised machines.
The Managed XDR team investigated a sophisticated campaign distributing Lumma Stealer through GitHub, where attackers leveraged the platform's release infrastructure to deliver malware such as SectopRAT, Vidar, and Cobeacon.
Our research shows how attackers use platforms like YouTube to spread fake installers via trusted hosting services, employing encryption to evade detection and steal sensitive browser data.
In this blog entry, Trend Microβs Managed XDR team discusses their investigation into how the latest variant of NodeStealer is delivered through spear-phishing attacks, potentially leading to malware execution, data theft, and the exfiltration of sensitive information via Telegram.
In this blog entry, we discuss a social engineering attack that tricked the victim into installing a remote access tool, triggering DarkGate malware activities and an attempted C&C connection.