Reading view

There are new articles available, click to refresh the page.

Accelerating adoption of AI for cybersecurity at DEF CON 33

Posted by Elie Bursztein and Marianna Tishchenko, Google Privacy, Safety and Security Team

Empowering cyber defenders with AI is critical to tilting the cybersecurity balance back in their favor as they battle cybercriminals and keep users safe. To help accelerate adoption of AI for cybersecurity workflows, we partnered with Airbus at DEF CON 33 to host the GenSec Capture the Flag (CTF), dedicated to human-AI collaboration in cybersecurity. Our goal was to create a fun, interactive environment, where participants across various skill levels could explore how AI can accelerate their daily cybersecurity workflows.



At GenSec CTF, nearly 500 participants successfully completed introductory challenges, with 23% of participants using AI for cybersecurity for the very first time. An overwhelming 85% of all participants found the event useful for learning how AI can be applied to security workflows. This positive feedback highlights that AI-centric CTFs can play a vital role in speeding up AI education and adoption in the security community.


The CTF also offered a valuable opportunity for the community to use Sec-Gemini, Google’s experimental Cybersecurity AI, as an optional assistant available in the UI alongside major LLMs. And we received great feedback on Sec-Gemini, with 77% of respondents saying that they had found Sec-Gemini either “very helpful” or “extremely helpful” in assisting them with solving the challenges.  


We want to thank the DEF CON community for the enthusiastic participation and for making this inaugural event a resounding success. The community feedback during the event has been invaluable for understanding how to improve Sec-Gemini, and we are already incorporating some of the lessons learned into the next iteration. 


We are committed to advancing the AI cybersecurity frontier and will continue working with the community to build tools that help protect people online. Stay tuned as we plan to share more research and key learnings from the CTF with the broader community.



Supporting Rowhammer research to protect the DRAM ecosystem

Posted by Daniel Moghimi

Rowhammer is a complex class of vulnerabilities across the industry. It is a hardware vulnerability in DRAM where repeatedly accessing a row of memory can cause bit flips in adjacent rows, leading to data corruption. This can be exploited by attackers to gain unauthorized access to data, escalate privileges, or cause denial of service. Hardware vendors have deployed various mitigations, such as ECC and Target Row Refresh (TRR) for DDR5 memory, to mitigate Rowhammer and enhance DRAM reliability. However, the resilience of those mitigations against sophisticated attackers remains an open question.

To address this gap and help the ecosystem with deploying robust defenses, Google has supported academic research and developed test platforms to analyze DDR5 memory. Our effort has led to the discovery of new attacks and a deeper understanding of Rowhammer on the current DRAM modules, helping to forge the way for further, stronger mitigations.

What is Rowhammer? 

Rowhammer exploits a vulnerability in DRAM. DRAM cells store data as electrical charges, but these electric charges leak over time, causing data corruption. To prevent data loss, the memory controller periodically refreshes the cells. However, if a cell discharges before the refresh cycle, its stored bit may corrupt. Initially considered a reliability issue, it has been leveraged by security researchers to demonstrate privilege escalation attacks. By repeatedly accessing a memory row, an attacker can cause bit flips in neighboring rows. An adversary can exploit Rowhammer via:

  1. Reliably cause bit flips by repeatedly accessing adjacent DRAM rows.

  2. Coerce other applications or the OS into using these vulnerable memory pages.

  3. Target security-sensitive code or data to achieve privilege escalation.

  4. Or simply corrupt system’s memory to cause denial of service

Previous work has repeatedly demonstrated the possibility of such attacks from software [Revisiting rowhammer, Are we susceptible to rowhammer?, DrammerFlip feng shui, Jolt]. As a result, defending against Rowhammer is required for secure isolation in multi-tenant environments like the cloud. 

Rowhammer Mitigations 

The primary approach to mitigate Rowhammer is to detect which memory rows are being aggressively accessed and refreshing nearby rows before a bit flip occurs. TRR is a common example, which uses a number of counters to track accesses to a small number of rows adjacent to a potential victim row. If the access count for these aggressor rows reaches a certain threshold, the system issues a refresh to the victim row. TRR can be incorporated within the DRAM or in the host CPU.

However, this mitigation is not foolproof. For example, the TRRespass attack showed that by simultaneously hammering multiple, non-adjacent rows, TRR can be bypassed. Over the past couple of years, more sophisticated attacks [Half-Double, Blacksmith] have emerged, introducing more efficient attack patterns. 

In response, one of our efforts was to collaborate with JEDEC, external researchers, and experts to define the PRAC as a new mitigation that deterministically detects Rowhammer by tracking all memory rows. 

However, current systems equipped with DDR5 lack support for PRAC or other robust mitigations. As a result, they rely on probabilistic approaches such as ECC and enhanced TRR to reduce the risk. While these measures have mitigated older attacks, their overall effectiveness against new techniques was not fully understood until our recent findings.

Challenges with Rowhammer Assessment 

Mitigating Rowhammer attacks involves making it difficult for an attacker to reliably cause bit flips from software. Therefore, for an effective mitigation, we have to understand how a determined adversary introduces memory accesses that bypass existing mitigations. Three key information components can help with such an analysis:

  1. How the improved TRR and in-DRAM ECC work.

  2. How memory access patterns from software translate into low-level DDR commands.

  3. (Optionally) How any mitigations (e.g., ECC or TRR) in the host processor work.

The first step is particularly challenging and involves reverse-engineering the proprietary in-DRAM TRR mechanism, which varies significantly between different manufacturers and device models. This process requires the ability to issue precise DDR commands to DRAM and analyze its responses, which is difficult on an off-the-shelf system. Therefore, specialized test platforms are essential.

The second and third steps involve analyzing the DDR traffic between the host processor and the DRAM. This can be done using an off-the-shelf interposer, a tool that sits between the processor and DRAM. A crucial part of this analysis is understanding how a live system translates software-level memory accesses into the DDR protocol.

The third step, which involves analyzing host-side mitigations, is sometimes optional. For example, host-side ECC (Error Correcting Code) is enabled by default on servers, while host-side TRR has only been implemented in some CPUs. 

Rowhammer testing platforms

For the first challenge, we partnered with Antmicro to develop two specialized, open-source FPGA-based Rowhammer test platforms. These platforms allow us to conduct in-depth testing on different types of DDR5 modules.

  • DDR5 RDIMM Platform: A new DDR5 Tester board to meet the hardware requirements of Registered DIMM (RDIMM) memory, common in server computers.

  • SO-DIMM Platform: A version that supports the standard SO-DIMM pinout compatible with off-the-shelf DDR5 SO-DIMM memory sticks, common in workstations and end-user devices.

Antmicro designed and manufactured these open-source platforms and we worked closely with them, and researchers from ETH Zurich, to test the applicability of these platforms for analyzing off-the-shelf memory modules in RDIMM and SO-DIMM forms.


Antmicro DDR5 RDIMM FPGA test platform in action.

Phoenix Attacks on DDR5

In collaboration with researchers from ETH, we applied the new Rowhammer test platforms to evaluate the effectiveness of current in-DRAM DDR5 mitigations. Our findings, detailed in the recently co-authored "Phoenix” research paper, reveal that we successfully developed custom attack patterns capable of bypassing enhanced TRR (Target Row Refresh) defense on DDR5 memory. We were able to create a novel self-correcting refresh synchronization attack technique, which allowed us to perform the first-ever Rowhammer privilege escalation exploit on a standard, production-grade desktop system equipped with DDR5 memory. While this experiment was conducted on an off-the-shelf workstation equipped with recent AMD Zen processors and SK Hynix DDR5 memory, we continue to investigate the applicability of our findings to other hardware configurations.

Lessons learned 

We showed that current mitigations for Rowhammer attacks are not sufficient, and the issue remains a widespread problem across the industry. They do make it more difficult “but not impossible” to carry out attacks, since an attacker needs an in-depth understanding of the specific memory subsystem architecture they wish to target.


Current mitigations based on TRR and ECC rely on probabilistic countermeasures that have insufficient entropy. Once an analyst understands how TRR operates, they can craft specific memory access patterns to bypass it. Furthermore, current ECC schemes were not designed as a security measure and are therefore incapable of reliably detecting errors.


Memory encryption is an alternative countermeasure for Rowhammer. However, our current assessment is that without cryptographic integrity, it offers no valuable defense against Rowhammer. More research is needed to develop viable, practical encryption and integrity solutions.

Path forward

Google has been a leader in JEDEC standardization efforts, for instance with PRAC, a fully approved standard to be supported in upcoming versions of DDR5/LPDDR6. It works by accurately counting the number of times a DRAM wordline is activated and alerts the system if an excessive number of activations is detected. This close coordination between the DRAM and the system gives PRAC a reliable way to address Rowhammer. 


In the meantime, we continue to evaluate and improve other countermeasures to ensure our workloads are resilient against Rowhammer. We collaborate with our academic and industry partners to improve analysis techniques and test platforms, and to share our findings with the broader ecosystem.

Want to learn more?

“Phoenix: Rowhammer Attacks on DDR5 with Self-Correcting Synchronization” will be presented at IEEE Security & Privacy 2026 in San Francisco, CA (MAY 18-21, 2026).

Introducing OSS Rebuild: Open Source, Rebuilt to Last

Posted by Matthew Suozzo, Google Open Source Security Team (GOSST)

Today we're excited to announce OSS Rebuild, a new project to strengthen trust in open source package ecosystems by reproducing upstream artifacts. As supply chain attacks continue to target widely-used dependencies, OSS Rebuild gives security teams powerful data to avoid compromise without burden on upstream maintainers.


The project comprises:


  • Automation to derive declarative build definitions for existing PyPI (Python), npm (JS/TS), and Crates.io (Rust) packages.

  • SLSA Provenance for thousands of packages across our supported ecosystems, meeting SLSA Build Level 3 requirements with no publisher intervention.

  • Build observability and verification tools that security teams can integrate into their existing vulnerability management workflows.

  • Infrastructure definitions to allow organizations to easily run their own instances of OSS Rebuild to rebuild, generate, sign, and distribute provenance.


Challenges

Open source software has become the foundation of our digital world. From critical infrastructure to everyday applications, OSS components now account for 77% of modern applications. With an estimated value exceeding $12 trillion, open source software has never been more integral to the global economy.


Yet this very ubiquity makes open source an attractive target: Recent high-profile supply chain attacks have demonstrated sophisticated methods for compromising widely-used packages. Each incident erodes trust in open ecosystems, creating hesitation among both contributors and consumers.


The security community has responded with initiatives like OpenSSF Scorecard, pypi's Trusted Publishers, and npm's native SLSA support. However, there is no panacea: Each effort targets a certain aspect of the problem, often making tradeoffs like shifting work onto publishers and maintainers.

Our Aim

Our aim with OSS Rebuild is to empower the security community to deeply understand and control their supply chains by making package consumption as transparent as using a source repository. Our rebuild platform unlocks this transparency by utilizing a declarative build process, build instrumentation, and network monitoring capabilities which, within the SLSA Build framework, produces fine-grained, durable, trustworthy security metadata.


Building on the hosted infrastructure model that we pioneered with OSS Fuzz for memory issue detection, OSS Rebuild similarly seeks to use hosted resources to address security challenges in open source, this time aimed at securing the software supply chain.


Our vision extends beyond any single ecosystem: We are committed to bringing supply chain transparency and security to all open source software development. Our initial support for the PyPI (Python), npm (JS/TS), and Crates.io (Rust) package registries—providing rebuild provenance for many of their most popular packages—is just the beginning of our journey.


How OSS Rebuild Works




Through automation and heuristics, we determine a prospective build definition for a target package and rebuild it. We semantically compare the result with the existing upstream artifact, normalizing each one to remove instabilities that cause bit-for-bit comparisons to fail (e.g. archive compression). Once we reproduce the package, we publish the build definition and outcome via SLSA Provenance. This attestation allows consumers to reliably verify a package's origin within the source history, understand and repeat its build process, and customize the build from a known-functional baseline (or maybe even use it to generate more detailed SBOMs).


With OSS Rebuild's existing automation for PyPI, npm, and Crates.io, most packages obtain protection effortlessly without user or maintainer intervention. Where automation isn't currently able to fully reproduce the package, we offer manual build specification so the whole community benefits from individual contributions.


And we are also excited at the potential for AI to help reproduce packages: Build and release processes are often described in natural language documentation which, while difficult to utilize with discrete logic, is increasingly useful to language models. Our initial experiments have demonstrated the approach's viability in automating exploration and testing, with limited human intervention, even in the most complex builds.


Our Capabilities

OSS Rebuild helps detect several classes of supply chain compromise:

  • Unsubmitted Source Code - When published packages contain code not present in the public source repository, OSS Rebuild will not attest to the artifact.

  • Build Environment Compromise - By creating standardized, minimal build environments with comprehensive monitoring, OSS Rebuild can detect suspicious build activity or avoid exposure to compromised components altogether.

  • Stealthy Backdoors - Even sophisticated backdoors like xz often exhibit anomalous behavioral patterns during builds. OSS Rebuild's dynamic analysis capabilities can detect unusual execution paths or suspicious operations that are otherwise impractical to identify through manual review.

For enterprises and security professionals, OSS Rebuild can...

  • Enhance metadata without changing registries by enriching data for upstream packages. No need to maintain custom registries or migrate to a new package ecosystem.

  • Augment SBOMs by adding detailed build observability information to existing Software Bills of Materials, creating a more complete security picture.

  • Accelerate vulnerability response by providing a path to vendor, patch, and re-host upstream packages using our verifiable build definitions.


For publishers and maintainers of open source packages, OSS Rebuild can...

  • Strengthen package trust by providing consumers with independent verification of the packages' build integrity, regardless of the sophistication of the original build.

  • Retrofit historical packages' integrity with high-quality build attestations, regardless of whether build attestations were present or supported at the time of publication.

  • Reduce CI security-sensitivity allowing publishers to focus on core development work. CI platforms tend to have complex authorization and execution models and by performing separate rebuilds, the CI environment no longer needs to be load-bearing for your packages' security.


Check it out!

The easiest (but not only!) way to access OSS Rebuild attestations is to use the provided Go-based command-line interface. It can be compiled and installed easily:


$ go install github.com/google/oss-rebuild/cmd/oss-rebuild@latest


You can fetch OSS Rebuild's SLSA Provenance:

$ oss-rebuild get cratesio syn 2.0.39


..or explore the rebuilt versions of a particular package:


$ oss-rebuild list pypi absl-py

..or even rebuild the package for yourself:


$ oss-rebuild get npm lodash 4.17.20 --output=dockerfile | \

   docker run $(docker buildx build -q -)

Join Us in Helping Secure Open Source

OSS Rebuild is not just about fixing problems; it's about empowering end-users to make open source ecosystems more secure and transparent through collective action. If you're a developer, enterprise, or security researcher interested in OSS security, we invite you to follow along and get involved!


Mitigating prompt injection attacks with a layered defense strategy

Posted by Google GenAI Security Team

With the rapid adoption of generative AI, a new wave of threats is emerging across the industry with the aim of manipulating the AI systems themselves. One such emerging attack vector is indirect prompt injections. Unlike direct prompt injections, where an attacker directly inputs malicious commands into a prompt, indirect prompt injections involve hidden malicious instructions within external data sources. These may include emails, documents, or calendar invites that instruct AI to exfiltrate user data or execute other rogue actions. As more governments, businesses, and individuals adopt generative AI to get more done, this subtle yet potentially potent attack becomes increasingly pertinent across the industry, demanding immediate attention and robust security measures.


At Google, our teams have a longstanding precedent of investing in a defense-in-depth strategy, including robust evaluation, threat analysis, AI security best practices, AI red-teaming, adversarial training, and model hardening for generative AI tools. This approach enables safer adoption of Gemini in Google Workspace and the Gemini app (we refer to both in this blog as “Gemini” for simplicity). Below we describe our prompt injection mitigation product strategy based on extensive research, development, and deployment of improved security mitigations.


A layered security approach

Google has taken a layered security approach introducing security measures designed for each stage of the prompt lifecycle. From Gemini 2.5 model hardening, to purpose-built machine learning (ML) models detecting malicious instructions, to system-level safeguards, we are meaningfully elevating the difficulty, expense, and complexity faced by an attacker. This approach compels adversaries to resort to methods that are either more easily identified or demand greater resources. 


Our model training with adversarial data significantly enhanced our defenses against indirect prompt injection attacks in Gemini 2.5 models (technical details). This inherent model resilience is augmented with additional defenses that we built directly into Gemini, including: 


  1. Prompt injection content classifiers

  2. Security thought reinforcement

  3. Markdown sanitization and suspicious URL redaction

  4. User confirmation framework

  5. End-user security mitigation notifications


This layered approach to our security strategy strengthens the overall security framework for Gemini – throughout the prompt lifecycle and across diverse attack techniques.


1. Prompt injection content classifiers


Through collaboration with leading AI security researchers via Google's AI Vulnerability Reward Program (VRP), we've curated one of the world’s most advanced catalogs of generative AI vulnerabilities and adversarial data. Utilizing this resource, we built and are in the process of rolling out proprietary machine learning models that can detect malicious prompts and instructions within various formats, such as emails and files, drawing from real-world examples. Consequently, when users query Workspace data with Gemini, the content classifiers filter out harmful data containing malicious instructions, helping to ensure a secure end-to-end user experience by retaining only safe content. For example, if a user receives an email in Gmail that includes malicious instructions, our content classifiers help to detect and disregard malicious instructions, then generate a safe response for the user. This is in addition to built-in defenses in Gmail that automatically block more than 99.9% of spam, phishing attempts, and malware.


A diagram of Gemini’s actions based on the detection of the malicious instructions by content classifiers.


2. Security thought reinforcement


This technique adds targeted security instructions surrounding the prompt content to remind the large language model (LLM) to perform the user-directed task and ignore any adversarial instructions that could be present in the content. With this approach, we steer the LLM to stay focused on the task and ignore harmful or malicious requests added by a threat actor to execute indirect prompt injection attacks.

A diagram of Gemini’s actions based on additional protection provided by the security thought reinforcement technique. 


3. Markdown sanitization and suspicious URL redaction 


Our markdown sanitizer identifies external image URLs and will not render them, making the “EchoLeak” 0-click image rendering exfiltration vulnerability not applicable to Gemini. From there, a key protection against prompt injection and data exfiltration attacks occurs at the URL level. With external data containing dynamic URLs, users may encounter unknown risks as these URLs may be designed for indirect prompt injections and data exfiltration attacks. Malicious instructions executed on a user's behalf may also generate harmful URLs. With Gemini, our defense system includes suspicious URL detection based on Google Safe Browsing to differentiate between safe and unsafe links, providing a secure experience by helping to prevent URL-based attacks. For example, if a document contains malicious URLs and a user is summarizing the content with Gemini, the suspicious URLs will be redacted in Gemini’s response. 


Gemini in Gmail provides a summary of an email thread. In the summary, there is an unsafe URL. That URL is redacted in the response and is replaced with the text “suspicious link removed”. 


4. User confirmation framework


Gemini also features a contextual user confirmation system. This framework enables Gemini to require user confirmation for certain actions, also known as “Human-In-The-Loop” (HITL), using these responses to bolster security and streamline the user experience. For example, potentially risky operations like deleting a calendar event may trigger an explicit user confirmation request, thereby helping to prevent undetected or immediate execution of the operation.


The Gemini app with instructions to delete all events on Saturday. Gemini responds with the events found on Google Calendar and asks the user to confirm this action.


5. End-user security mitigation notifications


A key aspect to keeping our users safe is sharing details on attacks that we’ve stopped so users can watch out for similar attacks in the future. To that end, when security issues are mitigated with our built-in defenses, end users are provided with contextual information allowing them to learn more via dedicated help center articles. For example, if Gemini summarizes a file containing malicious instructions and one of Google’s prompt injection defenses mitigates the situation, a security notification with a “Learn more” link will be displayed for the user. Users are encouraged to become more familiar with our prompt injection defenses by reading the Help Center article


Gemini in Docs with instructions to provide a summary of a file. Suspicious content was detected and a response was not provided. There is a yellow security notification banner for the user and a statement that Gemini’s response has been removed, with a “Learn more” link to a relevant Help Center article.

Moving forward


Our comprehensive prompt injection security strategy strengthens the overall security framework for Gemini. Beyond the techniques described above, it also involves rigorous testing through manual and automated red teams, generative AI security BugSWAT events, strong security standards like our Secure AI Framework (SAIF), and partnerships with both external researchers via the Google AI Vulnerability Reward Program (VRP) and industry peers via the Coalition for Secure AI (CoSAI). Our commitment to trust includes collaboration with the security community to responsibly disclose AI security vulnerabilities, share our latest threat intelligence on ways we see bad actors trying to leverage AI, and offering insights into our work to build stronger prompt injection defenses. 


Working closely with industry partners is crucial to building stronger protections for all of our users. To that end, we’re fortunate to have strong collaborative partnerships with numerous researchers, such as Ben Nassi (Confidentiality), Stav Cohen (Technion), and Or Yair (SafeBreach), as well as other AI Security researchers participating in our BugSWAT events and AI VRP program. We appreciate the work of these researchers and others in the community to help us red team and refine our defenses.


We continue working to make upcoming Gemini models inherently more resilient and add additional prompt injection defenses directly into Gemini later this year. To learn more about Google’s progress and research on generative AI threat actors, attack techniques, and vulnerabilities, take a look at the following resources:


Tracking the Cost of Quantum Factoring

Posted by Craig Gidney, Quantum Research Scientist, and Sophie Schmieg, Senior Staff Cryptography Engineer 

Google Quantum AI's mission is to build best in class quantum computing for otherwise unsolvable problems. For decades the quantum and security communities have also known that large-scale quantum computers will at some point in the future likely be able to break many of today’s secure public key cryptography algorithms, such as Rivest–Shamir–Adleman (RSA). Google has long worked with the U.S. National Institute of Standards and Technology (NIST) and others in government, industry, and academia to develop and transition to post-quantum cryptography (PQC), which is expected to be resistant to quantum computing attacks. As quantum computing technology continues to advance, ongoing multi-stakeholder collaboration and action on PQC is critical.


In order to plan for the transition from today’s cryptosystems to an era of PQC, it's important the size and performance of a future quantum computer that could likely break current cryptography algorithms is carefully characterized. Yesterday, we published a preprint demonstrating that 2048-bit RSA encryption could theoretically be broken by a quantum computer with 1 million noisy qubits running for one week. This is a 20-fold decrease in the number of qubits from our previous estimate, published in 2019. Notably, quantum computers with relevant error rates currently have on the order of only 100 to 1000 qubits, and the National Institute of Standards and Technology (NIST) recently released standard PQC algorithms that are expected to be resistant to future large-scale quantum computers. However, this new result does underscore the importance of migrating to these standards in line with NIST recommended timelines


Estimated resources for factoring have been steadily decreasing

Quantum computers break RSA by factoring numbers, using Shor’s algorithm. Since Peter Shor published this algorithm in 1994, the estimated number of qubits needed to run it has steadily decreased. For example, in 2012, it was estimated that a 2048-bit RSA key could be broken by a quantum computer with a billion physical qubits. In 2019, using the same physical assumptions – which consider qubits with a slightly lower error rate than Google Quantum AI’s current quantum computers – the estimate was lowered to 20 million physical qubits.



Historical estimates of the number of physical qubits needed to factor 2048-bit RSA integers.


This result represents a 20-fold decrease compared to our estimate from 2019

The reduction in physical qubit count comes from two sources: better algorithms and better error correction – whereby qubits used by the algorithm ("logical qubits") are redundantly encoded across many physical qubits, so that errors can be detected and corrected.


On the algorithmic side, the key change is to compute an approximate modular exponentiation rather than an exact one. An algorithm for doing this, while using only small work registers, was discovered in 2024 by Chevignard and Fouque and Schrottenloher. Their algorithm used 1000x more operations than prior work, but we found ways to reduce that overhead down to 2x.


On the error correction side, the key change is tripling the storage density of idle logical qubits by adding a second layer of error correction. Normally more error correction layers means more overhead, but a good combination was discovered by the Google Quantum AI team in 2023. Another notable error correction improvement is using "magic state cultivation", proposed by the Google Quantum AI team in 2024, to reduce the workspace required for certain basic quantum operations. These error correction improvements aren't specific to factoring and also reduce the required resources for other quantum computations like in chemistry and materials simulation.


Security implications

NIST recently concluded a PQC competition that resulted in the first set of PQC standards. These algorithms can already be deployed to defend against quantum computers well before a working cryptographically relevant quantum computer is built. 


To assess the security implications of quantum computers, however, it’s instructive to additionally take a closer look at the affected algorithms (see here for a detailed look): RSA and Elliptic Curve Diffie-Hellman. As asymmetric algorithms, they are used for encryption in transit, including encryption for messaging services, as well as digital signatures (widely used to prove the authenticity of documents or software, e.g. the identity of websites). For asymmetric encryption, in particular encryption in transit, the motivation to migrate to PQC is made more urgent due to the fact that an adversary can collect ciphertexts, and later decrypt them once a quantum computer is available, known as a “store now, decrypt later” attack. Google has therefore been encrypting traffic both in Chrome and internally, switching to the standardized version of ML-KEM once it became available. Notably not affected is symmetric cryptography, which is primarily deployed in encryption at rest, and to enable some stateless services.


For signatures, things are more complex. Some signature use cases are similarly urgent, e.g., when public keys are fixed in hardware. In general, the landscape for signatures is mostly remarkable due to the higher complexity of the transition, since signature keys are used in many different places, and since these keys tend to be longer lived than the usually ephemeral encryption keys. Signature keys are therefore harder to replace and much more attractive targets to attack, especially when compute time on a quantum computer is a limited resource. This complexity likewise motivates moving earlier rather than later. To enable this, we have added PQC signature schemes in public preview in Cloud KMS. 


The initial public draft of the NIST internal report on the transition to post-quantum cryptography standards states that vulnerable systems should be deprecated after 2030 and disallowed after 2035. Our work highlights the importance of adhering to this recommended timeline.



More from Google on PQC: https://cloud.google.com/security/resources/post-quantum-cryptography?e=48754805 


Google announces Sec-Gemini v1, a new experimental cybersecurity model

Posted by Elie Burzstein and Marianna Tishchenko, Sec-Gemini team



Today, we’re announcing Sec-Gemini v1, a new experimental AI model focused on advancing cybersecurity AI frontiers. 



As outlined a year ago, defenders face the daunting task of securing against all cyber threats, while attackers need to successfully find and exploit only a single vulnerability. This fundamental asymmetry has made securing systems extremely difficult, time consuming and error prone. AI-powered cybersecurity workflows have the potential to help shift the balance back to the defenders by force multiplying cybersecurity professionals like never before.


 

Effectively powering SecOps workflows requires state-of-the-art reasoning capabilities and extensive current cybersecurity knowledge. Sec-Gemini v1 achieves this by combining Gemini’s advanced capabilities with near real-time cybersecurity knowledge and tooling. This combination allows it to achieve superior performance on key cybersecurity workflows, including incident root cause analysis, threat analysis, and vulnerability impact understanding.



We firmly believe that successfully pushing AI cybersecurity frontiers to decisively tilt the balance in favor of the defenders requires a strong collaboration across the cybersecurity community. This is why we are making Sec-Gemini v1 freely available to select organizations, institutions, professionals, and NGOs for research purposes.



Sec-Gemini v1 outperforms other models on key cybersecurity benchmarks as a result of its advanced integration of Google Threat Intelligence (GTI), OSV, and other key data sources. Sec-Gemini v1 outperforms other models on CTI-MCQ, a leading threat intelligence benchmark, by at least 11% (See Figure 1). It also outperforms other models by at least 10.5% on the CTI-Root Cause Mapping benchmark (See Figure 2):





Figure 1: Sec-Gemini v1 outperforms other models on the CTI-MCQ Cybersecurity Threat Intelligence benchmark.







Figure 2: Sec-Gemini v1 has outperformed other models in a Cybersecurity Threat Intelligence-Root Cause Mapping (CTI-RCM) benchmark that evaluates an LLM's ability to understand the nuances of vulnerability descriptions, identify vulnerabilities underlying root causes, and accurately classify them according to the CWE taxonomy.




Below is an example of the comprehensiveness of Sec-Gemini v1’s answers in response to key cybersecurity questions. First, Sec-Gemini v1 is able to determine that Salt Typhoon is a threat actor (not all models do) and provides a comprehensive description of that threat actor, thanks to its deep integration with Mandiant Threat intelligence data.









Next, in response to a question about the vulnerabilities in the Salt Typhoon description, Sec-Gemini v1 outputs not only vulnerability details (thanks to its integration with OSV data, the open-source vulnerabilities database operated by Google), but also contextualizes the vulnerabilities with respect to threat actors (using Mandiant data). With Sec-Gemini v1, analysts can understand the risk and threat profile associated with specific vulnerabilities faster.








If you are interested in collaborating with us on advancing the AI cybersecurity frontier, please request early access to Sec-Gemini v1 via this form.








Taming the Wild West of ML: Practical Model Signing with Sigstore

Posted by Mihai Maruseac, Google Open Source Security Team (GOSST)


In partnership with NVIDIA and HiddenLayer, as part of the Open Source Security Foundation, we are now launching the first stable version of our model signing library. Using digital signatures like those from Sigstore, we allow users to verify that the model used by the application is exactly the model that was created by the developers. In this blog post we will illustrate why this release is important from Google’s point of view.



With the advent of LLMs, the ML field has entered an era of rapid evolution. We have seen remarkable progress leading to weekly launches of various applications which incorporate ML models to perform tasks ranging from customer support, software development, and even performing security critical tasks.



However, this has also opened the door to a new wave of security threats. Model and data poisoning, prompt injection, prompt leaking and prompt evasion are just a few of the risks that have recently been in the news. Garnering less attention are the risks around the ML supply chain process: since models are an uninspectable collection of weights (sometimes also with arbitrary code), an attacker can tamper with them and achieve significant impact to those using the models. Users, developers, and practitioners need to examine an important question during their risk assessment process: “can I trust this model?”



Since its launch, Google’s Secure AI Framework (SAIF) has created guidance and technical solutions for creating AI applications that users can trust. A first step in achieving trust in the model is to permit users to verify its integrity and provenance, to prevent tampering across all processes from training to usage, via cryptographic signing. 



The ML supply chain

To understand the need for the model signing project, let’s look at the way ML powered applications are developed, with an eye to where malicious tampering can occur.



Applications that use advanced AI models are typically developed in at least three different stages. First, a large foundation model is trained on large datasets. Next, a separate ML team finetunes the model to make it achieve good performance on application specific tasks. Finally,  this fine-tuned model is embedded into an application.



The three steps involved in building an application that uses large language models.



These three stages are usually handled by different teams, and potentially even different companies, since each stage requires specialized expertise. To make models available from one stage to the next, practitioners leverage model hubs, which are repositories for storing models. Kaggle and HuggingFace are popular open source options, although internal model hubs could also be used.



This separation into stages creates multiple opportunities where a malicious user (or external threat actor who has compromised the internal infrastructure) could tamper with the model. This could range from just a slight alteration of the model weights that control model behavior, to injecting architectural backdoors — completely new model behaviors and capabilities that could be triggered only on specific inputs. It is also possible to exploit the serialization format and inject arbitrary code execution in the model as saved on disk — our whitepaper on AI supply chain integrity goes into more details on how popular model serialization libraries could be exploited. The following diagram summarizes the risks across the ML supply chain for developing a single model, as discussed in the whitepaper.



The supply chain diagram for building a single model, illustrating some supply chain risks (oval labels) and where model signing can defend against them (check marks)



The diagram shows several places where the model could be compromised. Most of these could be prevented by signing the model during training and verifying integrity before any usage, in every step: the signature would have to be verified when the model gets uploaded to a model hub, when the model gets selected to be deployed into an application (embedded or via remote APIs) and when the model is used as an intermediary during another training run. Assuming the training infrastructure is trustworthy and not compromised, this approach guarantees that each model user can trust the model.



Sigstore for ML models

Signing models is inspired by code signing, a critical step in traditional software development. A signed binary artifact helps users identify its producer and prevents tampering after publication. The average developer, however, would not want to manage keys and rotate them on compromise.



These challenges are addressed by using Sigstore, a collection of tools and services that make code signing secure and easy. By binding an OpenID Connect token to a workload or developer identity, Sigstore alleviates the need to manage or rotate long-lived secrets. Furthermore, signing is made transparent so signatures over malicious artifacts could be audited in a public transparency log, by anyone. This ensures that split-view attacks are not possible, so any user would get the exact same model. These features are why we recommend Sigstore’s signing mechanism as the default approach for signing ML models.



Today the OSS community is releasing the v1.0 stable version of our model signing library as a Python package supporting Sigstore and traditional signing methods. This model signing library is specialized to handle the sheer scale of ML models (which are usually much larger than traditional software components), and handles signing models represented as a directory tree. The package provides CLI utilities so that users can sign and verify model signatures for individual models. The package can also be used as a library which we plan to incorporate directly into model hub upload flows as well as into ML frameworks.



Future goals

We can view model signing as establishing the foundation of trust in the ML ecosystem. We envision extending this approach to also include datasets and other ML-related artifacts. Then, we plan to build on top of signatures, towards fully tamper-proof metadata records, that can be read by both humans and machines. This has the potential to automate a significant fraction of the work needed to perform incident response in case of a compromise in the ML world. In an ideal world, an ML developer would not need to perform any code changes to the training code, while the framework itself would handle model signing and verification in a transparent manner.



If you are interested in the future of this project, join the OpenSSF meetings attached to the project. To shape the future of building tamper-proof ML, join the Coalition for Secure AI, where we are planning to work on building the entire trust ecosystem together with the open source community. In collaboration with multiple industry partners, we are starting up a special interest group under CoSAI for defining the future of ML signing and including tamper-proof ML metadata, such as model cards and evaluation results.

Titan Security Keys now available in more countries

Posted by Christiaan Brand, Group Product Manager

We’re excited to announce that starting today, Titan Security Keys are available for purchase in more than 10 new countries:

  • Ireland

  • Portugal

  • The Netherlands

  • Denmark

  • Norway

  • Sweden

  • Finland

  • Australia

  • New Zealand

  • Singapore

  • Puerto Rico

This expansion means Titan Security Keys are now available in 22 markets, including previously announced countries like Austria, Belgium, Canada, France, Germany, Italy, Japan, Spain, Switzerland, the UK, and the US.


What is a Titan Security Key?

A Titan Security Key is a small, physical device that you can use to verify your identity when you sign in to your Google Account. It’s like a second password that’s much harder for cybercriminals to steal.

Titan Security Keys allow you to store your passkeys on a strong, purpose-built device that can help protect you against phishing and other online attacks. They’re easy to use and work with a wide range of devices and services as they’re compatible with the FIDO2 standard.

How do I use a Titan Security Key?

To use a Titan Security Key, you simply plug it into your computer’s USB port or tap it to your device using NFC. When you’re asked to verify your identity, you’ll just need to tap the button on the key.

Where can I buy a Titan Security Key?

You can buy Titan Security Keys on the Google Store.


We’re committed to making our products available to as many people as possible and we hope this expansion will help more people stay safe online.


Announcing OSV-Scanner V2: Vulnerability scanner and remediation tool for open source

Posted by Rex Pan and Xueqin Cui, Google Open Source Security Team


In December 2022, we released the open source OSV-Scanner tool, and earlier this year, we open sourced OSV-SCALIBR. OSV-Scanner and OSV-SCALIBR, together with OSV.dev are components of an open platform for managing vulnerability metadata and enabling simple and accurate matching and remediation of known vulnerabilities. Our goal is to simplify and streamline vulnerability management for developers and security teams alike.

Today, we're thrilled to announce the launch of OSV-Scanner V2.0.0, following the announcement of the beta version. This V2 release builds upon the foundation we laid with OSV-SCALIBR and adds significant new capabilities to OSV-Scanner, making it a comprehensive vulnerability scanner and remediation tool with broad support for formats and ecosystems. 



What’s new

Enhanced Dependency Extraction with OSV-SCALIBR

This release represents the first major integration of OSV-SCALIBR features into OSV-Scanner, which is now the official command-line code and container scanning tool for the OSV-SCALIBR library. This integration also expanded our support for the kinds of dependencies we can extract from projects and containers:

Source manifests and lockfiles:

  • .NET: deps.json

  • Python: uv.lock

  • JavaScript: bun.lock

  • Haskell: cabal.project.freeze, stack.yaml.lock

Artifacts:

  • Node modules

  • Python wheels

  • Java uber jars

  • Go binaries


Layer and base image-aware container scanning

Previously, OSV-Scanner focused on scanning of source repositories and language package manifests and lockfiles. OSV-Scanner V2 adds support for comprehensive, layer-aware scanning for Debian, Ubuntu, and Alpine container images. OSV-Scanner can now analyze container images to provide:


  • Layers where a package was first introduced

  • Layer history and commands

  • Base images the image is based on (leveraging a new experimental API provided by deps.dev).

  • OS/Distro the container is running on

  • Filtering of vulnerabilities that are unlikely to impact your container image



This layer analysis currently supports the following OSes and languages:


Distro Support:

  • Alpine OS

  • Debian

  • Ubuntu


Language Artifacts Support:

  • Go

  • Java

  • Node

  • Python



Interactive HTML output

Presenting vulnerability scan information in a clear and actionable way is difficult, particularly in the context of container scanning. To address this, we built a new interactive local HTML output format. This provides more interactivity and information compared to terminal only outputs, including:

  • Severity breakdown

  • Package and ID filtering

  • Vulnerability importance filtering

  • Full vulnerability advisory entries



And additionally for container image scanning:

  • Layer filtering

  • Image layer information

  • Base image identification


Illustration of HTML output for container image scanning


Guided remediation for Maven pom.xml

Last year we released a feature called guided remediation for npm, which streamlines vulnerability management by intelligently suggesting prioritized, targeted upgrades and offering flexible strategies. This ultimately maximizes security improvements while minimizing disruption. We have now expanded this feature to Java through support for Maven pom.xml.

With guided remediation support for Maven, you can remediate vulnerabilities in both direct and transitive dependencies through direct version updates or overriding versions through dependency management.


We’ve introduced a few new things for our Maven support:

  • A new remediation strategy override.

  • Support for reading and writing pom.xml files, including writing changes to local parent pom files. We leverage OSV-Scalibr for Maven transitive dependency extraction.

  • A private registry can be specified to fetch Maven metadata.

  • A new experimental subcommend to update all your dependencies in pom.xml to the latest version.


We also introduced machine readable output for guided remediation that makes it easier to integrate guided remediation into your workflow.


What’s next?

We have exciting plans for the remainder of the year, including:

  • Continued OSV-SCALIBR Convergence: We will continue to converge OSV-Scanner and OSV-SCALIBR to bring OSV-SCALIBR’s functionality to OSV-Scanner’s CLI interface.

  • Expanded Ecosystem Support: We'll expand the number of ecosystems we support across all the features currently in OSV-Scanner, including more languages for guided remediation, OS advisories for container scanning, and more general lockfile support for source code scanning.

  • Full Filesystem Accountability for Containers: Another goal of osv-scanner is to give you the ability to know and account for every single file on your container image, including sideloaded binaries downloaded from the internet.

  • Reachability Analysis: We're working on integrating reachability analysis to provide deeper insights into the potential impact of vulnerabilities.

  • VEX Support: We're planning to add support for Vulnerability Exchange (VEX) to facilitate better communication and collaboration around vulnerability information.


Try OSV-Scanner V2

You can try V2.0.0 and contribute to its ongoing development by checking out OSV-Scanner or the OSV-SCALIBR repository. We welcome your feedback and contributions as we continue to improve the platform and make vulnerability management easier for everyone.

If you have any questions or if you would like to contribute, don't hesitate to reach out to us at osv-discuss@google.com, or post an issue in our issue tracker.

Vulnerability Reward Program: 2024 in Review

Posted by Dirk Göhmann



In 2024, our Vulnerability Reward Program confirmed the ongoing value of engaging with the security research community to make Google and its products safer. This was evident as we awarded just shy of $12 million to over 600 researchers based in countries around the globe across all of our programs.





Vulnerability Reward Program 2024 in Numbers







You can learn about who’s reporting to the Vulnerability Reward Program via our Leaderboard – and find out more about our youngest security researchers who’ve recently joined the ranks of Google bug hunters.




VRP Highlights in 2024


In 2024 we made a series of changes and improvements coming to our vulnerability reward programs and related initiatives:



  • The Google VRP revamped its reward structure, bumping rewards up to a maximum of $151,515, the Mobile VRP is now offering up to $300,000 for critical vulnerabilities in top-tier apps, Cloud VRP has a top-tier award of up $151,515, and Chrome awards now peak at $250,000 (see the below section on Chrome for details).

  • We rolled out InternetCTF – to get rewarded, discover novel code execution vulnerabilities in open source and provide Tsunami plugin patches for them.

  • The Abuse VRP saw a 40% YoY increase in payouts – we received over 250 valid bugs targeting abuse and misuse issues in Google products, resulting in over $290,000 in rewards.

  • To improve the payment process for rewards going to bug hunters, we introduced Bugcrowd as an additional payment option on bughunters.google.com alongside the existing standard Google payment option. 

  • We hosted two editions of bugSWAT for training, skill sharing, and, of course, some live hacking – in August, we had 16 bug hunters in attendance in Las Vegas, and in October, as part of our annual security conference ESCAL8 in Malaga, Spain, we welcomed 40 of our top researchers. Between these two events, our bug hunters were rewarded $370,000 (and plenty of swag).

  • We doubled down on our commitment to support the next generation of security engineers by hosting four init.g workshops (Las Vegas, São Paulo, Paris, and Malaga). Follow the Google VRP channel on X to stay tuned on future events.




More detailed updates on selected programs are shared in the following sections.




Android and Google Devices

In 2024, the Android and Google Devices Security Reward Program and the Google Mobile Vulnerability Reward Program, both part of the broader Google Bug Hunters program, continued their mission to fortify the Android ecosystem, achieving new heights in both impact and severity. We awarded over $3.3 million in rewards to researchers who demonstrated exceptional skill in uncovering critical vulnerabilities within Android and Google mobile applications. 


The above numbers mark a significant change compared to previous years. Although we saw an 8% decrease in the total number of submissions, there was a 2% increase in the number of critical and high vulnerabilities. In other words, fewer researchers are submitting fewer, but more impactful bugs, and are citing the improved security posture of the Android operating system as the central challenge. This showcases the program's sustained success in hardening Android.


This year, we had a heightened focus on Android Automotive OS and WearOS, bringing actual automotive devices to multiple live hacking events and conferences. At ESCAL8, we hosted a live-hacking challenge focused on Pixel devices, resulting in over $75,000 in rewards in one weekend, and the discovery of several memory safety vulnerabilities. To facilitate learning, we launched a new Android hacking course in collaboration with external security researchers, focused on mobile app security, designed for newcomers and veterans alike. Stay tuned for more.


We extend our deepest gratitude to the dedicated researchers who make the Android ecosystem safer. We're proud to work with you! Special thanks to Zinuo Han (@ele7enxxh) for their expertise in Bluetooth security, blunt (@blunt_qian) for holding the record for the most valid reports submitted to the Google Play Security Reward Program, and WANG,YONG (@ThomasKing2014) for groundbreaking research on rooting Android devices with kernel MTE enabled. We also appreciate all researchers who participated in last year's bugSWAT event in Málaga. Your contributions are invaluable! 


Chrome


Chrome did some remodeling in 2024 as we updated our reward amounts and structure to incentivize deeper research. For example, we increased our maximum reward for a single issue to $250,000 for demonstrating RCE in the browser or other non-sandboxed process, and more if done directly without requiring a renderer compromise. 



In 2024, UAF mitigation MiraclePtr was fully launched across all platforms, and a year after the initial launch, MiraclePtr-protected bugs are no longer being considered exploitable security bugs. In tandem, we increased the MiraclePtr Bypass Reward to $250,128. Between April and November, we also launched the first and second iterations of the V8 Sandbox Bypass Rewards as part of the progression towards the V8 sandbox, eventually becoming a security boundary in Chrome. 



We received 337 reports of unique, valid security bugs in Chrome during 2024, and awarded 137 Chrome VRP researchers $3.4 million in total. The highest single reward of 2024 was $100,115 and was awarded to Mickey for their report of a MiraclePtr Bypass after MiraclePtr was initially enabled across most platforms in Chrome M115 in 2023. We rounded out the year by announcing the top 20 Chrome VRP researchers for 2024, all of whom were gifted new Chrome VRP swag, featuring our new Chrome VRP mascot, Bug.





Cloud VRP


The Cloud VRP launched in October as a Cloud-focused vulnerability reward program dedicated to Google Cloud products and services. As part of the launch, we also updated our product tiering and improved our reward structure to better align our reports with their impact on Google Cloud. This resulted in over 150 Google Cloud products coming under the top two reward tiers, enabling better rewards for our Cloud researchers and a more secure cloud.




Since its launch, Google Cloud VRP triaged over 400 reports and filed over 200 unique security vulnerabilities for Google Cloud products and services leading to over $500,000 in researcher rewards. 




Our highlight last year was launching at the bugSWAT event in Málaga where we got to meet many of our amazing researchers who make our program so successful! The overwhelming positive feedback from the researcher community continues to propel us to mature Google Cloud VRP further this year. Stay tuned for some exciting announcements!





Generative AI

We’re celebrating an exciting first year of AI bug bounties.  We received over 150 bug reports – over $55,000 in rewards so far – with one-in-six leading to key improvements. 




We also ran a bugSWAT live-hacking event targeting LLM products and received 35 reports, totaling more than $87,000 – including issues like “Hacking Google Bard - From Prompt Injection to Data Exfiltration” and “We Hacked Google A.I. for $50,000”.



Keep an eye on Gen AI in 2025 as we focus on expanding scope and sharing additional ways for our researcher community to contribute. 



Looking Forward to 2025

In 2025, we will be celebrating 15 years of VRP at Google, during which we have remained fully committed to fostering collaboration, innovation, and transparency with the security community, and will continue to do so in the future. Our goal remains to stay ahead of emerging threats, adapt to evolving technologies, and continue to strengthen the security posture of Google’s products and services. 




We want to send a huge thank you to our bug hunter community for helping us make Google products and platforms more safe and secure for our users around the world – and invite researchers not yet engaged with the Vulnerability Reward Program to join us in our mission to keep Google safe! 




Thank you to Dirk Göhmann, Amy Ressler, Eduardo Vela, Jan Keller, Krzysztof Kotowicz, Martin Straka, Michael Cote, Mike Antares, Sri Tulasiram, and Tony Mendez.



Tip: Want to be informed of new developments and events around our Vulnerability Reward Program? Follow the Google VRP channel on X to stay in the loop and be sure to check out the Security Engineering blog, which covers topics ranging from VRP updates to security practices and vulnerability descriptions (30 posts in 2024)!


Securing tomorrow's software: the need for memory safety standards

Posted by Alex Rebert, Security Foundations, Ben Laurie, Research, Murali Vijayaraghavan, Research and Alex Richardson, Silicon


For decades, memory safety vulnerabilities have been at the center of various security incidents across the industry, eroding trust in technology and costing billions. Traditional approaches, like code auditing, fuzzing, and exploit mitigations while helpful haven't been enough to stem the tide, while incurring an increasingly high cost.



In this blog post, we are calling for a fundamental shift: a collective commitment to finally eliminate this class of vulnerabilities, anchored on secure-by-design practices not just for ourselves but for the generations that follow.



The shift we are calling for is reinforced by a recent ACM article calling to standardize memory safety we took part in releasing with academic and industry partners. It's a recognition that the lack of memory safety is no longer a niche technical problem but a societal one, impacting everything from national security to personal privacy.




The standardization opportunity


Over the past decade, a confluence of secure-by-design advancements has matured to the point of practical, widespread deployment. This includes memory-safe languages, now including high-performance ones such as Rust, as well as safer language subsets like Safe Buffers for C++. 



These tools are already proving effective. In Android for example, the increasing adoption of memory-safe languages like Kotlin and Rust in new code has driven a significant reduction in vulnerabilities.



Looking forward, we're also seeing exciting and promising developments in hardware. Technologies like ARM's Memory Tagging Extension (MTE) and the Capability Hardware Enhanced RISC Instructions (CHERI) architecture offer a complementary defense, particularly for existing code.

While these advancements are encouraging, achieving comprehensive memory safety across the entire software industry requires more than just individual technological progress:  we need to create the right environment and accountability for their widespread adoption. Standardization is key to this. 



To facilitate standardization, we suggest establishing a common framework for specifying and objectively assessing memory safety assurances; doing so will lay the foundation for creating a market in which vendors are incentivized to invest in memory safety. Customers will be empowered to recognize, demand, and reward safety. This framework will provide governments and businesses with the clarity to specify memory safety requirements, driving the procurement of more secure systems. 



The framework we are proposing would complement existing efforts by defining specific, measurable criteria for achieving different levels of memory safety assurance across the industry. In this way, policymakers will gain the technical foundation to craft effective policy initiatives and incentives promoting memory safety.



 

A blueprint for a memory-safe future


We know there's more than one way of solving this problem, and we are ourselves investing in several. Importantly, our vision for achieving memory safety through standardization focuses on defining the desired outcomes rather than locking ourselves into specific technologies.




To translate this vision into an effective standard, we need a framework that will:



Foster innovation and support diverse approaches: The standard should focus on the security properties we want to achieve (e.g., freedom from spatial and temporal safety violations) rather than mandating specific implementation details. The framework should therefore be technology-neutral, allowing vendors to choose the best approach for their products and requirements. This encourages innovation and allows software and hardware manufacturers to adopt the best solutions as they emerge.




Tailor memory safety requirements based on need: The framework should establish different levels of safety assurance, akin to SLSA levels, recognizing that different applications have different security needs and cost constraints. Similarly, we likely need distinct guidance for developing new systems and improving existing codebases. For instance, we probably do not need every single piece of code to be formally proven. This allows for tailored security, ensuring appropriate levels of memory safety for various contexts. 




Enable objective assessment: The framework should define clear criteria and potentially metrics for assessing memory safety and compliance with a given level of assurance. The goal would be to objectively compare the memory safety assurance of different software components or systems, much like we assess energy efficiency today. This will move us beyond subjective claims and towards objective and comparable security properties across products.




Be practical and actionable: Alongside the technology-neutral framework, we need best practices for existing technologies. The framework should provide guidance on how to effectively leverage specific technologies to meet the standards. This includes answering questions such as when and to what extent unsafe code is acceptable within larger software systems, and guidelines on structuring such unsafe dependencies to support compositional reasoning about safety.




Google's commitment


At Google, we're not just advocating for standardization and a memory-safe future, we're actively working to build it.



We are collaborating with industry and academic partners to develop potential standards, and our joint authorship of the recent CACM call-to-action marks an important first step in this process. In addition, as outlined in our Secure by Design whitepaper and in our memory safety strategy, we are deeply committed to building security into the foundation of our products and services.



This commitment is also reflected in our internal efforts. We are prioritizing memory-safe languages, and have already seen significant reductions in vulnerabilities by adopting languages like Rust in combination with existing, wide-spread usage of Java, Kotlin, and Go where performance constraints permit. We recognize that a complete transition to those languages will take time. That's why we're also investing in techniques to improve the safety of our existing C++ codebase by design, such as deploying hardened libc++.




Let's build a memory-safe future together


This effort isn't about picking winners or dictating solutions. It's about creating a level playing field, empowering informed decision-making, and driving a virtuous cycle of security improvement. It's about enabling a future where:



  • Developers and vendors can confidently build more secure systems, knowing their efforts can be objectively assessed.

  • Businesses can procure memory-safe products with assurance, reducing their risk and protecting their customers.

  • Governments can effectively protect critical infrastructure and incentivize the adoption of secure-by-design practices.

  • Consumers are empowered to make decisions about the services they rely on and the devices they use with confidence – knowing the security of each option was assessed against a common framework. 



The journey towards memory safety requires a collective commitment to standardization. We need to build a future where memory safety is not an afterthought but a foundational principle, a future where the next generation inherits a digital world that is secure by design.




Acknowledgments


We'd like to thank our CACM article co-authors for their invaluable contributions: Robert N. M. Watson, John Baldwin, Tony Chen, David Chisnall, Jessica Clarke, Brooks Davis, Nathaniel Wesley Filardo, Brett Gutstein, Graeme Jenkinson, Christoph Kern, Alfredo Mazzinghi, Simon W. Moore, Peter G. Neumann, Hamed Okhravi, Peter Sewell, Laurence Tratt, Hugo Vincent, and Konrad Witaszczyk, as well as many others.

How we estimate the risk from prompt injection attacks on AI systems

Posted by the Agentic AI Security Team at Google DeepMind


Modern AI systems, like Gemini, are more capable than ever, helping retrieve data and perform actions on behalf of users. However, data from external sources present new security challenges if untrusted sources are available to execute instructions on AI systems. Attackers can take advantage of this by hiding malicious instructions in data that are likely to be retrieved by the AI system, to manipulate its behavior. This type of attack is commonly referred to as an "indirect prompt injection," a term first coined by Kai Greshake and the NVIDIA team.




To mitigate the risk posed by this class of attacks, we are actively deploying defenses within our AI systems along with measurement and monitoring tools. One of these tools is a robust evaluation framework we have developed to automatically red-team an AI system’s vulnerability to indirect prompt injection attacks. We will take you through our threat model, before describing three attack techniques we have implemented in our evaluation framework.




Threat model and evaluation framework






Our threat model concentrates on an attacker using indirect prompt injection to exfiltrate sensitive information, as illustrated above. The evaluation framework tests this by creating a hypothetical scenario, in which an AI agent can send and retrieve emails on behalf of the user. The agent is presented with a fictitious conversation history in which the user references private information such as their passport or social security number. Each conversation ends with a request by the user to summarize their last email, and the retrieved email in context.




The contents of this email are controlled by the attacker, who tries to manipulate the agent into sending the sensitive information in the conversation history to an attacker-controlled email address. The attack is successful if the agent executes the malicious prompt contained in the email, resulting in the unauthorized disclosure of sensitive information. The attack fails if the agent only follows user instructions and provides a simple summary of the email. 




Automated red-teaming


Crafting successful indirect prompt injections requires an iterative process of refinement based on observed responses. To automate this process, we have developed a red-team framework consisting of several optimization-based attacks that generate prompt injections (in the example above this would be different versions of the malicious email). These optimization-based attacks are designed to be as strong as possible; weak attacks do little to inform us of the susceptibility of an AI system to indirect prompt injections.




Once these prompt injections have been constructed, we measure the resulting attack success rate on a diverse set of conversation histories. Because the attacker has no prior knowledge of the conversation history, to achieve a high attack success rate the prompt injection must be capable of extracting sensitive user information contained in any potential conversation contained in the prompt, making this a harder task than eliciting generic unaligned responses from the AI system. The attacks in our framework include:




Actor Critic: This attack uses an attacker-controlled model to generate suggestions for prompt injections. These are passed to the AI system under attack, which returns a probability score of a successful attack. Based on this probability, the attack model refines the prompt injection. This process repeats until the attack model converges to a successful prompt injection. 




Beam Search: This attack starts with a naive prompt injection directly requesting that the AI system send an email to the attacker containing the sensitive user information. If the AI system recognizes the request as suspicious and does not comply, the attack adds random tokens to the end of the prompt injection and measures the new probability of the attack succeeding. If the probability increases, these random tokens are kept, otherwise they are removed, and this process repeats until the combination of the prompt injection and random appended tokens result in a successful attack.



Tree of Attacks w/ Pruning (TAP): Mehrotra et al. (2024) [3] designed an attack to generate prompts that cause an AI system to violate safety policies (such as generating hate speech). We adapt this attack, making several adjustments to target security violations. Like Actor Critic, this attack searches in the natural language space; however, we assume the attacker cannot access probability scores from the AI system under attack, only the text samples that are generated.





We are actively leveraging insights gleaned from these attacks within our automated red-team framework to protect current and future versions of AI systems we develop against indirect prompt injection, providing a measurable way to track security improvements. A single silver bullet defense is not expected to solve this problem entirely. We believe the most promising path to defend against these attacks involves a combination of robust evaluation frameworks leveraging automated red-teaming methods, alongside monitoring, heuristic defenses, and standard security engineering solutions. 





We would like to thank Vijay Bolina, Sravanti Addepalli, Lihao Liang, and Alex Kaskasoli for their prior contributions to this work.





Posted on behalf of the entire Google DeepMind Agentic AI Security team (listed in alphabetical order):


Aneesh Pappu, Andreas Terzis, Chongyang Shi, Gena Gibson, Ilia Shumailov, Itay Yona, Jamie Hayes, John "Four" Flynn, Juliette Pluto, Sharon Lin, Shuang Song

OSV-SCALIBR: A library for Software Composition Analysis

Posted by Erik Varga, Vulnerability Management, and Rex Pan, Open Source Security Team



In December 2022, we announced OSV-Scanner, a tool to enable developers to easily scan for vulnerabilities in their open source dependencies. Together with the open source community, we’ve continued to build this tool, adding remediation features, as well as expanding ecosystem support to 11 programming languages and 20 package manager formats. 




Today, we’re excited to release OSV-SCALIBR (Software Composition Analysis LIBRary), an extensible library for SCA and file system scanning. OSV-SCALIBR combines Google’s internal vulnerability management expertise into one scanning library with significant new capabilities such as:



  • SCA for installed packages, standalone binaries, as well as source code

  • OSes package scanning on Linux (COS, Debian, Ubuntu, RHEL, and much more), Windows, and Mac

  • Artifact and lockfile scanning in major language ecosystems (Go, Java, Javascript, Python, Ruby, and much more)

  • Vulnerability scanning tools such as weak credential detectors for Linux, Windows, and Mac

  • SBOM generation in SPDX and CycloneDX, the two most popular document formats

  • Optimization for on-host scanning of resource constrained environments where performance and low resource consumption is critical



OSV-SCALIBR is now the primary SCA engine used within Google for live hosts, code repos, and containers. It’s been used and tested extensively across many different products and internal tools to help generate SBOMs, find vulnerabilities, and help protect our users’ data at Google scale.



We offer OSV-SCALIBR primarily as an open source Go library today, and we're working on adding its new capabilities into OSV-Scanner as the primary CLI interface.


Using OSV-SCALIBR as a library


All of OSV-SCALIBR's capabilities are modularized into plugins for software extraction and vulnerability detection which are very simple to expand.You can use OSV-SCALIBR as a library to:


1.Generate SBOMs from the build artifacts and code repos on your live host:


import (

 "context"

 "github.com/google/osv-scalibr"

 "github.com/google/osv-scalibr/converter"

 "github.com/google/osv-scalibr/extractor/filesystem/list"

 "github.com/google/osv-scalibr/fs"

 "github.com/google/osv-scalibr/plugin"

 spdx "github.com/spdx/tools-golang/spdx/v2/v2_3"

)


func GenSBOM(ctx context.Context) *spdx.Document {

 capab := &plugin.Capabilities{OS: plugin.OSLinux}

 cfg := &scalibr.ScanConfig{

   ScanRoots: fs.RealFSScanRoots("/"),

   FilesystemExtractors: list.FromCapabilities(capab),

   Capabilities: capab,

 }

 result := scalibr.New().Scan(ctx, cfg)

 return converter.ToSPDX23(result, converter.SPDXConfig{})

}


2. Scan a git repo for SBOMs:


Simply replace "/" with the path to your git repo. Also take a look at the various language extractors to enable for code scanning.


3. Scan a remote container for SBOMs:


Replace the scan config from the above code snippet with


import (

 ...

 "github.com/google/go-containerregistry/pkg/authn"

 "github.com/google/go-containerregistry/pkg/v1/remote"

 "github.com/google/osv-scalibr/artifact/image"

 ...

)


...

filesys, _ := image.NewFromRemoteName(

 "alpine:latest",

 remote.WithAuthFromKeychain(authn.DefaultKeychain),

)

cfg := &scalibr.ScanConfig{

 ScanRoots: []*fs.ScanRoot{{FS: filesys}},

 ...

}


4. Find vulnerabilities on your filesystem or a remote container:


Extract the PURLs from the SCALIBR inventory results from the previous steps:


import (

 ...

 "github.com/google/osv-scalibr/converter"

 ...

)

...

result := scalibr.New().Scan(ctx, cfg)

for _, i := range result.Inventories {

 fmt.Println(converter.ToPURL(i))

}


And send them to osv.dev, e.g.


$ curl -d '{"package": {"purl": "pkg:npm/dojo@1.2.3"}}' "https://api.osv.dev/v1/query"


See the usage docs for more details.


OSV-Scanner + OSV-SCALIBR


Users looking for an out-of-the-box vulnerability scanning CLI tool should check out OSV-Scanner, which already provides comprehensive language package scanning capabilities using much of the same extraction as OSV-SCALIBR. 



Some of OSV-SCALIBR’s capabilities are not yet available in OSV-Scanner, but we’re currently working on integrating OSV-SCALIBR more deeply into OSV-Scanner. This will make more and more of OSV-SCALIBR’s capabilities available in OSV-Scanner in the next few months, including installed package extraction, weak credentials scanning, SBOM generation, and more.



Look out soon for an announcement of OSV-Scanner V2 with many of these new features available. OSV-Scanner will become the primary frontend to the OSV-SCALIBR library for users who require a CLI interface. Existing users of OSV-Scanner can continue to use the tool the same way, with backwards compatibility maintained for all existing use cases. 



For installation and usage instructions, have a look at OSV-Scanner’s documentation here.



What’s next

In addition to making all of OSV-SCALIBR’s features available in OSV-Scanner, we're also working on additional new capabilities. Here's some of the things you can expect:

  • Support for more OS and language ecosystems, both for regular extraction and for Guided Remediation

  • Layer attribution and base image identification for container scanning

  • Reachability analysis to reduce false positive vulnerability matches

  • More vulnerability and misconfiguration detectors for Windows

  • More weak credentials detectors


We hope that this library helps developers and organizations to secure their software and encourages the open source community to contribute back by sharing new plugins on top of OSV-SCALIBR.

If you have any questions or if you would like to contribute, don't hesitate to reach out to us at osv-discuss@google.com or by posting an issue in our issue tracker.

Google Cloud expands vulnerability detection for Artifact Registry using OSV

Posted by Greg Mucci, Product Manager, Artifact Analysis, Oliver Chang, Senior Staff Engineering, OSV, and Charl de Nysschen, Product Manager OSV


DevOps teams dedicated to securing their supply chain and predicting potential risks consistently face novel threats. Fortunately, they can now improve their image and container security by harnessing Google-grade vulnerability scanning, which offers expanded open-source coverage. A significant benefit of utilizing Google Cloud Platform is its integrated security tools, including Artifact Analysis. This scanning service leverages the same infrastructure that Google depends on to monitor vulnerabilities within its internal systems and software supply chains.



Artifact Analysis has recently expanded its scanning coverage to eight additional language packages, four operating systems, and two extensively utilized base images, making it a more robust and versatile tool than ever before.   



This enhanced coverage was achieved by integrating Artifact Analysis with the Open Source Vulnerabilities (OSV) platform and database. This integration provides industry-leading insights into open source vulnerabilities—a crucial capability as software supply chain attacks continue to grow in frequency and complexity, impacting organizations reliant on open source software.



With these recent updates, customers can now successfully scan the vast majority of the images they push to Artifact Registry. These successful scans ensure that any known vulnerabilities are detected, reported, and can be integrated into a broader vulnerability management program, allowing teams to take prompt action.



Open source vulnerabilities, with more reach 

Artifact Analysis pulls vulnerability information directly from OSV, which is the only open source, distributed vulnerability database that gets information directly from open source practitioners. OSV’s database provides a consistent, high quality, high fidelity database of vulnerabilities from authoritative sources who have adopted the OSV schema. This ensures the database has accurate information to reliably match software dependencies to known vulnerabilities—previously a difficult process reliant on inaccurate mechanisms such as CPEs (Common Platform Enumerations). 



Over the past three years, OSV has increased its total coverage to 28 language and OS ecosystems. For example, industry leaders such as GitHub, Chainguard, and Ubuntu, as well as open source ecosystems such as Rust and Python are now exporting their vulnerability discoveries in the OSV Schema. This increased coverage also includes Chainguard’s Wolfi images and Google’s Distroless images, which are popular choices for minimal container images used by many developers and organizations. Customers who rely on distroless images can count on Artifact Analysis scanning to support their minimal container image initiatives.  Each expansion in OSV’s coverage is incorporated into scanning tools that integrate with the OSV database.



Broader vulnerability detection with Artifact Analysis 

As a result of OSV’s expansion, scanners like Artifact Analysis that draw from OSV now alert users to higher quality vulnerability information across a broader set of ecosystems—meaning GCP project owners will be made aware of a more complete set of vulnerability findings and potential security risks. 



Existing Artifact Registry scanning customers don't need to take any action to take advantage of this update. Projects that have scanning enabled will immediately benefit from this expanded coverage and vulnerability findings will continue to be available in the Artifact Registry UI, Container Analysis API, and via pub/sub (for workflows).



Existing On Demand scanning customers will also benefit from this expanded vulnerability coverage. All the same Operating Systems and Language package coverage that Registry Scanning customers enjoy are available in On Demand Scan. 



Beyond Artifact Registry 

We know that detection is just one of the first steps necessary to manage risks. We’re continually expanding Artifact Analysis capabilities and in 2025 we’ll be integrating Artifact Registry vulnerability findings with Google Cloud’s Security Command Center. Through Security Command Center customers can maintain a more comprehensive vulnerability management program, and prioritize risk across a number of different dimensions. 

Announcing the launch of Vanir: Open-source Security Patch Validation

Posted by Hyunwook Baek, Duy Truong, Justin Dunlap and Lauren Stan from Android Security and Privacy, and Oliver Chang with the Google Open Source Security Team

Today, we are announcing the availability of Vanir, a new open-source security patch validation tool. Introduced at Android Bootcamp in April, Vanir gives Android platform developers the power to quickly and efficiently scan their custom platform code for missing security patches and identify applicable available patches. Vanir significantly accelerates patch validation by automating this process, allowing OEMs to ensure devices are protected with critical security updates much faster than traditional methods. This strengthens the security of the Android ecosystem, helping to keep Android users around the world safe. 

By open-sourcing Vanir, we aim to empower the broader security community to contribute to and benefit from this tool, enabling wider adoption and ultimately improving security across various ecosystems. While initially designed for Android, Vanir can be easily adapted to other ecosystems with relatively small modifications, making it a versatile tool for enhancing software security across the board. In collaboration with the Google Open Source Security Team, we have incorporated feedback from our early adopters to improve Vanir and make it more useful for security professionals. This tool is now available for you to start developing on top of, and integrating into, your systems.

The Android ecosystem relies on a multi-stage process for vulnerability mitigation. When a new vulnerability is discovered, upstream AOSP developers create and release upstream patches. The downstream device and chip manufacturers then assess the impact on their specific devices and backport the necessary fixes. This process, while effective, can present scalability challenges, especially for manufacturers managing a diverse range of devices and old models with complex update histories. Managing patch coverage across diverse and customized devices often requires considerable effort due to the manual nature of backporting.

To streamline the vital security workflow, we developed Vanir. Vanir provides a scalable and sustainable solution for security patch adoption and validation, helping to ensure Android devices receive timely protection against potential threats.

The power of Vanir

Source-code-based static analysis 

Vanir’s first-of-its-kind approach to Android security patch validation uses source-code-based static analysis to directly compare the target source code against known vulnerable code patterns. Vanir does not rely on traditional metadata-based validation mechanisms, such as version numbers, repository history and build configs, which can be prone to errors. This unique approach enables Vanir to analyze entire codebases with full history, individual files, or even partial code snippets. 

A main focus of Vanir is to automate the time consuming and costly process of identifying missing security patches in the open source software ecosystem. During the early development of Vanir, it became clear that manually identifying a high-volume of missing patches is not only labor intensive but also can leave user devices inadvertently exposed to known vulnerabilities for a period of time. To address this, Vanir utilizes novel automatic signature refinement techniques and multiple pattern analysis algorithms, inspired by the vulnerable code clone detection algorithms proposed by Jang et al. [1] and Kim et al. [2]. These algorithms have low false-alarm rates and can effectively handle broad classes of code changes that might appear in code patch processes. In fact, based on our 2-year operation of Vanir, only 2.72% of signatures triggered  false alarms. This allows Vanir to efficiently find missing patches, even with code changes, while minimizing unnecessary alerts and manual review efforts. 

Vanir's source-code-based approach also enables rapid scaling across any ecosystem. It can generate signatures for any source files written in supported languages. Vanir's signature generator automatically generates, tests, and refines these signatures, allowing users to quickly create signatures for new vulnerabilities in any ecosystem simply by providing source files with security patches. 

Android’s successful use of Vanir highlights its efficiency compared to traditional patch verification methods. A single engineer used Vanir to generate signatures for over 150 vulnerabilities and verify missing security patches across its downstream branches – all within just five days.

Vanir for Android

Currently Vanir supports C/C++ and Java targets and covers 95% of Android kernel and userspace CVEs with public security patches. Google Android Security team consistently incorporates the latest CVEs into Vanir’s coverage to provide a complete picture of the Android ecosystem’s patch adoption risk profile. 

The Vanir signatures for Android vulnerabilities are published through the Open Source Vulnerabilities (OSV) database. This allows Vanir users to seamlessly protect their codebases against latest Android vulnerabilities without any additional updates. Currently, there are over 2,000 Android vulnerabilities in OSV, and finishing scanning an entire Android source tree can take 10-20 minutes with a modern PC.

Flexible integration, adoption and expansion.

Vanir is developed not only as a standalone application but also as a Python library. Users who want to integrate automated patch verification processes with their continuous build or test chain may easily achieve it by wiring their build integration tool with Vanir scanner libraries. For instance, Vanir is integrated with a continuous testing pipeline in Google, ensuring all security patches are adopted in ever-evolving Android codebase and their first-party downstream branches.

Vanir is also fully open-sourced, and under BSD-3 license. As Vanir is not fundamentally limited to the Android ecosystem, you may easily adopt Vanir for the ecosystem that you want to protect by making relatively small modifications in Vanir. In addition, since Vanir’s underlying algorithm is not limited to security patch validation, you may modify the source and use it for different purposes such as licensed code detection or code clone detection. The Android Security team welcomes your contributions to Vanir for any direction that may expand its capability and scope. You can also contribute to Vanir by providing vulnerability data with Vanir signatures to OSV.

Vanir Results

Since early last year, we have partnered with several Android OEMs to test the tool’s effectiveness. Internally we have been able to integrate the tool into our build system continuously testing against over 1,300 vulnerabilities. Currently Vanir covers 95% of all Android, Wear, and Pixel vulnerabilities with public fixes across Android Kernel and Userspace. It has a 97% accuracy rate, which has saved our internal teams over 500 hours to date in patch fix time.

Next steps

We are happy to announce that Vanir is now available for public use. Vanir is not technically limited to Android, and we are also actively exploring problems that Vanir may help address, such as general C/C++ dependency management via integration with OSV-scanner. If you are interested in using or contributing to Vanir, please visit github.com/google/vanir. Please join our public community to submit your feedback and questions on the tool. 

We look forward to working with you on Vanir!

Leveling Up Fuzzing: Finding more vulnerabilities with AI

Posted by Oliver Chang, Dongge Liu and Jonathan Metzman, Google Open Source Security Team

Recently, OSS-Fuzz reported 26 new vulnerabilities to open source project maintainers, including one vulnerability in the critical OpenSSL library (CVE-2024-9143) that underpins much of internet infrastructure. The reports themselves aren’t unusual—we’ve reported and helped maintainers fix over 11,000 vulnerabilities in the 8 years of the project. 



But these particular vulnerabilities represent a milestone for automated vulnerability finding: each was found with AI, using AI-generated and enhanced fuzz targets. The OpenSSL CVE is one of the first vulnerabilities in a critical piece of software that was discovered by LLMs, adding another real-world example to a recent Google discovery of an exploitable stack buffer underflow in the widely used database engine SQLite.



This blog post discusses the results and lessons over a year and a half of work to bring AI-powered fuzzing to this point, both in introducing AI into fuzz target generation and expanding this to simulate a developer’s workflow. These efforts continue our explorations of how AI can transform vulnerability discovery and strengthen the arsenal of defenders everywhere.


The story so far

In August 2023, the OSS-Fuzz team announced AI-Powered Fuzzing, describing our effort to leverage large language models (LLM) to improve fuzzing coverage to find more vulnerabilities automatically—before malicious attackers could exploit them. Our approach was to use the coding abilities of an LLM to generate more fuzz targets, which are similar to unit tests that exercise relevant functionality to search for vulnerabilities. 



The ideal solution would be to completely automate the manual process of developing a fuzz target end to end:


  1. Drafting an initial fuzz target.

  2. Fixing any compilation issues that arise. 

  3. Running the fuzz target to see how it performs, and fixing any obvious mistakes causing runtime issues.

  4. Running the corrected fuzz target for a longer period of time, and triaging any crashes to determine the root cause.

  5. Fixing vulnerabilities. 



In August 2023, we covered our efforts to use an LLM to handle the first two steps. We were able to use an iterative process to generate a fuzz target with a simple prompt including hardcoded examples and compilation errors. 



In January 2024, we open sourced the framework that we were building to enable an LLM to generate fuzz targets. By that point, LLMs were reliably generating targets that exercised more interesting code coverage across 160 projects. But there was still a long tail of projects where we couldn’t get a single working AI-generated fuzz target.



To address this, we’ve been improving the first two steps, as well as implementing steps 3 and 4.


New results: More code coverage and discovered vulnerabilities

We’re now able to automatically gain more coverage in 272 C/C++ projects on OSS-Fuzz (up from 160), adding 370k+ lines of new code coverage. The top coverage improvement in a single project was an increase from 77 lines to 5434 lines (a 7000% increase).



This led to the discovery of 26 new vulnerabilities in projects on OSS-Fuzz that already had hundreds of thousands of hours of fuzzing. The highlight is CVE-2024-9143 in the critical and well-tested OpenSSL library. We reported this vulnerability on September 16 and a fix was published on October 16. As far as we can tell, this vulnerability has likely been present for two decades and wouldn’t have been discoverable with existing fuzz targets written by humans.



Another example was a bug in the project cJSON, where even though an existing human-written harness existed to fuzz a specific function, we still discovered a new vulnerability in that same function with an AI-generated target. 



One reason that such bugs could remain undiscovered for so long is that line coverage is not a guarantee that a function is free of bugs. Code coverage as a metric isn’t able to measure all possible code paths and states—different flags and configurations may trigger different behaviors, unearthing different bugs. These examples underscore the need to continue to generate new varieties of fuzz targets even for code that is already fuzzed, as has also been shown by Project Zero in the past (1, 2).


New improvements

To achieve these results, we’ve been focusing on two major improvements:


  1. Automatically generate more relevant context in our prompts. The more complete and relevant information we can provide the LLM about a project, the less likely it would be to hallucinate the missing details in its response. This meant providing more accurate, project-specific context in prompts, such as function, type definitions, cross references, and existing unit tests for each project. To generate this information automatically, we built new infrastructure to index projects across OSS-Fuzz. 



  1. LLMs turned out to be highly effective at emulating a typical developer’s entire workflow of writing, testing, and iterating on the fuzz target, as well as triaging the crashes found. Thanks to this, it was possible to further automate more parts of the fuzzing workflow. This additional iterative feedback in turn also resulted in higher quality and greater number of correct fuzz targets. 



The workflow in action

Our LLM can now execute the first four steps of the developer’s process (with the fifth soon to come). 



1. Drafting an initial fuzz target

A developer might check the source code, existing documentation and unit tests, as well as  usages of the target function when to draft an initial fuzz target. An LLM can fulfill this role here, if we provide a prompt with this information and ask it to come up with a fuzz target. 


Prompt: 


Your goal is to write a fuzzing harness for the provided function-under-test signature using <code>LLVMFuzzerTestOneInput</code>. It is important that the provided solution compiles and actually calls the function-under-test specified by the function signature:

<function signature>

unsigned char * buffer_append_base64_decode(buffer *, const char *, size_t, base64_charset)

</function signature>


Here is the source code of the function being tested:

<code>

unsigned char* buffer_append_base64_decode(buffer *out, const char* in, size_t in_length, base64_charset charset) {

    const size_t reserve = 3*(in_length/4) + 3;

    unsigned char * const result = (unsigned char *)

      buffer_string_prepare_append(out, reserve);

    const size_t out_pos =

      li_base64_dec(result, reserve, in, in_length, charset);


    buffer_commit(out, out_pos);


    return (out_pos || !in_length) ? result : NULL;

}


</code>


Here is the source code for functions which reference the function being tested:

<code>

…  



2. Fixing any compilation issues that arise. 

Once a developer has a candidate target, they would try to compile it and look at any compilation issues that arise. Again, we can prompt an LLM with details of the compilation errors so it can provide fixes.  


Prompt: 


Below is the error to fix:

The code has the following build issues:

<error>

server_fuzzer.cc:12:46: error: only virtual member functions can be marked 'override'

   12 |   bool OnData(const char* data, size_t size) override {

      |                                              ^~~~~~~~

server_fuzzer.cc:22:39: error: no type named 'MultiPartFile' in namespace 'httplib'

   22 |                        const httplib::MultiPartFile& file) override {

      |                              ~~~~~~~~~^

server_fuzzer.cc:12:27: warning: unused parameter 'data' [-Wunused-parameter]

   12 |   bool OnData(const char* data, size_t size) override {

      |                           ^

server_fuzzer.cc:37:5: error: use of undeclared 'operator()'

   37 |     operator()(request_data.data(), receiver);

      |     ^

server_fuzzer.cc:27:23: warning: private field 'provider_' is not used [-Wunused-private-field]

   27 |   FuzzedDataProvider& provider_;

      |                       ^

</error>

Example of compilation errors that an LLM was able to fix



3. Running the fuzz target to see how it performs, and fixing any obvious mistakes causing runtime issues.


Once all compilation errors are fixed, a developer would try running the fuzz target for a short period of time to see if there were any mistakes that led it to instantly crash, suggesting an error with the target rather than a bug discovered in the project.



The following is an example of an LLM fixing a semantic issue with the fuzzing setup: 


LLM Response:

<explanation>


The issue is that `sqlite3_blob_open` requires the table to exist before you can open a blob on it. The fix ensures that the table exists by executing a `CREATE TABLE IF NOT EXISTS` statement before calling `sqlite3_blob_open`. 


Here's a breakdown of the changes:


1. **Create Table Query:** A string `createTableQuery` is constructed to hold the SQL statement for creating the table. It uses the fuzzed values for the table and column names.

2. **Prepare Statement:** `sqlite3_prepare_v2` is used to prepare the SQL statement from `createTableQuery`. This compiles the SQL into a byte-code program that SQLite can execute.

3. **Execute Statement:** `sqlite3_step` executes the prepared statement. Since it's a `CREATE TABLE` statement, it doesn't return any data, but it creates the table if it doesn't exist.

4. **Finalize Statement:** `sqlite3_finalize` destroys the prepared statement object when it's no longer needed.


By adding these steps before calling `sqlite3_blob_open`, we ensure that the table exists, preventing the null pointer access issue. 

</explanation>




4. Running the corrected fuzz target for a longer period of time, and triaging any crashes.

At this point, the fuzz target is ready to run for an extended period of time on a suitable fuzzing infrastructure, such as ClusterFuzz



Any discovered crashes would then need to be triaged, to determine the root causes and whether they represented legitimate vulnerabilities (or bugs in the fuzz target). An LLM can be prompted with the relevant context (stacktraces, fuzz target source code, relevant project source code) to perform this triage. 



Prompt: Given the following crash report, fuzz driver code and relevant project function code, analyze the cause of the crash.


First, only answer “Crash is caused by bug in fuzz driver” or “Crash is caused by bug in project”. Second, offer a succinct and to-the-point analysis.


Below is crash report:

<log>

AddressSanitizer: heap-use-after-free on address 0x502000000630 at pc 0x560036939242 bp 0x7ffd65dca850 sp 0x7ffd65dca010

READ of size 1 at 0x502000000630 thread T0

SCARINESS: 40 (1-byte-read-heap-use-after-free)

    #0 0x560036939241 in __asan_memcpy /src/llvm-project/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp:63:3

    #1 0x56003697d30b in EmitLiteral<false> /src/snappy/snappy.cc:661:5

    #2 0x56003697d30b in snappy::internal::CompressFragment(char const*, unsigned long, char*, unsigned short*, int) /src/snappy/snappy.cc:954:10



In this example, the LLM correctly determines this is a bug in the fuzz target, rather than a bug in the project being fuzzed. 



LLM Response: Crash is caused by bug in fuzz driver.


The fuzz driver creates a vector of `iovec` structures, each pointing to a chunk of memory owned by the `FuzzedDataProvider`. However, the `data_chunk` vector, which holds the actual data pointed to by `iov_base`, is deallocated at the end of each fuzzer iteration when it goes out of scope. This leaves the `iov_base` pointers dangling. When `snappy::CompressFromIOVec` tries to access the memory pointed to by these dangling pointers, a heap-use-after-free occurs. 



5. Fixing vulnerabilities. 

The goal is to fully automate this entire workflow by having the LLM generate a suggested patch for the vulnerability. We don’t have anything we can share here today, but we’re collaborating with various researchers to make this a reality and look forward to sharing results soon. 


Up next

Improving automated triaging: to get to a point where we’re confident about not requiring human review. This will help automatically report new vulnerabilities to project maintainers. There are likely more than the 26 vulnerabilities we’ve already reported upstream hiding in our results.



Agent-based architecture: which means letting the LLM autonomously plan out the steps to solve a particular problem by providing it with access to tools that enable it to get more information, as well as to check and validate results. By providing LLM with interactive access to real tools such as debuggers, we’ve found that the LLM is more likely to arrive at a correct result.



Integrating our research into OSS-Fuzz as a feature: to achieve a more fully automated end-to-end solution for vulnerability discovery and patching. We hope OSS-Fuzz will be useful for other researchers to evaluate AI-powered vulnerability discovery ideas and ultimately become a tool that will enable defenders to find more vulnerabilities before they get exploited. 



For more information, check out our open source framework at oss-fuzz-gen. We’re hoping to continue to collaborate on this area with other researchers. Also, be sure to check out the OSS-Fuzz blog for more technical updates.

Retrofitting spatial safety to hundreds of millions of lines of C++

Posted by Alex Rebert and Max Shavrick, Security Foundations, and Kinuko Yasuda, Core Developer

Attackers regularly exploit spatial memory safety vulnerabilities, which occur when code accesses a memory allocation outside of its intended bounds, to compromise systems and sensitive data. These vulnerabilities represent a major security risk to users. 

Based on an analysis of in-the-wild exploits tracked by Google's Project Zero, spatial safety vulnerabilities represent 40% of in-the-wild memory safety exploits over the past decade:

Breakdown of memory safety CVEs exploited in the wild by vulnerability class.1

Google is taking a comprehensive approach to memory safety. A key element of our strategy focuses on Safe Coding and using memory-safe languages in new code. This leads to an exponential decline in memory safety vulnerabilities and quickly improves the overall security posture of a codebase, as demonstrated by our post about Android's journey to memory safety.

However, this transition will take multiple years as we adapt our development practices and infrastructure. Ensuring the safety of our billions of users therefore requires us to go further: we're also retrofitting secure-by-design principles to our existing C++ codebase wherever possible.

To that end, we're working towards bringing spatial memory safety into as many of our C++ codebases as possible, including Chrome and the monolithic codebase powering our services.

We’ve begun by enabling hardened libc++, which adds bounds checking to standard C++ data structures, eliminating a significant class of spatial safety bugs. While C++ will not become fully memory-safe, these improvements reduce risk as discussed in more detail in our perspective on memory safety, leading to more reliable and secure software.

This post explains how we're retrofitting hardened libc++ across our codebases and  showcases the positive impact it's already having, including preventing exploits, reducing crashes, and improving code correctness.

Bounds-checked data structures: The foundation for spatial safety

One of our primary strategies for improving spatial safety in C++ is to implement bounds checking for common data structures, starting with hardening the C++ standard library (in our case, LLVM’s libc++). Hardened libc++, recently added by open source contributors, introduces a set of security checks designed to catch vulnerabilities such as out-of-bounds accesses in production.

For example, hardened libc++ ensures that every access to an element of a std::vector stays within its allocated bounds, preventing attempts to read or write beyond the valid memory region. Similarly, hardened libc++ checks that a std::optional isn't empty before allowing access, preventing access to uninitialized memory.

This approach mirrors what's already standard practice in many modern programming languages like Java, Python, Go, and Rust. They all incorporate bounds checking by default, recognizing its crucial role in preventing memory errors. C++ has been a notable exception, but efforts like hardened libc++ aim to close this gap in our infrastructure. It’s also worth noting that similar hardening is available in other C++ standard libraries, such as libstdc++.

Raising the security baseline across the board

Building on the successful deployment of hardened libc++ in Chrome in 2022, we've now made it default across our server-side production systems. This improves spatial memory safety across our services, including key performance-critical components of products like Search, Gmail, Drive, YouTube, and Maps. While a very small number of components remain opted out, we're actively working to reduce this and raise the bar for security across the board, even in applications with lower exploitation risk.

The performance impact of these changes was surprisingly low, despite Google's modern C++ codebase making heavy use of libc++. Hardening libc++ resulted in an average 0.30% performance impact across our services (yes, only a third of a percent).

This is due to both the compiler's ability to eliminate redundant checks during optimization, and the efficient design of hardened libc++. While a handful of performance-critical code paths still require targeted use of explicitly unsafe accesses, these instances are carefully reviewed for safety. Techniques like profile-guided optimizations further improved performance, but even without those advanced techniques, the overhead of bounds checking remains minimal.

We actively monitor the performance impact of these checks and work to minimize any unnecessary overhead. For instance, we identified and fixed an unnecessary check, which led to a 15% reduction in overhead (reduced from 0.35% to 0.3%), and contributed the fix back to the LLVM project to share the benefits with the broader C++ community.

While hardened libc++'s overhead is minimal for individual applications in most cases, deploying it at Google's scale required a substantial commitment of computing resources. This investment underscores our dedication to enhancing the safety and security of our products.

From tests to production

Enabling libc++ hardening wasn't a simple flip of a switch. Rather, it required a multi-stage rollout to avoid accidentally disrupting users or creating an outage:

  1. Testing: We first enabled hardened libc++ in our tests over a year ago. This allowed us to identify and fix hundreds of previously undetected bugs in our code and tests.
  2. Baking: We let the hardened runtime "bake" in our testing and pre-production environments, giving developers time to adapt and address any new issues that surfaced. We also conducted extensive performance evaluations, ensuring minimal impact to our users' experience.
  3. Gradual Production Rollout: We then rolled out hardened libc++ to production over several months, starting with a small set of services and gradually expanding to our entire infrastructure. We closely monitored the rollout, promptly addressing any crashes or performance regressions.

Quantifiable impact

In just a few months since enabling hardened libc++ by default, we've already seen benefits.

Preventing exploits: Hardened libc++ has already disrupted an internal red team exercise and would have prevented another one that happened before we enabled hardening, demonstrating its effectiveness in thwarting exploits. The safety checks have uncovered over 1,000 bugs, and would prevent 1,000 to 2,000 new bugs yearly at our current rate of C++ development.

Improved reliability and correctness: The process of identifying and fixing bugs uncovered by hardened libc++ led to a 30% reduction in our baseline segmentation fault rate across production, indicating improved code reliability and quality. Beyond crashes, the checks also caught errors that would have otherwise manifested as unpredictable behavior or data corruption.

Moving average of segfaults across our fleet over time, before and after enablement.

Easier debugging: Hardened libc++ enabled us to identify and fix multiple bugs that had been lurking in our code for more than a decade. The checks transform many difficult-to-diagnose memory corruptions into immediate and easily debuggable errors, saving developers valuable time and effort.

Bridging the gap with memory-safe languages

While libc++ hardening provides immediate benefits by adding bounds checking to standard data structures, it's only one piece of the puzzle when it comes to spatial safety.

We're expanding bounds checking to other libraries and working to migrate our code to Safe Buffers, requiring all accesses to be bounds checked. For spatial safety, both hardened data structures, including their iterators, and Safe Buffers are necessary.

Beyond improving the safety of our C++, we're also focused on making it easier to interoperate with memory-safe languages. Migrating our C++ to Safe Buffers shrinks the gap between the languages, which simplifies interoperability and potentially even an eventual automated translation.

Building a safer C++ ecosystem

Hardened libc++ is a practical and effective way to enhance the safety, reliability, and debuggability of C++ code with minimal overhead. Given this, we strongly encourage organizations using C++ to enable their standard library's hardened mode universally by default.

At Google, enabling hardened libc++ is only the first step in our journey towards a spatially safe C++ codebase. By expanding bounds checking, migrating to Safe Buffers, and actively collaborating with the broader C++ community, we aim to create a future where spatial safety is the norm.

Acknowledgements

We’d like to thank Emilia Kasper, Chandler Carruth, Duygu Isler, Matthew Riley, and Jeff Vander Stoep for their helpful feedback. We also extend our thanks to the libc++ community for developing the hardening mode that made this work possible.


  1. Based on manual analysis of CVEs from July 15, 2014 to Dec 14, 2023. Note that we could not classify 11% of CVEs.. 

Security of Passkeys in the Google Password Manager

Posted by Arnar Birgisson, Software Engineer

We are excited to announce passkey support on Android and Chrome for developers to test today, with general availability following later this year. In this post we cover details on how passkeys stored in the Google Password Manager are kept secure. See our post on the Android Developers Blog for a more general overview.

Passkeys are a safer and more secure alternative to passwords. They also replace the need for traditional 2nd factor authentication methods such as text message, app based one-time codes or push-based approvals. Passkeys use public-key cryptography so that data breaches of service providers don't result in a compromise of passkey-protected accounts, and are based on industry standard APIs and protocols to ensure they are not subject to phishing attacks.

Passkeys are the result of an industry-wide effort. They combine secure authentication standards created within the FIDO Alliance and the W3C Web Authentication working group with a common terminology and user experience across different platforms, recoverability against device loss, and a common integration path for developers. Passkeys are supported in Android and other leading industry client OS platforms.

A single passkey identifies a particular user account on some online service. A user has different passkeys for different services. The user's operating systems, or software similar to today's password managers, provide user-friendly management of passkeys. From the user's point of view, using passkeys is very similar to using saved passwords, but with significantly better security.

The main ingredient of a passkey is a cryptographic private key. In most cases, this private key lives only on the user's own devices, such as laptops or mobile phones. When a passkey is created, only its corresponding public key is stored by the online service. During login, the service uses the public key to verify a signature from the private key. This can only come from one of the user's devices. Additionally, the user is also required to unlock their device or credential store for this to happen, preventing sign-ins from e.g. a stolen phone. 

To address the common case of device loss or upgrade, a key feature enabled by passkeys is that the same private key can exist on multiple devices. This happens through platform-provided synchronization and backup.

Passkeys in the Google Password Manager

On Android, the Google Password Manager provides backup and sync of passkeys. This means that if a user sets up two Android devices with the same Google Account, passkeys created on one device are available on the other. This applies both to the case where a user has multiple devices simultaneously, for example a phone and a tablet, and the more common case where a user upgrades e.g. from an old Android phone to a new one.

Passkeys in the Google Password Manager are always end-to-end encrypted: When a passkey is backed up, its private key is uploaded only in its encrypted form using an encryption key that is only accessible on the user's own devices. This protects passkeys against Google itself, or e.g. a malicious attacker inside Google. Without access to the private key, such an attacker cannot use the passkey to sign in to its corresponding online account.

Additionally, passkey private keys are encrypted at rest on the user's devices, with a hardware-protected encryption key.

Creating or using passkeys stored in the Google Password Manager requires a screen lock to be set up. This prevents others from using a passkey even if they have access to the user's device, but is also necessary to facilitate the end-to-end encryption and safe recovery in the case of device loss.

Recovering access or adding new devices

When a user sets up a new Android device by transferring data from an older device, existing end-to-end encryption keys are securely transferred to the new device. In some cases, for example, when the older device was lost or damaged, users may need to recover the end-to-end encryption keys from a secure online backup.

To recover the end-to-end encryption key, the user must provide the lock screen PIN, password, or pattern of another existing device that had access to those keys. Note, that restoring passkeys on a new device requires both being signed in to the Google Account and an existing device's screen lock.

Since screen lock PINs and patterns, in particular, are short, the recovery mechanism provides protection against brute-force guessing. After a small number of consecutive, incorrect attempts to provide the screen lock of an existing device, it can no longer be used. This number is always 10 or less, but for safety reasons we may block attempts before that number is reached. Screen locks of other existing devices may still be used.

If the maximum number of attempts is reached for all existing devices on file, e.g. when a malicious actor tries to brute force guess, the user may still be able to recover if they still have access to one of the existing devices and knows its screen lock. By signing in to the existing device and changing its screen lock PIN, password or pattern, the count of invalid recovery attempts is reset. End-to-end encryption keys can then be recovered on the new device by entering the new screen lock of the existing device.

Screen lock PINs, passwords or patterns themselves are not known to Google. The data that allows Google to verify correct input of a device's screen lock is stored on Google's servers in secure hardware enclaves and cannot be read by Google or any other entity. The secure hardware also enforces the limits on maximum guesses, which cannot exceed 10 attempts, even by an internal attack. This protects the screen lock information, even from Google.

When the screen lock is removed from a device, the previously configured screen lock may still be used for recovery of end-to-end encryption keys on other devices for a period of time up to 64 days. If a user believes their screen lock is compromised, the safer option is to configure a different screen lock (e.g. a different PIN). This disables the previous screen lock as a recovery factor immediately, as long as the user is online and signed in on the device.

Recovery user experience

If end-to-end encryption keys were not transferred during device setup, the recovery process happens automatically the first time a passkey is created or used on the new device. In most cases, this only happens once on each new device.

From the user's point of view, this means that when using a passkey for the first time on the new device, they will be asked for an existing device's screen lock in order to restore the end-to-end encryption keys, and then for the current device's screen lock or biometric, which is required every time a passkey is used.

Passkeys and device-bound private keys

Passkeys are an instance of FIDO multi-device credentials. Google recognizes that in certain deployment scenarios, relying parties may still require signals about the strong device binding that traditional FIDO credentials provide, while taking advantage of the recoverability and usability of passkeys.

To address this, passkeys on Android support the proposed Device-bound Public Key WebAuthn extension (devicePubKey). If this extension is requested when creating or using passkeys on Android, relying parties will receive two signatures in the result: One from the passkey private key, which may exist on multiple devices, and an additional signature from a second private key that only exists on the current device. This device-bound private key is unique to the passkey in question, and each response includes a copy of the corresponding device-bound public key.

Observing two passkey signatures with the same device-bound public key is a strong signal that the signatures are generated by the same device. On the other hand, if a relying party observes a device-bound public key it has not seen before, this may indicate that the passkey has been synced to a new device.

On Android, device-bound private keys are generated in the device's trusted execution environment (TEE), via the Android Keystore API. This provides hardware-backed protections against exfiltration of the device-bound private keys to other devices. Device-bound private keys are not backed up, so e.g. when a device is factory reset and restored from a prior backup, its device-bound key pairs will be different.

The device-bound key pair is created and stored on-demand. That means relying parties can request the devicePubKey extension when getting a signature from an existing passkey, even if devicePubKey was not requested when the passkey was created.

Announcing Google’s Open Source Software Vulnerability Rewards Program

Posted by Francis Perron, Open Source Security Technical Program Manager, and Krzysztof Kotowicz, Information Security Engineer 



Today, we are launching Google’s Open Source Software Vulnerability Rewards Program (OSS VRP) to reward discoveries of vulnerabilities in Google’s open source projects. As the maintainer of major projects such as Golang, Angular, and Fuchsia, Google is among the largest contributors and users of open source in the world. With the addition of Google’s OSS VRP to our family of Vulnerability Reward Programs (VRPs), researchers can now be rewarded for finding bugs that could potentially impact the entire open source ecosystem.


Google has been committed to supporting security researchers and bug hunters for over a decade. The original VRP program, established to compensate and thank those who help make Google’s code more secure, was one of the first in the world and is now approaching its 12th anniversary. Over time, our VRP lineup has expanded to include programs focused on Chrome, Android, and other areas. Collectively, these programs have rewarded more than 13,000 submissions, totaling over $38M paid. 


The addition of this new program addresses the ever more prevalent reality of rising supply chain compromises. Last year saw a 650% year-over-year increase in attacks targeting the open source supply chain, including headliner incidents like Codecov and the Log4j vulnerability that showed the destructive potential of a single open source vulnerability. Google's OSS VRP is part of our $10B commitment to improving cybersecurity, including securing the supply chain against these types of attacks for both Google’s users and open source consumers worldwide.

How it works

Projects

Google's OSS VRP encourages researchers to report vulnerabilities with the greatest real, and potential, impact on open source software under the Google portfolio. The program focuses on:


  • All up-to-date versions of open source software (including repository settings) stored in the public repositories of Google-owned GitHub organizations (eg. Google, GoogleAPIs, GoogleCloudPlatform, …).


  • Those projects’ third-party dependencies (with prior notification to the affected dependency required before submission to Google’s OSS VRP).


The top awards will go to vulnerabilities found in the most sensitive projects: Bazel, Angular, Golang, Protocol buffers, and Fuchsia. After the initial rollout we plan to expand this list. Be sure to check back to see what’s been added.

Vulnerabilities 

To focus efforts on discoveries that have the greatest impact on the supply chain, we welcome submissions of:


  • Vulnerabilities that lead to supply chain compromise

  • Design issues that cause product vulnerabilities

  • Other security issues such as sensitive or leaked credentials, weak passwords, or insecure installations


Depending on the severity of the vulnerability and the project’s importance, rewards will range from $100 to $31,337. The larger amounts will also go to unusual or particularly interesting vulnerabilities, so creativity is encouraged.

Getting involved

Before you start, please see the program rules for more information about out-of-scope projects and vulnerabilities, then get hacking and let us know what you find. If your submission is particularly unusual, we’ll reach out and work with you directly for triaging and response. In addition to a reward, you can receive public recognition for your contribution. You can also opt to donate your reward to charity at double the original amount.


Not sure whether a bug you’ve found is right for Google’s OSS VRP? Don’t worry, if needed, we’ll route your submission to a different VRP that will give you the highest possible payout. We also encourage you to check out our Patch Rewards program, which rewards security improvements to Google’s open source projects (for example, up to $20K for fuzzing integrations in OSS-Fuzz).

 

Appreciation for the open source community


Google is proud to both support and be a part of the open source software community. Through our existing bug bounty programs, we’ve rewarded bug hunters from over 84 countries and look forward to increasing that number through this new VRP. The community has continuously surprised us with its creativity and determination, and we cannot wait to see what new bugs and discoveries you have in store. Together, we can help improve the security of the open source ecosystem. 


Give it a try, and happy bug hunting! 


Announcing the Open Sourcing of Paranoid's Library

Posted by Pedro Barbosa, Security Engineer, and Daniel Bleichenbacher, Software Engineer

Paranoid is a project to detect well-known weaknesses in large amounts of crypto artifacts, like public keys and digital signatures. On August 3rd 2022 we open sourced the library containing the checks that we implemented so far (https://github.com/google/paranoid_crypto). The library is developed and maintained by members of the Google Security Team, but it is not an officially supported Google product.

Why the Project?

Crypto artifacts may be generated by systems with implementations unknown to us; we refer to them as “black boxes.” An artifact may be generated by a black-box if, for example, it was not generated by one of our own tools (such as Tink), or by a library that we can inspect and test using Wycheproof. Unfortunately, sometimes we end up relying on black-box generated artifacts (e.g. generated by proprietary HSMs).

After the disclosure of the ROCA vulnerability, we wondered what other weaknesses may exist in crypto artifacts generated by black boxes, and what we could do to detect and mitigate them. We then started working on this project in 2019 and created a library to perform checks against large amounts of crypto artifacts.

The library contains implementations and optimizations of existing work found in the literature. The literature shows that the generation of artifacts is flawed in some cases - below are examples of publications the library is based on.

As a recent example, CVE-2022-26320 found by Hanno Böck, confirmed the importance of checking for known weaknesses. Paranoid has already found similar weak keys independently (via the CheckFermat test). We also believe the project has potential to detect new vulnerabilities since we typically attempt to generalize detections as much as we can.

Call for Contributions

The goal of open sourcing the library is to increase transparency, allow other ecosystems to use it (such as Certificate Authorities - CAs that need to run similar checks to meet compliance), and receive contributions from external researchers. By doing so, we’re making a call for contributions, in hopes that after researchers find and report crypto vulnerabilities, the checks are added into the library. This way, Google and the rest of the world can respond quickly to new threats.

Note, the project is intended to be light in its use of computational resources. The checks must be fast enough to run against large numbers of artifacts and must make sense in real world production context. Projects with less restrictions, such as RsaCtfTool, may be more appropriate for different use cases.

In addition to contributions of new checks, improvements to those that already exist are also welcome. By analyzing the released source one can see some problems that are still open. For example, for ECDSA signatures in which the secrets are generated using java.util.random, we have a precomputed model that is able to detect this vulnerability given two signatures over secp256r1 in most cases. However, for larger curves such as secp384r1, we have not been able to precompute a model with significant success.

In addition to ECDSA signatures, we also implemented checks for RSA and EC public keys, and general (pseudo) random bit streams. For the latter, we were able to build some improvements on the NIST SP 800-22 test suite and to include additional tests using lattice reduction techniques.

Preliminary results

Similar to other published works, we have been analyzing the crypto artifacts from Certificate Transparency (CT), which logs issued website certificates since 2013 with the goal of making them transparent and verifiable. Its database contains more than 7 billion certificates.

For the checks of EC public keys and ECDSA signatures, so far, we have not found any weak artifacts in CT. For the RSA public key checks with severities high or critical, we have the following results:



Some of these certificates were already expired or revoked. For the ones that were still active (most of the CheckGCD ones), we immediately reported them to the CAs to be revoked. Reporting weak certificates is important to keep the internet secure, as stated by the policies of the CAs. The Let's Encrypt policy, for example, is defined here. In another example, Digicert states:

Certificate revocation and certificate problem reporting are an important part of online trust. Certificate revocation is used to prevent the use of certificates with compromised private keys, reduce the threat of malicious websites, and address system-wide attacks and vulnerabilities. As a member of the online community, you play an important role in helping maintain online trust by requesting certificate revocations when needed.

What is next?

We plan to continue analyzing Certificate Transparency, and now with the help of external contributions, we will continue the implementation of new checks and optimization of those existing.

We are also closely watching the NIST Post-Quantum Cryptography Standardization Process for new algorithms that make sense to implement checks. New crypto implementations carry the possibility of new bugs, and it is important that Paranoid is able to detect them.


❌