Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Researchers find what makes AI chatbots politically persuasive

4 December 2025 at 15:07

Roughly two years ago, Sam Altman tweeted that AI systems would be capable of superhuman persuasion well before achieving general intelligence—a prediction that raised concerns about the influence AI could have over democratic elections.

To see if conversational large language models can really sway political views of the public, scientists at the UK AI Security Institute, MIT, Stanford, Carnegie Mellon, and many other institutions performed by far the largest study on AI persuasiveness to date, involving nearly 80,000 participants in the UK. It turned out political AI chatbots fell far short of superhuman persuasiveness, but the study raises some more nuanced issues about our interactions with AI.

AI dystopias

The public debate about the impact AI has on politics has largely revolved around notions drawn from dystopian sci-fi. Large language models have access to essentially every fact and story ever published about any issue or candidate. They have processed information from books on psychology, negotiations, and human manipulation. They can rely on absurdly high computing power in huge data centers worldwide. On top of that, they can often access tons of personal information about individual users thanks to hundreds upon hundreds of online interactions at their disposal.

Read full article

Comments

© Carol Yepes via Getty

OpenAI CEO declares “code red” as Gemini gains 200 million users in 3 months

2 December 2025 at 17:42

The shoe is most certainly on the other foot. On Monday, OpenAI CEO Sam Altman reportedly declared a “code red” at the company to improve ChatGPT, delaying advertising plans and other products in the process,  The Information reported based on a leaked internal memo. The move follows Google’s release of its Gemini 3 model last month, which has outperformed ChatGPT on some industry benchmark tests and sparked high-profile praise on social media.

In the memo, Altman wrote, “We are at a critical time for ChatGPT.” The company will push back work on advertising integration, AI agents for health and shopping, and a personal assistant feature called Pulse. Altman encouraged temporary team transfers and established daily calls for employees responsible for enhancing the chatbot.

The directive creates an odd symmetry with events from December 2022, when Google management declared its own “code red” internal emergency after ChatGPT launched and rapidly gained in popularity. At the time, Google CEO Sundar Pichai reassigned teams across the company to develop AI prototypes and products to compete with OpenAI’s chatbot. Now, three years later, the AI industry is in a very different place.

Read full article

Comments

© Anadolu via Getty Images

Syntax hacking: Researchers discover sentence structure can bypass AI safety rules

2 December 2025 at 07:15

Researchers from MIT, Northeastern University, and Meta recently released a paper suggesting that large language models (LLMs) similar to those that power ChatGPT may sometimes prioritize sentence structure over meaning when answering questions. The findings reveal a weakness in how these models process instructions that may shed light on why some prompt injection or jailbreaking approaches work, though the researchers caution their analysis of some production models remains speculative since training data details of prominent commercial AI models are not publicly available.

The team, led by Chantal Shaib and Vinith M. Suriyakumar, tested this by asking models questions with preserved grammatical patterns but nonsensical words. For example, when prompted with “Quickly sit Paris clouded?” (mimicking the structure of “Where is Paris located?”), models still answered “France.”

This suggests models absorb both meaning and syntactic patterns, but can overrely on structural shortcuts when they strongly correlate with specific domains in training data, which sometimes allows patterns to override semantic understanding in edge cases. The team plans to present these findings at NeurIPS later this month.

Read full article

Comments

© EasternLightcraft via Getty Images

What comes next for federal workers after AI takes over the mundane tasks

26 November 2025 at 18:52

 

Interview transcript:

 

Bob Venero As we look at AI in in the federal government, but also in the companies that support the federal government — like the Northrop Grummans of the world, the Raytheons — AI is extremely important in helping them accomplish mission. Whether that mission is for the warfighter or whether that mission is for the Veterans Administration or any of the other areas within the government. And what we’re seeing is there’s a tremendous amount of pilots that are happening within those government agencies. At the end of the day, AI means something different to everybody, right? And really if you look at it, what is going to be the business outcome of some type of AI strategy or AI automation that you, as an agency or as a federal systems integrator, are trying to accomplish? That’s the key factor. I don’t want to say there’s no magic behind the curtain as it relates to what AI is — it’s things that are going to happen at speed and scale, tied to incredible technologies from companies like NVIDIA. I look at them as the grandfather of AI. And actually, I think next week is NVIDIA’s GTC event in DC, right? Really geared towards the government and you know what AI is doing for them. So as we look at different areas within the different government spaces, automation is key. Having the proper large language models to support what those agencies are trying to do, and then really protecting the security around what those AI models are going to do, and the guardrails that the government has to have, which is different than the bad actors around their AI testing processes and procedures.

Eric White That’s one thing I’ve been curious about as large language models and other AI tools start being implemented and the idea of contractors competing for different jobs. How are government officials going to judge who has the better large language model or who’s piggybacking off of who? Just getting your thoughts on what the future in that realm will look like.

Bob Venero Well, when you look at that … people are saying, hey, prove to me that you didn’t use an AI model to do that, right? And I look at it and say, if I’m smart enough to use that AI model, you should be looking at me because I’m going to be innovative and smart to help accomplish the goal. I don’t necessarily look at it as a bad thing. Who is going to leverage the proper tools that are out there to accomplish the job in the most efficient, effective and cost-based area? And I think that’s key for people to start to look at. You’ll see now when people actually do interviews, they’re asking them, are you doing this interview with AI or not with AI? And they have to attest to that. But to me, that goes counter to what AI is looking to do for everybody, right? It’s about speed, accuracy, and automation. And if someone knows how to leverage it better, that’s probably the person that you want, because those tools are going to be in your environment. It doesn’t necessarily mean that it’s a bad opportunity or they’re a bad contractor or there’s a bad comparative against large language models. It’s who’s using the technology to the best of its ability to accomplish the goals and the business outcomes? That’s the answer.

Eric White How much help will it provide on the bottom line, do you think, as far as budgets are getting tighter and tighter? How much more will this provide?

Bob Venero As we look at what AI can accomplish, automation is a lot of that AI conversation. Because when you can do automation, then you can take people out of the mundane tasks that are just labor-intensive and have them focus on better things to do, that are more thought-provoking, within their environment. So, it will make a difference as far as cost is concerned. Because if I have an individual that we’re paying $150,000 a year, and we had him doing tasks that were mundane because it was a part of his job description and now we can automate that and have him do more thought-provoking things? That’s better for the environment that we’re going to put him in, but it’s also better for the bottom line. Because now I can do things quicker, more efficient, and more effective. And now I can come in under budget potentially. So as budgets become more and more strained, AI becomes a much better tool. But you know, the big fear … am I going to lose my job to AI? That’s a very broad question. You shouldn’t lose your job to AI, and if you do lose your job to AI, then you weren’t focusing on really what your career was in your future. Because if AI can just take the job away, then you haven’t built value for yourself as an individual. It’s about how AI can help you do your job better, faster and right now the question is accuracy, right? Because there’s a lot of mistakes that happen within that model. Whether it’s Grok or ChatGPT, and you ask it a question that you know the answer to, and you know that the answer they gave you is wrong, and then you just say, are you sure? And you prompt that in, and they’re like, oh you’re right, actually, I did make a mistake. Now that I thought about it, here’s what it is. So it’s not even a question of oh, is that model 100% accurate? If you’re taking that as the rule of law, then you’re going to be in a situation where it’s going come back and bite you in the butt.

Eric White We’re speaking with Bob Venero, he’s the president and CEO of Future Tech Enterprise. That’s been the selling point of this technology, faulting whatever the doomsdayers say about loss of jobs, that you know it will help automate and free you up from those mundane tasks. Are we already seeing that or when is that going to kick in? Because I still find myself doing a lot of data entry here, Bob.

Bob Venero I think it’s definitely happening. Not as efficiently and effective as it should, and that’s because we haven’t been educated properly on prompt engineering. If you don’t know how to ask the AI model the question the right way, it’s going to take you longer sometimes to get to the end result. So there’s a whole education cycle that needs to happen on how to create the right prompting, to ask the right questions to get to your end result and goal. And I think that’s going to develop over time. So right now we’re in the infancies of it. I can tell you that it definitely helps in some of the mundane tasks that are tied to, hey, I want to write a brief about something, here is my topic. It gets you 80% there, and then you have to go in and adjust it. But that 80% has saved you a lot of time and effort, from starting it from the beginning to the end. But then you need to validate what it is, the end result, and make sure your answers are correct. So, we’re not quite there yet. I think in the next 12 to 18 months you’ll see a big difference as these models become more and more intelligent in supporting the businesses that they’re handling, the government agencies that they’re handling, whatever the area that it is, because it’s all about the data. And you know … [garbage] in, [garbage] out, right? And that is, from a data perspective, extremely important as these models become trained.

Eric White Let’s zero in on the defense side of things. Where, from a warfighter perspective, could this technology even work out for the Department of Defense? In the procurement contract world, could this technology be of assistance?

Bob Venero Oh, without a doubt. So a lot of times when the agencies like the DoD put out an RFQ for some type of solution, they’ve got criteria that they need to look at and vet each time the respondent is doing what they’re doing. And that criteria now can be handled by AI versus an individual having to compare hundreds or thousands of pages of response, going through it, pulling out the key areas in there, and then evaluating them across each other. And I think that’s very important. As you take a look at the speed of getting things done, what we’re seeing in the organizations, the systems integrators, speed is so important to them. Now to be able to respond to something accurately, efficiently, and be there first versus somebody else who maybe isn’t leveraging those tools is key. If the Department of Defense can use those same tools to evaluate and compare and contrast versus the human eye, it’s a game changer. It really, really is. And it can give you weighted results against each of the potential bidders on there and pick what it believes is the right solution based on your criteria. But then you still have that human intervention that says, okay, let’s really weigh the results here. Thank you for giving me the information. I see it this way, but Northrop Grumman performed better than BAE did on this, or vice versa, and there’s historics that you can take a look at from a DoD perspective. So I think the more and more it’s adopted within that space, the more efficient those agencies can become, the quicker they can give awards, and the better the cost base will be. They it’s going to reduce their costs as well.

Eric White And the bidders may be able to use it if they have to go through a lengthy RFQ, right?

Bob Venero Which without a doubt. Future Tech as a company, we do a lot of RFQs because we support a lot of the federal systems integrators. And we have an AI tool now — we’ve been in business 29 years, so 29 years of responses — and we fed it into the large language model. So now when something comes across the plate, I don’t have to have a team of seven: “Hey, pull from this, pull from that, pull from here, pull from that.” The AI goes in, it looks at the criteria, it then pulls it in and helps us write a draft of what the response is, the key things that we’ve had. And that’s been amazing from a time-reduction perspective and from a personnel and skill perspective.

Eric White Yeah, pointing it back at yourself, you gave that example. What are you having those seven folks do now that you don’t have to have them digging through all of that paperwork?

Bob Venero Here’s a perfect example. The person now who heads up our RFQ team, she wanted to expand and be a part of the onboarding and training for individuals that come into the company. And so now she has a dual role in the organization. That role wasn’t there before, but now we created this additional role, she’s got both of them, she has the time to do it based on the tools that are there. And now from an onboarding and training perspective, we’re going to bring in an AI module that’s going to help her with that as well. So if you’re embracing it properly, it is going to take you to places that are good. I always use this analogy. I’m a boater, right? And years ago, you had two sticks when you had two engines, right? You had sticks for left and right and back and forth, and then they came out with a joystick. And the joystick is just like we know, you turn the joystick left, it goes left, turn it right, back, forward. All intelligence built into it. The key and smart thing about what that has done — I used to get yelled at, “you’re a cheater, you’re a cheater, you know, you’re not learning the old way.” I’m like, no, I’m actually smart — it’s a lot easier to do it this way. I can get to my route quicker and I can park a lot more easy. It’s the same thing with these tools, right? Embrace them. Bring them into your environment, leverage them out there, and it’ll help you as an organization. But also any of the agencies that do it, it will help them be more efficient, effective, and that’s important right now, tied to costs and cost reduction.

The post What comes next for federal workers after AI takes over the mundane tasks first appeared on Federal News Network.

© Getty Images/wildpixel

AI Robot Team Assistant Service and Chatbot agant or Robotic Automation helping Humans as technology and Human Job integration as employees being guided by robots.

Google tells employees it must double capacity every 6 months to meet AI demand

21 November 2025 at 16:47

While AI bubble talk fills the air these days, with fears of overinvestment that could pop at any time, something of a contradiction is brewing on the ground: Companies like Google and OpenAI can barely build infrastructure fast enough to fill their AI needs.

During an all-hands meeting earlier this month, Google’s AI infrastructure head Amin Vahdat told employees that the company must double its serving capacity every six months to meet demand for artificial intelligence services, reports CNBC. The comments show a rare look at what Google executives are telling its own employees internally. Vahdat, a vice president at Google Cloud, presented slides to its employees showing the company needs to scale “the next 1000x in 4-5 years.”

While a thousandfold increase in compute capacity sounds ambitious by itself, Vahdat noted some key constraints: Google needs to be able to deliver this increase in capability, compute, and storage networking “for essentially the same cost and increasingly, the same power, the same energy level,” he told employees during the meeting. “It won’t be easy but through collaboration and co-design, we’re going to get there.”

Read full article

Comments

© Google

Ai2 releases Olmo 3 open models, rivaling Meta, DeepSeek and others on performance and efficiency

20 November 2025 at 10:15
GeekWire Photo / Todd Bishop

The Allen Institute for AI (Ai2) released a new generation of its flagship large language models, designed to compete more squarely with industry and academic heavyweights.

The Seattle-based nonprofit unveiled Olmo 3, a collection of open language models that it says outperforms fully open models such as Stanford’s Marin and commercial open-weight models like Meta’s Llama 3.1.

Earlier versions of Olmo were framed mainly as scientific tools for understanding how AI models are built. With Olmo 3, Ai2 is expanding its focus, positioning the models as powerful, efficient, and transparent systems suitable for real-world use, including commercial applications.

“Olmo 3 proves that openness and performance can advance together,” said Ali Farhadi, the Ai2 CEO, in a press release Thursday morning announcing the new models.

It’s part of a broader evolution in the AI world. Over the past year, increasingly powerful open models from companies and universities — including Meta, DeepSeek, Qwen, and Stanford — have started to rival the performance of proprietary systems from big tech companies.

Many of the latest open models are designed to show their reasoning step-by-step — commonly called “thinking” models — which has become a key benchmark in the field.

Ai2 is releasing Olmo 3 in multiple versions: Olmo 3 Base (the core foundation model); Olmo 3 Instruct (tuned to follow user directions); Olmo 3 Think (designed to show more explicit reasoning); and Olmo 3 RL Zero (an experimental model trained with reinforcement learning).

Open models have been gaining traction with startups and businesses that want more control over costs and data, along with clearer visibility into how the technology works. 

Ai2 is going further by releasing the full “model flow” behind Olmo 3 — a set of snapshots showing how the model progressed through each stage of training. In addition, an updated OlmoTrace tool will let researchers link a model’s reasoning steps back to the specific data and training decisions that influenced them.

In terms of energy and cost efficiency, Ai2 says the new Olmo base model is 2.5 times more efficient to train than Meta’s Llama 3.1 (based on GPU-hours per token, comparing Olmo 3 Base to Meta’s 8B post-trained model). Much of this gain comes from training Olmo 3 on far fewer tokens than comparable systems, in some cases six times fewer than rival models.

Among other improvements, Ai2 says Olmo 3 can read or analyze much longer documents at once, with support for inputs up to 65,000 tokens, about the length of a short book chapter.

Founded in 2014 by the late Microsoft co-founder Paul Allen, Ai2 has long operated as a research-focused nonprofit, developing open-source tools and models while bigger commercial labs dominated the spotlight. The institute has made a series of moves this year to elevate its profile while preserving its mission of developing AI to solve the world’s biggest problems.

In August, Ai2 was selected by the National Science Foundation and Nvidia for a landmark $152 million initiative to build fully open multimodal AI models for scientific research, positioning the institute to serve as a key contributor to the nation’s AI backbone. 

It also serves as the key technical partner for the Cancer AI Alliance, helping Fred Hutch and other top U.S. cancer centers train AI models on clinical data without exposing patient records.

Olmo 3 is available now on Hugging Face and Ai2’s model playground.

Critics scoff after Microsoft warns AI feature can infect machines and pilfer data

19 November 2025 at 15:25

Microsoft’s warning on Tuesday that an experimental AI agent integrated into Windows can infect devices and pilfer sensitive user data has set off a familiar response from security-minded critics: Why is Big Tech so intent on pushing new features before their dangerous behaviors can be fully understood and contained?

As reported Tuesday, Microsoft introduced Copilot Actions, a new set of “experimental agentic features” that, when enabled, perform “everyday tasks like organizing files, scheduling meetings, or sending emails,” and provide “an active digital collaborator that can carry out complex tasks for you to enhance efficiency and productivity.”

Hallucinations and prompt injections apply

The fanfare, however, came with a significant caveat. Microsoft recommended users enable Copilot Actions only “if you understand the security implications outlined.”

Read full article

Comments

© Photographer: Chona Kasinger/Bloomberg via Getty Images

Google CEO: If an AI bubble pops, no one is getting out clean

18 November 2025 at 11:32

On Tuesday, Alphabet CEO Sundar Pichai warned of “irrationality” in the AI market, telling the BBC in an interview, “I think no company is going to be immune, including us.” His comments arrive as scrutiny over the state of the AI market has reached new heights, with Alphabet shares doubling in value over seven months to reach a $3.5 trillion market capitalization.

Speaking exclusively to the BBC at Google’s California headquarters, Pichai acknowledged that while AI investment growth is at an “extraordinary moment,” the industry can “overshoot” in investment cycles, as we’re seeing now. He drew comparisons to the late 1990s Internet boom, which saw early Internet company valuations surge before collapsing in 2000, leading to bankruptcies and job losses.

“We can look back at the Internet right now. There was clearly a lot of excess investment, but none of us would question whether the Internet was profound,” Pichai said. “I expect AI to be the same. So I think it’s both rational and there are elements of irrationality through a moment like this.”

Read full article

Comments

© Ryan Whitwam

Forget AGI—Sam Altman celebrates ChatGPT finally following em dash formatting rules

14 November 2025 at 13:45

Em dashes have become what many believe to be a telltale sign of AI-generated text over the past few years. The punctuation mark appears frequently in outputs from ChatGPT and other AI chatbots, sometimes to the point where readers believe they can identify AI writing by its overuse alone—although people can overuse it, too.

On Thursday evening, OpenAI CEO Sam Altman posted on X that ChatGPT has started following custom instructions to avoid using em dashes. “Small-but-happy win: If you tell ChatGPT not to use em-dashes in your custom instructions, it finally does what it’s supposed to do!” he wrote.

The post, which came two days after the release of OpenAI’s new GPT-5.1 AI model, received mixed reactions from users who have struggled for years with getting the chatbot to follow specific formatting preferences. And this “small win” raises a very big question: If the world’s most valuable AI company has struggled with controlling something as simple as punctuation use after years of trying, perhaps what people call artificial general intelligence (AGI) is farther off than some in the industry claim.

Read full article

Comments

© Yurii Karvatskyi via Getty Images

OpenAI walks a tricky tightrope with GPT-5.1’s eight new personalities

12 November 2025 at 17:54

On Wednesday, OpenAI released GPT-5.1 Instant and GPT-5.1 Thinking, two updated versions of its flagship AI models now available in ChatGPT. The company is wrapping the models in the language of anthropomorphism, claiming that they’re warmer, more conversational, and better at following instructions.

The release follows complaints earlier this year that its previous models were excessively cheerful and sycophantic, along with an opposing controversy among users over how OpenAI modified the default GPT-5 output style after several suicide lawsuits.

The company now faces intense scrutiny from lawyers and regulators that could threaten its future operations. In that kind of environment, it’s difficult to just release a new AI model, throw out a few stats, and move on like the company could even a year ago. But here are the basics: The new GPT-5.1 Instant model will serve as ChatGPT’s faster default option for most tasks, while GPT-5.1 Thinking is a simulated reasoning model that attempts to handle more complex problem-solving tasks.

Read full article

Comments

© Chris Madden via Getty Images

❌
❌