Newsroom Robots
Newsroom Robots
How Open-Source AI Puts Newsrooms Back in the Driver’s Seat: In Conversation with Florent Daudens
0:00
-1:04:24

How Open-Source AI Puts Newsrooms Back in the Driver’s Seat: In Conversation with Florent Daudens

What if the future of journalism isn’t locked behind the paywalls of big tech companies, but freely available to every newsroom willing to embrace it?

Too often, the conversation around AI in newsrooms centers on big tech, like OpenAI’s ChatGPT or Google’s Gemini. These are powerful tools, no doubt but they come with caveats: mainly cost, limited transparency, and little to no control over where your data ends up.

But there’s another world of AI rapidly evolving in parallel and it might be journalism’s best path forward: open-source AI.

In this episode of Newsroom Robots, I welcomed back Florent Daudens for his second appearance on the podcast. He first joined us last year on the podcast while leading AI innovation at Radio-Canada. Since then, he’s taken on a new role as Press Lead at Hugging Face, the open-source AI platform. Described as “the GitHub of AI,” Hugging Face hosts over 1.5 million models used by 7 million people daily.

Here are three key insights I took away from our conversation:

1️⃣ Open-Source AI Lets Newsrooms Reclaim Control from Big Tech

Here’s the fundamental difference: When you use a tool like ChatGPT, your data is sent to OpenAI’s servers, processed by their systems, and governed by their privacy policies with all the potential risks of third-party data exposure. With open-source AI, you download the model (think of it like downloading any other file) and run it entirely on your own computer or servers. That means greater control, better privacy, and fewer surprises.

"Each time you're running an open source model on your device, it is free to download, it's free to run, you don't pay any cost," Florent emphasized. For newsrooms handling sensitive investigations, confidential sources, or unpublished content, this local processing eliminates major security risks entirely.

But the advantages go far beyond privacy. Open-source models offer full transparency: you can see how they were trained, what data was used, and what biases might be built in. Proprietary models like ChatGPT are black boxes — you have no idea what’s happening under the hood. With open-source models, you can inspect every component and modify them to fit your needs.

Perhaps most importantly, you’re never locked into a vendor’s decisions. When OpenAI changes ChatGPT’s capabilities or pricing, you have no choice but to adapt. With open-source alternatives, you control the entire pipeline. “You can switch models in one line of code,” Florent said — meaning newsrooms can experiment freely, compare results transparently, and never worry about outside companies changing the rules.

2️⃣ Smaller Models, Greener Footprints

I’ve noticed that one of the most significant barriers to AI adoption in newsrooms isn’t just technical, it’s ethical. Many journalists are wary of AI because of its massive environmental footprint, and they’re right to be concerned. Florent pointed to research from Hugging Face showing that every time you generate an image with ChatGPT, it consumes about as much energy as fully charging an iPhone. Now scale that across millions of users making thousands of requests every day and the carbon impact becomes staggering.

Florent offered some practical advice on how newsrooms can reduce their AI footprint. First, he emphasized the importance of asking a simple but critical question: “Is AI actually needed here?” He explained, “The news industry has focused on productivity as a first step with AI and that’s okay. But sometimes you have to ask: do we really need AI for this specific task?” In other words, just because you can use AI doesn’t always mean you should.

Florent’s second recommendation focused on where newsrooms can gain the most efficiency: swapping out large foundational models for smaller, task-specific ones. “What I’m seeing,” he said, “is that more often than not, the news industry is using big foundational models for tasks that could easily be handled by smaller, specialized models.” This is where the small-model revolution becomes crucial — delivering faster results, lower costs, and a far lighter environmental footprint.

A big foundational model is something like GPT-4o or o3 — the kinds you access through ChatGPT. But thanks to the small-model revolution, compressed models are now achieving remarkable capabilities while running entirely on local devices. Florent spoke about a 250-million-parameter vision model — smaller than most smartphone apps — that can shot-list an entire video in minutes. And when you run a model like that locally, you're using a computer that's already powered on for other work — no need for extra servers or transmitting data across the internet.

Compression advances are accelerating fast. Models that once needed massive server farms are now being distilled into lightweight versions that run smoothly on everyday laptops and all while retaining most of their core capabilities.

3️⃣ Preparing for the Post-Website Era

The most existential threat to journalism’s business model may be the rise of AI agents that eliminate the need for audiences to visit news websites altogether. These autonomous systems can browse the internet, read articles, synthesize information, and deliver personalized news briefings, all without users ever seeing a headline, ad, or paywall.

“You can ask your agents to retrieve information for you. They’ll browse the website, extract the content and you’ll never see the UX, the UI, or the ads,” Florent explained, describing a reality that’s already taking shape with tools like ChatGPT’s search feature and Perplexity’s AI-powered search.

This shift threatens to obliterate traditional revenue models built on page views, time-on-site, and ad impressions. When readers can ask their AI assistant for the latest news and get a full briefing without ever visiting a news site, what happens to digital advertising? And if agents can instantly synthesize information from multiple sources, why would anyone click through to a single article?

But this disruption also opens up new opportunities for newsrooms willing to rethink their relationship with audiences. Instead of competing with AI agents, the most forward-thinking outlets may need to start building for them: creating content and experiences that work whether consumed by humans or by machines acting on their behalf. In this new landscape, the winners won’t just own the content, they’ll own the experience.

🎧 Listen to the full conversation with Florent Daudens on Apple Podcasts, Spotify or other major podcast platforms.

Discussion about this episode

User's avatar