Streaming infrastructure has scaled dramatically over the past decade. The encoding layer underneath it, in many operations, has not kept pace.
Most platforms are still running architectures built around always-on transcoding pipelines, pre-encoded rendition libraries stored across every bitrate and format combination, and CPU-heavy infrastructure provisioned for peak load regardless of actual demand. It is a model that works. But working is not the same as being efficient, and at scale, the difference between the two shows up directly in cost, energy consumption and operational complexity.
For many streaming service providers, the operational challenge is no longer just scale. It is flexibility. Sports, pop-up channels, regional broadcasts, FAST services, and temporary live events all create demand patterns that are difficult to predict in advance. Providers increasingly need the ability to launch additional events on demand without permanently allocating encoding capacity that may sit idle most of the time.
That reality is pushing interest toward architectures that can defer encoding and packaging until it is required; when a viewer session begins, when audience demand increases, or when a specific rendition is requested. Instead of continuously generating and storing every possible output variant, the infrastructure becomes capable of responding dynamically to real consumption patterns.
At production scale, that shift matters. It changes how compute resources are allocated, how storage is consumed, how redundancy is planned, and ultimately how efficiently a streaming platform can operate under fluctuating demand.
The industry knows this. What is less clear, for many teams, is where to start.
The Gap Between Design and Production
Encoding decisions are made early. Their consequences show up late.
A pipeline that looks efficient at the design stage can quietly become the most expensive line item in a streaming operation once it is live and under load. Costs drift. Infrastructure sized for expected demand struggles with real demand. The trade-offs only become visible after deployment, and by that point, redesigning from scratch is rarely an option. The work becomes about understanding what is actually happening inside the system and making targeted, informed decisions from there.
That requires a different kind of conversation than the industry usually has.
Streaming Architecture: Encoding Efficiency at Scale with Scalstrm and NETINT
On May 22, the day after Streaming Tech Sweden, Scalstrm and NETINT are hosting Streaming Architecture: Encoding Efficiency at Scale at Hotel at Six, Music Lounge in Stockholm. It is a free, independent, full-day session for streaming engineers and platform teams, capped at 40 attendees.
The morning starts with breakfast at 09:00, moving into technical sessions from 09:30. Leo Nieto of NETINT opens by connecting encoding decisions to broader architectural outcomes, setting the stage for what follows. Craig Butlin then breaks down why encoding has become the dominant cost driver in streaming workflows at scale, and where software-based approaches begin to struggle.
From Scalstrm, Dominique Vosters presents Where Streaming Infrastructure Is Actually Heading, examining the shift toward orchestrated, resource-aware infrastructure and the operational and economic challenges teams face as workloads become more dynamic and less predictable. Pontus Eklöf follows with From Pipelines to Systems: Orchestrating Encoding at Scale, a session diving into how modern orchestration approaches enable better utilisation, automation and cost control, with real-world examples of workflows designed to adapt dynamically to demand.
The morning closes with a panel discussion at 12:00 bringing all four speakers together to debate what a cost-efficient streaming stack actually looks like in 2026.
The afternoon splits into two tracks. The engineering track runs focused technical deep dives on integration, pipeline design and adaptive architectures. The business track offers 1:1 architecture clinics, 15-minute sessions where teams can bring their actual pipeline and get direct, practical feedback. Ten slots are available, pre-booked.
Teams who want their architecture discussed live can submit a short overview in advance. A limited number will be selected for live review.
The VPU Conversation That Needs to Happen
VPUs have gained significant attention for good reason. Built for video encoding workloads, they deliver far greater throughput and energy efficiency than general-purpose CPUs in high-volume environments.
Deployments using Scalstrm’s VPU-integrated architecture have recorded up to 50% lower cost per channel, 75% less server footprint and 80% lower power consumption compared to CPU-based pipelines.
But hardware alone does not guarantee results. Performance depends on the software layer, from session management and workload allocation to pipeline design. This session explores that gap through real deployment models and practical trade-offs, beyond vendor claims.
Join Scalstrm and NETINT in Stockholm
The day is built for streaming platform engineers, video infrastructure architects and teams operating encoding workflows at scale. Seating is limited to 40 attendees. Attendance is complimentary, lunch included.
Streaming Architecture: Encoding Efficiency at Scale Friday, May 22, 2026 | Hotel at Six, Music Lounge, Stockholm | The day after Streaming Tech Sweden.
- Reserve your spot (40 places available)
- Book a 1:1 Architecture Clinic (10 slots only)