P.1204.1 vs. PEVQ-S: Comparing Two Approaches to Video Quality Monitoring

Video streaming today is dominated by large OTT services like Netflix, YouTube, Disney+, and Prime Video, delivered over networks that nobody owns end to end. If you are an ISP, mobile operator, benchmarking provider, regulator, or drive test vendor, you share the same job: figuring out how well those services actually work for your users, across different access networks, locations, devices, and times of day.

Choosing a video quality measurement solution for that job means looking hard at both the model itself and the delivery architecture it assumes. At AVEQ, our Surfmeter Mobile SDK is built on ITU-T Rec. P.1204.1, the most recent video quality standard from ITU-T, and we believe it is the only solution that fits the modern OTT reality. This article compares P.1204.1 directly with another model commonly cited for the job, PEVQ-S from OPTICOM, to help you navigate the current market.

Different Model Architectures

As a third-party measurement vendor, you have no control over the source video and no reference against which you can compare a received copy. That constraint has shaped how the industry thinks about objective video quality measurement, and two families of solutions dominate the market:

Full- or reduced-reference models assume you have both a known-good source and a received copy, and can run pixel-by-pixel comparisons between them. PEVQ-S™, marketed by OPTICOM, is one of the better-known examples in the space; it is an extension of ITU-T Rec. J.247, first published in 2008.
Bitstream- and metadata-based models estimate the perceived quality directly from the encoding parameters of whatever stream the user is receiving, with no reference required. The recently published ITU-T Rec. P.1204.1 from 2025 belongs firmly in this second family, and it is the engine behind the video MOS that powers our Surfmeter Mobile SDK.

Both families try to answer the same question about the user’s experienced video quality, but they do so in ways that suit very different deployment contexts.

At a Glance

The table below summarizes the main differences that the rest of the article unpacks.

Aspect	P.1204.1 (AVEQ Surfmeter)	PEVQ-S (OPTICOM)
Architecture	No-reference, runs on device	Hybrid: server-side reference analysis plus client probe
Standardization	ITU-T P.1204.1, 2025	ITU-T J.247, 2008; J.343, 2014
Source access needed	✅ No reference required	❌ Needs a reference
Supported codecs	✅ H.264, H.265/HEVC, VP9; AV1 with public extension	❌ H.264/HEVC, per J.247 scope and proprietary extensions
Resolutions	✅ Up to 2160p (4K UHD)	❌ Up to HD, per OPTICOM product page
OTT catalogue coverage	✅ Any content on any service	❌ YouTube only, via a fixed OPTICOM test clip
Session MOS with stalling	✅ Yes, through the P.1203 framework	❌ Partial, proprietary
Publicly validated performance	✅ Yes, PCC 0.89–0.94, peer-reviewed basis	❌ Limited, partly proprietary

What PEVQ-S Is and What It Was Built For

According to OPTICOM’s publicly available PEVQ-S product page, PEVQ-S is a hybrid video quality architecture. It splits the work into two blocks: a server-side Content & Media Stream Analysis block that runs OPTICOM’s full-reference PEVQ algorithm against the studio master, and a client-side probe that handles transmission and presentation quality. The two halves are combined into a final MOS.

The full-reference half of PEVQ-S has a long standardization track. The original PEVQ algorithm became part of ITU-T Rec. J.247 in 2008 as a pure pixel-domain full-reference model for video telephony, IPTV, and streaming video. The hybrid PEVQ-S architecture was later standardized under ITU-T Rec. J.343 in 2014, and OPTICOM reports that PEVQ-S scored top in the VQEG benchmark of the same year.

In terms of scope, OPTICOM lists supported screen resolutions from QCIF up to HD, with full backward compatibility to J.247 for standardized 10-second sequences. The product page does not list 4K/UHD as a supported viewing size. It also does not support recent codec families (VP9, AV1). That is unsurprising given that the underlying standards were finalized in 2008 and 2014, before these technologies reached mass OTT deployment. PEVQ-S itself is also partly proprietary, and OPTICOM does not publish the kind of detailed performance numbers that open-source and newer ITU-T standardized models have. A recent third-party study confirms that PEVQ-S can be reasonably accurate, but only over a small set of YouTube clips and a narrow operating range.

To compute a full PEVQ-S score, somebody has to have access to the source video. That is simple enough when the streaming server is yours, but it is much harder if you are an outside party measuring how well a third-party OTT service performs across a network you do not control. In practice, PEVQ-S in an OTT context is deployed against a fixed test clip that OPTICOM uploaded to YouTube many years ago. Because that clip is not popular-tier content, YouTube has never re-encoded it with the most modern ladders or the newest codec generations, so what PEVQ-S measures is a frozen corner of the YouTube catalog that nobody really watches. Services where you cannot upload your own clip in the first place, like Netflix, are out of scope by construction.

… And what is P.1204.1?

ITU-T Rec. P.1204.1 was approved in October 2025 by ITU-T Study Group 12, developed in collaboration with the Video Quality Experts Group (VQEG). It is the newest member of the P.1204 family, a Mode 0 video quality model that predicts video quality purely from metadata about the delivered stream. The required inputs are the kind of information that our probe already has at hand:

video bitrate
video frame rate
video coding resolution and display resolution
codec (H.264, H.265/HEVC, VP9, AV1)
segment duration and device type

That is the entire input list. There is no decoding, no pixel comparison, no frame alignment, and no reference video needed.

The technical core of P.1204.1 is the AVQBits|M0 model developed by TU Ilmenau, published as peer-reviewed scientific work and trained against publicly available data like the AVT-VQDB-UHD-1 subjective database, along with several others. P.1204.1 supports resolutions up to 2160p (4K UHD), frame rates up to 60 fps, and produces both a per-segment overall quality score (5–10 seconds) and a per-1-second video quality score, suitable for diagnostics or for integration into longer-session quality using the P.1203 framework.

The reason ITU-T published the recommendation in the first place is that it performs well. On the AVT-VQDB-UHD-1 database used for training and validation, P.1204.1 achieves a Pearson correlation of 0.890 with subjective MOS at an RMSE of 0.499. The standard reports, among others, a Pearson correlation of up to 0.92 against the Twitch dataset. Even more interesting is what happens outside the training scope: independent research has measured a Pearson correlation of 0.94 between P.1204.1-based scores and subjective ratings from a Viasat satellite streaming database, even though satellite streaming was not part of the original training material. That kind of out-of-distribution generalization is a strong indicator of how soundly the underlying model is built, and it is exactly the kind of public, peer-reviewed evidence that PEVQ-S does not provide.

P.1204.1 is also a drop-in replacement for the Pv (video quality) module of the broader ITU-T Rec. P.1203 framework. Combined with P.1203’s audio and integration modules, you get a single end-to-end MOS for an entire streaming session, including stalling, quality switching, and longer sessions through our sliding-window extension. The open-source reference implementation of P.1203, which we co-maintain, is used widely across academia and industry.

Computational Footprint and Deployment

PEVQ-S splits its computation across two physical locations. The full-reference Content & Media Stream Analysis runs at the server side; OPTICOM describes this as a “one-time calculation” against the studio master. The client probe at the device side is lightweight and contributes the transmission and presentation quality. So while the smartphone-side footprint can be small, the system as a whole still requires server infrastructure with access to the source content in order to actually produce a score, and the two pieces have to be deployed and kept in sync.

P.1204.1 is a single well-engineered piece. The real engineering work behind the AVEQ product is years of experience in measuring third-party video services and extracting the right information from the video player. Our implementation of P.1204.1 is small enough to ship in an easy-to-install SDK, runs entirely on the client, and finishes in milliseconds. It can run on a mobile phone, fully offline, with no backend dependency and no pixel processing.

Where It Sits in Your Monitoring Stack

PEVQ-S is a two-piece architecture: a server-side full-reference analysis block plus a client-side probe. It measures video alone, with no audio. It does incorporate rebuffering and stalling, but the equations are not published because that part of the model is proprietary.

P.1204.1 can be used as a drop-in replacement for the Pv module of P.1203, increasing its applicability to modern streaming solutions. That means you immediately get more than a video MOS for a single segment: you also get audio quality, audiovisual quality, the impact of stalling and rebuffering, and a single integral score for sessions of one to five minutes, extended to longer sessions through AVEQ’s sliding-window work. The whole framework runs in one place, typically directly on the device that is watching the video, and there is nothing else to deploy or coordinate. Our overview of ITU-T video quality models explains how these pieces work together.

Why P.1204.1 Is the Right Choice for Modern OTT Monitoring

We believe that PEVQ-S had its moment. The full-reference engine was standardized in 2008 as J.247, the hybrid architecture followed in 2014 as J.343, and the last VQEG benchmark is from the same year. That was a different Internet. 4K had not arrived in any meaningful volume, VP9 was still finding its footing, AV1 did not exist, and the dominant deployment model for premium video was an operator-managed IPTV head-end where the encoder, the CDN, and the player were all under one roof. Packet loss was a thing. Inside that world, deploying PEVQ-S was an obvious choice.

That world has changed, and benchmarking providers and drive test vendors have very different requirements today. The video is whatever Netflix or YouTube decides to ship, the codec is whatever the service picked for that session, and the resolution can go up to 4K60. Content is streamed using reliable transmission. P.1204.1 was built for this world. It is openly validated, both inside the standard itself and independently against datasets like the Viasat satellite database, where it consistently lands in the 0.89–0.94 Pearson-correlation range against real subjective ratings.

Through our work with TU Ilmenau, P.1204.1 also supports AV1, the codec that powers a large and growing share of YouTube and Netflix traffic. It fits directly into the P.1203 framework as the Pv module, so the same engine that scores a single segment also produces a session MOS that accounts for stalling and quality switching. Its footprint is small enough that storage, CPU, and battery on a mobile probe stay essentially untouched.

That is the engineering choice behind our entire Surfmeter suite, and the reason the Surfmeter Mobile Quality SDK can drop into any Android app and start producing video MOS values from live OTT sessions over real networks, with nothing else to deploy. If you want to see what that looks like in practice, get in touch.

_{Note: PEVQ™ and PEVQ-S™ are trademarks of OPTICOM GmbH. We refer to them in this article for the sole purpose of nominative comparison between two different methodologies for video quality measurement. All technical statements about PEVQ-S above are taken from publicly available OPTICOM product documentation and from the published ITU-T Recommendations J.247 and J.343. AVEQ is not affiliated with, endorsed by, or sponsored by OPTICOM.}

P.1204.1 vs. PEVQ-S: Comparing Two Approaches to Video Quality Monitoring

Different Model Architectures

At a Glance

What PEVQ-S Is and What It Was Built For

… And what is P.1204.1?

Computational Footprint and Deployment

Where It Sits in Your Monitoring Stack

Why P.1204.1 Is the Right Choice for Modern OTT Monitoring

Contact

About Us

Solutions

Legal