How to Evaluate the Quality of Experience Associated With Video Streaming
October 7, 2020
In a previous blog, we’ve seen how the streaming experience is shaping up the video industry. We now want to highlight how this notion of Quality of Experience (QoE) can be captured and analyzed, which is key to taking the corrective steps to improve it.
What to measure
Video QoE can be assessed through the measurement of tree different families of metrics:
- The startup time: The time it takes for the requested service to be launched, after end-users have selected it from their portal.
- The number, frequency and duration of freezes: When this happens end-users will not be able to see the content.
- Video layers and changes: The video layers that are being displayed, the frequency of the switches between them and the average bitrate of the video session.
To assess the impact of these elements on end-user behavior, it is important to corelate it with the completion rate related to the viewing of a program. In other words, to what extent was the experience so bad that it led the user to stop watching the service.
How to measure
To accurately measure these metrics, we need to collect information at the player level. However, each player has its own parameters to catch events happening during the viewing process. It is, therefore, necessary to have a library in the application, specific to each player, that is used to transform those parameters into common metrics.
At Broadpeak, we have developed Smartlib, which allows us to gather all the information necessary from the players.
With a few lines of codes Smartlib can be added within any end-user application. It comes with a built-in mechanism to collect player metrics and sends this information to Broadpeak’s analytics solution.
The startup time is obtained by computing the elapsed time between the play request and the display of the first video content frame. The information about freezes is obtained by observing the playback position and capturing stall dates.
The adaptive path among layers is obtained by listening to the players’ events associated with layers switches and bit rate evolution.
The computed metrics are sent to the BkA100 analytics server on a regular basis to be displayed.
The frequency at which the measurements are made has an important impact on the way the information is perceived. But a high accuracy implies an important volume of data that is generated. It is, therefore, important to find a trade-off between the precision about the time when an event occurred and the quantity of information that is stored to capture information.
Aggregation of information is key to reduce the required volume while keeping the details that are useful.
For example, the average startup time per player provides a good indication on the performance of the different players involved in the video delivery solution. A playback will produce a very verbose and spread logs output across the multiple servers involved in the video delivery. An aggregation to present manipulate it has an homogenous video session data set allows to assess the QoE of the delivered content and drill-down into details for further analysis
How to represent
The way the information is presented in the GUI is crucial if we want to exploit it and trigger some actions. There are many graphical components that exist (i.e., pie charts, histograms, time-based graphics, lists…) and they each serve a different purpose. It is, therefore, key to associate to each piece of information the most adapted way of displaying it, so that the exploitable elements that it conveys immediately draws the attention.
Timelines are used to visualize the evolution of a metric overtime. For example, they are adapted to see the progress of concurrent video during a given period.
Histograms are efficient to plot the frequency of score occurrences like the QoE Viewer Experience score repartition across the delivered video sessions.
Maps are used when the data refers to a localization, such as the video session creation per country or region.
Rankings are used to sort larger data collection by ascending or descending order and to display top/worst lists. Rankings are adapted to rank an amount of video sessions based on http status code or to rank client IP based on the number of errors to highlight the top viewers impacted by video delivery errors.
Gauges are used to show a single value within a given scale, for example the QoE Viewer Experience score computed for a user or a group of users.
Radar charts can quickly highlight a metric impacting the Viewer Experience score to understand the main reason for a bad the QoE.
The Viewer Experience score is based on startup times, number of freezes, freeze frequency, average bit rate and layer switches to highlight the evolution over time of the QoE for delivered video sessions. This score must be customizable based on the QoE items that are the most important for each operator.
Pie charts are used to visualize the data within limited categories such as the percentage of delivered video sessions per service type (i.e., VOD, live, catch-up, start-over) or device type (smartphone, TV, tablet, Set-top box)
A good analysis also requires that the collected data is represented through different angles.
Information related to a specific customer or session are relevant in case a support request is made. A session tracker view fulfils this objective by displaying all the relevant events that happened on a timeline.
All this information can be combined in dedicated dashboards that must be customizable based on the QoE items that are the most important for each operator.
But this is not enough. The possibility of grouping the indicators relative to different parameters is key to undergo investigations about root cause of errors. This is where the list of filters that can be applied on the graphics is most useful.
How to exploit
The analytics information is useless if it does not trigger any action. It is, therefore, important to be able to set thresholds on different QoE parameters, in particular, those that will trigger an alert. The info will come as a notification in the GUI, an SMS or an e-mail.
Abnormalities can also be identified by superimposing two identical graphics taken at different times.
Eventually, the combination of filters can help with identifying the root cause of an error, as in the scenario described below.
A use case
An operator received multiple calls from a frustrated customer complaining about the start-up times and wanted to understand the root cause of the issues.
The operator can first look at the repartition of the created sessions per video server over time.
A fairly even repartition is observed across three streaming servers. Looking for startup time issues, the operator can apply a filter to isolate the number of sessions with a start-up time greater than 4 seconds.
As a result, the operator can see that the third video server is the one mostly involved with these problematic sessions. However, no monitoring tool has shown any system or streaming capacity issues on this server.
Another factor might be involved, and it could be a correlated issue between the streaming server and another parameter such as the end-users’ device operating system, device type, player version or with a type of video service delivered.
The operator can look at the typical repartition of created sessions over the day. Most of the sessions here are consumed by Android smartphones. The rest are being shared between tablets, smart TVs on TvOS, iOS and Android devices. Out of the sessions, 52% are live vs 48% for VOD. Finally, there is 27%, 33% and 40% repartition across three video player versions.
Now the operator can apply the filter to isolate the number of sessions with a start-up time greater than 4 seconds. The number of created sessions expectedly decreases on all charts. And most importantly, the operator can observe a significant change in the repartition of sessions per player versions.
The cause is due to multiple factors. The drop in quality is resulting from a degraded behavior when VSRV_03 streaming server is sending its content to a 2.6.0 version of the player. Looking at the configuration of both server and player versions will allow the operator to take the corrective steps to improve QoE.
This type of issue can only be identified easily in a few clicks if the analytics tool gathers feedback from both the end-users side and from the CDN and servers side.
Streaming Video Alliance
Broadpeak is an active member of the Streaming Video Alliance where it contributes to the QoE group whose aim is to identify streaming video metrics to gather, establish guidelines on how to calculate those metrics and create best practices on implementing a system to capture QoE and other measurement data.