Simplifying the Management of Subtitles in an IP Video World
August 27, 2020
Spanish version available here
A major challenge facing broadcasters and pay-TV operators migrating to IP is how to manage subtitles without requiring extra encoding. This blog will look into why subtitles are difficult to process in the IP video world and what unique solutions Broadpeak brings to the table, in particular for operators in Latin America.
The Complexity of Subtitles
Today, there is a clear shift from QAM to IP services, starting with VOD OTT offerings and extending to live TV with two trends. Cable operators — who used to have broadcast TV on their network — are now deploying pure OTT (IPTV) services on fiber (GPON). Also, small ISPs or new players are launching pure OTT offerings.
IP deployments are mostly based on pure OTT using DASH and HLS, with some minor use of the HSS format, mainly for web browsers, TVs, and Microsoft Xbox. While it might not seem like it, subtitles consisting of one or two rows of text have often been a hassle. Subtitles are a critical feature, especially when broadcasting English content to non-English speaking audiences. In some regions, subtitles are even a legal requirement for the hard of hearing public. Technically — between the synchronization with audio/video, the placement of the subtitles over the video, and treatment of special characters — subtitles have been a source of headaches since the earliest times of broadcasting.
One reason why subtitles are difficult is because both video formats and subtitle formats have changed over time. For OTT formats, subtitles are managed as text files that are downloaded alongside the video chunks and read easily by the player. Let’s take a look at the various specifications :
- HLS only specifies a text format WebVTT. Even in version 7 where IMSC-1 is introduced, only the text profile is supported.
- DASH specifies a text format in TTML as well as SMPTE-TT, which is an extension of TTML that allows bitmap images.
- Microsoft HTTP Smooth Streaming (HSS) specifies a text format TTML (DFXP profile). Even if it is still possible to carry a TTML image profile in HSS, it is not part of the standard and not supported by all the HSS players.
- Closed Captions (CEA 608/708) are embedded in the video so the role of the origin packager is to perform a pass through of the SEI messaging. In some cases, the origin packager will also make a notification with regards to signaling in the manifest (HLS and DASH).
DVB subtitles and SCTE-27 subtitles became popular in the broadcast world of live TV as they are images (literally bitmap images) with the subtitles already “burned.” These images can be decoded easily by broadcast set-top boxes used by cable and satellite operators.
They eliminate potential interoperability issues in the playback, especially with special characters like the “ñ” in Spanish and the ”~” in Portuguese. These feeds are still in place and are also used in the headend for new OTT services. Many operators in LATAM still receive video content from big content providers with DVB subtitles and/or SCTE-27. This poses as a problem because there is no native way to display those subtitles in HLS, which is the major format for most deployments.
How to Fix the Problem
At Broadpeak, we have been working with our customers to develop solutions to this conundrum.
HLS solution: For HLS, the first way of solving the problem is to use the SMPTE-TT bitmap. As we said, this is not supported natively by the protocol so it needs an adaptation on the player side. This has been done both with app/web players as well on native STB players. The main drawback is that it is a proprietary solution depending on the packager vendor implementation and player capabilities to support it. Several implementations exist in the field, which make using SMPTE-TT in HLS a nightmare in terms of interoperability.
The second solution is using OCR (Optical Character Recognition) to transcode the DVB/SCTE-27 bitmap subtitles in an OTT-friendly text format such as WebVTT. This is achieved using our origin packager without adding any extra modules or products. It has proven to be a very popular solution for customers that do not want to make adaptations to their applications/players as well as for those using a native iOS player.
DASH solution: The standard solution is to use SMPTE-TT images. We can also use OCR to produce TTML-TT subtitles.
VOD solution: Concerning VOD, the widely deployed formats sources are text based. The most popular one is TTML. However, some broadcasters have long used EBU-STL as their subtitle format, which is more complex, allowing use of colors and formatting, background color, text deviation, and is useful for true captioning.
In the case of VOD, Broadpeak supports TTML, STL subtitles as well as the simpler but more widely used SRT format. We also have some customers who need to support SMPTE-TT subtitles, which leads to use OCR as for live.
Our packager can ingest these three formats for VOD files and produce the following outputs:
- For HLS, WebVTT (if the input is SMPTE-TT, we also use OCR)
- For DASH, stpp (TTML, SMPTE-TT, or TTML through OCR)
How Do We Do OCR?
OCR is a feature embedded in our Origin Packager (BkS350) without the need of additional modules.
The OCR processing can be divided into three steps :
- Image processing — this involves cropping, boxing, luminance transform, and changing the background and the borders to ease the recognition process
- Character recognition
- Language processing with deep learning using long short-term memory networks
Why Broadpeak’s Subtitle Solution is Unique
Broadpeak’s BkS350 Origin Packager includes an innovative just-in time feature that packages and encrypts the video content in the most popular ABR formats, including Apple HTTP Live Streaming (HLS), MPEG-DASH, HSS, and the latest CMAF low-latency protocols. The BkS350 supports a wide variety of audio, video, and subtitle formats and offers seamless integration with major DRM vendors. Combining just-in time packaging and a built-in cache mechanism, the BkS350 reduces the need for encoding and storage resources and provides a high throughput capacity generating important savings. Since the BkS350 is able to deliver content to any off-net or on-net CDN, you can have your content delivered through the CDN of your choice and use a combination of delivery infrastructures.
Broadpeak is also starting to deploy projects using the DASH CMAF Low Latency profile, allowing latencies very close to IPTV deployments (MPEG-TS). DASH-CMAF specifies a new subtitle format in IMSC-1, which can be used for both image and text subtitles. Our BkS350 packager supports this format. Additionally, our solution supports SMPTE-TT subtitles for DASH-CMAF, which is important since we have seen that some players are already used to and prefer using this format.
To ensure low latency streaming, it’s important is to define the chunk size of the subtitles. We generally use video chunks of 200ms, but DASH-IF recommends using at least 1s chunks for subtitles, not smaller. Broadpeak also recommends using multicast ABR, which is the only way to guarantee the bandwidth and suppress the risk introduced by the lower buffers of such small chunks. Not only will multicast ABR allow a constant flow of traffic (independent of the number of simultaneous users), but it will also be more controlled, as UDP/RTP traffic can generally be prioritized in the network.
Real-World Success in Latin America
Our BkS350 has seen huge success in the real world for IP subtitling, especially by leading operators in Latin America. In one case, the BkS350 totally eliminated the need for the operator to purchase an additional module for subtitle preparation and management.
Another Latin American customer we work with was receiving DVB subtitles. Before Broadpeak could support OCR, they had to burn the subtitles in the video, which made language selection or subtitle disabling impossible. In another instance, one of our customers had already acquired a VOD catalog from broadcasters. Out of the catalog, 30% used SMPTE-TT subtitles. It was not possible to re-encode the whole catalog. With Broadpeak’s technology, the customer was able to resolve this issue and target new platforms, such as Safari Web browser.
We’d love to talk with you about how the BkS350 solution can benefit your subtitling needs.