The Technologies Behind Taobao Live Streaming

Basics of Audio-visual Technology



Live Streaming Technology

Streaming Media Protocols

Stream Ingestion and Stream Pulling


Stream Pulling

Definition of Demux

Demux Process


I-frame, B-frame, and P-frame

  • The I-frame is a keyframe. The data of an I-frame can be independently decoded through intra-frame prediction. The I-frame is typically the first frame in a Group of Pictures (GOP). GOP is a video compression technology used by MPEG. The I-frame is appropriately compressed into a random access reference point and used as a static image.
  • The B-frame is a forward pre-encoded frame. This frame is predicted based on a previous I-frame or P-frame and a subsequent I-frame or P-frame. During decoding, the data of the current B-frame is superimposed with both the previously cached picture and the decoded subsequent picture to generate the final picture.
  • The P-frame is a forward predictive frame. During image encoding, the data of a P-frame to be transmitted is compressed by removing all the redundant time information about encoded frames in an image sequence.



  • Decoding Timestamp (DTS) is used to tell a player when to decode the data of a frame.
  • Presentation Timestamp (PTS) is used to tell a player when to display the data of a frame.



Web Media Technologies






Open-source Products and Frameworks in Use

  • JavaScript Players: FFmpeg is integrated with Wasm to implement JavaScript players in web browsers or to extend other audio-visual capabilities of web browsers.
  • Node Module Fluent-FFmpeg: This is a useful module in Node.js, which streamlines the complex commands of FFmpeg. Use this module to upload files and process video streams. For more information, visit the fluent-ffmpeg website.

