Free Essay

Mpeg

In:

Submitted By tanmay1325
Words 5683
Pages 23
[pic]

CASE STUDY

[pic]

NAME : TANMAY MEHTA

COURSE : MBA TECH

BRANCH : TELECOM

ROLL NO : 527

PREFACE

The acronym MPEG stands for Moving Picture Expert Group, which worked to generate the specifications under ISO, the International Organization for Standardization and IEC, the International Electrotechnical Commission. What is commonly referred to as "MPEG video" actually consists at the present time of two finalized standards, MPEG-1 and MPEG-2, with a third standard, MPEG-4, in the process of being finalized . The MPEG-1 & -2 standards are similar in basic concepts. They both are based on motion compensated block-based transform coding techniques, while MPEG-4 deviates from these more traditional approaches in its usage of software image construct descriptors, for target bit-rates in the very low range, < 64Kb/sec. Because MPEG-1 & -2 are finalized standards and are both presently being utilized in a large number of applications, this case study concentrates on compression techniques relating only to these two standards. MPEG 3- it was originally anticipated that this standard would refer to HDTV applications, but it was found that minor extensions to the MPEG-2 standard would suffice for this higher bit-rate, higher resolution application, so work on a separate MPEG-3 standard was abandoned.

CONTENTS

*Introduction
*History
*Video Compression
*Video Quality
*MPEG
*MPEG Standards
*MPEG Video Compression Technology
*MPEG Specification
*System
-Elementary Stream
-System Clock Referance
-Program Streams
-Presentation Time Stamps
-Decoding Time Stamps
-Multiplexing

*Video
-Resolution
-Bitrate
-I frame
-P frame
-B frame
-D frame
-Macroblock
-Motion Vectors

*Audio
*Illustration 1 : 32 sub band filter bank
*Illustration 2 : FFT analysis
*Discrete Cosine Transform
*Important Tables
*Applications
*Referances

INTRODUCTION

MPEG video compression is used in many current and emerging products. It is at the heart of digital television set-top boxes, DSS, HDTV decoders, DVD players, video conferencing, Internet video, and other applications. These applications benefit from video compression in the fact that they may require less storage space for archived video information, less bandwidth for the transmission of the video information from one point to another, or a combination of both. Besides the fact that it works well in a wide variety of applications, a large part of its popularity is that it is defined in two finalized international standards, with a third standard currently in the definition process. It is the purpose of this case study to introduce to the basics of MPEG video compression, from both an encoding and a decoding perspective.

HISTORY

Modeled on the successful collaborative approach and the compression technologies developed by the Joint Photographic Experts Group and CCITT's Experts Group on Telephony (creators of the JPEG image compression standard and the H.261 standard for video conferencing respectively) the Moving Picture Experts Group (MPEG) working group was established in January 1988. MPEG was formed to address the need for standard video and audio formats, and build on H.261 to get better quality through the use of more complex encoding methods.
Development of the MPEG-1 standard began in May 1988. 14 video and 14 audio codec proposals were submitted by individual companies and institutions for evaluation. The codecs were extensively tested for computational complexity and subjective (human perceived) quality, at data rates of 1.5 Mbit/s. This specific bitrate was chosen for transmission over T-1/E-1 lines and as the approximate data rate of audio CDs. The codecs that excelled in this testing were utilized as the basis for the standard and refined further, with additional features and other improvements being incorporated in the process.
After 20 meetings of the full group in various cities around the world, and 4½ years of development and testing, the final standard (for parts 1-3) was approved in early November 1992 and published a few months later. The reported completion date of the MPEG-1 standard , varies greatly: a largely complete draft standard was produced in September 1990, and from that point on, only minor changes were introduced. The draft standard was publicly available for purchase. The standard was finished with the 6 November 1992 meeting. The Berkeley Plateau Multimedia Research Group developed a MPEG-1 decoder in November 1992. In July 1990, before the first draft of the MPEG-1 standard had even been written, work began on a second standard, MPEG-2,[ intended to extend MPEG-1 technology to provide full broadcast-quality video (as per CCIR 601) at high bitrates (3 - 15 Mbit/s), and support for interlaced video. Due in part to the similarity between the two codecs , the MPEG-2 standard includes full backwards compatibility with MPEG-1 video, so any MPEG-2 decoder can play MPEG-1 videos.
Notably, the MPEG-1 standard very strictly defines the bitstream , and decoder function, but does not define how MPEG-1 encoding is to be performed . This means that MPEG-1 coding efficiency can drastically vary depending on the encoder used, and generally means that newer encoders perform significantly better than their predecessors. The first three parts (Systems, Video and Audio) of ISO/IEC 11172 were published in August 1993.

VIDEO COMPRESSION
Video compression refers to reducing the quantity of data used to represent digital video images, and is a combination of spatial image compression and temporal motion compensation. Video compression is an example of the concept of source coding in Information theory. This case study deals with its applications: compressed video can effectively reduce the bandwidth required to transmit video via terrestrial broadcast, via cable TV, or via satellite TV services.

VIDEO QUALITY

Most video compression is lossy — it operates on the premise that much of the data present before compression is not necessary for achieving good perceptual quality. For example, DVDs use a video coding standard called MPEG-2 that can compress around two hours of video data by 15 to 30 times, while still producing a picture quality that is generally considered high-quality for standard-definition video. Video compression is a trade off between disk space, video quality, and the cost of hardware required to decompress the video in a reasonable time. However, if the video is over compressed in a lossy manner, visible (and sometimes distracting) artifacts can appear.

[pic]
Video compression typically operates on square-shaped groups of neighboring pixels, often called macroblocks. These pixel groups or blocks of pixels are compared from one frame to the next and the video compression codec (encode/decode scheme) sends only the differences within those blocks. This works extremely well if the video has no motion. A still frame of text, for example, can be repeated with very little transmitted data. In areas of video with more motion, more pixels change from one frame to the next. When more pixels change, the video compression scheme must send more data to keep up with the larger number of pixels that are changing.
VIDEO COMPRESSION TECHNOLOGY

At its most basic level, compression is performed when an input video stream is analyzed and information that is indiscernible to the viewer is discarded. Each event is then assigned a code - commonly occurring events are assigned few bits and rare events will have codes more bits. These steps are commonly called signal analysis, quantization and variable length encoding respectively. There are four methods for compression, discrete cosine transform (DCT), vector quantization (VQ), fractal compression, and discrete wavelet transform (DWT).
Discrete cosine transform is a lossy compression algorithm that samples an image at regular intervals, analyzes the frequency components present in the sample, and discards those frequencies which do not affect the image as the human eye perceives it. DCT is the basis of standards such as JPEG, MPEG, H.261, and H.263. We covered the definition of both DCT and wavelets in our tutorial on Wavelets Theory.

Vector quantization is a lossy compression that looks at an array of data, instead of individual values. It can then generalize what it sees, compressing redundant data, while at the same time retaining the desired object or data stream's original intent.

Fractal compression is a form of VQ and is also a lossy compression. Compression is performed by locating self-similar sections of an image, then using a fractal algorithm to generate the sections.

Like DCT, discrete wavelet transform mathematically transforms an image into frequency components. The process is performed on the entire image, which differs from the other methods (DCT), that work on smaller pieces of the desired data. The result is a hierarchical representation of an image, where each layer represents a frequency band.
MPEG
MOVING PICTURE EXPERTS GROUP

[pic]

The Moving Picture Experts Group (MPEG) was formed by the ISO to set standards for audio and video compression and transmission. It was established in 1988 and its first meeting was in May 1988 in Ottawa, Canada. As of late 2005, MPEG has grown to include approximately 350 members per meeting from various industries, universities, and research institutions. MPEG's official designation is ISO/IEC JTC1/SC29 WG11 - Coding of moving pictures and audio .

STANDARDS

The MPEG standards consist of different Parts. Each part covers a certain aspect of the whole specification. The standards also specify Profiles and Levels. Profiles are intended to define a set of tools that are available, and Levels define the range of appropriate values for the properties associated with them. MPEG has standardized the following compression formats and ancillary standards: • MPEG-1 (1993): Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s (ISO/IEC 11172). The first MPEG compression standard for audio and video. It was basically designed to allow moving pictures and sound to be encoded into the bitrate of a Compact Disc. It is used on Video CD, SVCD and can be used for low-quality video on DVD Video. It was used in digital satellite/cable TV services before MPEG-2 became widespread. To meet the low bit requirement, MPEG-1 downsamples the images, as well as uses picture rates of only 24–30 Hz, resulting in a moderate quality. It includes the popular Layer 3 (MP3) audio compression format. • MPEG-2 (1995): Generic coding of moving pictures and associated audio information. (ISO/IEC 13818) Transport, video and audio standards for broadcast-quality television. MPEG-2 standard was considerably broader in scope and of wider appeal – supporting interlacing and high definition. MPEG-2 is considered important because it has been chosen as the compression scheme for over-the-air digital television ATSC, DVB and ISDB, digital satellite TV services like Dish Network, digital cable television signals, SVCD, DVD Video and Blu-ray. • MPEG-3: MPEG-3 dealt with standardizing scalable and multi-resolution compression and was intended for HDTV compression but was found to be redundant and was merged with MPEG-2, as a result there is no MPEG-3 standard. MPEG-3 is not to be confused with MP3, which is MPEG-1 Audio Layer 3. • MPEG-4 (1998): Coding of audio-visual objects. (ISO/IEC 14496) MPEG-4 uses further coding tools with additional complexity to achieve higher compression factors than MPEG-2. In addition to more efficient coding of video, MPEG-4 moves closer to computer graphics applications. In more complex profiles, the MPEG-4 decoder effectively becomes a rendering processor and the compressed bitstream describes three-dimensional shapes and surface texture. MPEG-4 also provides Intellectual Property Management and Protection (IPMP) which provides the facility to use proprietary technologies to manage and protect content like digital rights management. Several new higher-efficiency video standards (newer than MPEG-2 Video) are included
In addition, the following standards, while not sequential advances to the video encoding standard as with MPEG-1 through MPEG-4, are referred to by similar notation: • MPEG-7 (2002): Multimedia content description interface. (ISO/IEC 15938) • MPEG-21 (2001): Multimedia framework (MPEG-21). (ISO/IEC 21000) MPEG describes this standard as a multimedia framework and provides for intellectual property management and protection.
Moreover, more recently than other standards above, MPEG has started following international standards; each of the standards holds multiple MPEG technologies for a way of application. (For example, MPEG-A includes a number of technologies on multimedia application format.) • MPEG-A (2007): Multimedia application format (MPEG-A). (ISO/IEC 23000) (e.g. Purpose for multimedia application formats, MPEG music player application format, MPEG photo player application format and others) • MPEG-B (2006): MPEG systems technologies. (ISO/IEC 23001) (e.g. Binary MPEG format for XML, Fragment Request Units, Bitstream Syntax Description Language (BSDL) and others) • MPEG-C (2006): MPEG video technologies. (ISO/IEC 23002) (e.g. Accuracy requirements for implementation of integer-output 8x8 inverse discrete cosine transform and others) • MPEG-D (2007): MPEG audio technologies. (ISO/IEC 23003) (e.g. MPEG Surroundand two parts under development: SAOC-Spatial Audio Object Coding and USAC-Unified Speech and Audio Coding) • MPEG-E (2007): Multimedia Middleware. (ISO/IEC 23004) (a.k.a. M3W) (e.g. Architecture, Multimedia application programming interface (API), Component model and others)

MPEG VIDEO COMPRESSION TECHNIQUE

A MPEG "film" is a sequence of three kinds of frames:
|[pic] |The I-frames are intra coded, |
| |i.e. they can be reconstructed |
| |without any reference to other |
| |frames. The P-frames are forward|
| |predicted from the last I-frame |
| |or P-frame, i.e. it is |
| |impossible to reconstruct them |
| |without the data of another |
| |frame (I or P). The B-frames are|
| |both, forward predicted and |
| |backward predicted from the |
| |last/next I-frame or P-frame, |
| |i.e. there are two other frames |
| |necessary to reconstruct them. |
| |P-frames and B-frames are |
| |referred to as inter coded |
| |frames. |

As an example the frame sequence above is transfered in the following order: I P B B B P B B B. The only task of the decoder is to reorder the reconstructed frames. To support this an ascending frame number comes with each frame .

What does "prediction" mean?

|[pic] |Imagine an I-frame showing a triangle on white background. A following P-frame |
| |shows the same triangle but at another position. Prediction means to supply a |
| |motion vector which declares how to move the triangle on I-frame to obtain the |
| |triangle in P-frame. This motion vector is part of the MPEG stream and it is |
| |divided in a horizontal and a vertical part. These parts can be positive or |
| |negative. A positive value means motion to the right or motion downwards, |
| |respectively. A negative value means motion to the left or motion upwards, |
| |respectively. |
| |The parts of the motion vector are in an range of -64 ... +63. So the referred |
| |area can be up to 64x64 pixels away. |

|But this model assumes that every |[pic] |
|change between frames can be | |
|expressed as a simple displacement | |
|of pixels. But the figure to the | |
|right shows this isn't true. The | |
|rectangle is shifted and rotated by| |
|5° to the right. So a simple | |
|displacement of the rectangle will | |
|cause a prediction error. Therefore| |
|the MPEG stream contains a matrix | |
|for compensating this prediction | |
|error. | |

Thus, the reconstruction of inter coded frames goes ahead in two steps: 1. Application of the motion vector to the referred frame; 2. Adding the prediction error compensation to the result;
[pic]
the prediction error compensation requires less bytes than the whole frame because the white parts are zero and can be discarded from MPEG stream. Furthermore the DCT compression (see later in this chapter) is applied to the prediction error which decreases its memory size.
Note also the different meanings of the two + - signs. The first means adding the motion vector to the x-, y- coordinates of each pixel. The second means adding an error value to the color value of the appropriate pixel.

what if some parts move to the left and others to the right ?

|The motion vector isn't valid for the whole frame. Instead of |[pic] |
|this the frame is divided into macro blocks of 16x16 pixels. | |
|Every macro block has its own motion vector. Of course, this | |
|does not avoid contradictory motion but it minimizes its | |
|probability. | |
|And if contradictory motion occurs? One of the greatest | |
|misunderstandings of the MPEG compression technique is to | |
|assume that all macro blocks of P-frames are predicted. If the| |
|prediction error is to big the coder can decide to intra-code | |
|a macro block. Similarly the macro blocks in B-frames can be | |
|forward predicted or backward predicted or forward and | |
|backward predicted or intra-coded. | |

|[pic] |Every macro block contains 4 luminance blocks and 2 chrominance |
| |blocks. Every block has a dimension of 8x8 values. The luminance |
| |blocks contain information of the brightness of every pixel in |
| |macro block. The chrominance blocks contain color information. |
| |Because of some properties of the human eye it isn't necessary to |
| |give color information for every pixel. Instead 4 pixels are |
| |related to one color value. This color value is divided into two |
| |parts. The first is in Cb color block the second is in Cr color |
| |block. The color information is to be applied as shown in the |
| |picture to the left. |
| |Depending on the kind of macro block the blocks contain pixel |
| |information or prediction error information as mentioned above. In|
| |any case the information is compressed using the discrete cosine |
| |transform (DCT). |

MPEG SPECIFICATION

Part 1: Systems

Part 1 of the MPEG-1 standard covers systems, and is defined in ISO/IEC-11172-1.
MPEG-1 Systems specifies the logical layout and methods used to store the encoded audio, video, and other data into a standard bitstream, and to maintain synchronization between the different contents. This file format is specifically designed for storage on media, and transmission over data channels, that are considered relatively reliable. Only limited error protection is defined by the standard, and small errors in the bitstream may cause noticeable defects.

Elementary streams

Elementary streams (ES) are the raw bitstreams of MPEG-1 audio and video, output by an encoder. These files can be distributed on their own, such as is the case with MP3 files.
Additionally, elementary streams can be made more robust by packetizing them, i.e., dividing them into independent chunks, and adding a cyclic redundancy check (CRC) checksum to each segment for error detection. This is the Packetized Elementary Stream (PES) structure.
System Clock Reference (SCR) is a timing value stored in a 33-bit header of each ES, at a frequency/precision of 90 kHz, with an extra 9-bit extension that stores additional timing data with a precision of 27 MHz. These are inserted by the encoder, derived from the system time clock (STC). Simultaneously encoded audio and video streams will not have identical SCR values, however, due to buffering, encoding, jitter, and other delay.

Program streams

Program Streams (PS) are concerned with combining multiple packetized elementary streams (usually just one audio and video PES) into a single stream, ensuring simultaneous delivery, and maintaining synchronization. The PS structure is known as a multiplex, or a container format.
Presentation time stamps (PTS) exist in PS to correct the inevitable disparity between audio and video SCR values (time-base correction). 90 kHz PTS values in the PS header tell the decoder which video SCR values match which audio SCR values. PTS determines when to display a portion of an MPEG program, and is also used by the decoder to determine when data can be discarded from the buffer. Either video or audio will be delayed by the decoder until the corresponding segment of the other arrives and can be decoded.
Decoding Time Stamps (DTS), additionally, are required because of B-frames. With B-frames in the video stream, adjacent frames have to be encoded and decoded out-of-order (re-ordered frames). DTS is quite similar to PTS, but instead of just handling sequential frames, it contains the proper time-stamps to tell the decoder when to decode and display the next B-frame (types of frames explained below), ahead of its anchor (P- or I-) frame. Without B-frames in the video, PTS and DTS values are identical

Multiplexing

To generate the PS, the multiplexer will interleave the (two or more) packetized elementary streams. This is done so the packets of the simultaneous streams can be transferred over the same channel and are guaranteed to both arrive at the decoder at precisely the same time. This is a case of time-division multiplexing.

Part 2: Video

Part 2 of the MPEG-1 standard covers video and is defined in ISO/IEC-11172-2. The design was heavily influenced by H.261.
MPEG-1 Video exploits perceptual compression methods to significantly reduce the data rate required by a video stream. It reduces or completely discards information in certain frequencies and areas of the picture that the human eye has limited ability to fully perceive. It also utilizes effective methods to exploit temporal (over time) and spatial (across a picture) redundancy common in video, to achieve better data compression than would be possible otherwise.

Color space

[pic]

[pic]

Example of 4:2:0 subsampling. The two overlapping center circles represent chroma blue and chroma red (color) pixels, while the 4 outside circles represent the luma (brightness).

Before encoding video to MPEG-1, the color-space is transformed to Y'CbCr (Y'=Luma, Cb=Chroma Blue, Cr=Chroma Red). Luma (brightness, resolution) is stored separately from chroma (color, hue, phase) and even further separated into red and blue components. The chroma is also subsampled to 4:2:0, meaning it is decimated by one half vertically and one half horizontally, to just one quarter the resolution of the video.[1]
Because the human eye is much less sensitive to small changes in color than in brightness, chroma subsampling is a very effective way to reduce the amount of video data that needs to be compressed. On videos with fine detail (high spatial complexity) this can manifest as chroma aliasing artifacts. Compared to other digital compression artifacts, this issue seems to be very rarely a source of annoyance.
Because of subsampling, Y'CbCr video must always be stored using even dimensions (divisible by 2), otherwise chroma mismatch will occur, and it will appear as if the color is ahead of, or behind the rest of the video, much like a shadow.
Y'CbCr is often inaccurately called YUV which is only used in the domain of analog video signals. Similarly, the terms luminance and chrominance are often used instead of the (more accurate) terms luma and chroma.

Resolution/Bitrate

MPEG-1 supports resolutions up to 4095×4095 (12-bits), and bitrates up to 100 Mbit/s.
MPEG-1 videos are most commonly seen using Source Input Format (SIF) resolution: 352x240, 352x288, or 320x240. These low resolutions, combined with a bitrate less than 1.5 Mbit/s, make up what is known as a constrained parameters bitstream (CPB), later renamed the "Low Level" (LL) profile in MPEG-2. This is the minimum video specifications any decoder should be able to handle, to be considered MPEG-1 compliant. This was selected to provide a good balance between quality and performance, allowing the use of reasonably inexpensive hardware of the time.

Frame/picture/block types

MPEG-1 has several frame/picture types that serve different purposes. The most important, yet simplest are I-frames.

I-frames

I-frame is an abbreviation for Intra-frame, so-called because they can be decoded independently of any other frames. They may also be known as I-pictures, or keyframes due to their somewhat similar function to the key frames used in animation. I-frames can be considered effectively identical to baseline JPEG images.
High-speed seeking through an MPEG-1 video is only possible to the nearest I-frame. When cutting a video it is not possible to start playback of a segment of video before the first I-frame in the segment (at least not without computationally-intensive re-encoding). For this reason, I-frame-only MPEG videos are used in editing applications.
I-frame only compression is very fast, but produces very large file sizes: a factor of 3× (or more) larger than normally encoded MPEG-1 video, depending on how temporally complex a specific video is. I-frame only MPEG-1 video is very similar to MJPEG video. So much so that very high-speed and theoretically lossless conversion can be made from one format to the other, provided a couple of restrictions (color space and quantization matrix) are followed in the creation of the bitstream.

P-frames

P-frame is an abbreviation for Predicted-frame. They may also be called forward-predicted frames, or inter-frames
P-frames exist to improve compression by exploiting the temporal (over time) redundancy in a video. P-frames store only the difference in image from the frame (either an I-frame or P-frame) immediately preceding it (this reference frame is also called the anchor frame).
The difference between a P-frame and its anchor frame is calculated using motion vectors on each macroblock of the frame (see below). Such motion vector data will be embedded in the P-frame for use by the decoder.

B-frames

B-frame stands for bidirectional-frame. They may also be known as backwards-predicted frames or B-pictures. B-frames are quite similar to P-frames, except they can make predictions using both the previous and future frames (i.e. two anchor frames).
It is therefore necessary for the player to first decode the next I- or P- anchor frame sequentially after the B-frame, before the B-frame can be decoded and displayed. This makes B-frames very computationally complex, requires larger data buffers, and causes an increased delay on both decoding and during encoding. This also necessitates the display time stamps (DTS) feature in the container/system stream (see above). As such, B-frames have long been subject of much controversy, they are often avoided in videos, and are sometimes not fully supported by hardware decoders.
No other frames are predicted from a B-frame. Because of this, a very low bitrate B-frame can be inserted, where needed, to help control the bitrate. If this was done with a P-frame, future P-frames would be predicted from it and would lower the quality of the entire sequence.

D-frames

MPEG-1 has a unique frame type not found in later video standards. D-frames or DC-pictures are independent images (intra-frames) that have been encoded DC-only and hence are very low quality. D-frames are never referenced by I-, P- or B- frames. D-frames are only used for fast previews of video, for instance when seeking through a video at high speed. Given moderately higher-performance decoding equipment, this feature can be approximated by decoding I-frames instead. This provides higher quality previews, and without the need for D-frames taking up space in the stream, yet not improving video quality.

Macroblocks

MPEG-1 operates on video in a series of 8x8 blocks for quantization. However, because chroma (color) is subsampled by a factor of 4, each pair of (red and blue) chroma blocks corresponds to 4 different luma blocks. This set of 6 blocks, with a resolution of 16x16, is called a macroblock.
A macroblock is the smallest independent unit of (color) video. Motion vectors (see below) operate solely at the macroblock level.

Motion vectors

To decrease the amount of spatial redundancy in a video, only blocks that change are updated, (up to the maximum GOP size). This is known as conditional replenishment. However, this is not very effective by itself. Movement of the objects, and/or the camera may result in large portions of the frame needing to be updated, even though only the position of the previously encoded objects has changed. Through motion estimation the encoder can compensate for this movement and remove a large amount of redundant information.
The encoder compares the current frame with adjacent parts of the video from the anchor frame (previous I- or P- frame) in a diamond pattern, up to a (encoder-specific) predefined radius limit from the area of the current macroblock. If a match is found, only the direction and distance (i.e. the vector of the motion) from the previous video area to the current macroblock need to be encoded into the inter-frame (P- or B- frame). The reverse of this process, performed by the decoder to reconstruct the picture, is called motion compensation.
Motion vectors record the distance between two areas on screen based on the number of pixels (called pels). MPEG-1 video uses a motion vector (MV) precision of one half of one pixel, or half-pel. The finer the precision of the MVs, the more accurate the match is likely to be, and the more efficient the compression. There are trade-offs to higher precision, however. Finer MVs result in larger data size, as larger numbers must be stored in the frame for every single MV, increased coding complexity as increasing levels of interpolation on the macroblock are required for both the encoder and decoder, and diminishing returns (minimal gains) with higher precision MVs. Half-pel was chosen as the ideal trade-off.

Part 3: Audio

Part 3 of the MPEG-1 standard covers audio and is defined in ISO/IEC-11172-3.
MPEG-1 Audio utilizes psychoacoustics to significantly reduce the data rate required by an audio stream. It reduces or completely discards certain parts of the audio that the human ear can't hear, either because they are in frequencies where the ear has limited sensitivity, or are masked by other (typically louder) sounds.

.

[pic]

[pic] Visualization of the 32 sub-band filter bank used by MPEG-1 Audio, showing the disparity between the equal band-size of MP2 and the varying width of critical bands ("barks").
The 32 sub-band filter bank returns 32 amplitude coefficients, one for each equal-sized frequency band/segment of the audio, which is about 700 Hz wide (depending on the audio's sampling frequency). The encoder then utilizes the psychoacoustic model to determine which sub-bands contain audio information that is less important, and so, where quantization will be inaudible, or at least much less noticeable.
[pic]
Example FFT analysis on an audio wave sample.
DISCRETE COSINE TRANSFORM

In general, neighboring pixels within an image tend to be highly correlated. As such, it is desired to use an invertible transform to concentrate randomness into fewer, decorrelated parameters. The Discrete Cosine Transform (DCT) has been shown to be near optimal for a large class of images in energy concentration and decorrelating. The DCT decomposes the signal into underlying spatial frequencies, which then allow further processing techniques to reduce the precision of the DCT coefficients consistent with the Human Visual System (HVS) model.
The DCT/IDCT transform operations are described with Equations 1 & 2 respectively4:
[pic]
[pic]
[pic]
Equation 1: Forward Discrete Cosine Transform [pic]
Equation 2: Inverse Discrete Cosine Transform

An interesting summary of typical hardware requirements is given in the following table

| |typical decoder transistor count| | |
|video profile | | |DRam bus width, speed |
| | |total dram | |
|MPEG-1 CPB |0.4-0.75 million |4Mb |16 bits, 80 ns |
|MPEG-1 601 |0.8-1.1 million |16Mb |64 bits, 80 ns |
|MPEG-2 MP@ML |0.9-1.5 million |16 Mb |64 bits, 80 ns |
|MPEG-2 MP@HL |2.0-3.0 million |64 Mb |N/A |

MPEG FILE FORMAT SUMMARY
|Type |Audio/video data storage |
|Colors |Up to 24-bits (4:2:0 YCbCr color space) |
|Compression |DCT and block-based scheme with motion compensation |
|Maximum Image Size |4095x4095x30 frames/second |
|Multiple Images Per File |Yes (multiple program multiplexing) |
|Numerical Format |NA |
|Originator |Motion Picture Experts Group (MPEG) of the International Standards Organization (ISO) |
|Platform |All |
|Supporting Applications |Xing Technologies MPEG player, others |
| | |

Usage
Stores an MPEG-encoded data stream on a digital storage medium. MPEG is used to encode audio, video, text, and graphical data within a single, synchronized data stream.
Comments
MPEG-1 is a finalized standard in wide use. MPEG-2 is still in the development phase and continues to be revised for a wider base of applications. Currently, there are few stable products available for making practical use of the MPEG standard, but this is changing.

APPLICATIONS

• Most popular computer software for video playback includes MPEG-1 decoding, in addition to any other supported formats. • The popularity of MP3 audio has established a massive installed base of hardware that can play back MPEG-1 Audio (all three layers). • "Virtually all digital audio devices" can play back MPEG-1 Audio.[30] Many millions have been sold to-date. • Before MPEG-2 became widespread, many digital satellite/cable TV services used MPEG-1 exclusively.[9][19] • The widespread popularity of MPEG-2 with broadcasters means MPEG-1 is playable by most digital cable and satellite set-top boxes, and digital disc and tape players, due to backwards compatibility. • MPEG-1 is the exclusive video and audio format used on Video CD (VCD), the first consumer digital video format, and still a very popular format around the world. • The Super Video CD standard, based on VCD, uses MPEG-1 audio exclusively, as well as MPEG-2 video. • The DVD-Video format uses MPEG-2 video primarily, but MPEG-1 support is explicitly defined in the standard. • The DVD-Video standard originally required MPEG-1 Layer II audio for PAL countries, but was changed to allow AC-3/Dolby Digital-only discs. MPEG-1 Layer II audio is still allowed on DVDs, although newer extensions to the format, like MPEG Multichannel, are rarely supported. • Most DVD players also support Video CD and MP3 CD playback, which use MPEG-1. • The international Digital Video Broadcasting (DVB) standard primarily uses MPEG-1 Layer II audio, and MPEG-2 video. • The international Digital Audio Broadcasting (DAB) standard uses MPEG-1 Layer II audio exclusively, due to MP2's especially high quality, modest decoder performance requirements, and tolerance of errors.

REFERANCES

-“Elements Of Data Compression” by
ADAM DROZDEK

-“Introduction To Data Compression” by
KHALEED SYUOOD

-Wikipedia

-Encarta

------------------------------------------------------------------------------------

Similar Documents

Free Essay

Mpeg-4 Technology

...VIDEO FORMATS (MPEG-4) A SEMINAR REPORT II TABLE OF CONTENTS PAGE NO ABSTRACT 1 1 INTRODUCTION 2 1.1 ABOUT MPEG-4 3 2 THE LAYER STRUCTURE FOR MPEG-4 4 TERMIAL 3 OVERALL SYSTEM ARCHITECTURE 6 4 CLIENT –SERVER MODEL 4.1 THE MPEG-4 SERVER 10 4.2 THE MPEG-4 CLIENT 13 5 APPLICATIONS OF MPEG-4 15 6 MPEG-4 ADDRESSES THE NEED FOR 16 7 REQURIMENTS FOR MPEG-4 VIDEO 18 STANDARD 8 CONCLISION 21 BIBLIOGRAPHY 22 APPENDIX – A POWER POINT SLIDES III ABSTRACT The Multimedia Technology Research Center (MTrec) is one of the leading research centers in the world which was engaged in MPEG-4 Research. MPEG-4 is mainly targeted for interactive multimedia applications & became the international standard in 1998 .MPEG-4 makes it possible to construct content such as movie , song , or animations out of multimedia objects. MPEG-4 is the global multimedia...

Words: 3399 - Pages: 14

Free Essay

Video Compression: an Examination of the Concepts, Standards, Benefits, and Economics

...Video Compression: An Examination of the Concepts, Standards, Benefits, and Economics ITEC620 April 14, 2008 To accommodate the increased demand for digital video content, compression technology must be used. This paper examines the most commonly used compression formats, the MPEG-1, MPEG-2 and MPEG-4 video compression formats, their relative benefits and differences, the delivery methods available for digital video content and the economics of video content delivery. Every time a digital video disc is played, a video is watched on YouTube, an NFL clip is viewed on a Sprint-based cellular phone, or a movie is ordered through an on-demand cable television video service, the viewer is watching data that is not in the state it which it originated. Video in an unmodified state is comprised of vast quantities of data (Apostopoulos & Wee, 2000). In order to make effective and efficient usage of video data, some method of reducing the quantity of data is necessary. Apostopoulos and Wee, in their 2000 paper, “Video Compression Standards” explain this succinctly and well, “For example, consider the problem of video transmission within high-definition television (HDTV). A popular HDTV video format is progressively scanned 720x1280 pixels/frame, 60 frames/s video signal, with 24-bits/pixel (8 bits for red, green, and blue), which corresponds to a raw data rate of about 1.3 Gbits/sec. Modern digital communication systems can only...

Words: 5707 - Pages: 23

Free Essay

Computer Terms

...Resource Under Seized. * 3G - 3rd Generation. * GSM - Global System for Mobile Communication. * CDMA - Code Divison Multiple Access. * UMTS - Universal Mobile Telecommunication System. * SIM - Subscriber Identity Module. * AVI = Audio Video Interleave * RTS = Real Time Streaming * SIS = Symbian OS Installer File * AMR = Adaptive Multi-Rate Codec * JAD = Java Application Descriptor * JAR = Java Archive * JAD = Java Application Descriptor * 3GPP = 3rd Generation Partnership Project * 3GP = 3rd Generation Project * MP3 = MPEG player lll * MP4 = MPEG-4 video file * AAC = Advanced Audio Coding * GIF = Graphic Interchangeable Format * JPEG = Joint Photographic Expert Group * BMP = Bitmap * SWF = Shock Wave Flash * WMV = Windows Media Video * WMA = Windows Media Audio * WAV = Waveform Audio * PNG = Portable Network Graphics * DOC = Document (Microsoft Corporation) * PDF = Portable Document Format * M3G = Mobile 3D Graphics * M4A = MPEG-4 Audio File * NTH = Nokia Theme (series 40) * THM = Themes (Sony Ericsson) * MMF = Synthetic Music Mobile Application File * NRT = Nokia Ringtone * XMF = Extensible Music File * WBMP = Wireless Bitmap Image * DVX = DivX Video * HTML = Hyper Text Markup Language * WML = Wireless Markup Language * CD - Compact Disk. * DVD - Digital Versatile Disk. * CRT - Cathode Ray Tube. * DAT - Digital Audio Tape. * DOS - Disk Operating System. * GUI - Graphical User Interface. * HTTP - Hyper Text Transfer Protocol. * IP - Internet...

Words: 299 - Pages: 2

Premium Essay

Sfddfdg

...formats supported: H.264 video up to 1080p, 30 frames per second, High Profile level 4.1 with AAC-LC audio up to 160 Kbps, 48kHz, stereo audio in .m4v, .mp4, and .mov file formats; MPEG-4 video up to 2.5 Mbps, 640 by 480 pixels, 30 frames per second, Simple Profile with AAC-LC audio up to 160 Kbps per channel, 48kHz, stereo audio in .m4v, .mp4, and .mov file formats; Motion JPEG (M-JPEG) up to 35 Mbps, 1280 by 720 pixels, 30 frames per second, audio in ulaw, PCM stereo audio in .avi file format | Headphones | * Apple Earphones with Remote and Mic * Frequency response: 20Hz to 20,000Hz * Impedance: 32 ohms | Mail Attachment Support | Viewable Document Types.jpg, .tiff, .gif (images); .doc and .docx (Microsoft Word); .htm and .html (web pages); .key (Keynote); .numbers (Numbers); .pages (Pages); .pdf (Preview and Adobe Acrobat); .ppt and .pptx (Microsoft PowerPoint); .txt (text); .rtf (rich text format); .vcf (contact information); .xls and .xlsx (Microsoft Excel) | * Video out support at 576p and 480p with Apple Component AV Cable; 576i and 480i with Apple Composite AV Cable (cables sold separately) * Video formats supported: H.264 video up to 1080p, 30 frames per second, High Profile level 4.1 with AAC-LC audio up to 160 Kbps, 48kHz, stereo audio in .m4v, .mp4, and .mov file formats; MPEG-4 video up to 2.5 Mbps, 640 by 480 pixels, 30 frames per second, Simple Profile with AAC-LC audio up to 160 Kbps per channel, 48kHz, stereo audio in .m4v, .mp4, and .mov file formats;...

Words: 380 - Pages: 2

Free Essay

Mpeg4

...MPEG-4 The new standard for multimedia on the Internet, powered by QuickTime. MPEG-4 is the new worldwide standard for interactive multimedia creation, delivery, and playback for the Internet. What MPEG-1 and its delivery of full-motion, full-screen video meant to the CD-ROM industry and MPEG-2 meant to the development of DVD, MPEG-4 will mean to the Internet. What Is MPEG-4? MPEG-4 is an extensive set of key enabling technology specifications with audio and video at its core. It was defined by the MPEG (Moving Picture Experts Group) committee, the working group within the International Organization for Standardization (ISO) that specified the widely adopted, Emmy Award–winning standards known as MPEG-1 and MPEG-2. MPEG-4 is the result of an international effort involving hundreds of researchers and engineers. MPEG-4, whose formal designation is ISO/IEC 14496, was finalized in October 1998 and became an international standard in early 1999. The components of MPEG-4 Fact Sheet MPEG-4 Fact Sheet MPEG-4 2 Multimedia beyond the desktop The MPEG committee designed MPEG-4 to be a single standard covering the entire digital media workflow—from capture, authoring, and editing to encoding, distribution, playback, and archiving. The adoption of the MPEG-4 standard is not just critical for desktop computers, but is increasingly important as digital media expands into new areas such as set-top boxes, wireless devices, and game consoles. Member companies of the MPEG-4...

Words: 1851 - Pages: 8

Free Essay

An Analysis of Xml Database Solutions for the Management

...An Analysis of XML Database Solutions for the Management Of MPEG-7 Media Descriptions MOHSIN ARSALAN SZABIST KARACHI, PAKISTAN Abstract: MPEG-7 based applications will be set up in the near future due to its promising standard for the description of multimedia contents. Therefore management of large amount of mpeg-7 complaint media description is what is needed. Mpeg documents are XML documents with media description schemes defined in xml schema. In this paper are mentioned some of the solutions that help make a management system for mpeg-7. Furthermore these solutions are compared and analyzed. The result shows how limited today’s databases are, when comes the case of mpeg-7. Keywords: MPEG-7, XML, SCHEMA 1. INTRODUCTION The paper is organized in the following manner that sec1 is the introduction to what mpeg-7 is. Then the solutions for its management are mentioned. Then the solutions are analyzed and finally in the end the conclusion is mentioned. 1.1. MPEG-7: Mpeg (moving picture expert group) is the creator of the well known mpeg1 and mpeg2 and mpeg4. Mpeg-7 is an ISO standard which is developed by mpeg group. The mpeg-7 standard formally named “multimedia content description interface” provides a rich of standardized tools which describes a multimedia content. Humans as well as automatic systems process audiovisual information which is within the mpeg7 scope. Audiovisual description tools (the metadata elements and their structure and relationships...

Words: 1061 - Pages: 5

Free Essay

Blockbuster Technology Plan

...technologies are emerging that will make this possible in the near future and that we believe Blockbuster is well positioned to take advantage of in order to offer 3D content streaming to their customers. The biggest factor of sending a movie over the Internet is the amount and type of compression that is used. It would be impossible to send a fully uncompressed high definition movie directly to your home, whether or not it is 3D, because the amount of data is staggering. In order to make the files smaller and able to be sent in small pieces, companies like Netflix use compression formats such as MPEG or AVI which use special technologies that result in smaller files. This involves limiting the bit depth for colors in a way that humans don’t really perceive, as well as comparing individual frames to the ones ahead of and before it so that only the differences have to be shown. In an MPEG-4 video, the type used by Netflix and Apple, scenes with very little action such as in a dark room use much less data compared to an action scene because of the way this compression scheme works (IEEE, 2004). On a Blu-Ray disc,...

Words: 969 - Pages: 4

Free Essay

Computers

...non-graphical calculations. An example of GPUs being used non-graphically is the generation of Bitcoins, where the graphical processing unit is used to solve puzzles. In addition to the 3D hardware, today's GPUs include basic 2D acceleration and framebuffer capabilities (usually with a VGA compatibility mode). Newer cards like AMD/ATI HD5000-HD7000 even lack 2D acceleration, it has to be emulated by 3D hardware. [edit]GPU accelerated video decoding The ATI HD5470 GPU (above) features UVD 2.1 which enables it to decode AVC and VC-1 video formats- GPU from Vaio E series laptop Most GPUs made since 1995 support the YUV color space and hardware overlays, important for digital video playback, and many GPUs made since 2000 also support MPEG primitives such as motion compensation and iDCT. This process of hardware accelerated video decoding, where portions of the video decoding process and video post-processing are offloaded to the GPU hardware, is commonly referred to as "GPU accelerated video decoding", "GPU assisted video decoding", "GPU hardware accelerated video decoding" or "GPU hardware assisted video decoding". More...

Words: 385 - Pages: 2

Premium Essay

Yxcvyxcv Sadf

...Your current notebook will be replaced in November 2011. After the exchange, please immediately check if all of your local data has been successfully copied to your new notebook. In case some of your locally stored data is missing, we are able to restore it within five days after the exchange. It will not be possible to retrieve any data from your old notebook after this date. 1. Backup data information Please note: * You are responsible for the backup of your own local data. * Media files such as MP3 / MP4 files and related software (e.g. iTunes) will not be copied * It is not allowed to copy local private data such as audio and video files to our network drives (e.g. R-Drive) for backup purposes. Doing so would lead to a storage capacity shortage. * Aura and MyClient databases will not be copied and you need to create new local replicas on your new notebook. GTS will copy the following data to your new PC: * All files in “C:\Documents and Settings\[your short sign]\My Documents” * Lotus Notes settings and local databases * Favorites * Desktop If you have stored additional business-related data in different locations on your PC, please add the path/location to the section below. | | | | | Business data to copy to the new notebook (path/location) |       | | | | | | 2. Software information * LoS and/or team-specific software will be installed on your new Windows 7 notebook. * User specific...

Words: 457 - Pages: 2

Free Essay

A New Dct-Based Perturbation Scheme for High Capacity Data Hiding in H.264/Avc Intra Frame

...predictor-pair set and the determined predictor priorities, we thus can efficiently eliminate the redundant predictors and preserve more frequently used ones, leading to the advantages of bitrate reduction and computation-saving. The results of experiments on the sixteen standard Video Coding Experts Group (VCEG) test sequences turn out that under similar bitrates, the average execution-time improvement ratio of the proposed scheme over Laroche et al.’s method can be more than 16.77%. I. I NTRODUCTION H.264/advanced video coding (AVC) [1], [6], [8], [10], established by the Joint Video Team (JVT) of ISO/IEC Moving Picture Experts Group (MPEG) and ITU-T Video Coding Experts Group (VCEG), has become the state-of-the-art video coding standard to deal with a large number of video applications. It can produce more than 50% bitrate improvement ratio over the MPEG-2 video coding standard when having similar video quality [12]. For some applications, such as the surveillance and TV broadcast systems, the bitrate requirement of...

Words: 4068 - Pages: 17

Premium Essay

Nt1310 Unit 7 Lab Report

...For MPEG-4 Part 2 simulation result, there is no packet jitter value for the first three minutes. This is because of scanning and processes involved when streaming the video contents. The next 13 minutes, packet jitter for MS_1 increased to 0.55 us and remained constant at 0.5 us for the next interval. As for MS_2, the graph increase to 0.15 us for the next minutes and remain the constant at 0.11 us for the one hour interval. Figure 4.3d: Packet delay variation versus time for MPEG-4 Part 2. As for H.264/AVC simulation result, there is no packet jitter value for the first three minutes. This is again because of scanning and processes involved when streaming the video contents. Then, the packet jitter severely increased from 15 ps to 27 ps and decreased drastically for the next minutes. For MS_2, there is almost null jitter since the speed of data transmission rate is very high. However, all packet jitter results are approximately equal to zero for H.264/AVC video codec application...

Words: 1740 - Pages: 7

Premium Essay

Nt1310 Unit 3 Frames Essay

...encode data, and many terminal vendors which each may have an unique idea of data compression, common standards are needed, that rigidly define how the video is coded in the transmission channel. There are mainly two standard series in common use, both having several versions. International Telecommunications Union (ITU) started developing Recommendation H.261 in 1984, and the effort was finished in 1990 when it was approved. In the last few decades lots of research have been done to design an efficient video codec that can compress our information without any loss. There are different video codec developed so far, starting with H.261, MPEG-1, MPEG-2/ H.262, H.263, H.264, MPEG-4. The work is going on to develop MPEG-7 and MPEG-21,that uses meta data and video search. The standards developed by ITU-T are labeled as H.26x and MPEG are labeled as MPEG-X. The different video coding standards are described below. ...

Words: 449 - Pages: 2

Free Essay

Tjdgsa Lmlokmak Dkaskm

...The MPEG-7 Description Standard1 Nina Jaunsen Dept of Information and Media Science University of Bergen, Norway September 2004 The increasing use of multimedia in the general society and the need for user-friendly effective multimedia retrieval systems are indisputable. The fact that effective retrieval is reliant on complete and thorough description (among other things) is also fairly agreed upon. Various standards for the description of multimedia are being developed and some exist already. Typically such standards tend to be specialised for certain applications or domains leaving only a few representing general-purpose multimedia description. As of today, the MPEG-7 seems to be recognised as the most complete general-purpose description standard for multimedia. Whether the MPEG-7 multimedia description standard qualifies as an appropriate general-purpose description standard and is compliant with the requirements of such a standard is a discussion beyond the scope of this thesis. MPEG-7 is an ISO/IEC (International Standards Organization/International Electro-technical Committee) approved standard developed by MPEG (Moving Picture Expert Group), a working group developing international standards for compression, decompression, processing, and coded representation of audio-visual data. The standard was initialised in 1996 and it represents a continuously evolving framework for standardising multimedia content description. In the context of this thesis the proposed MPEG-7...

Words: 4161 - Pages: 17

Free Essay

Dxdiag

...------------------ System Information ------------------ Time of this report: 3/14/2014, 16:22:24 Machine name: VIDUR-PC Operating System: Windows 7 Professional 64-bit (6.1, Build 7601) Service Pack 1 (7601.win7sp1_rtm.101119-1850) Language: English (Regional Setting: English) System Manufacturer: SAMSUNG ELECTRONICS CO., LTD. System Model: 300E4A/300E5A/300E7A/3430EA/3530EA BIOS: Phoenix SecureCore-Tiano(tm) NB Version 2.1 05QA Processor: Intel(R) Core(TM) i3-2350M CPU @ 2.30GHz (4 CPUs), ~2.3GHz Memory: 4096MB RAM Available OS Memory: 4010MB RAM Page File: 1144MB used, 6873MB available Windows Dir: C:\Windows DirectX Version: DirectX 11 DX Setup Parameters: Not found User DPI Setting: 96 DPI (100 percent) System DPI Setting: 96 DPI (100 percent) DWM DPI Scaling: Disabled DxDiag Version: 6.01.7601.17514 32bit Unicode ------------ DxDiag Notes ------------ Display Tab 1: No problems found. Sound Tab 1: No problems found. Input Tab: No problems found. -------------------- DirectX Debug Levels -------------------- Direct3D: 0/4 (retail) DirectDraw: 0/4 (retail) DirectInput: 0/5 (retail) DirectMusic: 0/5 (retail) DirectPlay: 0/9 (retail) DirectSound: 0/5 (retail) DirectShow: 0/6 (retail) --------------- Display Devices --------------- Card name: Standard VGA Graphics Adapter Manufacturer:...

Words: 2666 - Pages: 11

Free Essay

Prabhakar

...White paper h.264 video compression standard. New possibilities within video surveillance. table of contents 1. introduction 2. Development of h.264 3. how video compression works 4. h.264 profiles and levels 5. Understanding frames 6. Basic methods of reducing data 7. efficiency of h.264 8. Conclusion 3 3 4 5 5 6 7 9 1. introduction The latest video compression standard, H.264 (also known as MPEG-4 Part 10/AVC for Advanced Video Coding), is expected to become the video standard of choice in the coming years. H.264 is an open, licensed standard that supports the most efficient video compression techniques available today. Without compromising image quality, an H.264 encoder can reduce the size of a digital video file by more than 80% compared with the Motion JPEG format and as much as 50% more than with the MPEG-4 Part 2 standard. This means that much less network bandwidth and storage space are required for a video file. Or seen another way, much higher video quality can be achieved for a given bit rate. Jointly defined by standardization organizations in the telecommunications and IT industries, H.264 is expected to be more widely adopted than previous standards. H.264 has already been introduced in new electronic gadgets such as mobile phones and digital video players, and has gained fast acceptance by end users. Service providers such as online video storage and telecommunications companies are also beginning to adopt H.264. In the video surveillance industry...

Words: 3156 - Pages: 13