Masking refers to the psycho-visual/acoustical effect in which certain signal characteristics make certain types of coding noise invisible or inaudible.  In the aural case, the predominant masking effect is frequency related, but there are temporal aspects as well.  Visually, masking is mainly spatial, occurring in the vicinity of high contrast edges, but there are temporal effects as well.

The challenge is to design a variable bit rate system in which the noise introduced at low bit rates falls in the frequency, spatial, or temporal regions in which the desired signal masks the presence of noise.

In audio, for example, a pure tone will mask energy of higher frequency and to a lesser extent lower frequency.  The amount of masking decreases as the noise gets further in frequency from the masking tone. In addition, this masking effect does not disappear instantaneously after the tone is removed, but persists for a short time.  

Visually, high contrast edges mask random noise.  There are also temporal masking effects for step transitions, which depend to some degree on the polarity of the temporal transition (black to white or white to black).  Thus, the video frame following a scene change does not have to be rendered with the same accuracy as a continuous still frame.

