In addition to the ISO reference documentation ($120 from ANSI in the US), here are some less detailed but more infomative articles.
You can find links to lots of documentation at www.mp3-tech.org's programmers' corner.
NDTC Speech and Audio Coding Page contains a detailed and very well written review article on psycho-acoustics.
Davis Pan has a couple of useful overviews of MPEG audio encoding.
Several papers by Frank Baumgarte. The paper by Baumgarte, Ferekidis and Fuchs describes an alternative psycho-acoustic model for MPEG encoding.
A complete (non MPEG) MDCT based audio encoder (MUS420 class project). The assocated paper gives some good information on audio encoding.
A good paper on mid/side stereo encoding:
Johnston and Ferreira,
Sum-Difference Stereo Transform Coding, Proc. IEEE ICASSP (1992) p 569-571.
A lot in the MPEG2-AAC can also be used in MP3:
Bosi et al. "ISO/IEC
MPEG-2 AAC", J. Audio Eng. Soc. 45 (1997) p 789-814.
And the original MPEG1 reference:
Brandenburg & Stoll, "ISO-MPEG-1
Audio: A Generic Standard for Coding of High-Quality Digital Audio", J. Audio
Eng. Soc 42 (1994) p 780-792.
Some original papers on the psycho-acoustics used by MPEG:
"Transform Coding of Audio Signals Using Perceptual Noise Criteria", IEEE J.
Selected Areas Communications, (1988).
Brandenburg and Johnston,
"Second Generation Perceptual Audio coding: The Hybrid Coder", AES 89th
Convention, 1990.
Shape of masking curve for tonal sounds (used in ISO model2):
Atal & Hall, "Optimizing digital speech coders by exploiting masking
properties of the human ear", JASA Vol.66 N°6 (1979) p 1647-1652
Tonality estimation used in ISO model 1,
ATH shape (sometimes
incorrectly reffered as Painter & Spanias formula):
Terhardt &
Stoll, "Algorithm for extraction of pitch and pitch salience from complex
tonal signals", JASA Vol.71 N°3 (1982) p 679-688
Shape of masking curve depending upon stimulus intensity:
Sporer &
Brandenburg, "Constraints of filter banks used for perceptual measurement",
JAES Vol.43 N°3 (1995) p 107-115