| 1 | <pre>
|
|---|
| 2 | DRAFT TIFF Technical Note #2 17-Mar-95
|
|---|
| 3 | ============================
|
|---|
| 4 |
|
|---|
| 5 | This Technical Note describes serious problems that have been found in
|
|---|
| 6 | TIFF 6.0's design for embedding JPEG-compressed data in TIFF (Section 22
|
|---|
| 7 | of the TIFF 6.0 spec of 3 June 1992). A replacement TIFF/JPEG
|
|---|
| 8 | specification is given. Some corrections to Section 21 are also given.
|
|---|
| 9 |
|
|---|
| 10 | To permit TIFF implementations to continue to read existing files, the 6.0
|
|---|
| 11 | JPEG fields and tag values will remain reserved indefinitely. However,
|
|---|
| 12 | TIFF writers are strongly discouraged from using the 6.0 JPEG design. It
|
|---|
| 13 | is expected that the next full release of the TIFF specification will not
|
|---|
| 14 | describe the old design at all, except to note that certain tag numbers
|
|---|
| 15 | are reserved. The existing Section 22 will be replaced by the
|
|---|
| 16 | specification text given in the second part of this Tech Note.
|
|---|
| 17 |
|
|---|
| 18 |
|
|---|
| 19 | Problems in TIFF 6.0 JPEG
|
|---|
| 20 | =========================
|
|---|
| 21 |
|
|---|
| 22 | Abandoning a published spec is not a step to be taken lightly. This
|
|---|
| 23 | section summarizes the reasons that have forced this decision.
|
|---|
| 24 | TIFF 6.0's JPEG design suffers from design errors and limitations,
|
|---|
| 25 | ambiguities, and unnecessary complexity.
|
|---|
| 26 |
|
|---|
| 27 |
|
|---|
| 28 | Design errors and limitations
|
|---|
| 29 | -----------------------------
|
|---|
| 30 |
|
|---|
| 31 | The fundamental design error in the existing Section 22 is that JPEG's
|
|---|
| 32 | various tables and parameters are broken out as separate fields which the
|
|---|
| 33 | TIFF control logic must manage. This is bad software engineering: that
|
|---|
| 34 | information should be treated as private to the JPEG codec
|
|---|
| 35 | (compressor/decompressor). Worse, the fields themselves are specified
|
|---|
| 36 | without sufficient thought for future extension and without regard to
|
|---|
| 37 | well-established TIFF conventions. Here are some of the significant
|
|---|
| 38 | problems:
|
|---|
| 39 |
|
|---|
| 40 | * The JPEGxxTable fields do not store the table data directly in the
|
|---|
| 41 | IFD/field structure; rather, the fields hold pointers to information
|
|---|
| 42 | elsewhere in the file. This requires special-purpose code to be added to
|
|---|
| 43 | *every* TIFF-manipulating application, whether it needs to decode JPEG
|
|---|
| 44 | image data or not. Even a trivial TIFF editor, for example a program to
|
|---|
| 45 | add an ImageDescription field to a TIFF file, must be explicitly aware of
|
|---|
| 46 | the internal structure of the JPEG-related tables, or else it will probably
|
|---|
| 47 | break the file. Every other auxiliary field in the TIFF spec contains
|
|---|
| 48 | data, not pointers, and can be copied or relocated by standard code that
|
|---|
| 49 | doesn't know anything about the particular field. This is a crucial
|
|---|
| 50 | property of the TIFF format that must not be given up.
|
|---|
| 51 |
|
|---|
| 52 | * To manipulate these fields, the TIFF control logic is required to know a
|
|---|
| 53 | great deal about JPEG details, for example such arcana as how to compute
|
|---|
| 54 | the length of a Huffman code table --- the length is not supplied in the
|
|---|
| 55 | field structure and can only be found by inspecting the table contents.
|
|---|
| 56 | This is again a violation of good software practice. Moreover, it will
|
|---|
| 57 | prevent easy adoption of future JPEG extensions that might change these
|
|---|
| 58 | low-level details.
|
|---|
| 59 |
|
|---|
| 60 | * The design neglects the fact that baseline JPEG codecs support only two
|
|---|
| 61 | sets of Huffman tables: it specifies a separate table for each color
|
|---|
| 62 | component. This implies that encoders must waste space (by storing
|
|---|
| 63 | duplicate Huffman tables) or else violate the well-founded TIFF convention
|
|---|
| 64 | that prohibits duplicate pointers. Furthermore, baseline decoders must
|
|---|
| 65 | test to find out which tables are identical, a waste of time and code
|
|---|
| 66 | space.
|
|---|
| 67 |
|
|---|
| 68 | * The JPEGInterchangeFormat field also violates TIFF's proscription against
|
|---|
| 69 | duplicate pointers: the normal strip/tile pointers are expected to point
|
|---|
| 70 | into the larger data area pointed to by JPEGInterchangeFormat. All TIFF
|
|---|
| 71 | editing applications must be specifically aware of this relationship, since
|
|---|
| 72 | they must maintain it or else delete the JPEGInterchangeFormat field. The
|
|---|
| 73 | JPEGxxTables fields are also likely to point into the JPEGInterchangeFormat
|
|---|
| 74 | area, creating additional pointer relationships that must be maintained.
|
|---|
| 75 |
|
|---|
| 76 | * The JPEGQTables field is fixed at a byte per table entry; there is no
|
|---|
| 77 | way to support 16-bit quantization values. This is a serious impediment
|
|---|
| 78 | to extending TIFF to use 12-bit JPEG.
|
|---|
| 79 |
|
|---|
| 80 | * The 6.0 design cannot support using different quantization tables in
|
|---|
| 81 | different strips/tiles of an image (so as to encode some areas at higher
|
|---|
| 82 | quality than others). Furthermore, since quantization tables are tied
|
|---|
| 83 | one-for-one to color components, the design cannot support table switching
|
|---|
| 84 | options that are likely to be added in future JPEG revisions.
|
|---|
| 85 |
|
|---|
| 86 |
|
|---|
| 87 | Ambiguities
|
|---|
| 88 | -----------
|
|---|
| 89 |
|
|---|
| 90 | Several incompatible interpretations are possible for 6.0's treatment of
|
|---|
| 91 | JPEG restart markers:
|
|---|
| 92 |
|
|---|
| 93 | * It is unclear whether restart markers must be omitted at TIFF segment
|
|---|
| 94 | (strip/tile) boundaries, or whether they are optional.
|
|---|
| 95 |
|
|---|
| 96 | * It is unclear whether the segment size is required to be chosen as
|
|---|
| 97 | a multiple of the specified restart interval (if any); perhaps the
|
|---|
| 98 | JPEG codec is supposed to be reset at each segment boundary as if
|
|---|
| 99 | there were a restart marker there, even if the boundary does not fall
|
|---|
| 100 | at a multiple of the nominal restart interval.
|
|---|
| 101 |
|
|---|
| 102 | * The spec fails to address the question of restart marker numbering:
|
|---|
| 103 | do the numbers begin again within each segment, or not?
|
|---|
| 104 |
|
|---|
| 105 | That last point is particularly nasty. If we make numbering begin again
|
|---|
| 106 | within each segment, we give up the ability to impose a TIFF strip/tile
|
|---|
| 107 | structure on an existing JPEG datastream with restarts (which was clearly a
|
|---|
| 108 | goal of Section 22's authors). But the other choice interferes with random
|
|---|
| 109 | access to the image segments: a reader must compute the first restart
|
|---|
| 110 | number to be expected within a segment, and must have a way to reset its
|
|---|
| 111 | JPEG decoder to expect a nonzero restart number first. This may not even
|
|---|
| 112 | be possible with some JPEG chips.
|
|---|
| 113 |
|
|---|
| 114 | The tile height restriction found on page 104 contradicts Section 15's
|
|---|
| 115 | general description of tiles. For an image that is not vertically
|
|---|
| 116 | downsampled, page 104 specifies a tile height of one MCU or 8 pixels; but
|
|---|
| 117 | Section 15 requires tiles to be a multiple of 16 pixels high.
|
|---|
| 118 |
|
|---|
| 119 | This Tech Note does not attempt to resolve these ambiguities, so
|
|---|
| 120 | implementations that follow the 6.0 design should be aware that
|
|---|
| 121 | inter-application compatibility problems are likely to arise.
|
|---|
| 122 |
|
|---|
| 123 |
|
|---|
| 124 | Unnecessary complexity
|
|---|
| 125 | ----------------------
|
|---|
| 126 |
|
|---|
| 127 | The 6.0 design creates problems for implementations that need to keep the
|
|---|
| 128 | JPEG codec separate from the TIFF control logic --- for example, consider
|
|---|
| 129 | using a JPEG chip that was not designed specifically for TIFF. JPEG codecs
|
|---|
| 130 | generally want to produce or consume a standard ISO JPEG datastream, not
|
|---|
| 131 | just raw compressed data. (If they were to handle raw data, a separate
|
|---|
| 132 | out-of-band mechanism would be needed to load tables into the codec.)
|
|---|
| 133 | With such a codec, the TIFF control logic must parse JPEG markers emitted
|
|---|
| 134 | by the codec to create the TIFF table fields (when writing) or synthesize
|
|---|
| 135 | JPEG markers from the TIFF fields to feed the codec (when reading). This
|
|---|
| 136 | means that the control logic must know a great deal more about JPEG details
|
|---|
| 137 | than we would like. The parsing and reconstruction of the markers also
|
|---|
| 138 | represents a fair amount of unnecessary work.
|
|---|
| 139 |
|
|---|
| 140 | Quite a few implementors have proposed writing "TIFF/JPEG" files in which
|
|---|
| 141 | a standard JPEG datastream is simply dumped into the file and pointed to
|
|---|
| 142 | by JPEGInterchangeFormat. To avoid parsing the JPEG datastream, they
|
|---|
| 143 | suggest not writing the JPEG auxiliary fields (JPEGxxTables etc) nor even
|
|---|
| 144 | the basic TIFF strip/tile data pointers. This approach is incompatible
|
|---|
| 145 | with implementations that handle the full TIFF 6.0 JPEG design, since they
|
|---|
| 146 | will expect to find strip/tile pointers and auxiliary fields. Indeed this
|
|---|
| 147 | is arguably not TIFF at all, since *all* TIFF-reading applications expect
|
|---|
| 148 | to find strip or tile pointers. A subset implementation that is not
|
|---|
| 149 | upward-compatible with the full spec is clearly unacceptable. However,
|
|---|
| 150 | the frequency with which this idea has come up makes it clear that
|
|---|
| 151 | implementors find the existing Section 22 too complex.
|
|---|
| 152 |
|
|---|
| 153 |
|
|---|
| 154 | Overview of the solution
|
|---|
| 155 | ========================
|
|---|
| 156 |
|
|---|
| 157 | To solve these problems, we adopt a new design for embedding
|
|---|
| 158 | JPEG-compressed data in TIFF files. The new design uses only complete,
|
|---|
| 159 | uninterpreted ISO JPEG datastreams, so it should be much more forgiving of
|
|---|
| 160 | extensions to the ISO standard. It should also be far easier to implement
|
|---|
| 161 | using unmodified JPEG codecs.
|
|---|
| 162 |
|
|---|
| 163 | To reduce overhead in multi-segment TIFF files, we allow JPEG overhead
|
|---|
| 164 | tables to be stored just once in a JPEGTables auxiliary field. This
|
|---|
| 165 | feature does not violate the integrity of the JPEG datastreams, because it
|
|---|
| 166 | uses the notions of "tables-only datastreams" and "abbreviated image
|
|---|
| 167 | datastreams" as defined by the ISO standard.
|
|---|
| 168 |
|
|---|
| 169 | To prevent confusion with the old design, the new design is given a new
|
|---|
| 170 | Compression tag value, Compression=7. Readers that need to handle
|
|---|
| 171 | existing 6.0 JPEG files may read both old and new files, using whatever
|
|---|
| 172 | interpretation of the 6.0 spec they did before. Compression tag value 6
|
|---|
| 173 | and the field tag numbers defined by 6.0 section 22 will remain reserved
|
|---|
| 174 | indefinitely, even though detailed descriptions of them will be dropped
|
|---|
| 175 | from future editions of the TIFF specification.
|
|---|
| 176 |
|
|---|
| 177 |
|
|---|
| 178 | Replacement TIFF/JPEG specification
|
|---|
| 179 | ===================================
|
|---|
| 180 |
|
|---|
| 181 | [This section of the Tech Note is expected to replace Section 22 in the
|
|---|
| 182 | next release of the TIFF specification.]
|
|---|
| 183 |
|
|---|
| 184 | This section describes TIFF compression scheme 7, a high-performance
|
|---|
| 185 | compression method for continuous-tone images.
|
|---|
| 186 |
|
|---|
| 187 | Introduction
|
|---|
| 188 | ------------
|
|---|
| 189 |
|
|---|
| 190 | This TIFF compression method uses the international standard for image
|
|---|
| 191 | compression ISO/IEC 10918-1, usually known as "JPEG" (after the original
|
|---|
| 192 | name of the standards committee, Joint Photographic Experts Group). JPEG
|
|---|
| 193 | is a joint ISO/CCITT standard for compression of continuous-tone images.
|
|---|
| 194 |
|
|---|
| 195 | The JPEG committee decided that because of the broad scope of the standard,
|
|---|
| 196 | no one algorithmic procedure was able to satisfy the requirements of all
|
|---|
| 197 | applications. Instead, the JPEG standard became a "toolkit" of multiple
|
|---|
| 198 | algorithms and optional capabilities. Individual applications may select
|
|---|
| 199 | a subset of the JPEG standard that meets their requirements.
|
|---|
| 200 |
|
|---|
| 201 | The most important distinction among the JPEG processes is between lossy
|
|---|
| 202 | and lossless compression. Lossy compression methods provide high
|
|---|
| 203 | compression but allow only approximate reconstruction of the original
|
|---|
| 204 | image. JPEG's lossy processes allow the encoder to trade off compressed
|
|---|
| 205 | file size against reconstruction fidelity over a wide range. Typically,
|
|---|
| 206 | 10:1 or more compression of full-color data can be obtained while keeping
|
|---|
| 207 | the reconstructed image visually indistinguishable from the original. Much
|
|---|
| 208 | higher compression ratios are possible if a low-quality reconstructed image
|
|---|
| 209 | is acceptable. Lossless compression provides exact reconstruction of the
|
|---|
| 210 | source data, but the achievable compression ratio is much lower than for
|
|---|
| 211 | the lossy processes; JPEG's rather simple lossless process typically
|
|---|
| 212 | achieves around 2:1 compression of full-color data.
|
|---|
| 213 |
|
|---|
| 214 | The most widely implemented JPEG subset is the "baseline" JPEG process.
|
|---|
| 215 | This provides lossy compression of 8-bit-per-channel data. Optional
|
|---|
| 216 | extensions include 12-bit-per-channel data, arithmetic entropy coding for
|
|---|
| 217 | better compression, and progressive/hierarchical representations. The
|
|---|
| 218 | lossless process is an independent algorithm that has little in
|
|---|
| 219 | common with the lossy processes.
|
|---|
| 220 |
|
|---|
| 221 | It should be noted that the optional arithmetic-coding extension is subject
|
|---|
| 222 | to several US and Japanese patents. To avoid patent problems, use of
|
|---|
| 223 | arithmetic coding processes in TIFF files intended for inter-application
|
|---|
| 224 | interchange is discouraged.
|
|---|
| 225 |
|
|---|
| 226 | All of the JPEG processes are useful only for "continuous tone" data,
|
|---|
| 227 | in which the difference between adjacent pixel values is usually small.
|
|---|
| 228 | Low-bit-depth source data is not appropriate for JPEG compression, nor
|
|---|
| 229 | are palette-color images good candidates. The JPEG processes work well
|
|---|
| 230 | on grayscale and full-color data.
|
|---|
| 231 |
|
|---|
| 232 | Describing the JPEG compression algorithms in sufficient detail to permit
|
|---|
| 233 | implementation would require more space than we have here. Instead, we
|
|---|
| 234 | refer the reader to the References section.
|
|---|
| 235 |
|
|---|
| 236 |
|
|---|
| 237 | What data is being compressed?
|
|---|
| 238 | ------------------------------
|
|---|
| 239 |
|
|---|
| 240 | In lossy JPEG compression, it is customary to convert color source data
|
|---|
| 241 | to YCbCr and then downsample it before JPEG compression. This gives
|
|---|
| 242 | 2:1 data compression with hardly any visible image degradation, and it
|
|---|
| 243 | permits additional space savings within the JPEG compression step proper.
|
|---|
| 244 | However, these steps are not considered part of the ISO JPEG standard.
|
|---|
| 245 | The ISO standard is "color blind": it accepts data in any color space.
|
|---|
| 246 |
|
|---|
| 247 | For TIFF purposes, the JPEG compression tag is considered to represent the
|
|---|
| 248 | ISO JPEG compression standard only. The ISO standard is applied to the
|
|---|
| 249 | same data that would be stored in the TIFF file if no compression were
|
|---|
| 250 | used. Therefore, if color conversion or downsampling are used, they must
|
|---|
| 251 | be reflected in the regular TIFF fields; these steps are not considered to
|
|---|
| 252 | be implicit in the JPEG compression tag value. PhotometricInterpretation
|
|---|
| 253 | and related fields shall describe the color space actually stored in the
|
|---|
| 254 | file. With the TIFF 6.0 field definitions, downsampling is permissible
|
|---|
| 255 | only for YCbCr data, and it must correspond to the YCbCrSubSampling field.
|
|---|
| 256 | (Note that the default value for this field is not 1,1; so the default for
|
|---|
| 257 | YCbCr is to apply downsampling!) It is likely that future versions of TIFF
|
|---|
| 258 | will provide additional PhotometricInterpretation values and a more general
|
|---|
| 259 | way of defining subsampling, so as to allow more flexibility in
|
|---|
| 260 | JPEG-compressed files. But that issue is not addressed in this Tech Note.
|
|---|
| 261 |
|
|---|
| 262 | Implementors should note that many popular JPEG codecs
|
|---|
| 263 | (compressor/decompressors) provide automatic color conversion and
|
|---|
| 264 | downsampling, so that the application may supply full-size RGB data which
|
|---|
| 265 | is nonetheless converted to downsampled YCbCr. This is an implementation
|
|---|
| 266 | convenience which does not excuse the TIFF control layer from its
|
|---|
| 267 | responsibility to know what is really going on. The
|
|---|
| 268 | PhotometricInterpretation and subsampling fields written to the file must
|
|---|
| 269 | describe what is actually in the file.
|
|---|
| 270 |
|
|---|
| 271 | A JPEG-compressed TIFF file will typically have PhotometricInterpretation =
|
|---|
| 272 | YCbCr and YCbCrSubSampling = [2,1] or [2,2], unless the source data was
|
|---|
| 273 | grayscale or CMYK.
|
|---|
| 274 |
|
|---|
| 275 |
|
|---|
| 276 | Basic representation of JPEG-compressed images
|
|---|
| 277 | ----------------------------------------------
|
|---|
| 278 |
|
|---|
| 279 | JPEG compression works in either strip-based or tile-based TIFF files.
|
|---|
| 280 | Rather than repeating "strip or tile" constantly, we will use the term
|
|---|
| 281 | "segment" to mean either a strip or a tile.
|
|---|
| 282 |
|
|---|
| 283 | When the Compression field has the value 7, each image segment contains
|
|---|
| 284 | a complete JPEG datastream which is valid according to the ISO JPEG
|
|---|
| 285 | standard (ISO/IEC 10918-1). Any sequential JPEG process can be used,
|
|---|
| 286 | including lossless JPEG, but progressive and hierarchical processes are not
|
|---|
| 287 | supported. Since JPEG is useful only for continuous-tone images, the
|
|---|
| 288 | PhotometricInterpretation of the image shall not be 3 (palette color) nor
|
|---|
| 289 | 4 (transparency mask). The bit depth of the data is also restricted as
|
|---|
| 290 | specified below.
|
|---|
| 291 |
|
|---|
| 292 | Each image segment in a JPEG-compressed TIFF file shall contain a valid
|
|---|
| 293 | JPEG datastream according to the ISO JPEG standard's rules for
|
|---|
| 294 | interchange-format or abbreviated-image-format data. The datastream shall
|
|---|
| 295 | contain a single JPEG frame storing that segment of the image. The
|
|---|
| 296 | required JPEG markers within a segment are:
|
|---|
| 297 | SOI (must appear at very beginning of segment)
|
|---|
| 298 | SOFn
|
|---|
| 299 | SOS (one for each scan, if there is more than one scan)
|
|---|
| 300 | EOI (must appear at very end of segment)
|
|---|
| 301 | The actual compressed data follows SOS; it may contain RSTn markers if DRI
|
|---|
| 302 | is used.
|
|---|
| 303 |
|
|---|
| 304 | Additional JPEG "tables and miscellaneous" markers may appear between SOI
|
|---|
| 305 | and SOFn, between SOFn and SOS, and before each subsequent SOS if there is
|
|---|
| 306 | more than one scan. These markers include:
|
|---|
| 307 | DQT
|
|---|
| 308 | DHT
|
|---|
| 309 | DAC (not to appear unless arithmetic coding is used)
|
|---|
| 310 | DRI
|
|---|
| 311 | APPn (shall be ignored by TIFF readers)
|
|---|
| 312 | COM (shall be ignored by TIFF readers)
|
|---|
| 313 | DNL markers shall not be used in TIFF files. Readers should abort if any
|
|---|
| 314 | other marker type is found, especially the JPEG reserved markers;
|
|---|
| 315 | occurrence of such a marker is likely to indicate a JPEG extension.
|
|---|
| 316 |
|
|---|
| 317 | The tables/miscellaneous markers may appear in any order. Readers are
|
|---|
| 318 | cautioned that although the SOFn marker refers to DQT tables, JPEG does not
|
|---|
| 319 | require those tables to precede the SOFn, only the SOS. Missing-table
|
|---|
| 320 | checks should be made when SOS is reached.
|
|---|
| 321 |
|
|---|
| 322 | If no JPEGTables field is used, then each image segment shall be a complete
|
|---|
| 323 | JPEG interchange datastream. Each segment must define all the tables it
|
|---|
| 324 | references. To allow readers to decode segments in any order, no segment
|
|---|
| 325 | may rely on tables being carried over from a previous segment.
|
|---|
| 326 |
|
|---|
| 327 | When a JPEGTables field is used, image segments may omit tables that have
|
|---|
| 328 | been specified in the JPEGTables field. Further details appear below.
|
|---|
| 329 |
|
|---|
| 330 | The SOFn marker shall be of type SOF0 for strict baseline JPEG data, of
|
|---|
| 331 | type SOF1 for non-baseline lossy JPEG data, or of type SOF3 for lossless
|
|---|
| 332 | JPEG data. (SOF9 or SOF11 would be used for arithmetic coding.) All
|
|---|
| 333 | segments of a JPEG-compressed TIFF image shall use the same JPEG
|
|---|
| 334 | compression process, in particular the same SOFn type.
|
|---|
| 335 |
|
|---|
| 336 | The data precision field of the SOFn marker shall agree with the TIFF
|
|---|
| 337 | BitsPerSample field. (Note that when PlanarConfiguration=1, this implies
|
|---|
| 338 | that all components must have the same BitsPerSample value; when
|
|---|
| 339 | PlanarConfiguration=2, different components could have different bit
|
|---|
| 340 | depths.) For SOF0 only precision 8 is permitted; for SOF1, precision 8 or
|
|---|
| 341 | 12 is permitted; for SOF3, precisions 2 to 16 are permitted.
|
|---|
| 342 |
|
|---|
| 343 | The image dimensions given in the SOFn marker shall agree with the logical
|
|---|
| 344 | dimensions of that particular strip or tile. For strip images, the SOFn
|
|---|
| 345 | image width shall equal ImageWidth and the height shall equal RowsPerStrip,
|
|---|
| 346 | except in the last strip; its SOFn height shall equal the number of rows
|
|---|
| 347 | remaining in the ImageLength. (In other words, no padding data is counted
|
|---|
| 348 | in the SOFn dimensions.) For tile images, each SOFn shall have width
|
|---|
| 349 | TileWidth and height TileHeight; adding and removing any padding needed in
|
|---|
| 350 | the edge tiles is the concern of some higher level of the TIFF software.
|
|---|
| 351 | (The dimensional rules are slightly different when PlanarConfiguration=2,
|
|---|
| 352 | as described below.)
|
|---|
| 353 |
|
|---|
| 354 | The ISO JPEG standard only permits images up to 65535 pixels in width or
|
|---|
| 355 | height, due to 2-byte fields in the SOFn markers. In TIFF, this limits
|
|---|
| 356 | the size of an individual JPEG-compressed strip or tile, but the total
|
|---|
| 357 | image size can be greater.
|
|---|
| 358 |
|
|---|
| 359 | The number of components in the JPEG datastream shall equal SamplesPerPixel
|
|---|
| 360 | for PlanarConfiguration=1, and shall be 1 for PlanarConfiguration=2. The
|
|---|
| 361 | components shall be stored in the same order as they are described at the
|
|---|
| 362 | TIFF field level. (This applies both to their order in the SOFn marker,
|
|---|
| 363 | and to the order in which they are scanned if multiple JPEG scans are
|
|---|
| 364 | used.) The component ID bytes are arbitrary so long as each component
|
|---|
| 365 | within an image segment is given a distinct ID. To avoid any possible
|
|---|
| 366 | confusion, we require that all segments of a TIFF image use the same ID
|
|---|
| 367 | code for a given component.
|
|---|
| 368 |
|
|---|
| 369 | In PlanarConfiguration 1, the sampling factors given in SOFn markers shall
|
|---|
| 370 | agree with the sampling factors defined by the related TIFF fields (or with
|
|---|
| 371 | the default values that are specified in the absence of those fields).
|
|---|
| 372 |
|
|---|
| 373 | When DCT-based JPEG is used in a strip TIFF file, RowsPerStrip is required
|
|---|
| 374 | to be a multiple of 8 times the largest vertical sampling factor, i.e., a
|
|---|
| 375 | multiple of the height of an interleaved MCU. (For simplicity of
|
|---|
| 376 | specification, we require this even if the data is not actually
|
|---|
| 377 | interleaved.) For example, if YCbCrSubSampling = [2,2] then RowsPerStrip
|
|---|
| 378 | must be a multiple of 16. An exception to this rule is made for
|
|---|
| 379 | single-strip images (RowsPerStrip >= ImageLength): the exact value of
|
|---|
| 380 | RowsPerStrip is unimportant in that case. This rule ensures that no data
|
|---|
| 381 | padding is needed at the bottom of a strip, except perhaps the last strip.
|
|---|
| 382 | Any padding required at the right edge of the image, or at the bottom of
|
|---|
| 383 | the last strip, is expected to occur internally to the JPEG codec.
|
|---|
| 384 |
|
|---|
| 385 | When DCT-based JPEG is used in a tiled TIFF file, TileLength is required
|
|---|
| 386 | to be a multiple of 8 times the largest vertical sampling factor, i.e.,
|
|---|
| 387 | a multiple of the height of an interleaved MCU; and TileWidth is required
|
|---|
| 388 | to be a multiple of 8 times the largest horizontal sampling factor, i.e.,
|
|---|
| 389 | a multiple of the width of an interleaved MCU. (For simplicity of
|
|---|
| 390 | specification, we require this even if the data is not actually
|
|---|
| 391 | interleaved.) All edge padding required will therefore occur in the course
|
|---|
| 392 | of normal TIFF tile padding; it is not special to JPEG.
|
|---|
| 393 |
|
|---|
| 394 | Lossless JPEG does not impose these constraints on strip and tile sizes,
|
|---|
| 395 | since it is not DCT-based.
|
|---|
| 396 |
|
|---|
| 397 | Note that within JPEG datastreams, multibyte values appear in the MSB-first
|
|---|
| 398 | order specified by the JPEG standard, regardless of the byte ordering of
|
|---|
| 399 | the surrounding TIFF file.
|
|---|
| 400 |
|
|---|
| 401 |
|
|---|
| 402 | JPEGTables field
|
|---|
| 403 | ----------------
|
|---|
| 404 |
|
|---|
| 405 | The only auxiliary TIFF field added for Compression=7 is the optional
|
|---|
| 406 | JPEGTables field. The purpose of JPEGTables is to predefine JPEG
|
|---|
| 407 | quantization and/or Huffman tables for subsequent use by JPEG image
|
|---|
| 408 | segments. When this is done, these rather bulky tables need not be
|
|---|
| 409 | duplicated in each segment, thus saving space and processing time.
|
|---|
| 410 | JPEGTables may be used even in a single-segment file, although there is no
|
|---|
| 411 | space savings in that case.
|
|---|
| 412 |
|
|---|
| 413 | JPEGTables:
|
|---|
| 414 | Tag = 347 (15B.H)
|
|---|
| 415 | Type = UNDEFINED
|
|---|
| 416 | N = number of bytes in tables datastream, typically a few hundred
|
|---|
| 417 | JPEGTables provides default JPEG quantization and/or Huffman tables which
|
|---|
| 418 | are used whenever a segment datastream does not contain its own tables, as
|
|---|
| 419 | specified below.
|
|---|
| 420 |
|
|---|
| 421 | Notice that the JPEGTables field is required to have type code UNDEFINED,
|
|---|
| 422 | not type code BYTE. This is to cue readers that expanding individual bytes
|
|---|
| 423 | to short or long integers is not appropriate. A TIFF reader will generally
|
|---|
| 424 | need to store the field value as an uninterpreted byte sequence until it is
|
|---|
| 425 | fed to the JPEG decoder.
|
|---|
| 426 |
|
|---|
| 427 | Multibyte quantities within the tables follow the ISO JPEG convention of
|
|---|
| 428 | MSB-first storage, regardless of the byte ordering of the surrounding TIFF
|
|---|
| 429 | file.
|
|---|
| 430 |
|
|---|
| 431 | When the JPEGTables field is present, it shall contain a valid JPEG
|
|---|
| 432 | "abbreviated table specification" datastream. This datastream shall begin
|
|---|
| 433 | with SOI and end with EOI. It may contain zero or more JPEG "tables and
|
|---|
| 434 | miscellaneous" markers, namely:
|
|---|
| 435 | DQT
|
|---|
| 436 | DHT
|
|---|
| 437 | DAC (not to appear unless arithmetic coding is used)
|
|---|
| 438 | DRI
|
|---|
| 439 | APPn (shall be ignored by TIFF readers)
|
|---|
| 440 | COM (shall be ignored by TIFF readers)
|
|---|
| 441 | Since JPEG defines the SOI marker to reset the DAC and DRI state, these two
|
|---|
| 442 | markers' values cannot be carried over into any image datastream, and thus
|
|---|
| 443 | they are effectively no-ops in the JPEGTables field. To avoid confusion,
|
|---|
| 444 | it is recommended that writers not place DAC or DRI markers in JPEGTables.
|
|---|
| 445 | However readers must properly skip over them if they appear.
|
|---|
| 446 |
|
|---|
| 447 | When JPEGTables is present, readers shall load the table specifications
|
|---|
| 448 | contained in JPEGTables before processing image segment datastreams.
|
|---|
| 449 | Image segments may simply refer to these preloaded tables without defining
|
|---|
| 450 | them. An image segment can still define and use its own tables, subject to
|
|---|
| 451 | the restrictions below.
|
|---|
| 452 |
|
|---|
| 453 | An image segment may not redefine any table defined in JPEGTables. (This
|
|---|
| 454 | restriction is imposed to allow readers to process image segments in random
|
|---|
| 455 | order without having to reload JPEGTables between segments.) Therefore, use
|
|---|
| 456 | of JPEGTables divides the available table slots into two groups: "global"
|
|---|
| 457 | slots are defined in JPEGTables and may be used but not redefined by
|
|---|
| 458 | segments; "local" slots are available for local definition and use in each
|
|---|
| 459 | segment. To permit random access, a segment may not reference any local
|
|---|
| 460 | tables that it does not itself define.
|
|---|
| 461 |
|
|---|
| 462 |
|
|---|
| 463 | Special considerations for PlanarConfiguration 2
|
|---|
| 464 | ------------------------------------------------
|
|---|
| 465 |
|
|---|
| 466 | In PlanarConfiguration 2, each image segment contains data for only one
|
|---|
| 467 | color component. To avoid confusing the JPEG codec, we wish the segments
|
|---|
| 468 | to look like valid single-channel (i.e., grayscale) JPEG datastreams. This
|
|---|
| 469 | means that different rules must be used for the SOFn parameters.
|
|---|
| 470 |
|
|---|
| 471 | In PlanarConfiguration 2, the dimensions given in the SOFn of a subsampled
|
|---|
| 472 | component shall be scaled down by the sampling factors compared to the SOFn
|
|---|
| 473 | dimensions that would be used in PlanarConfiguration 1. This is necessary
|
|---|
| 474 | to match the actual number of samples stored in that segment, so that the
|
|---|
| 475 | JPEG codec doesn't complain about too much or too little data. In strip
|
|---|
| 476 | TIFF files the computed dimensions may need to be rounded up to the next
|
|---|
| 477 | integer; in tiled files, the restrictions on tile size make this case
|
|---|
| 478 | impossible.
|
|---|
| 479 |
|
|---|
| 480 | Furthermore, all SOFn sampling factors shall be given as 1. (This is
|
|---|
| 481 | merely to avoid confusion, since the sampling factors in a single-channel
|
|---|
| 482 | JPEG datastream have no real effect.)
|
|---|
| 483 |
|
|---|
| 484 | Any downsampling will need to happen externally to the JPEG codec, since
|
|---|
| 485 | JPEG sampling factors are defined with reference to the full-precision
|
|---|
| 486 | component. In PlanarConfiguration 2, the JPEG codec will be working on
|
|---|
| 487 | only one component at a time and thus will have no reference component to
|
|---|
| 488 | downsample against.
|
|---|
| 489 |
|
|---|
| 490 |
|
|---|
| 491 | Minimum requirements for TIFF/JPEG
|
|---|
| 492 | ----------------------------------
|
|---|
| 493 |
|
|---|
| 494 | ISO JPEG is a large and complex standard; most implementations support only
|
|---|
| 495 | a subset of it. Here we define a "core" subset of TIFF/JPEG which readers
|
|---|
| 496 | must support to claim TIFF/JPEG compatibility. For maximum
|
|---|
| 497 | cross-application compatibility, we recommend that writers confine
|
|---|
| 498 | themselves to this subset unless there is very good reason to do otherwise.
|
|---|
| 499 |
|
|---|
| 500 | Use the ISO baseline JPEG process: 8-bit data precision, Huffman coding,
|
|---|
| 501 | with no more than 2 DC and 2 AC Huffman tables. Note that this implies
|
|---|
| 502 | BitsPerSample = 8 for each component. We recommend deviating from baseline
|
|---|
| 503 | JPEG only if 12-bit data precision or lossless coding is required.
|
|---|
| 504 |
|
|---|
| 505 | Use no subsampling (all JPEG sampling factors = 1) for color spaces other
|
|---|
| 506 | than YCbCr. (This is, in fact, required with the TIFF 6.0 field
|
|---|
| 507 | definitions, but may not be so in future revisions.) For YCbCr, use one of
|
|---|
| 508 | the following choices:
|
|---|
| 509 | YCbCrSubSampling field JPEG sampling factors
|
|---|
| 510 | 1,1 1h1v, 1h1v, 1h1v
|
|---|
| 511 | 2,1 2h1v, 1h1v, 1h1v
|
|---|
| 512 | 2,2 (default value) 2h2v, 1h1v, 1h1v
|
|---|
| 513 | We recommend that RGB source data be converted to YCbCr for best compression
|
|---|
| 514 | results. Other source data colorspaces should probably be left alone.
|
|---|
| 515 | Minimal readers need not support JPEG images with colorspaces other than
|
|---|
| 516 | YCbCr and grayscale (PhotometricInterpretation = 6 or 1).
|
|---|
| 517 |
|
|---|
| 518 | A minimal reader also need not support JPEG YCbCr images with nondefault
|
|---|
| 519 | values of YCbCrCoefficients or YCbCrPositioning, nor with values of
|
|---|
| 520 | ReferenceBlackWhite other than [0,255,128,255,128,255]. (These values
|
|---|
| 521 | correspond to the RGB<=>YCbCr conversion specified by JFIF, which is widely
|
|---|
| 522 | implemented in JPEG codecs.)
|
|---|
| 523 |
|
|---|
| 524 | Writers are reminded that a ReferenceBlackWhite field *must* be included
|
|---|
| 525 | when PhotometricInterpretation is YCbCr, because the default
|
|---|
| 526 | ReferenceBlackWhite values are inappropriate for YCbCr.
|
|---|
| 527 |
|
|---|
| 528 | If any subsampling is used, PlanarConfiguration=1 is preferred to avoid the
|
|---|
| 529 | possibly-confusing requirements of PlanarConfiguration=2. In any case,
|
|---|
| 530 | readers are not required to support PlanarConfiguration=2.
|
|---|
| 531 |
|
|---|
| 532 | If possible, use a single interleaved scan in each image segment. This is
|
|---|
| 533 | not legal JPEG if there are more than 4 SamplesPerPixel or if the sampling
|
|---|
| 534 | factors are such that more than 10 blocks would be needed per MCU; in that
|
|---|
| 535 | case, use a separate scan for each component. (The recommended color
|
|---|
| 536 | spaces and sampling factors will not run into that restriction, so a
|
|---|
| 537 | minimal reader need not support more than one scan per segment.)
|
|---|
| 538 |
|
|---|
| 539 | To claim TIFF/JPEG compatibility, readers shall support multiple-strip TIFF
|
|---|
| 540 | files and the optional JPEGTables field; it is not acceptable to read only
|
|---|
| 541 | single-datastream files. Support for tiled TIFF files is strongly
|
|---|
| 542 | recommended but not required.
|
|---|
| 543 |
|
|---|
| 544 |
|
|---|
| 545 | Other recommendations for implementors
|
|---|
| 546 | --------------------------------------
|
|---|
| 547 |
|
|---|
| 548 | The TIFF tag Compression=7 guarantees only that the compressed data is
|
|---|
| 549 | represented as ISO JPEG datastreams. Since JPEG is a large and evolving
|
|---|
| 550 | standard, readers should apply careful error checking to the JPEG markers
|
|---|
| 551 | to ensure that the compression process is within their capabilities. In
|
|---|
| 552 | particular, to avoid being confused by future extensions to the JPEG
|
|---|
| 553 | standard, it is important to abort if unknown marker codes are seen.
|
|---|
| 554 |
|
|---|
| 555 | The point of requiring that all image segments use the same JPEG process is
|
|---|
| 556 | to ensure that a reader need check only one segment to determine whether it
|
|---|
| 557 | can handle the image. For example, consider a TIFF reader that has access
|
|---|
| 558 | to fast but restricted JPEG hardware, as well as a slower, more general
|
|---|
| 559 | software implementation. It is desirable to check only one image segment
|
|---|
| 560 | to find out whether the fast hardware can be used. Thus, writers should
|
|---|
| 561 | try to ensure that all segments of an image look as much "alike" as
|
|---|
| 562 | possible: there should be no variation in scan layout, use of options such
|
|---|
| 563 | as DRI, etc. Ideally, segments will be processed identically except
|
|---|
| 564 | perhaps for using different local quantization or entropy-coding tables.
|
|---|
| 565 |
|
|---|
| 566 | Writers should avoid including "noise" JPEG markers (COM and APPn markers).
|
|---|
| 567 | Standard TIFF fields provide a better way to transport any non-image data.
|
|---|
| 568 | Some JPEG codecs may change behavior if they see an APPn marker they
|
|---|
| 569 | think they understand; since the TIFF spec requires these markers to be
|
|---|
| 570 | ignored, this behavior is undesirable.
|
|---|
| 571 |
|
|---|
| 572 | It is possible to convert an interchange-JPEG file (e.g., a JFIF file) to
|
|---|
| 573 | TIFF simply by dropping the interchange datastream into a single strip.
|
|---|
| 574 | (However, designers are reminded that the TIFF spec discourages huge
|
|---|
| 575 | strips; splitting the image is somewhat more work but may give better
|
|---|
| 576 | results.) Conversion from TIFF to interchange JPEG is more complex. A
|
|---|
| 577 | strip-based TIFF/JPEG file can be converted fairly easily if all strips use
|
|---|
| 578 | identical JPEG tables and no RSTn markers: just delete the overhead markers
|
|---|
| 579 | and insert RSTn markers between strips. Converting tiled images is harder,
|
|---|
| 580 | since the data will usually not be in the right order (unless the tiles are
|
|---|
| 581 | only one MCU high). This can still be done losslessly, but it will require
|
|---|
| 582 | undoing and redoing the entropy coding so that the DC coefficient
|
|---|
| 583 | differences can be updated.
|
|---|
| 584 |
|
|---|
| 585 | There is no default value for JPEGTables: standard TIFF files must define all
|
|---|
| 586 | tables that they reference. For some closed systems in which many files will
|
|---|
| 587 | have identical tables, it might make sense to define a default JPEGTables
|
|---|
| 588 | value to avoid actually storing the tables. Or even better, invent a
|
|---|
| 589 | private field selecting one of N default JPEGTables settings, so as to allow
|
|---|
| 590 | for future expansion. Either of these must be regarded as a private
|
|---|
| 591 | extension that will render the files unreadable by other applications.
|
|---|
| 592 |
|
|---|
| 593 |
|
|---|
| 594 | References
|
|---|
| 595 | ----------
|
|---|
| 596 |
|
|---|
| 597 | [1] Wallace, Gregory K. "The JPEG Still Picture Compression Standard",
|
|---|
| 598 | Communications of the ACM, April 1991 (vol. 34 no. 4), pp. 30-44.
|
|---|
| 599 |
|
|---|
| 600 | This is the best short technical introduction to the JPEG algorithms.
|
|---|
| 601 | It is a good overview but does not provide sufficiently detailed
|
|---|
| 602 | information to write an implementation.
|
|---|
| 603 |
|
|---|
| 604 | [2] Pennebaker, William B. and Mitchell, Joan L. "JPEG Still Image Data
|
|---|
| 605 | Compression Standard", Van Nostrand Reinhold, 1993, ISBN 0-442-01272-1.
|
|---|
| 606 | 638pp.
|
|---|
| 607 |
|
|---|
| 608 | This textbook is by far the most complete exposition of JPEG in existence.
|
|---|
| 609 | It includes the full text of the ISO JPEG standards (DIS 10918-1 and draft
|
|---|
| 610 | DIS 10918-2). No would-be JPEG implementor should be without it.
|
|---|
| 611 |
|
|---|
| 612 | [3] ISO/IEC IS 10918-1, "Digital Compression and Coding of Continuous-tone
|
|---|
| 613 | Still Images, Part 1: Requirements and guidelines", February 1994.
|
|---|
| 614 | ISO/IEC DIS 10918-2, "Digital Compression and Coding of Continuous-tone
|
|---|
| 615 | Still Images, Part 2: Compliance testing", final approval expected 1994.
|
|---|
| 616 |
|
|---|
| 617 | These are the official standards documents. Note that the Pennebaker and
|
|---|
| 618 | Mitchell textbook is likely to be cheaper and more useful than the official
|
|---|
| 619 | standards.
|
|---|
| 620 |
|
|---|
| 621 |
|
|---|
| 622 | Changes to Section 21: YCbCr Images
|
|---|
| 623 | ===================================
|
|---|
| 624 |
|
|---|
| 625 | [This section of the Tech Note clarifies section 21 to make clear the
|
|---|
| 626 | interpretation of image dimensions in a subsampled image. Furthermore,
|
|---|
| 627 | the section is changed to allow the original image dimensions not to be
|
|---|
| 628 | multiples of the sampling factors. This change is necessary to support use
|
|---|
| 629 | of JPEG compression on odd-size images.]
|
|---|
| 630 |
|
|---|
| 631 | Add the following paragraphs to the Section 21 introduction (p. 89),
|
|---|
| 632 | just after the paragraph beginning "When a Class Y image is subsampled":
|
|---|
| 633 |
|
|---|
| 634 | In a subsampled image, it is understood that all TIFF image
|
|---|
| 635 | dimensions are measured in terms of the highest-resolution
|
|---|
| 636 | (luminance) component. In particular, ImageWidth, ImageLength,
|
|---|
| 637 | RowsPerStrip, TileWidth, TileLength, XResolution, and YResolution
|
|---|
| 638 | are measured in luminance samples.
|
|---|
| 639 |
|
|---|
| 640 | RowsPerStrip, TileWidth, and TileLength are constrained so that
|
|---|
| 641 | there are an integral number of samples of each component in a
|
|---|
| 642 | complete strip or tile. However, ImageWidth/ImageLength are not
|
|---|
| 643 | constrained. If an odd-size image is to be converted to subsampled
|
|---|
| 644 | format, the writer should pad the source data to a multiple of the
|
|---|
| 645 | sampling factors by replication of the last column and/or row, then
|
|---|
| 646 | downsample. The number of luminance samples actually stored in the
|
|---|
| 647 | file will be a multiple of the sampling factors. Conversely,
|
|---|
| 648 | readers must ignore any extra data (outside the specified image
|
|---|
| 649 | dimensions) after upsampling.
|
|---|
| 650 |
|
|---|
| 651 | When PlanarConfiguration=2, each strip or tile covers the same
|
|---|
| 652 | image area despite subsampling; that is, the total number of strips
|
|---|
| 653 | or tiles in the image is the same for each component. Therefore
|
|---|
| 654 | strips or tiles of the subsampled components contain fewer samples
|
|---|
| 655 | than strips or tiles of the luminance component.
|
|---|
| 656 |
|
|---|
| 657 | If there are extra samples per pixel (see field ExtraSamples),
|
|---|
| 658 | these data channels have the same number of samples as the
|
|---|
| 659 | luminance component.
|
|---|
| 660 |
|
|---|
| 661 | Rewrite the YCbCrSubSampling field description (pp 91-92) as follows
|
|---|
| 662 | (largely to eliminate possibly-misleading references to
|
|---|
| 663 | ImageWidth/ImageLength of the subsampled components):
|
|---|
| 664 |
|
|---|
| 665 | (first paragraph unchanged)
|
|---|
| 666 |
|
|---|
| 667 | The two elements of this field are defined as follows:
|
|---|
| 668 |
|
|---|
| 669 | Short 0: ChromaSubsampleHoriz:
|
|---|
| 670 |
|
|---|
| 671 | 1 = there are equal numbers of luma and chroma samples horizontally.
|
|---|
| 672 |
|
|---|
| 673 | 2 = there are twice as many luma samples as chroma samples
|
|---|
| 674 | horizontally.
|
|---|
| 675 |
|
|---|
| 676 | 4 = there are four times as many luma samples as chroma samples
|
|---|
| 677 | horizontally.
|
|---|
| 678 |
|
|---|
| 679 | Short 1: ChromaSubsampleVert:
|
|---|
| 680 |
|
|---|
| 681 | 1 = there are equal numbers of luma and chroma samples vertically.
|
|---|
| 682 |
|
|---|
| 683 | 2 = there are twice as many luma samples as chroma samples
|
|---|
| 684 | vertically.
|
|---|
| 685 |
|
|---|
| 686 | 4 = there are four times as many luma samples as chroma samples
|
|---|
| 687 | vertically.
|
|---|
| 688 |
|
|---|
| 689 | ChromaSubsampleVert shall always be less than or equal to
|
|---|
| 690 | ChromaSubsampleHoriz. Note that Cb and Cr have the same sampling
|
|---|
| 691 | ratios.
|
|---|
| 692 |
|
|---|
| 693 | In a strip TIFF file, RowsPerStrip is required to be an integer
|
|---|
| 694 | multiple of ChromaSubSampleVert (unless RowsPerStrip >=
|
|---|
| 695 | ImageLength, in which case its exact value is unimportant).
|
|---|
| 696 | If ImageWidth and ImageLength are not multiples of
|
|---|
| 697 | ChromaSubsampleHoriz and ChromaSubsampleVert respectively, then the
|
|---|
| 698 | source data shall be padded to the next integer multiple of these
|
|---|
| 699 | values before downsampling.
|
|---|
| 700 |
|
|---|
| 701 | In a tiled TIFF file, TileWidth must be an integer multiple of
|
|---|
| 702 | ChromaSubsampleHoriz and TileLength must be an integer multiple of
|
|---|
| 703 | ChromaSubsampleVert. Padding will occur to tile boundaries.
|
|---|
| 704 |
|
|---|
| 705 | The default values of this field are [ 2,2 ]. Thus, YCbCr data is
|
|---|
| 706 | downsampled by default!
|
|---|
| 707 | </pre>
|
|---|