stb_image.h
Go to the documentation of this file.
00001 /* stb_image - v2.06 - public domain image loader - http://nothings.org/stb_image.h
00002                                      no warranty implied; use at your own risk
00003 
00004    Do this:
00005       #define STB_IMAGE_IMPLEMENTATION
00006    before you include this file in *one* C or C++ file to create the implementation.
00007 
00008    // i.e. it should look like this:
00009    #include ...
00010    #include ...
00011    #include ...
00012    #define STB_IMAGE_IMPLEMENTATION
00013    #include "stb_image.h"
00014 
00015    You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
00016    And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
00017 
00018 
00019    QUICK NOTES:
00020       Primarily of interest to game developers and other people who can
00021           avoid problematic images and only need the trivial interface
00022 
00023       JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
00024       PNG 1/2/4/8-bit-per-channel (16 bpc not supported)
00025 
00026       TGA (not sure what subset, if a subset)
00027       BMP non-1bpp, non-RLE
00028       PSD (composited view only, no extra channels)
00029 
00030       GIF (*comp always reports as 4-channel)
00031       HDR (radiance rgbE format)
00032       PIC (Softimage PIC)
00033       PNM (PPM and PGM binary only)
00034 
00035       - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
00036       - decode from arbitrary I/O callbacks
00037       - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
00038 
00039    Full documentation under "DOCUMENTATION" below.
00040 
00041 
00042    Revision 2.00 release notes:
00043 
00044       - Progressive JPEG is now supported.
00045 
00046       - PPM and PGM binary formats are now supported, thanks to Ken Miller.
00047 
00048       - x86 platforms now make use of SSE2 SIMD instructions for
00049         JPEG decoding, and ARM platforms can use NEON SIMD if requested.
00050         This work was done by Fabian "ryg" Giesen. SSE2 is used by
00051         default, but NEON must be enabled explicitly; see docs.
00052 
00053         With other JPEG optimizations included in this version, we see
00054         2x speedup on a JPEG on an x86 machine, and a 1.5x speedup
00055         on a JPEG on an ARM machine, relative to previous versions of this
00056         library. The same results will not obtain for all JPGs and for all
00057         x86/ARM machines. (Note that progressive JPEGs are significantly
00058         slower to decode than regular JPEGs.) This doesn't mean that this
00059         is the fastest JPEG decoder in the land; rather, it brings it
00060         closer to parity with standard libraries. If you want the fastest
00061         decode, look elsewhere. (See "Philosophy" section of docs below.)
00062 
00063         See final bullet items below for more info on SIMD.
00064 
00065       - Added STBI_MALLOC, STBI_REALLOC, and STBI_FREE macros for replacing
00066         the memory allocator. Unlike other STBI libraries, these macros don't
00067         support a context parameter, so if you need to pass a context in to
00068         the allocator, you'll have to store it in a global or a thread-local
00069         variable.
00070 
00071       - Split existing STBI_NO_HDR flag into two flags, STBI_NO_HDR and
00072         STBI_NO_LINEAR.
00073             STBI_NO_HDR:     suppress implementation of .hdr reader format
00074             STBI_NO_LINEAR:  suppress high-dynamic-range light-linear float API
00075 
00076       - You can suppress implementation of any of the decoders to reduce
00077         your code footprint by #defining one or more of the following
00078         symbols before creating the implementation.
00079 
00080             STBI_NO_JPEG
00081             STBI_NO_PNG
00082             STBI_NO_BMP
00083             STBI_NO_PSD
00084             STBI_NO_TGA
00085             STBI_NO_GIF
00086             STBI_NO_HDR
00087             STBI_NO_PIC
00088             STBI_NO_PNM   (.ppm and .pgm)
00089 
00090       - You can request *only* certain decoders and suppress all other ones
00091         (this will be more forward-compatible, as addition of new decoders
00092         doesn't require you to disable them explicitly):
00093 
00094             STBI_ONLY_JPEG
00095             STBI_ONLY_PNG
00096             STBI_ONLY_BMP
00097             STBI_ONLY_PSD
00098             STBI_ONLY_TGA
00099             STBI_ONLY_GIF
00100             STBI_ONLY_HDR
00101             STBI_ONLY_PIC
00102             STBI_ONLY_PNM   (.ppm and .pgm)
00103 
00104          Note that you can define multiples of these, and you will get all
00105          of them ("only x" and "only y" is interpreted to mean "only x&y").
00106 
00107        - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
00108          want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
00109 
00110       - Compilation of all SIMD code can be suppressed with
00111             #define STBI_NO_SIMD
00112         It should not be necessary to disable SIMD unless you have issues
00113         compiling (e.g. using an x86 compiler which doesn't support SSE
00114         intrinsics or that doesn't support the method used to detect
00115         SSE2 support at run-time), and even those can be reported as
00116         bugs so I can refine the built-in compile-time checking to be
00117         smarter.
00118 
00119       - The old STBI_SIMD system which allowed installing a user-defined
00120         IDCT etc. has been removed. If you need this, don't upgrade. My
00121         assumption is that almost nobody was doing this, and those who
00122         were will find the built-in SIMD more satisfactory anyway.
00123 
00124       - RGB values computed for JPEG images are slightly different from
00125         previous versions of stb_image. (This is due to using less
00126         integer precision in SIMD.) The C code has been adjusted so
00127         that the same RGB values will be computed regardless of whether
00128         SIMD support is available, so your app should always produce
00129         consistent results. But these results are slightly different from
00130         previous versions. (Specifically, about 3% of available YCbCr values
00131         will compute different RGB results from pre-1.49 versions by +-1;
00132         most of the deviating values are one smaller in the G channel.)
00133 
00134       - If you must produce consistent results with previous versions of
00135         stb_image, #define STBI_JPEG_OLD and you will get the same results
00136         you used to; however, you will not get the SIMD speedups for
00137         the YCbCr-to-RGB conversion step (although you should still see
00138         significant JPEG speedup from the other changes).
00139 
00140         Please note that STBI_JPEG_OLD is a temporary feature; it will be
00141         removed in future versions of the library. It is only intended for
00142         near-term back-compatibility use.
00143 
00144 
00145    Latest revision history:
00146       2.06  (2015-04-19) fix bug where PSD returns wrong '*comp' value
00147       2.05  (2015-04-19) fix bug in progressive JPEG handling, fix warning
00148       2.04  (2015-04-15) try to re-enable SIMD on MinGW 64-bit
00149       2.03  (2015-04-12) additional corruption checking
00150                          stbi_set_flip_vertically_on_load
00151                          fix NEON support; fix mingw support
00152       2.02  (2015-01-19) fix incorrect assert, fix warning
00153       2.01  (2015-01-17) fix various warnings
00154       2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
00155       2.00  (2014-12-25) optimize JPEG, including x86 SSE2 & ARM NEON SIMD
00156                          progressive JPEG
00157                          PGM/PPM support
00158                          STBI_MALLOC,STBI_REALLOC,STBI_FREE
00159                          STBI_NO_*, STBI_ONLY_*
00160                          GIF bugfix
00161       1.48  (2014-12-14) fix incorrectly-named assert()
00162       1.47  (2014-12-14) 1/2/4-bit PNG support (both grayscale and paletted)
00163                          optimize PNG
00164                          fix bug in interlaced PNG with user-specified channel count
00165 
00166    See end of file for full revision history.
00167 
00168 
00169  ============================    Contributors    =========================
00170 
00171  Image formats                                Bug fixes & warning fixes
00172     Sean Barrett (jpeg, png, bmp)                Marc LeBlanc
00173     Nicolas Schulz (hdr, psd)                    Christpher Lloyd
00174     Jonathan Dummer (tga)                        Dave Moore
00175     Jean-Marc Lienher (gif)                      Won Chun
00176     Tom Seddon (pic)                             the Horde3D community
00177     Thatcher Ulrich (psd)                        Janez Zemva
00178     Ken Miller (pgm, ppm)                        Jonathan Blow
00179                                                  Laurent Gomila
00180                                                  Aruelien Pocheville
00181  Extensions, features                            Ryamond Barbiero
00182     Jetro Lauha (stbi_info)                      David Woo
00183     Martin "SpartanJ" Golini (stbi_info)         Martin Golini
00184     James "moose2000" Brown (iPhone PNG)         Roy Eltham
00185     Ben "Disch" Wenger (io callbacks)            Luke Graham
00186     Omar Cornut (1/2/4-bit PNG)                  Thomas Ruf
00187     Nicolas Guillemot (vertical flip)            John Bartholomew
00188                                                  Ken Hamada
00189  Optimizations & bugfixes                        Cort Stratton
00190     Fabian "ryg" Giesen                          Blazej Dariusz Roszkowski
00191     Arseny Kapoulkine                            Thibault Reuille
00192                                                  Paul Du Bois
00193                                                  Guillaume George
00194   If your name should be here but                Jerry Jansson
00195   isn't, let Sean know.                          Hayaki Saito
00196                                                  Johan Duparc
00197                                                  Ronny Chevalier
00198                                                  Michal Cichon
00199                                                  Tero Hanninen
00200                                                  Sergio Gonzalez
00201                                                  Cass Everitt
00202                                                  Engin Manap
00203                                                  Martins Mozeiko
00204                                                  Joseph Thomson
00205                                                  Phil Jordan
00206 
00207 License:
00208    This software is in the public domain. Where that dedication is not
00209    recognized, you are granted a perpetual, irrevocable license to copy
00210    and modify this file however you want.
00211 
00212 */
00213 
00214 #ifndef STBI_INCLUDE_STB_IMAGE_H
00215 #define STBI_INCLUDE_STB_IMAGE_H
00216 
00217 // DOCUMENTATION
00218 //
00219 // Limitations:
00220 //    - no 16-bit-per-channel PNG
00221 //    - no 12-bit-per-channel JPEG
00222 //    - no JPEGs with arithmetic coding
00223 //    - no 1-bit BMP
00224 //    - GIF always returns *comp=4
00225 //
00226 // Basic usage (see HDR discussion below for HDR usage):
00227 //    int x,y,n;
00228 //    unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
00229 //    // ... process data if not NULL ...
00230 //    // ... x = width, y = height, n = # 8-bit components per pixel ...
00231 //    // ... replace '0' with '1'..'4' to force that many components per pixel
00232 //    // ... but 'n' will always be the number that it would have been if you said 0
00233 //    stbi_image_free(data)
00234 //
00235 // Standard parameters:
00236 //    int *x       -- outputs image width in pixels
00237 //    int *y       -- outputs image height in pixels
00238 //    int *comp    -- outputs # of image components in image file
00239 //    int req_comp -- if non-zero, # of image components requested in result
00240 //
00241 // The return value from an image loader is an 'unsigned char *' which points
00242 // to the pixel data, or NULL on an allocation failure or if the image is
00243 // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
00244 // with each pixel consisting of N interleaved 8-bit components; the first
00245 // pixel pointed to is top-left-most in the image. There is no padding between
00246 // image scanlines or between pixels, regardless of format. The number of
00247 // components N is 'req_comp' if req_comp is non-zero, or *comp otherwise.
00248 // If req_comp is non-zero, *comp has the number of components that _would_
00249 // have been output otherwise. E.g. if you set req_comp to 4, you will always
00250 // get RGBA output, but you can check *comp to see if it's trivially opaque
00251 // because e.g. there were only 3 channels in the source image.
00252 //
00253 // An output image with N components has the following components interleaved
00254 // in this order in each pixel:
00255 //
00256 //     N=#comp     components
00257 //       1           grey
00258 //       2           grey, alpha
00259 //       3           red, green, blue
00260 //       4           red, green, blue, alpha
00261 //
00262 // If image loading fails for any reason, the return value will be NULL,
00263 // and *x, *y, *comp will be unchanged. The function stbi_failure_reason()
00264 // can be queried for an extremely brief, end-user unfriendly explanation
00265 // of why the load failed. Define STBI_NO_FAILURE_STRINGS to avoid
00266 // compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
00267 // more user-friendly ones.
00268 //
00269 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
00270 //
00271 // ===========================================================================
00272 //
00273 // Philosophy
00274 //
00275 // stb libraries are designed with the following priorities:
00276 //
00277 //    1. easy to use
00278 //    2. easy to maintain
00279 //    3. good performance
00280 //
00281 // Sometimes I let "good performance" creep up in priority over "easy to maintain",
00282 // and for best performance I may provide less-easy-to-use APIs that give higher
00283 // performance, in addition to the easy to use ones. Nevertheless, it's important
00284 // to keep in mind that from the standpoint of you, a client of this library,
00285 // all you care about is #1 and #3, and stb libraries do not emphasize #3 above all.
00286 //
00287 // Some secondary priorities arise directly from the first two, some of which
00288 // make more explicit reasons why performance can't be emphasized.
00289 //
00290 //    - Portable ("ease of use")
00291 //    - Small footprint ("easy to maintain")
00292 //    - No dependencies ("ease of use")
00293 //
00294 // ===========================================================================
00295 //
00296 // I/O callbacks
00297 //
00298 // I/O callbacks allow you to read from arbitrary sources, like packaged
00299 // files or some other source. Data read from callbacks are processed
00300 // through a small internal buffer (currently 128 bytes) to try to reduce
00301 // overhead.
00302 //
00303 // The three functions you must define are "read" (reads some bytes of data),
00304 // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
00305 //
00306 // ===========================================================================
00307 //
00308 // SIMD support
00309 //
00310 // The JPEG decoder will try to automatically use SIMD kernels on x86 when
00311 // supported by the compiler. For ARM Neon support, you must explicitly
00312 // request it.
00313 //
00314 // (The old do-it-yourself SIMD API is no longer supported in the current
00315 // code.)
00316 //
00317 // On x86, SSE2 will automatically be used when available based on a run-time
00318 // test; if not, the generic C versions are used as a fall-back. On ARM targets,
00319 // the typical path is to have separate builds for NEON and non-NEON devices
00320 // (at least this is true for iOS and Android). Therefore, the NEON support is
00321 // toggled by a build flag: define STBI_NEON to get NEON loops.
00322 //
00323 // The output of the JPEG decoder is slightly different from versions where
00324 // SIMD support was introduced (that is, for versions before 1.49). The
00325 // difference is only +-1 in the 8-bit RGB channels, and only on a small
00326 // fraction of pixels. You can force the pre-1.49 behavior by defining
00327 // STBI_JPEG_OLD, but this will disable some of the SIMD decoding path
00328 // and hence cost some performance.
00329 //
00330 // If for some reason you do not want to use any of SIMD code, or if
00331 // you have issues compiling it, you can disable it entirely by
00332 // defining STBI_NO_SIMD.
00333 //
00334 // ===========================================================================
00335 //
00336 // HDR image support   (disable by defining STBI_NO_HDR)
00337 //
00338 // stb_image now supports loading HDR images in general, and currently
00339 // the Radiance .HDR file format, although the support is provided
00340 // generically. You can still load any file through the existing interface;
00341 // if you attempt to load an HDR file, it will be automatically remapped to
00342 // LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
00343 // both of these constants can be reconfigured through this interface:
00344 //
00345 //     stbi_hdr_to_ldr_gamma(2.2f);
00346 //     stbi_hdr_to_ldr_scale(1.0f);
00347 //
00348 // (note, do not use _inverse_ constants; stbi_image will invert them
00349 // appropriately).
00350 //
00351 // Additionally, there is a new, parallel interface for loading files as
00352 // (linear) floats to preserve the full dynamic range:
00353 //
00354 //    float *data = stbi_loadf(filename, &x, &y, &n, 0);
00355 //
00356 // If you load LDR images through this interface, those images will
00357 // be promoted to floating point values, run through the inverse of
00358 // constants corresponding to the above:
00359 //
00360 //     stbi_ldr_to_hdr_scale(1.0f);
00361 //     stbi_ldr_to_hdr_gamma(2.2f);
00362 //
00363 // Finally, given a filename (or an open file or memory block--see header
00364 // file for details) containing image data, you can query for the "most
00365 // appropriate" interface to use (that is, whether the image is HDR or
00366 // not), using:
00367 //
00368 //     stbi_is_hdr(char *filename);
00369 //
00370 // ===========================================================================
00371 //
00372 // iPhone PNG support:
00373 //
00374 // By default we convert iphone-formatted PNGs back to RGB, even though
00375 // they are internally encoded differently. You can disable this conversion
00376 // by by calling stbi_convert_iphone_png_to_rgb(0), in which case
00377 // you will always just get the native iphone "format" through (which
00378 // is BGR stored in RGB).
00379 //
00380 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
00381 // pixel to remove any premultiplied alpha *only* if the image file explicitly
00382 // says there's premultiplied data (currently only happens in iPhone images,
00383 // and only if iPhone convert-to-rgb processing is on).
00384 //
00385 
00386 
00387 #ifndef STBI_NO_STDIO
00388 #include <stdio.h>
00389 #endif // STBI_NO_STDIO
00390 
00391 #define STBI_VERSION 1
00392 
00393 enum
00394 {
00395    STBI_default = 0, // only used for req_comp
00396 
00397    STBI_grey       = 1,
00398    STBI_grey_alpha = 2,
00399    STBI_rgb        = 3,
00400    STBI_rgb_alpha  = 4
00401 };
00402 
00403 typedef unsigned char stbi_uc;
00404 
00405 #ifdef __cplusplus
00406 extern "C" {
00407 #endif
00408 
00409 #ifdef STB_IMAGE_STATIC
00410 #define STBIDEF static
00411 #else
00412 #define STBIDEF extern
00413 #endif
00414 
00416 //
00417 // PRIMARY API - works on images of any type
00418 //
00419 
00420 //
00421 // load image by filename, open file, or memory buffer
00422 //
00423 
00424 typedef struct
00425 {
00426    int      (*read)  (void *user,char *data,int size);   // fill 'data' with 'size' bytes.  return number of bytes actually read
00427    void     (*skip)  (void *user,int n);                 // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
00428    int      (*eof)   (void *user);                       // returns nonzero if we are at end of file/data
00429 } stbi_io_callbacks;
00430 
00431 STBIDEF stbi_uc *stbi_load               (char              const *filename,           int *x, int *y, int *comp, int req_comp);
00432 STBIDEF stbi_uc *stbi_load_from_memory   (stbi_uc           const *buffer, int len   , int *x, int *y, int *comp, int req_comp);
00433 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk  , void *user, int *x, int *y, int *comp, int req_comp);
00434 
00435 #ifndef STBI_NO_STDIO
00436 STBIDEF stbi_uc *stbi_load_from_file  (FILE *f,                  int *x, int *y, int *comp, int req_comp);
00437 // for stbi_load_from_file, file pointer is left pointing immediately after image
00438 #endif
00439 
00440 #ifndef STBI_NO_LINEAR
00441    STBIDEF float *stbi_loadf                 (char const *filename,           int *x, int *y, int *comp, int req_comp);
00442    STBIDEF float *stbi_loadf_from_memory     (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
00443    STBIDEF float *stbi_loadf_from_callbacks  (stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp);
00444 
00445    #ifndef STBI_NO_STDIO
00446    STBIDEF float *stbi_loadf_from_file  (FILE *f,                int *x, int *y, int *comp, int req_comp);
00447    #endif
00448 #endif
00449 
00450 #ifndef STBI_NO_HDR
00451    STBIDEF void   stbi_hdr_to_ldr_gamma(float gamma);
00452    STBIDEF void   stbi_hdr_to_ldr_scale(float scale);
00453 #endif
00454 
00455 #ifndef STBI_NO_LINEAR
00456    STBIDEF void   stbi_ldr_to_hdr_gamma(float gamma);
00457    STBIDEF void   stbi_ldr_to_hdr_scale(float scale);
00458 #endif // STBI_NO_HDR
00459 
00460 // stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
00461 STBIDEF int    stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
00462 STBIDEF int    stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
00463 #ifndef STBI_NO_STDIO
00464 STBIDEF int      stbi_is_hdr          (char const *filename);
00465 STBIDEF int      stbi_is_hdr_from_file(FILE *f);
00466 #endif // STBI_NO_STDIO
00467 
00468 
00469 // get a VERY brief reason for failure
00470 // NOT THREADSAFE
00471 STBIDEF const char *stbi_failure_reason  (void);
00472 
00473 // free the loaded image -- this is just free()
00474 STBIDEF void     stbi_image_free      (void *retval_from_stbi_load);
00475 
00476 // get image dimensions & components without fully decoding
00477 STBIDEF int      stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
00478 STBIDEF int      stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp);
00479 
00480 #ifndef STBI_NO_STDIO
00481 STBIDEF int      stbi_info            (char const *filename,     int *x, int *y, int *comp);
00482 STBIDEF int      stbi_info_from_file  (FILE *f,                  int *x, int *y, int *comp);
00483 
00484 #endif
00485 
00486 
00487 
00488 // for image formats that explicitly notate that they have premultiplied alpha,
00489 // we just return the colors as stored in the file. set this flag to force
00490 // unpremultiplication. results are undefined if the unpremultiply overflow.
00491 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
00492 
00493 // indicate whether we should process iphone images back to canonical format,
00494 // or just pass them through "as-is"
00495 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
00496 
00497 // flip the image vertically, so the first pixel in the output array is the bottom left
00498 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip);
00499 
00500 // ZLIB client - used by PNG, available for other purposes
00501 
00502 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
00503 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
00504 STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
00505 STBIDEF int   stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
00506 
00507 STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
00508 STBIDEF int   stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
00509 
00510 
00511 #ifdef __cplusplus
00512 }
00513 #endif
00514 
00515 //
00516 //
00518 #endif // STBI_INCLUDE_STB_IMAGE_H
00519 
00520 #ifdef STB_IMAGE_IMPLEMENTATION
00521 
00522 #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
00523   || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
00524   || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
00525   || defined(STBI_ONLY_ZLIB)
00526    #ifndef STBI_ONLY_JPEG
00527    #define STBI_NO_JPEG
00528    #endif
00529    #ifndef STBI_ONLY_PNG
00530    #define STBI_NO_PNG
00531    #endif
00532    #ifndef STBI_ONLY_BMP
00533    #define STBI_NO_BMP
00534    #endif
00535    #ifndef STBI_ONLY_PSD
00536    #define STBI_NO_PSD
00537    #endif
00538    #ifndef STBI_ONLY_TGA
00539    #define STBI_NO_TGA
00540    #endif
00541    #ifndef STBI_ONLY_GIF
00542    #define STBI_NO_GIF
00543    #endif
00544    #ifndef STBI_ONLY_HDR
00545    #define STBI_NO_HDR
00546    #endif
00547    #ifndef STBI_ONLY_PIC
00548    #define STBI_NO_PIC
00549    #endif
00550    #ifndef STBI_ONLY_PNM
00551    #define STBI_NO_PNM
00552    #endif
00553 #endif
00554 
00555 #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
00556 #define STBI_NO_ZLIB
00557 #endif
00558 
00559 
00560 #include <stdarg.h>
00561 #include <stddef.h> // ptrdiff_t on osx
00562 #include <stdlib.h>
00563 #include <string.h>
00564 
00565 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
00566 #include <math.h>  // ldexp
00567 #endif
00568 
00569 #ifndef STBI_NO_STDIO
00570 #include <stdio.h>
00571 #endif
00572 
00573 #ifndef STBI_ASSERT
00574 #include <assert.h>
00575 #define STBI_ASSERT(x) assert(x)
00576 #endif
00577 
00578 
00579 #ifndef _MSC_VER
00580    #ifdef __cplusplus
00581    #define stbi_inline inline
00582    #else
00583    #define stbi_inline
00584    #endif
00585 #else
00586    #define stbi_inline __forceinline
00587 #endif
00588 
00589 
00590 #ifdef _MSC_VER
00591 typedef unsigned short stbi__uint16;
00592 typedef   signed short stbi__int16;
00593 typedef unsigned int   stbi__uint32;
00594 typedef   signed int   stbi__int32;
00595 #else
00596 #include <stdint.h>
00597 typedef uint16_t stbi__uint16;
00598 typedef int16_t  stbi__int16;
00599 typedef uint32_t stbi__uint32;
00600 typedef int32_t  stbi__int32;
00601 #endif
00602 
00603 // should produce compiler error if size is wrong
00604 typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1];
00605 
00606 #ifdef _MSC_VER
00607 #define STBI_NOTUSED(v)  (void)(v)
00608 #else
00609 #define STBI_NOTUSED(v)  (void)sizeof(v)
00610 #endif
00611 
00612 #ifdef _MSC_VER
00613 #define STBI_HAS_LROTL
00614 #endif
00615 
00616 #ifdef STBI_HAS_LROTL
00617    #define stbi_lrot(x,y)  _lrotl(x,y)
00618 #else
00619    #define stbi_lrot(x,y)  (((x) << (y)) | ((x) >> (32 - (y))))
00620 #endif
00621 
00622 #if defined(STBI_MALLOC) && defined(STBI_FREE) && defined(STBI_REALLOC)
00623 // ok
00624 #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC)
00625 // ok
00626 #else
00627 #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC."
00628 #endif
00629 
00630 #ifndef STBI_MALLOC
00631 #define STBI_MALLOC(sz)    malloc(sz)
00632 #define STBI_REALLOC(p,sz) realloc(p,sz)
00633 #define STBI_FREE(p)       free(p)
00634 #endif
00635 
00636 // x86/x64 detection
00637 #if defined(__x86_64__) || defined(_M_X64)
00638 #define STBI__X64_TARGET
00639 #elif defined(__i386) || defined(_M_IX86)
00640 #define STBI__X86_TARGET
00641 #endif
00642 
00643 #if defined(__GNUC__) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET)) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
00644 // NOTE: not clear do we actually need this for the 64-bit path?
00645 // gcc doesn't support sse2 intrinsics unless you compile with -msse2,
00646 // (but compiling with -msse2 allows the compiler to use SSE2 everywhere;
00647 // this is just broken and gcc are jerks for not fixing it properly
00648 // http://www.virtualdub.org/blog/pivot/entry.php?id=363 )
00649 #define STBI_NO_SIMD
00650 #endif
00651 
00652 #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
00653 // Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
00654 //
00655 // 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
00656 // Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
00657 // As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
00658 // simultaneously enabling "-mstackrealign".
00659 //
00660 // See https://github.com/nothings/stb/issues/81 for more information.
00661 //
00662 // So default to no SSE2 on 32-bit MinGW. If you've read this far and added
00663 // -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
00664 #define STBI_NO_SIMD
00665 #endif
00666 
00667 #if !defined(STBI_NO_SIMD) && defined(STBI__X86_TARGET)
00668 #define STBI_SSE2
00669 #include <emmintrin.h>
00670 
00671 #ifdef _MSC_VER
00672 
00673 #if _MSC_VER >= 1400  // not VC6
00674 #include <intrin.h> // __cpuid
00675 static int stbi__cpuid3(void)
00676 {
00677    int info[4];
00678    __cpuid(info,1);
00679    return info[3];
00680 }
00681 #else
00682 static int stbi__cpuid3(void)
00683 {
00684    int res;
00685    __asm {
00686       mov  eax,1
00687       cpuid
00688       mov  res,edx
00689    }
00690    return res;
00691 }
00692 #endif
00693 
00694 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
00695 
00696 static int stbi__sse2_available()
00697 {
00698    int info3 = stbi__cpuid3();
00699    return ((info3 >> 26) & 1) != 0;
00700 }
00701 #else // assume GCC-style if not VC++
00702 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
00703 
00704 static int stbi__sse2_available()
00705 {
00706 #if defined(__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__) >= 408 // GCC 4.8 or later
00707    // GCC 4.8+ has a nice way to do this
00708    return __builtin_cpu_supports("sse2");
00709 #else
00710    // portable way to do this, preferably without using GCC inline ASM?
00711    // just bail for now.
00712    return 0;
00713 #endif
00714 }
00715 #endif
00716 #endif
00717 
00718 // ARM NEON
00719 #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
00720 #undef STBI_NEON
00721 #endif
00722 
00723 #ifdef STBI_NEON
00724 #include <arm_neon.h>
00725 // assume GCC or Clang on ARM targets
00726 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
00727 #endif
00728 
00729 #ifndef STBI_SIMD_ALIGN
00730 #define STBI_SIMD_ALIGN(type, name) type name
00731 #endif
00732 
00734 //
00735 //  stbi__context struct and start_xxx functions
00736 
00737 // stbi__context structure is our basic context used by all images, so it
00738 // contains all the IO context, plus some basic image information
00739 typedef struct
00740 {
00741    stbi__uint32 img_x, img_y;
00742    int img_n, img_out_n;
00743 
00744    stbi_io_callbacks io;
00745    void *io_user_data;
00746 
00747    int read_from_callbacks;
00748    int buflen;
00749    stbi_uc buffer_start[128];
00750 
00751    stbi_uc *img_buffer, *img_buffer_end;
00752    stbi_uc *img_buffer_original;
00753 } stbi__context;
00754 
00755 
00756 static void stbi__refill_buffer(stbi__context *s);
00757 
00758 // initialize a memory-decode context
00759 static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len)
00760 {
00761    s->io.read = NULL;
00762    s->read_from_callbacks = 0;
00763    s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer;
00764    s->img_buffer_end = (stbi_uc *) buffer+len;
00765 }
00766 
00767 // initialize a callback-based context
00768 static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user)
00769 {
00770    s->io = *c;
00771    s->io_user_data = user;
00772    s->buflen = sizeof(s->buffer_start);
00773    s->read_from_callbacks = 1;
00774    s->img_buffer_original = s->buffer_start;
00775    stbi__refill_buffer(s);
00776 }
00777 
00778 #ifndef STBI_NO_STDIO
00779 
00780 static int stbi__stdio_read(void *user, char *data, int size)
00781 {
00782    return (int) fread(data,1,size,(FILE*) user);
00783 }
00784 
00785 static void stbi__stdio_skip(void *user, int n)
00786 {
00787    fseek((FILE*) user, n, SEEK_CUR);
00788 }
00789 
00790 static int stbi__stdio_eof(void *user)
00791 {
00792    return feof((FILE*) user);
00793 }
00794 
00795 static stbi_io_callbacks stbi__stdio_callbacks =
00796 {
00797    stbi__stdio_read,
00798    stbi__stdio_skip,
00799    stbi__stdio_eof,
00800 };
00801 
00802 static void stbi__start_file(stbi__context *s, FILE *f)
00803 {
00804    stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f);
00805 }
00806 
00807 //static void stop_file(stbi__context *s) { }
00808 
00809 #endif // !STBI_NO_STDIO
00810 
00811 static void stbi__rewind(stbi__context *s)
00812 {
00813    // conceptually rewind SHOULD rewind to the beginning of the stream,
00814    // but we just rewind to the beginning of the initial buffer, because
00815    // we only use it after doing 'test', which only ever looks at at most 92 bytes
00816    s->img_buffer = s->img_buffer_original;
00817 }
00818 
00819 #ifndef STBI_NO_JPEG
00820 static int      stbi__jpeg_test(stbi__context *s);
00821 static stbi_uc *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
00822 static int      stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp);
00823 #endif
00824 
00825 #ifndef STBI_NO_PNG
00826 static int      stbi__png_test(stbi__context *s);
00827 static stbi_uc *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
00828 static int      stbi__png_info(stbi__context *s, int *x, int *y, int *comp);
00829 #endif
00830 
00831 #ifndef STBI_NO_BMP
00832 static int      stbi__bmp_test(stbi__context *s);
00833 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
00834 static int      stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp);
00835 #endif
00836 
00837 #ifndef STBI_NO_TGA
00838 static int      stbi__tga_test(stbi__context *s);
00839 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
00840 static int      stbi__tga_info(stbi__context *s, int *x, int *y, int *comp);
00841 #endif
00842 
00843 #ifndef STBI_NO_PSD
00844 static int      stbi__psd_test(stbi__context *s);
00845 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
00846 static int      stbi__psd_info(stbi__context *s, int *x, int *y, int *comp);
00847 #endif
00848 
00849 #ifndef STBI_NO_HDR
00850 static int      stbi__hdr_test(stbi__context *s);
00851 static float   *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
00852 static int      stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp);
00853 #endif
00854 
00855 #ifndef STBI_NO_PIC
00856 static int      stbi__pic_test(stbi__context *s);
00857 static stbi_uc *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
00858 static int      stbi__pic_info(stbi__context *s, int *x, int *y, int *comp);
00859 #endif
00860 
00861 #ifndef STBI_NO_GIF
00862 static int      stbi__gif_test(stbi__context *s);
00863 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
00864 static int      stbi__gif_info(stbi__context *s, int *x, int *y, int *comp);
00865 #endif
00866 
00867 #ifndef STBI_NO_PNM
00868 static int      stbi__pnm_test(stbi__context *s);
00869 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
00870 static int      stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp);
00871 #endif
00872 
00873 // this is not threadsafe
00874 static const char *stbi__g_failure_reason;
00875 
00876 STBIDEF const char *stbi_failure_reason(void)
00877 {
00878    return stbi__g_failure_reason;
00879 }
00880 
00881 static int stbi__err(const char *str)
00882 {
00883    stbi__g_failure_reason = str;
00884    return 0;
00885 }
00886 
00887 static void *stbi__malloc(size_t size)
00888 {
00889     return STBI_MALLOC(size);
00890 }
00891 
00892 // stbi__err - error
00893 // stbi__errpf - error returning pointer to float
00894 // stbi__errpuc - error returning pointer to unsigned char
00895 
00896 #ifdef STBI_NO_FAILURE_STRINGS
00897    #define stbi__err(x,y)  0
00898 #elif defined(STBI_FAILURE_USERMSG)
00899    #define stbi__err(x,y)  stbi__err(y)
00900 #else
00901    #define stbi__err(x,y)  stbi__err(x)
00902 #endif
00903 
00904 #define stbi__errpf(x,y)   ((float *) (stbi__err(x,y)?NULL:NULL))
00905 #define stbi__errpuc(x,y)  ((unsigned char *) (stbi__err(x,y)?NULL:NULL))
00906 
00907 STBIDEF void stbi_image_free(void *retval_from_stbi_load)
00908 {
00909    STBI_FREE(retval_from_stbi_load);
00910 }
00911 
00912 #ifndef STBI_NO_LINEAR
00913 static float   *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
00914 #endif
00915 
00916 #ifndef STBI_NO_HDR
00917 static stbi_uc *stbi__hdr_to_ldr(float   *data, int x, int y, int comp);
00918 #endif
00919 
00920 static int stbi__vertically_flip_on_load = 0;
00921 
00922 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
00923 {
00924     stbi__vertically_flip_on_load = flag_true_if_should_flip;
00925 }
00926 
00927 static unsigned char *stbi__load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
00928 {
00929    #ifndef STBI_NO_JPEG
00930    if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp);
00931    #endif
00932    #ifndef STBI_NO_PNG
00933    if (stbi__png_test(s))  return stbi__png_load(s,x,y,comp,req_comp);
00934    #endif
00935    #ifndef STBI_NO_BMP
00936    if (stbi__bmp_test(s))  return stbi__bmp_load(s,x,y,comp,req_comp);
00937    #endif
00938    #ifndef STBI_NO_GIF
00939    if (stbi__gif_test(s))  return stbi__gif_load(s,x,y,comp,req_comp);
00940    #endif
00941    #ifndef STBI_NO_PSD
00942    if (stbi__psd_test(s))  return stbi__psd_load(s,x,y,comp,req_comp);
00943    #endif
00944    #ifndef STBI_NO_PIC
00945    if (stbi__pic_test(s))  return stbi__pic_load(s,x,y,comp,req_comp);
00946    #endif
00947    #ifndef STBI_NO_PNM
00948    if (stbi__pnm_test(s))  return stbi__pnm_load(s,x,y,comp,req_comp);
00949    #endif
00950 
00951    #ifndef STBI_NO_HDR
00952    if (stbi__hdr_test(s)) {
00953       float *hdr = stbi__hdr_load(s, x,y,comp,req_comp);
00954       return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
00955    }
00956    #endif
00957 
00958    #ifndef STBI_NO_TGA
00959    // test tga last because it's a crappy test!
00960    if (stbi__tga_test(s))
00961       return stbi__tga_load(s,x,y,comp,req_comp);
00962    #endif
00963 
00964    return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
00965 }
00966 
00967 static unsigned char *stbi__load_flip(stbi__context *s, int *x, int *y, int *comp, int req_comp)
00968 {
00969    unsigned char *result = stbi__load_main(s, x, y, comp, req_comp);
00970 
00971    if (stbi__vertically_flip_on_load && result != NULL) {
00972       int w = *x, h = *y;
00973       int depth = req_comp ? req_comp : *comp;
00974       int row,col,z;
00975       stbi_uc temp;
00976 
00977       // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
00978       for (row = 0; row < (h>>1); row++) {
00979          for (col = 0; col < w; col++) {
00980             for (z = 0; z < depth; z++) {
00981                temp = result[(row * w + col) * depth + z];
00982                result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
00983                result[((h - row - 1) * w + col) * depth + z] = temp;
00984             }
00985          }
00986       }
00987    }
00988 
00989    return result;
00990 }
00991 
00992 static void stbi__float_postprocess(float *result, int *x, int *y, int *comp, int req_comp)
00993 {
00994    if (stbi__vertically_flip_on_load && result != NULL) {
00995       int w = *x, h = *y;
00996       int depth = req_comp ? req_comp : *comp;
00997       int row,col,z;
00998       float temp;
00999 
01000       // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
01001       for (row = 0; row < (h>>1); row++) {
01002          for (col = 0; col < w; col++) {
01003             for (z = 0; z < depth; z++) {
01004                temp = result[(row * w + col) * depth + z];
01005                result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
01006                result[((h - row - 1) * w + col) * depth + z] = temp;
01007             }
01008          }
01009       }
01010    }
01011 }
01012 
01013 
01014 #ifndef STBI_NO_STDIO
01015 
01016 static FILE *stbi__fopen(char const *filename, char const *mode)
01017 {
01018    FILE *f;
01019 #if defined(_MSC_VER) && _MSC_VER >= 1400
01020    if (0 != fopen_s(&f, filename, mode))
01021       f=0;
01022 #else
01023    f = fopen(filename, mode);
01024 #endif
01025    return f;
01026 }
01027 
01028 
01029 STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
01030 {
01031    FILE *f = stbi__fopen(filename, "rb");
01032    unsigned char *result;
01033    if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
01034    result = stbi_load_from_file(f,x,y,comp,req_comp);
01035    fclose(f);
01036    return result;
01037 }
01038 
01039 STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
01040 {
01041    unsigned char *result;
01042    stbi__context s;
01043    stbi__start_file(&s,f);
01044    result = stbi__load_flip(&s,x,y,comp,req_comp);
01045    if (result) {
01046       // need to 'unget' all the characters in the IO buffer
01047       fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
01048    }
01049    return result;
01050 }
01051 #endif //!STBI_NO_STDIO
01052 
01053 STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
01054 {
01055    stbi__context s;
01056    stbi__start_mem(&s,buffer,len);
01057    return stbi__load_flip(&s,x,y,comp,req_comp);
01058 }
01059 
01060 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
01061 {
01062    stbi__context s;
01063    stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
01064    return stbi__load_flip(&s,x,y,comp,req_comp);
01065 }
01066 
01067 #ifndef STBI_NO_LINEAR
01068 static float *stbi__loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
01069 {
01070    unsigned char *data;
01071    #ifndef STBI_NO_HDR
01072    if (stbi__hdr_test(s)) {
01073       float *hdr_data = stbi__hdr_load(s,x,y,comp,req_comp);
01074       if (hdr_data)
01075          stbi__float_postprocess(hdr_data,x,y,comp,req_comp);
01076       return hdr_data;
01077    }
01078    #endif
01079    data = stbi__load_flip(s, x, y, comp, req_comp);
01080    if (data)
01081       return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
01082    return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
01083 }
01084 
01085 STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
01086 {
01087    stbi__context s;
01088    stbi__start_mem(&s,buffer,len);
01089    return stbi__loadf_main(&s,x,y,comp,req_comp);
01090 }
01091 
01092 STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
01093 {
01094    stbi__context s;
01095    stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
01096    return stbi__loadf_main(&s,x,y,comp,req_comp);
01097 }
01098 
01099 #ifndef STBI_NO_STDIO
01100 STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
01101 {
01102    float *result;
01103    FILE *f = stbi__fopen(filename, "rb");
01104    if (!f) return stbi__errpf("can't fopen", "Unable to open file");
01105    result = stbi_loadf_from_file(f,x,y,comp,req_comp);
01106    fclose(f);
01107    return result;
01108 }
01109 
01110 STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
01111 {
01112    stbi__context s;
01113    stbi__start_file(&s,f);
01114    return stbi__loadf_main(&s,x,y,comp,req_comp);
01115 }
01116 #endif // !STBI_NO_STDIO
01117 
01118 #endif // !STBI_NO_LINEAR
01119 
01120 // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
01121 // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
01122 // reports false!
01123 
01124 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
01125 {
01126    #ifndef STBI_NO_HDR
01127    stbi__context s;
01128    stbi__start_mem(&s,buffer,len);
01129    return stbi__hdr_test(&s);
01130    #else
01131    STBI_NOTUSED(buffer);
01132    STBI_NOTUSED(len);
01133    return 0;
01134    #endif
01135 }
01136 
01137 #ifndef STBI_NO_STDIO
01138 STBIDEF int      stbi_is_hdr          (char const *filename)
01139 {
01140    FILE *f = stbi__fopen(filename, "rb");
01141    int result=0;
01142    if (f) {
01143       result = stbi_is_hdr_from_file(f);
01144       fclose(f);
01145    }
01146    return result;
01147 }
01148 
01149 STBIDEF int      stbi_is_hdr_from_file(FILE *f)
01150 {
01151    #ifndef STBI_NO_HDR
01152    stbi__context s;
01153    stbi__start_file(&s,f);
01154    return stbi__hdr_test(&s);
01155    #else
01156    return 0;
01157    #endif
01158 }
01159 #endif // !STBI_NO_STDIO
01160 
01161 STBIDEF int      stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
01162 {
01163    #ifndef STBI_NO_HDR
01164    stbi__context s;
01165    stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
01166    return stbi__hdr_test(&s);
01167    #else
01168    return 0;
01169    #endif
01170 }
01171 
01172 static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f;
01173 static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f;
01174 
01175 #ifndef STBI_NO_LINEAR
01176 STBIDEF void   stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
01177 STBIDEF void   stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
01178 #endif
01179 
01180 STBIDEF void   stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; }
01181 STBIDEF void   stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; }
01182 
01183 
01185 //
01186 // Common code used by all image loaders
01187 //
01188 
01189 enum
01190 {
01191    STBI__SCAN_load=0,
01192    STBI__SCAN_type,
01193    STBI__SCAN_header
01194 };
01195 
01196 static void stbi__refill_buffer(stbi__context *s)
01197 {
01198    int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
01199    if (n == 0) {
01200       // at end of file, treat same as if from memory, but need to handle case
01201       // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
01202       s->read_from_callbacks = 0;
01203       s->img_buffer = s->buffer_start;
01204       s->img_buffer_end = s->buffer_start+1;
01205       *s->img_buffer = 0;
01206    } else {
01207       s->img_buffer = s->buffer_start;
01208       s->img_buffer_end = s->buffer_start + n;
01209    }
01210 }
01211 
01212 stbi_inline static stbi_uc stbi__get8(stbi__context *s)
01213 {
01214    if (s->img_buffer < s->img_buffer_end)
01215       return *s->img_buffer++;
01216    if (s->read_from_callbacks) {
01217       stbi__refill_buffer(s);
01218       return *s->img_buffer++;
01219    }
01220    return 0;
01221 }
01222 
01223 stbi_inline static int stbi__at_eof(stbi__context *s)
01224 {
01225    if (s->io.read) {
01226       if (!(s->io.eof)(s->io_user_data)) return 0;
01227       // if feof() is true, check if buffer = end
01228       // special case: we've only got the special 0 character at the end
01229       if (s->read_from_callbacks == 0) return 1;
01230    }
01231 
01232    return s->img_buffer >= s->img_buffer_end;
01233 }
01234 
01235 static void stbi__skip(stbi__context *s, int n)
01236 {
01237    if (n < 0) {
01238       s->img_buffer = s->img_buffer_end;
01239       return;
01240    }
01241    if (s->io.read) {
01242       int blen = (int) (s->img_buffer_end - s->img_buffer);
01243       if (blen < n) {
01244          s->img_buffer = s->img_buffer_end;
01245          (s->io.skip)(s->io_user_data, n - blen);
01246          return;
01247       }
01248    }
01249    s->img_buffer += n;
01250 }
01251 
01252 static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n)
01253 {
01254    if (s->io.read) {
01255       int blen = (int) (s->img_buffer_end - s->img_buffer);
01256       if (blen < n) {
01257          int res, count;
01258 
01259          memcpy(buffer, s->img_buffer, blen);
01260 
01261          count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
01262          res = (count == (n-blen));
01263          s->img_buffer = s->img_buffer_end;
01264          return res;
01265       }
01266    }
01267 
01268    if (s->img_buffer+n <= s->img_buffer_end) {
01269       memcpy(buffer, s->img_buffer, n);
01270       s->img_buffer += n;
01271       return 1;
01272    } else
01273       return 0;
01274 }
01275 
01276 static int stbi__get16be(stbi__context *s)
01277 {
01278    int z = stbi__get8(s);
01279    return (z << 8) + stbi__get8(s);
01280 }
01281 
01282 static stbi__uint32 stbi__get32be(stbi__context *s)
01283 {
01284    stbi__uint32 z = stbi__get16be(s);
01285    return (z << 16) + stbi__get16be(s);
01286 }
01287 
01288 static int stbi__get16le(stbi__context *s)
01289 {
01290    int z = stbi__get8(s);
01291    return z + (stbi__get8(s) << 8);
01292 }
01293 
01294 static stbi__uint32 stbi__get32le(stbi__context *s)
01295 {
01296    stbi__uint32 z = stbi__get16le(s);
01297    return z + (stbi__get16le(s) << 16);
01298 }
01299 
01300 #define STBI__BYTECAST(x)  ((stbi_uc) ((x) & 255))  // truncate int to byte without warnings
01301 
01302 
01304 //
01305 //  generic converter from built-in img_n to req_comp
01306 //    individual types do this automatically as much as possible (e.g. jpeg
01307 //    does all cases internally since it needs to colorspace convert anyway,
01308 //    and it never has alpha, so very few cases ). png can automatically
01309 //    interleave an alpha=255 channel, but falls back to this for other cases
01310 //
01311 //  assume data buffer is malloced, so malloc a new one and free that one
01312 //  only failure mode is malloc failing
01313 
01314 static stbi_uc stbi__compute_y(int r, int g, int b)
01315 {
01316    return (stbi_uc) (((r*77) + (g*150) +  (29*b)) >> 8);
01317 }
01318 
01319 static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y)
01320 {
01321    int i,j;
01322    unsigned char *good;
01323 
01324    if (req_comp == img_n) return data;
01325    STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
01326 
01327    good = (unsigned char *) stbi__malloc(req_comp * x * y);
01328    if (good == NULL) {
01329       STBI_FREE(data);
01330       return stbi__errpuc("outofmem", "Out of memory");
01331    }
01332 
01333    for (j=0; j < (int) y; ++j) {
01334       unsigned char *src  = data + j * x * img_n   ;
01335       unsigned char *dest = good + j * x * req_comp;
01336 
01337       #define COMBO(a,b)  ((a)*8+(b))
01338       #define CASE(a,b)   case COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
01339       // convert source image with img_n components to one with req_comp components;
01340       // avoid switch per pixel, so use switch per scanline and massive macros
01341       switch (COMBO(img_n, req_comp)) {
01342          CASE(1,2) dest[0]=src[0], dest[1]=255; break;
01343          CASE(1,3) dest[0]=dest[1]=dest[2]=src[0]; break;
01344          CASE(1,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=255; break;
01345          CASE(2,1) dest[0]=src[0]; break;
01346          CASE(2,3) dest[0]=dest[1]=dest[2]=src[0]; break;
01347          CASE(2,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=src[1]; break;
01348          CASE(3,4) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2],dest[3]=255; break;
01349          CASE(3,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
01350          CASE(3,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = 255; break;
01351          CASE(4,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
01352          CASE(4,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = src[3]; break;
01353          CASE(4,3) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2]; break;
01354          default: STBI_ASSERT(0);
01355       }
01356       #undef CASE
01357    }
01358 
01359    STBI_FREE(data);
01360    return good;
01361 }
01362 
01363 #ifndef STBI_NO_LINEAR
01364 static float   *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
01365 {
01366    int i,k,n;
01367    float *output = (float *) stbi__malloc(x * y * comp * sizeof(float));
01368    if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
01369    // compute number of non-alpha components
01370    if (comp & 1) n = comp; else n = comp-1;
01371    for (i=0; i < x*y; ++i) {
01372       for (k=0; k < n; ++k) {
01373          output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
01374       }
01375       if (k < comp) output[i*comp + k] = data[i*comp+k]/255.0f;
01376    }
01377    STBI_FREE(data);
01378    return output;
01379 }
01380 #endif
01381 
01382 #ifndef STBI_NO_HDR
01383 #define stbi__float2int(x)   ((int) (x))
01384 static stbi_uc *stbi__hdr_to_ldr(float   *data, int x, int y, int comp)
01385 {
01386    int i,k,n;
01387    stbi_uc *output = (stbi_uc *) stbi__malloc(x * y * comp);
01388    if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
01389    // compute number of non-alpha components
01390    if (comp & 1) n = comp; else n = comp-1;
01391    for (i=0; i < x*y; ++i) {
01392       for (k=0; k < n; ++k) {
01393          float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
01394          if (z < 0) z = 0;
01395          if (z > 255) z = 255;
01396          output[i*comp + k] = (stbi_uc) stbi__float2int(z);
01397       }
01398       if (k < comp) {
01399          float z = data[i*comp+k] * 255 + 0.5f;
01400          if (z < 0) z = 0;
01401          if (z > 255) z = 255;
01402          output[i*comp + k] = (stbi_uc) stbi__float2int(z);
01403       }
01404    }
01405    STBI_FREE(data);
01406    return output;
01407 }
01408 #endif
01409 
01411 //
01412 //  "baseline" JPEG/JFIF decoder
01413 //
01414 //    simple implementation
01415 //      - doesn't support delayed output of y-dimension
01416 //      - simple interface (only one output format: 8-bit interleaved RGB)
01417 //      - doesn't try to recover corrupt jpegs
01418 //      - doesn't allow partial loading, loading multiple at once
01419 //      - still fast on x86 (copying globals into locals doesn't help x86)
01420 //      - allocates lots of intermediate memory (full size of all components)
01421 //        - non-interleaved case requires this anyway
01422 //        - allows good upsampling (see next)
01423 //    high-quality
01424 //      - upsampled channels are bilinearly interpolated, even across blocks
01425 //      - quality integer IDCT derived from IJG's 'slow'
01426 //    performance
01427 //      - fast huffman; reasonable integer IDCT
01428 //      - some SIMD kernels for common paths on targets with SSE2/NEON
01429 //      - uses a lot of intermediate memory, could cache poorly
01430 
01431 #ifndef STBI_NO_JPEG
01432 
01433 // huffman decoding acceleration
01434 #define FAST_BITS   9  // larger handles more cases; smaller stomps less cache
01435 
01436 typedef struct
01437 {
01438    stbi_uc  fast[1 << FAST_BITS];
01439    // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
01440    stbi__uint16 code[256];
01441    stbi_uc  values[256];
01442    stbi_uc  size[257];
01443    unsigned int maxcode[18];
01444    int    delta[17];   // old 'firstsymbol' - old 'firstcode'
01445 } stbi__huffman;
01446 
01447 typedef struct
01448 {
01449    stbi__context *s;
01450    stbi__huffman huff_dc[4];
01451    stbi__huffman huff_ac[4];
01452    stbi_uc dequant[4][64];
01453    stbi__int16 fast_ac[4][1 << FAST_BITS];
01454 
01455 // sizes for components, interleaved MCUs
01456    int img_h_max, img_v_max;
01457    int img_mcu_x, img_mcu_y;
01458    int img_mcu_w, img_mcu_h;
01459 
01460 // definition of jpeg image component
01461    struct
01462    {
01463       int id;
01464       int h,v;
01465       int tq;
01466       int hd,ha;
01467       int dc_pred;
01468 
01469       int x,y,w2,h2;
01470       stbi_uc *data;
01471       void *raw_data, *raw_coeff;
01472       stbi_uc *linebuf;
01473       short   *coeff;   // progressive only
01474       int      coeff_w, coeff_h; // number of 8x8 coefficient blocks
01475    } img_comp[4];
01476 
01477    stbi__uint32   code_buffer; // jpeg entropy-coded buffer
01478    int            code_bits;   // number of valid bits
01479    unsigned char  marker;      // marker seen while filling entropy buffer
01480    int            nomore;      // flag if we saw a marker so must stop
01481 
01482    int            progressive;
01483    int            spec_start;
01484    int            spec_end;
01485    int            succ_high;
01486    int            succ_low;
01487    int            eob_run;
01488 
01489    int scan_n, order[4];
01490    int restart_interval, todo;
01491 
01492 // kernels
01493    void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]);
01494    void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step);
01495    stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs);
01496 } stbi__jpeg;
01497 
01498 static int stbi__build_huffman(stbi__huffman *h, int *count)
01499 {
01500    int i,j,k=0,code;
01501    // build size list for each symbol (from JPEG spec)
01502    for (i=0; i < 16; ++i)
01503       for (j=0; j < count[i]; ++j)
01504          h->size[k++] = (stbi_uc) (i+1);
01505    h->size[k] = 0;
01506 
01507    // compute actual symbols (from jpeg spec)
01508    code = 0;
01509    k = 0;
01510    for(j=1; j <= 16; ++j) {
01511       // compute delta to add to code to compute symbol id
01512       h->delta[j] = k - code;
01513       if (h->size[k] == j) {
01514          while (h->size[k] == j)
01515             h->code[k++] = (stbi__uint16) (code++);
01516          if (code-1 >= (1 << j)) return stbi__err("bad code lengths","Corrupt JPEG");
01517       }
01518       // compute largest code + 1 for this size, preshifted as needed later
01519       h->maxcode[j] = code << (16-j);
01520       code <<= 1;
01521    }
01522    h->maxcode[j] = 0xffffffff;
01523 
01524    // build non-spec acceleration table; 255 is flag for not-accelerated
01525    memset(h->fast, 255, 1 << FAST_BITS);
01526    for (i=0; i < k; ++i) {
01527       int s = h->size[i];
01528       if (s <= FAST_BITS) {
01529          int c = h->code[i] << (FAST_BITS-s);
01530          int m = 1 << (FAST_BITS-s);
01531          for (j=0; j < m; ++j) {
01532             h->fast[c+j] = (stbi_uc) i;
01533          }
01534       }
01535    }
01536    return 1;
01537 }
01538 
01539 // build a table that decodes both magnitude and value of small ACs in
01540 // one go.
01541 static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h)
01542 {
01543    int i;
01544    for (i=0; i < (1 << FAST_BITS); ++i) {
01545       stbi_uc fast = h->fast[i];
01546       fast_ac[i] = 0;
01547       if (fast < 255) {
01548          int rs = h->values[fast];
01549          int run = (rs >> 4) & 15;
01550          int magbits = rs & 15;
01551          int len = h->size[fast];
01552 
01553          if (magbits && len + magbits <= FAST_BITS) {
01554             // magnitude code followed by receive_extend code
01555             int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
01556             int m = 1 << (magbits - 1);
01557             if (k < m) k += (-1 << magbits) + 1;
01558             // if the result is small enough, we can fit it in fast_ac table
01559             if (k >= -128 && k <= 127)
01560                fast_ac[i] = (stbi__int16) ((k << 8) + (run << 4) + (len + magbits));
01561          }
01562       }
01563    }
01564 }
01565 
01566 static void stbi__grow_buffer_unsafe(stbi__jpeg *j)
01567 {
01568    do {
01569       int b = j->nomore ? 0 : stbi__get8(j->s);
01570       if (b == 0xff) {
01571          int c = stbi__get8(j->s);
01572          if (c != 0) {
01573             j->marker = (unsigned char) c;
01574             j->nomore = 1;
01575             return;
01576          }
01577       }
01578       j->code_buffer |= b << (24 - j->code_bits);
01579       j->code_bits += 8;
01580    } while (j->code_bits <= 24);
01581 }
01582 
01583 // (1 << n) - 1
01584 static stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
01585 
01586 // decode a jpeg huffman value from the bitstream
01587 stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h)
01588 {
01589    unsigned int temp;
01590    int c,k;
01591 
01592    if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
01593 
01594    // look at the top FAST_BITS and determine what symbol ID it is,
01595    // if the code is <= FAST_BITS
01596    c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
01597    k = h->fast[c];
01598    if (k < 255) {
01599       int s = h->size[k];
01600       if (s > j->code_bits)
01601          return -1;
01602       j->code_buffer <<= s;
01603       j->code_bits -= s;
01604       return h->values[k];
01605    }
01606 
01607    // naive test is to shift the code_buffer down so k bits are
01608    // valid, then test against maxcode. To speed this up, we've
01609    // preshifted maxcode left so that it has (16-k) 0s at the
01610    // end; in other words, regardless of the number of bits, it
01611    // wants to be compared against something shifted to have 16;
01612    // that way we don't need to shift inside the loop.
01613    temp = j->code_buffer >> 16;
01614    for (k=FAST_BITS+1 ; ; ++k)
01615       if (temp < h->maxcode[k])
01616          break;
01617    if (k == 17) {
01618       // error! code not found
01619       j->code_bits -= 16;
01620       return -1;
01621    }
01622 
01623    if (k > j->code_bits)
01624       return -1;
01625 
01626    // convert the huffman code to the symbol id
01627    c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
01628    STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
01629 
01630    // convert the id to a symbol
01631    j->code_bits -= k;
01632    j->code_buffer <<= k;
01633    return h->values[c];
01634 }
01635 
01636 // bias[n] = (-1<<n) + 1
01637 static int const stbi__jbias[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
01638 
01639 // combined JPEG 'receive' and JPEG 'extend', since baseline
01640 // always extends everything it receives.
01641 stbi_inline static int stbi__extend_receive(stbi__jpeg *j, int n)
01642 {
01643    unsigned int k;
01644    int sgn;
01645    if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
01646 
01647    sgn = (stbi__int32)j->code_buffer >> 31; // sign bit is always in MSB
01648    k = stbi_lrot(j->code_buffer, n);
01649    STBI_ASSERT(n >= 0 && n < (int) (sizeof(stbi__bmask)/sizeof(*stbi__bmask)));
01650    j->code_buffer = k & ~stbi__bmask[n];
01651    k &= stbi__bmask[n];
01652    j->code_bits -= n;
01653    return k + (stbi__jbias[n] & ~sgn);
01654 }
01655 
01656 // get some unsigned bits
01657 stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg *j, int n)
01658 {
01659    unsigned int k;
01660    if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
01661    k = stbi_lrot(j->code_buffer, n);
01662    j->code_buffer = k & ~stbi__bmask[n];
01663    k &= stbi__bmask[n];
01664    j->code_bits -= n;
01665    return k;
01666 }
01667 
01668 stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg *j)
01669 {
01670    unsigned int k;
01671    if (j->code_bits < 1) stbi__grow_buffer_unsafe(j);
01672    k = j->code_buffer;
01673    j->code_buffer <<= 1;
01674    --j->code_bits;
01675    return k & 0x80000000;
01676 }
01677 
01678 // given a value that's at position X in the zigzag stream,
01679 // where does it appear in the 8x8 matrix coded as row-major?
01680 static stbi_uc stbi__jpeg_dezigzag[64+15] =
01681 {
01682     0,  1,  8, 16,  9,  2,  3, 10,
01683    17, 24, 32, 25, 18, 11,  4,  5,
01684    12, 19, 26, 33, 40, 48, 41, 34,
01685    27, 20, 13,  6,  7, 14, 21, 28,
01686    35, 42, 49, 56, 57, 50, 43, 36,
01687    29, 22, 15, 23, 30, 37, 44, 51,
01688    58, 59, 52, 45, 38, 31, 39, 46,
01689    53, 60, 61, 54, 47, 55, 62, 63,
01690    // let corrupt input sample past end
01691    63, 63, 63, 63, 63, 63, 63, 63,
01692    63, 63, 63, 63, 63, 63, 63
01693 };
01694 
01695 // decode one 64-entry block--
01696 static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi_uc *dequant)
01697 {
01698    int diff,dc,k;
01699    int t;
01700 
01701    if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
01702    t = stbi__jpeg_huff_decode(j, hdc);
01703    if (t < 0) return stbi__err("bad huffman code","Corrupt JPEG");
01704 
01705    // 0 all the ac values now so we can do it 32-bits at a time
01706    memset(data,0,64*sizeof(data[0]));
01707 
01708    diff = t ? stbi__extend_receive(j, t) : 0;
01709    dc = j->img_comp[b].dc_pred + diff;
01710    j->img_comp[b].dc_pred = dc;
01711    data[0] = (short) (dc * dequant[0]);
01712 
01713    // decode AC components, see JPEG spec
01714    k = 1;
01715    do {
01716       unsigned int zig;
01717       int c,r,s;
01718       if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
01719       c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
01720       r = fac[c];
01721       if (r) { // fast-AC path
01722          k += (r >> 4) & 15; // run
01723          s = r & 15; // combined length
01724          j->code_buffer <<= s;
01725          j->code_bits -= s;
01726          // decode into unzigzag'd location
01727          zig = stbi__jpeg_dezigzag[k++];
01728          data[zig] = (short) ((r >> 8) * dequant[zig]);
01729       } else {
01730          int rs = stbi__jpeg_huff_decode(j, hac);
01731          if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
01732          s = rs & 15;
01733          r = rs >> 4;
01734          if (s == 0) {
01735             if (rs != 0xf0) break; // end block
01736             k += 16;
01737          } else {
01738             k += r;
01739             // decode into unzigzag'd location
01740             zig = stbi__jpeg_dezigzag[k++];
01741             data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]);
01742          }
01743       }
01744    } while (k < 64);
01745    return 1;
01746 }
01747 
01748 static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg *j, short data[64], stbi__huffman *hdc, int b)
01749 {
01750    int diff,dc;
01751    int t;
01752    if (j->spec_end != 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
01753 
01754    if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
01755 
01756    if (j->succ_high == 0) {
01757       // first scan for DC coefficient, must be first
01758       memset(data,0,64*sizeof(data[0])); // 0 all the ac values now
01759       t = stbi__jpeg_huff_decode(j, hdc);
01760       diff = t ? stbi__extend_receive(j, t) : 0;
01761 
01762       dc = j->img_comp[b].dc_pred + diff;
01763       j->img_comp[b].dc_pred = dc;
01764       data[0] = (short) (dc << j->succ_low);
01765    } else {
01766       // refinement scan for DC coefficient
01767       if (stbi__jpeg_get_bit(j))
01768          data[0] += (short) (1 << j->succ_low);
01769    }
01770    return 1;
01771 }
01772 
01773 // @OPTIMIZE: store non-zigzagged during the decode passes,
01774 // and only de-zigzag when dequantizing
01775 static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg *j, short data[64], stbi__huffman *hac, stbi__int16 *fac)
01776 {
01777    int k;
01778    if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
01779 
01780    if (j->succ_high == 0) {
01781       int shift = j->succ_low;
01782 
01783       if (j->eob_run) {
01784          --j->eob_run;
01785          return 1;
01786       }
01787 
01788       k = j->spec_start;
01789       do {
01790          unsigned int zig;
01791          int c,r,s;
01792          if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
01793          c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
01794          r = fac[c];
01795          if (r) { // fast-AC path
01796             k += (r >> 4) & 15; // run
01797             s = r & 15; // combined length
01798             j->code_buffer <<= s;
01799             j->code_bits -= s;
01800             zig = stbi__jpeg_dezigzag[k++];
01801             data[zig] = (short) ((r >> 8) << shift);
01802          } else {
01803             int rs = stbi__jpeg_huff_decode(j, hac);
01804             if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
01805             s = rs & 15;
01806             r = rs >> 4;
01807             if (s == 0) {
01808                if (r < 15) {
01809                   j->eob_run = (1 << r);
01810                   if (r)
01811                      j->eob_run += stbi__jpeg_get_bits(j, r);
01812                   --j->eob_run;
01813                   break;
01814                }
01815                k += 16;
01816             } else {
01817                k += r;
01818                zig = stbi__jpeg_dezigzag[k++];
01819                data[zig] = (short) (stbi__extend_receive(j,s) << shift);
01820             }
01821          }
01822       } while (k <= j->spec_end);
01823    } else {
01824       // refinement scan for these AC coefficients
01825 
01826       short bit = (short) (1 << j->succ_low);
01827 
01828       if (j->eob_run) {
01829          --j->eob_run;
01830          for (k = j->spec_start; k <= j->spec_end; ++k) {
01831             short *p = &data[stbi__jpeg_dezigzag[k]];
01832             if (*p != 0)
01833                if (stbi__jpeg_get_bit(j))
01834                   if ((*p & bit)==0) {
01835                      if (*p > 0)
01836                         *p += bit;
01837                      else
01838                         *p -= bit;
01839                   }
01840          }
01841       } else {
01842          k = j->spec_start;
01843          do {
01844             int r,s;
01845             int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
01846             if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
01847             s = rs & 15;
01848             r = rs >> 4;
01849             if (s == 0) {
01850                if (r < 15) {
01851                   j->eob_run = (1 << r) - 1;
01852                   if (r)
01853                      j->eob_run += stbi__jpeg_get_bits(j, r);
01854                   r = 64; // force end of block
01855                } else {
01856                   // r=15 s=0 should write 16 0s, so we just do
01857                   // a run of 15 0s and then write s (which is 0),
01858                   // so we don't have to do anything special here
01859                }
01860             } else {
01861                if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG");
01862                // sign bit
01863                if (stbi__jpeg_get_bit(j))
01864                   s = bit;
01865                else
01866                   s = -bit;
01867             }
01868 
01869             // advance by r
01870             while (k <= j->spec_end) {
01871                short *p = &data[stbi__jpeg_dezigzag[k++]];
01872                if (*p != 0) {
01873                   if (stbi__jpeg_get_bit(j))
01874                      if ((*p & bit)==0) {
01875                         if (*p > 0)
01876                            *p += bit;
01877                         else
01878                            *p -= bit;
01879                      }
01880                } else {
01881                   if (r == 0) {
01882                      *p = (short) s;
01883                      break;
01884                   }
01885                   --r;
01886                }
01887             }
01888          } while (k <= j->spec_end);
01889       }
01890    }
01891    return 1;
01892 }
01893 
01894 // take a -128..127 value and stbi__clamp it and convert to 0..255
01895 stbi_inline static stbi_uc stbi__clamp(int x)
01896 {
01897    // trick to use a single test to catch both cases
01898    if ((unsigned int) x > 255) {
01899       if (x < 0) return 0;
01900       if (x > 255) return 255;
01901    }
01902    return (stbi_uc) x;
01903 }
01904 
01905 #define stbi__f2f(x)  ((int) (((x) * 4096 + 0.5)))
01906 #define stbi__fsh(x)  ((x) << 12)
01907 
01908 // derived from jidctint -- DCT_ISLOW
01909 #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
01910    int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
01911    p2 = s2;                                    \
01912    p3 = s6;                                    \
01913    p1 = (p2+p3) * stbi__f2f(0.5411961f);       \
01914    t2 = p1 + p3*stbi__f2f(-1.847759065f);      \
01915    t3 = p1 + p2*stbi__f2f( 0.765366865f);      \
01916    p2 = s0;                                    \
01917    p3 = s4;                                    \
01918    t0 = stbi__fsh(p2+p3);                      \
01919    t1 = stbi__fsh(p2-p3);                      \
01920    x0 = t0+t3;                                 \
01921    x3 = t0-t3;                                 \
01922    x1 = t1+t2;                                 \
01923    x2 = t1-t2;                                 \
01924    t0 = s7;                                    \
01925    t1 = s5;                                    \
01926    t2 = s3;                                    \
01927    t3 = s1;                                    \
01928    p3 = t0+t2;                                 \
01929    p4 = t1+t3;                                 \
01930    p1 = t0+t3;                                 \
01931    p2 = t1+t2;                                 \
01932    p5 = (p3+p4)*stbi__f2f( 1.175875602f);      \
01933    t0 = t0*stbi__f2f( 0.298631336f);           \
01934    t1 = t1*stbi__f2f( 2.053119869f);           \
01935    t2 = t2*stbi__f2f( 3.072711026f);           \
01936    t3 = t3*stbi__f2f( 1.501321110f);           \
01937    p1 = p5 + p1*stbi__f2f(-0.899976223f);      \
01938    p2 = p5 + p2*stbi__f2f(-2.562915447f);      \
01939    p3 = p3*stbi__f2f(-1.961570560f);           \
01940    p4 = p4*stbi__f2f(-0.390180644f);           \
01941    t3 += p1+p4;                                \
01942    t2 += p2+p3;                                \
01943    t1 += p2+p4;                                \
01944    t0 += p1+p3;
01945 
01946 static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64])
01947 {
01948    int i,val[64],*v=val;
01949    stbi_uc *o;
01950    short *d = data;
01951 
01952    // columns
01953    for (i=0; i < 8; ++i,++d, ++v) {
01954       // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
01955       if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
01956            && d[40]==0 && d[48]==0 && d[56]==0) {
01957          //    no shortcut                 0     seconds
01958          //    (1|2|3|4|5|6|7)==0          0     seconds
01959          //    all separate               -0.047 seconds
01960          //    1 && 2|3 && 4|5 && 6|7:    -0.047 seconds
01961          int dcterm = d[0] << 2;
01962          v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
01963       } else {
01964          STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56])
01965          // constants scaled things up by 1<<12; let's bring them back
01966          // down, but keep 2 extra bits of precision
01967          x0 += 512; x1 += 512; x2 += 512; x3 += 512;
01968          v[ 0] = (x0+t3) >> 10;
01969          v[56] = (x0-t3) >> 10;
01970          v[ 8] = (x1+t2) >> 10;
01971          v[48] = (x1-t2) >> 10;
01972          v[16] = (x2+t1) >> 10;
01973          v[40] = (x2-t1) >> 10;
01974          v[24] = (x3+t0) >> 10;
01975          v[32] = (x3-t0) >> 10;
01976       }
01977    }
01978 
01979    for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
01980       // no fast case since the first 1D IDCT spread components out
01981       STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
01982       // constants scaled things up by 1<<12, plus we had 1<<2 from first
01983       // loop, plus horizontal and vertical each scale by sqrt(8) so together
01984       // we've got an extra 1<<3, so 1<<17 total we need to remove.
01985       // so we want to round that, which means adding 0.5 * 1<<17,
01986       // aka 65536. Also, we'll end up with -128 to 127 that we want
01987       // to encode as 0..255 by adding 128, so we'll add that before the shift
01988       x0 += 65536 + (128<<17);
01989       x1 += 65536 + (128<<17);
01990       x2 += 65536 + (128<<17);
01991       x3 += 65536 + (128<<17);
01992       // tried computing the shifts into temps, or'ing the temps to see
01993       // if any were out of range, but that was slower
01994       o[0] = stbi__clamp((x0+t3) >> 17);
01995       o[7] = stbi__clamp((x0-t3) >> 17);
01996       o[1] = stbi__clamp((x1+t2) >> 17);
01997       o[6] = stbi__clamp((x1-t2) >> 17);
01998       o[2] = stbi__clamp((x2+t1) >> 17);
01999       o[5] = stbi__clamp((x2-t1) >> 17);
02000       o[3] = stbi__clamp((x3+t0) >> 17);
02001       o[4] = stbi__clamp((x3-t0) >> 17);
02002    }
02003 }
02004 
02005 #ifdef STBI_SSE2
02006 // sse2 integer IDCT. not the fastest possible implementation but it
02007 // produces bit-identical results to the generic C version so it's
02008 // fully "transparent".
02009 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
02010 {
02011    // This is constructed to match our regular (generic) integer IDCT exactly.
02012    __m128i row0, row1, row2, row3, row4, row5, row6, row7;
02013    __m128i tmp;
02014 
02015    // dot product constant: even elems=x, odd elems=y
02016    #define dct_const(x,y)  _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
02017 
02018    // out(0) = c0[even]*x + c0[odd]*y   (c0, x, y 16-bit, out 32-bit)
02019    // out(1) = c1[even]*x + c1[odd]*y
02020    #define dct_rot(out0,out1, x,y,c0,c1) \
02021       __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
02022       __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
02023       __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
02024       __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
02025       __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
02026       __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
02027 
02028    // out = in << 12  (in 16-bit, out 32-bit)
02029    #define dct_widen(out, in) \
02030       __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
02031       __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
02032 
02033    // wide add
02034    #define dct_wadd(out, a, b) \
02035       __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
02036       __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
02037 
02038    // wide sub
02039    #define dct_wsub(out, a, b) \
02040       __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
02041       __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
02042 
02043    // butterfly a/b, add bias, then shift by "s" and pack
02044    #define dct_bfly32o(out0, out1, a,b,bias,s) \
02045       { \
02046          __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
02047          __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
02048          dct_wadd(sum, abiased, b); \
02049          dct_wsub(dif, abiased, b); \
02050          out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
02051          out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
02052       }
02053 
02054    // 8-bit interleave step (for transposes)
02055    #define dct_interleave8(a, b) \
02056       tmp = a; \
02057       a = _mm_unpacklo_epi8(a, b); \
02058       b = _mm_unpackhi_epi8(tmp, b)
02059 
02060    // 16-bit interleave step (for transposes)
02061    #define dct_interleave16(a, b) \
02062       tmp = a; \
02063       a = _mm_unpacklo_epi16(a, b); \
02064       b = _mm_unpackhi_epi16(tmp, b)
02065 
02066    #define dct_pass(bias,shift) \
02067       { \
02068          /* even part */ \
02069          dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
02070          __m128i sum04 = _mm_add_epi16(row0, row4); \
02071          __m128i dif04 = _mm_sub_epi16(row0, row4); \
02072          dct_widen(t0e, sum04); \
02073          dct_widen(t1e, dif04); \
02074          dct_wadd(x0, t0e, t3e); \
02075          dct_wsub(x3, t0e, t3e); \
02076          dct_wadd(x1, t1e, t2e); \
02077          dct_wsub(x2, t1e, t2e); \
02078          /* odd part */ \
02079          dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
02080          dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
02081          __m128i sum17 = _mm_add_epi16(row1, row7); \
02082          __m128i sum35 = _mm_add_epi16(row3, row5); \
02083          dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
02084          dct_wadd(x4, y0o, y4o); \
02085          dct_wadd(x5, y1o, y5o); \
02086          dct_wadd(x6, y2o, y5o); \
02087          dct_wadd(x7, y3o, y4o); \
02088          dct_bfly32o(row0,row7, x0,x7,bias,shift); \
02089          dct_bfly32o(row1,row6, x1,x6,bias,shift); \
02090          dct_bfly32o(row2,row5, x2,x5,bias,shift); \
02091          dct_bfly32o(row3,row4, x3,x4,bias,shift); \
02092       }
02093 
02094    __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
02095    __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f));
02096    __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
02097    __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
02098    __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f));
02099    __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f));
02100    __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f));
02101    __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f));
02102 
02103    // rounding biases in column/row passes, see stbi__idct_block for explanation.
02104    __m128i bias_0 = _mm_set1_epi32(512);
02105    __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17));
02106 
02107    // load
02108    row0 = _mm_load_si128((const __m128i *) (data + 0*8));
02109    row1 = _mm_load_si128((const __m128i *) (data + 1*8));
02110    row2 = _mm_load_si128((const __m128i *) (data + 2*8));
02111    row3 = _mm_load_si128((const __m128i *) (data + 3*8));
02112    row4 = _mm_load_si128((const __m128i *) (data + 4*8));
02113    row5 = _mm_load_si128((const __m128i *) (data + 5*8));
02114    row6 = _mm_load_si128((const __m128i *) (data + 6*8));
02115    row7 = _mm_load_si128((const __m128i *) (data + 7*8));
02116 
02117    // column pass
02118    dct_pass(bias_0, 10);
02119 
02120    {
02121       // 16bit 8x8 transpose pass 1
02122       dct_interleave16(row0, row4);
02123       dct_interleave16(row1, row5);
02124       dct_interleave16(row2, row6);
02125       dct_interleave16(row3, row7);
02126 
02127       // transpose pass 2
02128       dct_interleave16(row0, row2);
02129       dct_interleave16(row1, row3);
02130       dct_interleave16(row4, row6);
02131       dct_interleave16(row5, row7);
02132 
02133       // transpose pass 3
02134       dct_interleave16(row0, row1);
02135       dct_interleave16(row2, row3);
02136       dct_interleave16(row4, row5);
02137       dct_interleave16(row6, row7);
02138    }
02139 
02140    // row pass
02141    dct_pass(bias_1, 17);
02142 
02143    {
02144       // pack
02145       __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
02146       __m128i p1 = _mm_packus_epi16(row2, row3);
02147       __m128i p2 = _mm_packus_epi16(row4, row5);
02148       __m128i p3 = _mm_packus_epi16(row6, row7);
02149 
02150       // 8bit 8x8 transpose pass 1
02151       dct_interleave8(p0, p2); // a0e0a1e1...
02152       dct_interleave8(p1, p3); // c0g0c1g1...
02153 
02154       // transpose pass 2
02155       dct_interleave8(p0, p1); // a0c0e0g0...
02156       dct_interleave8(p2, p3); // b0d0f0h0...
02157 
02158       // transpose pass 3
02159       dct_interleave8(p0, p2); // a0b0c0d0...
02160       dct_interleave8(p1, p3); // a4b4c4d4...
02161 
02162       // store
02163       _mm_storel_epi64((__m128i *) out, p0); out += out_stride;
02164       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
02165       _mm_storel_epi64((__m128i *) out, p2); out += out_stride;
02166       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
02167       _mm_storel_epi64((__m128i *) out, p1); out += out_stride;
02168       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
02169       _mm_storel_epi64((__m128i *) out, p3); out += out_stride;
02170       _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e));
02171    }
02172 
02173 #undef dct_const
02174 #undef dct_rot
02175 #undef dct_widen
02176 #undef dct_wadd
02177 #undef dct_wsub
02178 #undef dct_bfly32o
02179 #undef dct_interleave8
02180 #undef dct_interleave16
02181 #undef dct_pass
02182 }
02183 
02184 #endif // STBI_SSE2
02185 
02186 #ifdef STBI_NEON
02187 
02188 // NEON integer IDCT. should produce bit-identical
02189 // results to the generic C version.
02190 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
02191 {
02192    int16x8_t row0, row1, row2, row3, row4, row5, row6, row7;
02193 
02194    int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f));
02195    int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f));
02196    int16x4_t rot0_2 = vdup_n_s16(stbi__f2f( 0.765366865f));
02197    int16x4_t rot1_0 = vdup_n_s16(stbi__f2f( 1.175875602f));
02198    int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f));
02199    int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f));
02200    int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f));
02201    int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f));
02202    int16x4_t rot3_0 = vdup_n_s16(stbi__f2f( 0.298631336f));
02203    int16x4_t rot3_1 = vdup_n_s16(stbi__f2f( 2.053119869f));
02204    int16x4_t rot3_2 = vdup_n_s16(stbi__f2f( 3.072711026f));
02205    int16x4_t rot3_3 = vdup_n_s16(stbi__f2f( 1.501321110f));
02206 
02207 #define dct_long_mul(out, inq, coeff) \
02208    int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
02209    int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
02210 
02211 #define dct_long_mac(out, acc, inq, coeff) \
02212    int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
02213    int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
02214 
02215 #define dct_widen(out, inq) \
02216    int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
02217    int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
02218 
02219 // wide add
02220 #define dct_wadd(out, a, b) \
02221    int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
02222    int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
02223 
02224 // wide sub
02225 #define dct_wsub(out, a, b) \
02226    int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
02227    int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
02228 
02229 // butterfly a/b, then shift using "shiftop" by "s" and pack
02230 #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
02231    { \
02232       dct_wadd(sum, a, b); \
02233       dct_wsub(dif, a, b); \
02234       out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
02235       out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
02236    }
02237 
02238 #define dct_pass(shiftop, shift) \
02239    { \
02240       /* even part */ \
02241       int16x8_t sum26 = vaddq_s16(row2, row6); \
02242       dct_long_mul(p1e, sum26, rot0_0); \
02243       dct_long_mac(t2e, p1e, row6, rot0_1); \
02244       dct_long_mac(t3e, p1e, row2, rot0_2); \
02245       int16x8_t sum04 = vaddq_s16(row0, row4); \
02246       int16x8_t dif04 = vsubq_s16(row0, row4); \
02247       dct_widen(t0e, sum04); \
02248       dct_widen(t1e, dif04); \
02249       dct_wadd(x0, t0e, t3e); \
02250       dct_wsub(x3, t0e, t3e); \
02251       dct_wadd(x1, t1e, t2e); \
02252       dct_wsub(x2, t1e, t2e); \
02253       /* odd part */ \
02254       int16x8_t sum15 = vaddq_s16(row1, row5); \
02255       int16x8_t sum17 = vaddq_s16(row1, row7); \
02256       int16x8_t sum35 = vaddq_s16(row3, row5); \
02257       int16x8_t sum37 = vaddq_s16(row3, row7); \
02258       int16x8_t sumodd = vaddq_s16(sum17, sum35); \
02259       dct_long_mul(p5o, sumodd, rot1_0); \
02260       dct_long_mac(p1o, p5o, sum17, rot1_1); \
02261       dct_long_mac(p2o, p5o, sum35, rot1_2); \
02262       dct_long_mul(p3o, sum37, rot2_0); \
02263       dct_long_mul(p4o, sum15, rot2_1); \
02264       dct_wadd(sump13o, p1o, p3o); \
02265       dct_wadd(sump24o, p2o, p4o); \
02266       dct_wadd(sump23o, p2o, p3o); \
02267       dct_wadd(sump14o, p1o, p4o); \
02268       dct_long_mac(x4, sump13o, row7, rot3_0); \
02269       dct_long_mac(x5, sump24o, row5, rot3_1); \
02270       dct_long_mac(x6, sump23o, row3, rot3_2); \
02271       dct_long_mac(x7, sump14o, row1, rot3_3); \
02272       dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
02273       dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
02274       dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
02275       dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
02276    }
02277 
02278    // load
02279    row0 = vld1q_s16(data + 0*8);
02280    row1 = vld1q_s16(data + 1*8);
02281    row2 = vld1q_s16(data + 2*8);
02282    row3 = vld1q_s16(data + 3*8);
02283    row4 = vld1q_s16(data + 4*8);
02284    row5 = vld1q_s16(data + 5*8);
02285    row6 = vld1q_s16(data + 6*8);
02286    row7 = vld1q_s16(data + 7*8);
02287 
02288    // add DC bias
02289    row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
02290 
02291    // column pass
02292    dct_pass(vrshrn_n_s32, 10);
02293 
02294    // 16bit 8x8 transpose
02295    {
02296 // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
02297 // whether compilers actually get this is another story, sadly.
02298 #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
02299 #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
02300 #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
02301 
02302       // pass 1
02303       dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6
02304       dct_trn16(row2, row3);
02305       dct_trn16(row4, row5);
02306       dct_trn16(row6, row7);
02307 
02308       // pass 2
02309       dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4
02310       dct_trn32(row1, row3);
02311       dct_trn32(row4, row6);
02312       dct_trn32(row5, row7);
02313 
02314       // pass 3
02315       dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0
02316       dct_trn64(row1, row5);
02317       dct_trn64(row2, row6);
02318       dct_trn64(row3, row7);
02319 
02320 #undef dct_trn16
02321 #undef dct_trn32
02322 #undef dct_trn64
02323    }
02324 
02325    // row pass
02326    // vrshrn_n_s32 only supports shifts up to 16, we need
02327    // 17. so do a non-rounding shift of 16 first then follow
02328    // up with a rounding shift by 1.
02329    dct_pass(vshrn_n_s32, 16);
02330 
02331    {
02332       // pack and round
02333       uint8x8_t p0 = vqrshrun_n_s16(row0, 1);
02334       uint8x8_t p1 = vqrshrun_n_s16(row1, 1);
02335       uint8x8_t p2 = vqrshrun_n_s16(row2, 1);
02336       uint8x8_t p3 = vqrshrun_n_s16(row3, 1);
02337       uint8x8_t p4 = vqrshrun_n_s16(row4, 1);
02338       uint8x8_t p5 = vqrshrun_n_s16(row5, 1);
02339       uint8x8_t p6 = vqrshrun_n_s16(row6, 1);
02340       uint8x8_t p7 = vqrshrun_n_s16(row7, 1);
02341 
02342       // again, these can translate into one instruction, but often don't.
02343 #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
02344 #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
02345 #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
02346 
02347       // sadly can't use interleaved stores here since we only write
02348       // 8 bytes to each scan line!
02349 
02350       // 8x8 8-bit transpose pass 1
02351       dct_trn8_8(p0, p1);
02352       dct_trn8_8(p2, p3);
02353       dct_trn8_8(p4, p5);
02354       dct_trn8_8(p6, p7);
02355 
02356       // pass 2
02357       dct_trn8_16(p0, p2);
02358       dct_trn8_16(p1, p3);
02359       dct_trn8_16(p4, p6);
02360       dct_trn8_16(p5, p7);
02361 
02362       // pass 3
02363       dct_trn8_32(p0, p4);
02364       dct_trn8_32(p1, p5);
02365       dct_trn8_32(p2, p6);
02366       dct_trn8_32(p3, p7);
02367 
02368       // store
02369       vst1_u8(out, p0); out += out_stride;
02370       vst1_u8(out, p1); out += out_stride;
02371       vst1_u8(out, p2); out += out_stride;
02372       vst1_u8(out, p3); out += out_stride;
02373       vst1_u8(out, p4); out += out_stride;
02374       vst1_u8(out, p5); out += out_stride;
02375       vst1_u8(out, p6); out += out_stride;
02376       vst1_u8(out, p7);
02377 
02378 #undef dct_trn8_8
02379 #undef dct_trn8_16
02380 #undef dct_trn8_32
02381    }
02382 
02383 #undef dct_long_mul
02384 #undef dct_long_mac
02385 #undef dct_widen
02386 #undef dct_wadd
02387 #undef dct_wsub
02388 #undef dct_bfly32o
02389 #undef dct_pass
02390 }
02391 
02392 #endif // STBI_NEON
02393 
02394 #define STBI__MARKER_none  0xff
02395 // if there's a pending marker from the entropy stream, return that
02396 // otherwise, fetch from the stream and get a marker. if there's no
02397 // marker, return 0xff, which is never a valid marker value
02398 static stbi_uc stbi__get_marker(stbi__jpeg *j)
02399 {
02400    stbi_uc x;
02401    if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
02402    x = stbi__get8(j->s);
02403    if (x != 0xff) return STBI__MARKER_none;
02404    while (x == 0xff)
02405       x = stbi__get8(j->s);
02406    return x;
02407 }
02408 
02409 // in each scan, we'll have scan_n components, and the order
02410 // of the components is specified by order[]
02411 #define STBI__RESTART(x)     ((x) >= 0xd0 && (x) <= 0xd7)
02412 
02413 // after a restart interval, stbi__jpeg_reset the entropy decoder and
02414 // the dc prediction
02415 static void stbi__jpeg_reset(stbi__jpeg *j)
02416 {
02417    j->code_bits = 0;
02418    j->code_buffer = 0;
02419    j->nomore = 0;
02420    j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = 0;
02421    j->marker = STBI__MARKER_none;
02422    j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
02423    j->eob_run = 0;
02424    // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
02425    // since we don't even allow 1<<30 pixels
02426 }
02427 
02428 static int stbi__parse_entropy_coded_data(stbi__jpeg *z)
02429 {
02430    stbi__jpeg_reset(z);
02431    if (!z->progressive) {
02432       if (z->scan_n == 1) {
02433          int i,j;
02434          STBI_SIMD_ALIGN(short, data[64]);
02435          int n = z->order[0];
02436          // non-interleaved data, we just need to process one block at a time,
02437          // in trivial scanline order
02438          // number of blocks to do just depends on how many actual "pixels" this
02439          // component has, independent of interleaved MCU blocking and such
02440          int w = (z->img_comp[n].x+7) >> 3;
02441          int h = (z->img_comp[n].y+7) >> 3;
02442          for (j=0; j < h; ++j) {
02443             for (i=0; i < w; ++i) {
02444                int ha = z->img_comp[n].ha;
02445                if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
02446                z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
02447                // every data block is an MCU, so countdown the restart interval
02448                if (--z->todo <= 0) {
02449                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
02450                   // if it's NOT a restart, then just bail, so we get corrupt data
02451                   // rather than no data
02452                   if (!STBI__RESTART(z->marker)) return 1;
02453                   stbi__jpeg_reset(z);
02454                }
02455             }
02456          }
02457          return 1;
02458       } else { // interleaved
02459          int i,j,k,x,y;
02460          STBI_SIMD_ALIGN(short, data[64]);
02461          for (j=0; j < z->img_mcu_y; ++j) {
02462             for (i=0; i < z->img_mcu_x; ++i) {
02463                // scan an interleaved mcu... process scan_n components in order
02464                for (k=0; k < z->scan_n; ++k) {
02465                   int n = z->order[k];
02466                   // scan out an mcu's worth of this component; that's just determined
02467                   // by the basic H and V specified for the component
02468                   for (y=0; y < z->img_comp[n].v; ++y) {
02469                      for (x=0; x < z->img_comp[n].h; ++x) {
02470                         int x2 = (i*z->img_comp[n].h + x)*8;
02471                         int y2 = (j*z->img_comp[n].v + y)*8;
02472                         int ha = z->img_comp[n].ha;
02473                         if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
02474                         z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data);
02475                      }
02476                   }
02477                }
02478                // after all interleaved components, that's an interleaved MCU,
02479                // so now count down the restart interval
02480                if (--z->todo <= 0) {
02481                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
02482                   if (!STBI__RESTART(z->marker)) return 1;
02483                   stbi__jpeg_reset(z);
02484                }
02485             }
02486          }
02487          return 1;
02488       }
02489    } else {
02490       if (z->scan_n == 1) {
02491          int i,j;
02492          int n = z->order[0];
02493          // non-interleaved data, we just need to process one block at a time,
02494          // in trivial scanline order
02495          // number of blocks to do just depends on how many actual "pixels" this
02496          // component has, independent of interleaved MCU blocking and such
02497          int w = (z->img_comp[n].x+7) >> 3;
02498          int h = (z->img_comp[n].y+7) >> 3;
02499          for (j=0; j < h; ++j) {
02500             for (i=0; i < w; ++i) {
02501                short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
02502                if (z->spec_start == 0) {
02503                   if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
02504                      return 0;
02505                } else {
02506                   int ha = z->img_comp[n].ha;
02507                   if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha]))
02508                      return 0;
02509                }
02510                // every data block is an MCU, so countdown the restart interval
02511                if (--z->todo <= 0) {
02512                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
02513                   if (!STBI__RESTART(z->marker)) return 1;
02514                   stbi__jpeg_reset(z);
02515                }
02516             }
02517          }
02518          return 1;
02519       } else { // interleaved
02520          int i,j,k,x,y;
02521          for (j=0; j < z->img_mcu_y; ++j) {
02522             for (i=0; i < z->img_mcu_x; ++i) {
02523                // scan an interleaved mcu... process scan_n components in order
02524                for (k=0; k < z->scan_n; ++k) {
02525                   int n = z->order[k];
02526                   // scan out an mcu's worth of this component; that's just determined
02527                   // by the basic H and V specified for the component
02528                   for (y=0; y < z->img_comp[n].v; ++y) {
02529                      for (x=0; x < z->img_comp[n].h; ++x) {
02530                         int x2 = (i*z->img_comp[n].h + x);
02531                         int y2 = (j*z->img_comp[n].v + y);
02532                         short *data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w);
02533                         if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
02534                            return 0;
02535                      }
02536                   }
02537                }
02538                // after all interleaved components, that's an interleaved MCU,
02539                // so now count down the restart interval
02540                if (--z->todo <= 0) {
02541                   if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
02542                   if (!STBI__RESTART(z->marker)) return 1;
02543                   stbi__jpeg_reset(z);
02544                }
02545             }
02546          }
02547          return 1;
02548       }
02549    }
02550 }
02551 
02552 static void stbi__jpeg_dequantize(short *data, stbi_uc *dequant)
02553 {
02554    int i;
02555    for (i=0; i < 64; ++i)
02556       data[i] *= dequant[i];
02557 }
02558 
02559 static void stbi__jpeg_finish(stbi__jpeg *z)
02560 {
02561    if (z->progressive) {
02562       // dequantize and idct the data
02563       int i,j,n;
02564       for (n=0; n < z->s->img_n; ++n) {
02565          int w = (z->img_comp[n].x+7) >> 3;
02566          int h = (z->img_comp[n].y+7) >> 3;
02567          for (j=0; j < h; ++j) {
02568             for (i=0; i < w; ++i) {
02569                short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
02570                stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]);
02571                z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
02572             }
02573          }
02574       }
02575    }
02576 }
02577 
02578 static int stbi__process_marker(stbi__jpeg *z, int m)
02579 {
02580    int L;
02581    switch (m) {
02582       case STBI__MARKER_none: // no marker found
02583          return stbi__err("expected marker","Corrupt JPEG");
02584 
02585       case 0xDD: // DRI - specify restart interval
02586          if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
02587          z->restart_interval = stbi__get16be(z->s);
02588          return 1;
02589 
02590       case 0xDB: // DQT - define quantization table
02591          L = stbi__get16be(z->s)-2;
02592          while (L > 0) {
02593             int q = stbi__get8(z->s);
02594             int p = q >> 4;
02595             int t = q & 15,i;
02596             if (p != 0) return stbi__err("bad DQT type","Corrupt JPEG");
02597             if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG");
02598             for (i=0; i < 64; ++i)
02599                z->dequant[t][stbi__jpeg_dezigzag[i]] = stbi__get8(z->s);
02600             L -= 65;
02601          }
02602          return L==0;
02603 
02604       case 0xC4: // DHT - define huffman table
02605          L = stbi__get16be(z->s)-2;
02606          while (L > 0) {
02607             stbi_uc *v;
02608             int sizes[16],i,n=0;
02609             int q = stbi__get8(z->s);
02610             int tc = q >> 4;
02611             int th = q & 15;
02612             if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG");
02613             for (i=0; i < 16; ++i) {
02614                sizes[i] = stbi__get8(z->s);
02615                n += sizes[i];
02616             }
02617             L -= 17;
02618             if (tc == 0) {
02619                if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0;
02620                v = z->huff_dc[th].values;
02621             } else {
02622                if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0;
02623                v = z->huff_ac[th].values;
02624             }
02625             for (i=0; i < n; ++i)
02626                v[i] = stbi__get8(z->s);
02627             if (tc != 0)
02628                stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
02629             L -= n;
02630          }
02631          return L==0;
02632    }
02633    // check for comment block or APP blocks
02634    if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
02635       stbi__skip(z->s, stbi__get16be(z->s)-2);
02636       return 1;
02637    }
02638    return 0;
02639 }
02640 
02641 // after we see SOS
02642 static int stbi__process_scan_header(stbi__jpeg *z)
02643 {
02644    int i;
02645    int Ls = stbi__get16be(z->s);
02646    z->scan_n = stbi__get8(z->s);
02647    if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return stbi__err("bad SOS component count","Corrupt JPEG");
02648    if (Ls != 6+2*z->scan_n) return stbi__err("bad SOS len","Corrupt JPEG");
02649    for (i=0; i < z->scan_n; ++i) {
02650       int id = stbi__get8(z->s), which;
02651       int q = stbi__get8(z->s);
02652       for (which = 0; which < z->s->img_n; ++which)
02653          if (z->img_comp[which].id == id)
02654             break;
02655       if (which == z->s->img_n) return 0; // no match
02656       z->img_comp[which].hd = q >> 4;   if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff","Corrupt JPEG");
02657       z->img_comp[which].ha = q & 15;   if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff","Corrupt JPEG");
02658       z->order[i] = which;
02659    }
02660 
02661    {
02662       int aa;
02663       z->spec_start = stbi__get8(z->s);
02664       z->spec_end   = stbi__get8(z->s); // should be 63, but might be 0
02665       aa = stbi__get8(z->s);
02666       z->succ_high = (aa >> 4);
02667       z->succ_low  = (aa & 15);
02668       if (z->progressive) {
02669          if (z->spec_start > 63 || z->spec_end > 63  || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13)
02670             return stbi__err("bad SOS", "Corrupt JPEG");
02671       } else {
02672          if (z->spec_start != 0) return stbi__err("bad SOS","Corrupt JPEG");
02673          if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS","Corrupt JPEG");
02674          z->spec_end = 63;
02675       }
02676    }
02677 
02678    return 1;
02679 }
02680 
02681 static int stbi__process_frame_header(stbi__jpeg *z, int scan)
02682 {
02683    stbi__context *s = z->s;
02684    int Lf,p,i,q, h_max=1,v_max=1,c;
02685    Lf = stbi__get16be(s);         if (Lf < 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
02686    p  = stbi__get8(s);            if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
02687    s->img_y = stbi__get16be(s);   if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
02688    s->img_x = stbi__get16be(s);   if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
02689    c = stbi__get8(s);
02690    if (c != 3 && c != 1) return stbi__err("bad component count","Corrupt JPEG");    // JFIF requires
02691    s->img_n = c;
02692    for (i=0; i < c; ++i) {
02693       z->img_comp[i].data = NULL;
02694       z->img_comp[i].linebuf = NULL;
02695    }
02696 
02697    if (Lf != 8+3*s->img_n) return stbi__err("bad SOF len","Corrupt JPEG");
02698 
02699    for (i=0; i < s->img_n; ++i) {
02700       z->img_comp[i].id = stbi__get8(s);
02701       if (z->img_comp[i].id != i+1)   // JFIF requires
02702          if (z->img_comp[i].id != i)  // some version of jpegtran outputs non-JFIF-compliant files!
02703             return stbi__err("bad component ID","Corrupt JPEG");
02704       q = stbi__get8(s);
02705       z->img_comp[i].h = (q >> 4);  if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG");
02706       z->img_comp[i].v = q & 15;    if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG");
02707       z->img_comp[i].tq = stbi__get8(s);  if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG");
02708    }
02709 
02710    if (scan != STBI__SCAN_load) return 1;
02711 
02712    if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
02713 
02714    for (i=0; i < s->img_n; ++i) {
02715       if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
02716       if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
02717    }
02718 
02719    // compute interleaved mcu info
02720    z->img_h_max = h_max;
02721    z->img_v_max = v_max;
02722    z->img_mcu_w = h_max * 8;
02723    z->img_mcu_h = v_max * 8;
02724    z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
02725    z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;
02726 
02727    for (i=0; i < s->img_n; ++i) {
02728       // number of effective pixels (e.g. for non-interleaved MCU)
02729       z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
02730       z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
02731       // to simplify generation, we'll allocate enough memory to decode
02732       // the bogus oversized data from using interleaved MCUs and their
02733       // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
02734       // discard the extra data until colorspace conversion
02735       z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
02736       z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
02737       z->img_comp[i].raw_data = stbi__malloc(z->img_comp[i].w2 * z->img_comp[i].h2+15);
02738 
02739       if (z->img_comp[i].raw_data == NULL) {
02740          for(--i; i >= 0; --i) {
02741             STBI_FREE(z->img_comp[i].raw_data);
02742             z->img_comp[i].data = NULL;
02743          }
02744          return stbi__err("outofmem", "Out of memory");
02745       }
02746       // align blocks for idct using mmx/sse
02747       z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
02748       z->img_comp[i].linebuf = NULL;
02749       if (z->progressive) {
02750          z->img_comp[i].coeff_w = (z->img_comp[i].w2 + 7) >> 3;
02751          z->img_comp[i].coeff_h = (z->img_comp[i].h2 + 7) >> 3;
02752          z->img_comp[i].raw_coeff = STBI_MALLOC(z->img_comp[i].coeff_w * z->img_comp[i].coeff_h * 64 * sizeof(short) + 15);
02753          z->img_comp[i].coeff = (short*) (((size_t) z->img_comp[i].raw_coeff + 15) & ~15);
02754       } else {
02755          z->img_comp[i].coeff = 0;
02756          z->img_comp[i].raw_coeff = 0;
02757       }
02758    }
02759 
02760    return 1;
02761 }
02762 
02763 // use comparisons since in some cases we handle more than one case (e.g. SOF)
02764 #define stbi__DNL(x)         ((x) == 0xdc)
02765 #define stbi__SOI(x)         ((x) == 0xd8)
02766 #define stbi__EOI(x)         ((x) == 0xd9)
02767 #define stbi__SOF(x)         ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
02768 #define stbi__SOS(x)         ((x) == 0xda)
02769 
02770 #define stbi__SOF_progressive(x)   ((x) == 0xc2)
02771 
02772 static int stbi__decode_jpeg_header(stbi__jpeg *z, int scan)
02773 {
02774    int m;
02775    z->marker = STBI__MARKER_none; // initialize cached marker to empty
02776    m = stbi__get_marker(z);
02777    if (!stbi__SOI(m)) return stbi__err("no SOI","Corrupt JPEG");
02778    if (scan == STBI__SCAN_type) return 1;
02779    m = stbi__get_marker(z);
02780    while (!stbi__SOF(m)) {
02781       if (!stbi__process_marker(z,m)) return 0;
02782       m = stbi__get_marker(z);
02783       while (m == STBI__MARKER_none) {
02784          // some files have extra padding after their blocks, so ok, we'll scan
02785          if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG");
02786          m = stbi__get_marker(z);
02787       }
02788    }
02789    z->progressive = stbi__SOF_progressive(m);
02790    if (!stbi__process_frame_header(z, scan)) return 0;
02791    return 1;
02792 }
02793 
02794 // decode image to YCbCr format
02795 static int stbi__decode_jpeg_image(stbi__jpeg *j)
02796 {
02797    int m;
02798    for (m = 0; m < 4; m++) {
02799       j->img_comp[m].raw_data = NULL;
02800       j->img_comp[m].raw_coeff = NULL;
02801    }
02802    j->restart_interval = 0;
02803    if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0;
02804    m = stbi__get_marker(j);
02805    while (!stbi__EOI(m)) {
02806       if (stbi__SOS(m)) {
02807          if (!stbi__process_scan_header(j)) return 0;
02808          if (!stbi__parse_entropy_coded_data(j)) return 0;
02809          if (j->marker == STBI__MARKER_none ) {
02810             // handle 0s at the end of image data from IP Kamera 9060
02811             while (!stbi__at_eof(j->s)) {
02812                int x = stbi__get8(j->s);
02813                if (x == 255) {
02814                   j->marker = stbi__get8(j->s);
02815                   break;
02816                } else if (x != 0) {
02817                   return stbi__err("junk before marker", "Corrupt JPEG");
02818                }
02819             }
02820             // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
02821          }
02822       } else {
02823          if (!stbi__process_marker(j, m)) return 0;
02824       }
02825       m = stbi__get_marker(j);
02826    }
02827    if (j->progressive)
02828       stbi__jpeg_finish(j);
02829    return 1;
02830 }
02831 
02832 // static jfif-centered resampling (across block boundaries)
02833 
02834 typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1,
02835                                     int w, int hs);
02836 
02837 #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
02838 
02839 static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
02840 {
02841    STBI_NOTUSED(out);
02842    STBI_NOTUSED(in_far);
02843    STBI_NOTUSED(w);
02844    STBI_NOTUSED(hs);
02845    return in_near;
02846 }
02847 
02848 static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
02849 {
02850    // need to generate two samples vertically for every one in input
02851    int i;
02852    STBI_NOTUSED(hs);
02853    for (i=0; i < w; ++i)
02854       out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2);
02855    return out;
02856 }
02857 
02858 static stbi_uc*  stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
02859 {
02860    // need to generate two samples horizontally for every one in input
02861    int i;
02862    stbi_uc *input = in_near;
02863 
02864    if (w == 1) {
02865       // if only one sample, can't do any interpolation
02866       out[0] = out[1] = input[0];
02867       return out;
02868    }
02869 
02870    out[0] = input[0];
02871    out[1] = stbi__div4(input[0]*3 + input[1] + 2);
02872    for (i=1; i < w-1; ++i) {
02873       int n = 3*input[i]+2;
02874       out[i*2+0] = stbi__div4(n+input[i-1]);
02875       out[i*2+1] = stbi__div4(n+input[i+1]);
02876    }
02877    out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2);
02878    out[i*2+1] = input[w-1];
02879 
02880    STBI_NOTUSED(in_far);
02881    STBI_NOTUSED(hs);
02882 
02883    return out;
02884 }
02885 
02886 #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
02887 
02888 static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
02889 {
02890    // need to generate 2x2 samples for every one in input
02891    int i,t0,t1;
02892    if (w == 1) {
02893       out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
02894       return out;
02895    }
02896 
02897    t1 = 3*in_near[0] + in_far[0];
02898    out[0] = stbi__div4(t1+2);
02899    for (i=1; i < w; ++i) {
02900       t0 = t1;
02901       t1 = 3*in_near[i]+in_far[i];
02902       out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
02903       out[i*2  ] = stbi__div16(3*t1 + t0 + 8);
02904    }
02905    out[w*2-1] = stbi__div4(t1+2);
02906 
02907    STBI_NOTUSED(hs);
02908 
02909    return out;
02910 }
02911 
02912 #if defined(STBI_SSE2) || defined(STBI_NEON)
02913 static stbi_uc *stbi__resample_row_hv_2_simd(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
02914 {
02915    // need to generate 2x2 samples for every one in input
02916    int i=0,t0,t1;
02917 
02918    if (w == 1) {
02919       out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
02920       return out;
02921    }
02922 
02923    t1 = 3*in_near[0] + in_far[0];
02924    // process groups of 8 pixels for as long as we can.
02925    // note we can't handle the last pixel in a row in this loop
02926    // because we need to handle the filter boundary conditions.
02927    for (; i < ((w-1) & ~7); i += 8) {
02928 #if defined(STBI_SSE2)
02929       // load and perform the vertical filtering pass
02930       // this uses 3*x + y = 4*x + (y - x)
02931       __m128i zero  = _mm_setzero_si128();
02932       __m128i farb  = _mm_loadl_epi64((__m128i *) (in_far + i));
02933       __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i));
02934       __m128i farw  = _mm_unpacklo_epi8(farb, zero);
02935       __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
02936       __m128i diff  = _mm_sub_epi16(farw, nearw);
02937       __m128i nears = _mm_slli_epi16(nearw, 2);
02938       __m128i curr  = _mm_add_epi16(nears, diff); // current row
02939 
02940       // horizontal filter works the same based on shifted vers of current
02941       // row. "prev" is current row shifted right by 1 pixel; we need to
02942       // insert the previous pixel value (from t1).
02943       // "next" is current row shifted left by 1 pixel, with first pixel
02944       // of next block of 8 pixels added in.
02945       __m128i prv0 = _mm_slli_si128(curr, 2);
02946       __m128i nxt0 = _mm_srli_si128(curr, 2);
02947       __m128i prev = _mm_insert_epi16(prv0, t1, 0);
02948       __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7);
02949 
02950       // horizontal filter, polyphase implementation since it's convenient:
02951       // even pixels = 3*cur + prev = cur*4 + (prev - cur)
02952       // odd  pixels = 3*cur + next = cur*4 + (next - cur)
02953       // note the shared term.
02954       __m128i bias  = _mm_set1_epi16(8);
02955       __m128i curs = _mm_slli_epi16(curr, 2);
02956       __m128i prvd = _mm_sub_epi16(prev, curr);
02957       __m128i nxtd = _mm_sub_epi16(next, curr);
02958       __m128i curb = _mm_add_epi16(curs, bias);
02959       __m128i even = _mm_add_epi16(prvd, curb);
02960       __m128i odd  = _mm_add_epi16(nxtd, curb);
02961 
02962       // interleave even and odd pixels, then undo scaling.
02963       __m128i int0 = _mm_unpacklo_epi16(even, odd);
02964       __m128i int1 = _mm_unpackhi_epi16(even, odd);
02965       __m128i de0  = _mm_srli_epi16(int0, 4);
02966       __m128i de1  = _mm_srli_epi16(int1, 4);
02967 
02968       // pack and write output
02969       __m128i outv = _mm_packus_epi16(de0, de1);
02970       _mm_storeu_si128((__m128i *) (out + i*2), outv);
02971 #elif defined(STBI_NEON)
02972       // load and perform the vertical filtering pass
02973       // this uses 3*x + y = 4*x + (y - x)
02974       uint8x8_t farb  = vld1_u8(in_far + i);
02975       uint8x8_t nearb = vld1_u8(in_near + i);
02976       int16x8_t diff  = vreinterpretq_s16_u16(vsubl_u8(farb, nearb));
02977       int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2));
02978       int16x8_t curr  = vaddq_s16(nears, diff); // current row
02979 
02980       // horizontal filter works the same based on shifted vers of current
02981       // row. "prev" is current row shifted right by 1 pixel; we need to
02982       // insert the previous pixel value (from t1).
02983       // "next" is current row shifted left by 1 pixel, with first pixel
02984       // of next block of 8 pixels added in.
02985       int16x8_t prv0 = vextq_s16(curr, curr, 7);
02986       int16x8_t nxt0 = vextq_s16(curr, curr, 1);
02987       int16x8_t prev = vsetq_lane_s16(t1, prv0, 0);
02988       int16x8_t next = vsetq_lane_s16(3*in_near[i+8] + in_far[i+8], nxt0, 7);
02989 
02990       // horizontal filter, polyphase implementation since it's convenient:
02991       // even pixels = 3*cur + prev = cur*4 + (prev - cur)
02992       // odd  pixels = 3*cur + next = cur*4 + (next - cur)
02993       // note the shared term.
02994       int16x8_t curs = vshlq_n_s16(curr, 2);
02995       int16x8_t prvd = vsubq_s16(prev, curr);
02996       int16x8_t nxtd = vsubq_s16(next, curr);
02997       int16x8_t even = vaddq_s16(curs, prvd);
02998       int16x8_t odd  = vaddq_s16(curs, nxtd);
02999 
03000       // undo scaling and round, then store with even/odd phases interleaved
03001       uint8x8x2_t o;
03002       o.val[0] = vqrshrun_n_s16(even, 4);
03003       o.val[1] = vqrshrun_n_s16(odd,  4);
03004       vst2_u8(out + i*2, o);
03005 #endif
03006 
03007       // "previous" value for next iter
03008       t1 = 3*in_near[i+7] + in_far[i+7];
03009    }
03010 
03011    t0 = t1;
03012    t1 = 3*in_near[i] + in_far[i];
03013    out[i*2] = stbi__div16(3*t1 + t0 + 8);
03014 
03015    for (++i; i < w; ++i) {
03016       t0 = t1;
03017       t1 = 3*in_near[i]+in_far[i];
03018       out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
03019       out[i*2  ] = stbi__div16(3*t1 + t0 + 8);
03020    }
03021    out[w*2-1] = stbi__div4(t1+2);
03022 
03023    STBI_NOTUSED(hs);
03024 
03025    return out;
03026 }
03027 #endif
03028 
03029 static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
03030 {
03031    // resample with nearest-neighbor
03032    int i,j;
03033    STBI_NOTUSED(in_far);
03034    for (i=0; i < w; ++i)
03035       for (j=0; j < hs; ++j)
03036          out[i*hs+j] = in_near[i];
03037    return out;
03038 }
03039 
03040 #ifdef STBI_JPEG_OLD
03041 // this is the same YCbCr-to-RGB calculation that stb_image has used
03042 // historically before the algorithm changes in 1.49
03043 #define float2fixed(x)  ((int) ((x) * 65536 + 0.5))
03044 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
03045 {
03046    int i;
03047    for (i=0; i < count; ++i) {
03048       int y_fixed = (y[i] << 16) + 32768; // rounding
03049       int r,g,b;
03050       int cr = pcr[i] - 128;
03051       int cb = pcb[i] - 128;
03052       r = y_fixed + cr*float2fixed(1.40200f);
03053       g = y_fixed - cr*float2fixed(0.71414f) - cb*float2fixed(0.34414f);
03054       b = y_fixed                            + cb*float2fixed(1.77200f);
03055       r >>= 16;
03056       g >>= 16;
03057       b >>= 16;
03058       if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
03059       if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
03060       if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
03061       out[0] = (stbi_uc)r;
03062       out[1] = (stbi_uc)g;
03063       out[2] = (stbi_uc)b;
03064       out[3] = 255;
03065       out += step;
03066    }
03067 }
03068 #else
03069 // this is a reduced-precision calculation of YCbCr-to-RGB introduced
03070 // to make sure the code produces the same results in both SIMD and scalar
03071 #define float2fixed(x)  (((int) ((x) * 4096.0f + 0.5f)) << 8)
03072 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
03073 {
03074    int i;
03075    for (i=0; i < count; ++i) {
03076       int y_fixed = (y[i] << 20) + (1<<19); // rounding
03077       int r,g,b;
03078       int cr = pcr[i] - 128;
03079       int cb = pcb[i] - 128;
03080       r = y_fixed +  cr* float2fixed(1.40200f);
03081       g = y_fixed + (cr*-float2fixed(0.71414f)) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
03082       b = y_fixed                               +   cb* float2fixed(1.77200f);
03083       r >>= 20;
03084       g >>= 20;
03085       b >>= 20;
03086       if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
03087       if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
03088       if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
03089       out[0] = (stbi_uc)r;
03090       out[1] = (stbi_uc)g;
03091       out[2] = (stbi_uc)b;
03092       out[3] = 255;
03093       out += step;
03094    }
03095 }
03096 #endif
03097 
03098 #if defined(STBI_SSE2) || defined(STBI_NEON)
03099 static void stbi__YCbCr_to_RGB_simd(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step)
03100 {
03101    int i = 0;
03102 
03103 #ifdef STBI_SSE2
03104    // step == 3 is pretty ugly on the final interleave, and i'm not convinced
03105    // it's useful in practice (you wouldn't use it for textures, for example).
03106    // so just accelerate step == 4 case.
03107    if (step == 4) {
03108       // this is a fairly straightforward implementation and not super-optimized.
03109       __m128i signflip  = _mm_set1_epi8(-0x80);
03110       __m128i cr_const0 = _mm_set1_epi16(   (short) ( 1.40200f*4096.0f+0.5f));
03111       __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f));
03112       __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f));
03113       __m128i cb_const1 = _mm_set1_epi16(   (short) ( 1.77200f*4096.0f+0.5f));
03114       __m128i y_bias = _mm_set1_epi8((char) (unsigned char) 128);
03115       __m128i xw = _mm_set1_epi16(255); // alpha channel
03116 
03117       for (; i+7 < count; i += 8) {
03118          // load
03119          __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i));
03120          __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i));
03121          __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i));
03122          __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
03123          __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
03124 
03125          // unpack to short (and left-shift cr, cb by 8)
03126          __m128i yw  = _mm_unpacklo_epi8(y_bias, y_bytes);
03127          __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
03128          __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
03129 
03130          // color transform
03131          __m128i yws = _mm_srli_epi16(yw, 4);
03132          __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
03133          __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
03134          __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
03135          __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
03136          __m128i rws = _mm_add_epi16(cr0, yws);
03137          __m128i gwt = _mm_add_epi16(cb0, yws);
03138          __m128i bws = _mm_add_epi16(yws, cb1);
03139          __m128i gws = _mm_add_epi16(gwt, cr1);
03140 
03141          // descale
03142          __m128i rw = _mm_srai_epi16(rws, 4);
03143          __m128i bw = _mm_srai_epi16(bws, 4);
03144          __m128i gw = _mm_srai_epi16(gws, 4);
03145 
03146          // back to byte, set up for transpose
03147          __m128i brb = _mm_packus_epi16(rw, bw);
03148          __m128i gxb = _mm_packus_epi16(gw, xw);
03149 
03150          // transpose to interleave channels
03151          __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
03152          __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
03153          __m128i o0 = _mm_unpacklo_epi16(t0, t1);
03154          __m128i o1 = _mm_unpackhi_epi16(t0, t1);
03155 
03156          // store
03157          _mm_storeu_si128((__m128i *) (out + 0), o0);
03158          _mm_storeu_si128((__m128i *) (out + 16), o1);
03159          out += 32;
03160       }
03161    }
03162 #endif
03163 
03164 #ifdef STBI_NEON
03165    // in this version, step=3 support would be easy to add. but is there demand?
03166    if (step == 4) {
03167       // this is a fairly straightforward implementation and not super-optimized.
03168       uint8x8_t signflip = vdup_n_u8(0x80);
03169       int16x8_t cr_const0 = vdupq_n_s16(   (short) ( 1.40200f*4096.0f+0.5f));
03170       int16x8_t cr_const1 = vdupq_n_s16( - (short) ( 0.71414f*4096.0f+0.5f));
03171       int16x8_t cb_const0 = vdupq_n_s16( - (short) ( 0.34414f*4096.0f+0.5f));
03172       int16x8_t cb_const1 = vdupq_n_s16(   (short) ( 1.77200f*4096.0f+0.5f));
03173 
03174       for (; i+7 < count; i += 8) {
03175          // load
03176          uint8x8_t y_bytes  = vld1_u8(y + i);
03177          uint8x8_t cr_bytes = vld1_u8(pcr + i);
03178          uint8x8_t cb_bytes = vld1_u8(pcb + i);
03179          int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip));
03180          int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip));
03181 
03182          // expand to s16
03183          int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4));
03184          int16x8_t crw = vshll_n_s8(cr_biased, 7);
03185          int16x8_t cbw = vshll_n_s8(cb_biased, 7);
03186 
03187          // color transform
03188          int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0);
03189          int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0);
03190          int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1);
03191          int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1);
03192          int16x8_t rws = vaddq_s16(yws, cr0);
03193          int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1);
03194          int16x8_t bws = vaddq_s16(yws, cb1);
03195 
03196          // undo scaling, round, convert to byte
03197          uint8x8x4_t o;
03198          o.val[0] = vqrshrun_n_s16(rws, 4);
03199          o.val[1] = vqrshrun_n_s16(gws, 4);
03200          o.val[2] = vqrshrun_n_s16(bws, 4);
03201          o.val[3] = vdup_n_u8(255);
03202 
03203          // store, interleaving r/g/b/a
03204          vst4_u8(out, o);
03205          out += 8*4;
03206       }
03207    }
03208 #endif
03209 
03210    for (; i < count; ++i) {
03211       int y_fixed = (y[i] << 20) + (1<<19); // rounding
03212       int r,g,b;
03213       int cr = pcr[i] - 128;
03214       int cb = pcb[i] - 128;
03215       r = y_fixed + cr* float2fixed(1.40200f);
03216       g = y_fixed + cr*-float2fixed(0.71414f) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
03217       b = y_fixed                             +   cb* float2fixed(1.77200f);
03218       r >>= 20;
03219       g >>= 20;
03220       b >>= 20;
03221       if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
03222       if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
03223       if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
03224       out[0] = (stbi_uc)r;
03225       out[1] = (stbi_uc)g;
03226       out[2] = (stbi_uc)b;
03227       out[3] = 255;
03228       out += step;
03229    }
03230 }
03231 #endif
03232 
03233 // set up the kernels
03234 static void stbi__setup_jpeg(stbi__jpeg *j)
03235 {
03236    j->idct_block_kernel = stbi__idct_block;
03237    j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
03238    j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
03239 
03240 #ifdef STBI_SSE2
03241    if (stbi__sse2_available()) {
03242       j->idct_block_kernel = stbi__idct_simd;
03243       #ifndef STBI_JPEG_OLD
03244       j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
03245       #endif
03246       j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
03247    }
03248 #endif
03249 
03250 #ifdef STBI_NEON
03251    j->idct_block_kernel = stbi__idct_simd;
03252    #ifndef STBI_JPEG_OLD
03253    j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
03254    #endif
03255    j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
03256 #endif
03257 }
03258 
03259 // clean up the temporary component buffers
03260 static void stbi__cleanup_jpeg(stbi__jpeg *j)
03261 {
03262    int i;
03263    for (i=0; i < j->s->img_n; ++i) {
03264       if (j->img_comp[i].raw_data) {
03265          STBI_FREE(j->img_comp[i].raw_data);
03266          j->img_comp[i].raw_data = NULL;
03267          j->img_comp[i].data = NULL;
03268       }
03269       if (j->img_comp[i].raw_coeff) {
03270          STBI_FREE(j->img_comp[i].raw_coeff);
03271          j->img_comp[i].raw_coeff = 0;
03272          j->img_comp[i].coeff = 0;
03273       }
03274       if (j->img_comp[i].linebuf) {
03275          STBI_FREE(j->img_comp[i].linebuf);
03276          j->img_comp[i].linebuf = NULL;
03277       }
03278    }
03279 }
03280 
03281 typedef struct
03282 {
03283    resample_row_func resample;
03284    stbi_uc *line0,*line1;
03285    int hs,vs;   // expansion factor in each axis
03286    int w_lores; // horizontal pixels pre-expansion
03287    int ystep;   // how far through vertical expansion we are
03288    int ypos;    // which pre-expansion row we're on
03289 } stbi__resample;
03290 
03291 static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
03292 {
03293    int n, decode_n;
03294    z->s->img_n = 0; // make stbi__cleanup_jpeg safe
03295 
03296    // validate req_comp
03297    if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
03298 
03299    // load a jpeg image from whichever source, but leave in YCbCr format
03300    if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
03301 
03302    // determine actual number of components to generate
03303    n = req_comp ? req_comp : z->s->img_n;
03304 
03305    if (z->s->img_n == 3 && n < 3)
03306       decode_n = 1;
03307    else
03308       decode_n = z->s->img_n;
03309 
03310    // resample and color-convert
03311    {
03312       int k;
03313       unsigned int i,j;
03314       stbi_uc *output;
03315       stbi_uc *coutput[4];
03316 
03317       stbi__resample res_comp[4];
03318 
03319       for (k=0; k < decode_n; ++k) {
03320          stbi__resample *r = &res_comp[k];
03321 
03322          // allocate line buffer big enough for upsampling off the edges
03323          // with upsample factor of 4
03324          z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3);
03325          if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
03326 
03327          r->hs      = z->img_h_max / z->img_comp[k].h;
03328          r->vs      = z->img_v_max / z->img_comp[k].v;
03329          r->ystep   = r->vs >> 1;
03330          r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
03331          r->ypos    = 0;
03332          r->line0   = r->line1 = z->img_comp[k].data;
03333 
03334          if      (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
03335          else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
03336          else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
03337          else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
03338          else                               r->resample = stbi__resample_row_generic;
03339       }
03340 
03341       // can't error after this so, this is safe
03342       output = (stbi_uc *) stbi__malloc(n * z->s->img_x * z->s->img_y + 1);
03343       if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
03344 
03345       // now go ahead and resample
03346       for (j=0; j < z->s->img_y; ++j) {
03347          stbi_uc *out = output + n * z->s->img_x * j;
03348          for (k=0; k < decode_n; ++k) {
03349             stbi__resample *r = &res_comp[k];
03350             int y_bot = r->ystep >= (r->vs >> 1);
03351             coutput[k] = r->resample(z->img_comp[k].linebuf,
03352                                      y_bot ? r->line1 : r->line0,
03353                                      y_bot ? r->line0 : r->line1,
03354                                      r->w_lores, r->hs);
03355             if (++r->ystep >= r->vs) {
03356                r->ystep = 0;
03357                r->line0 = r->line1;
03358                if (++r->ypos < z->img_comp[k].y)
03359                   r->line1 += z->img_comp[k].w2;
03360             }
03361          }
03362          if (n >= 3) {
03363             stbi_uc *y = coutput[0];
03364             if (z->s->img_n == 3) {
03365                z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
03366             } else
03367                for (i=0; i < z->s->img_x; ++i) {
03368                   out[0] = out[1] = out[2] = y[i];
03369                   out[3] = 255; // not used if n==3
03370                   out += n;
03371                }
03372          } else {
03373             stbi_uc *y = coutput[0];
03374             if (n == 1)
03375                for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
03376             else
03377                for (i=0; i < z->s->img_x; ++i) *out++ = y[i], *out++ = 255;
03378          }
03379       }
03380       stbi__cleanup_jpeg(z);
03381       *out_x = z->s->img_x;
03382       *out_y = z->s->img_y;
03383       if (comp) *comp  = z->s->img_n; // report original components, not output
03384       return output;
03385    }
03386 }
03387 
03388 static unsigned char *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
03389 {
03390    stbi__jpeg j;
03391    j.s = s;
03392    stbi__setup_jpeg(&j);
03393    return load_jpeg_image(&j, x,y,comp,req_comp);
03394 }
03395 
03396 static int stbi__jpeg_test(stbi__context *s)
03397 {
03398    int r;
03399    stbi__jpeg j;
03400    j.s = s;
03401    stbi__setup_jpeg(&j);
03402    r = stbi__decode_jpeg_header(&j, STBI__SCAN_type);
03403    stbi__rewind(s);
03404    return r;
03405 }
03406 
03407 static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp)
03408 {
03409    if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) {
03410       stbi__rewind( j->s );
03411       return 0;
03412    }
03413    if (x) *x = j->s->img_x;
03414    if (y) *y = j->s->img_y;
03415    if (comp) *comp = j->s->img_n;
03416    return 1;
03417 }
03418 
03419 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp)
03420 {
03421    stbi__jpeg j;
03422    j.s = s;
03423    return stbi__jpeg_info_raw(&j, x, y, comp);
03424 }
03425 #endif
03426 
03427 // public domain zlib decode    v0.2  Sean Barrett 2006-11-18
03428 //    simple implementation
03429 //      - all input must be provided in an upfront buffer
03430 //      - all output is written to a single output buffer (can malloc/realloc)
03431 //    performance
03432 //      - fast huffman
03433 
03434 #ifndef STBI_NO_ZLIB
03435 
03436 // fast-way is faster to check than jpeg huffman, but slow way is slower
03437 #define STBI__ZFAST_BITS  9 // accelerate all cases in default tables
03438 #define STBI__ZFAST_MASK  ((1 << STBI__ZFAST_BITS) - 1)
03439 
03440 // zlib-style huffman encoding
03441 // (jpegs packs from left, zlib from right, so can't share code)
03442 typedef struct
03443 {
03444    stbi__uint16 fast[1 << STBI__ZFAST_BITS];
03445    stbi__uint16 firstcode[16];
03446    int maxcode[17];
03447    stbi__uint16 firstsymbol[16];
03448    stbi_uc  size[288];
03449    stbi__uint16 value[288];
03450 } stbi__zhuffman;
03451 
03452 stbi_inline static int stbi__bitreverse16(int n)
03453 {
03454   n = ((n & 0xAAAA) >>  1) | ((n & 0x5555) << 1);
03455   n = ((n & 0xCCCC) >>  2) | ((n & 0x3333) << 2);
03456   n = ((n & 0xF0F0) >>  4) | ((n & 0x0F0F) << 4);
03457   n = ((n & 0xFF00) >>  8) | ((n & 0x00FF) << 8);
03458   return n;
03459 }
03460 
03461 stbi_inline static int stbi__bit_reverse(int v, int bits)
03462 {
03463    STBI_ASSERT(bits <= 16);
03464    // to bit reverse n bits, reverse 16 and shift
03465    // e.g. 11 bits, bit reverse and shift away 5
03466    return stbi__bitreverse16(v) >> (16-bits);
03467 }
03468 
03469 static int stbi__zbuild_huffman(stbi__zhuffman *z, stbi_uc *sizelist, int num)
03470 {
03471    int i,k=0;
03472    int code, next_code[16], sizes[17];
03473 
03474    // DEFLATE spec for generating codes
03475    memset(sizes, 0, sizeof(sizes));
03476    memset(z->fast, 0, sizeof(z->fast));
03477    for (i=0; i < num; ++i)
03478       ++sizes[sizelist[i]];
03479    sizes[0] = 0;
03480    for (i=1; i < 16; ++i)
03481       if (sizes[i] > (1 << i))
03482          return stbi__err("bad sizes", "Corrupt PNG");
03483    code = 0;
03484    for (i=1; i < 16; ++i) {
03485       next_code[i] = code;
03486       z->firstcode[i] = (stbi__uint16) code;
03487       z->firstsymbol[i] = (stbi__uint16) k;
03488       code = (code + sizes[i]);
03489       if (sizes[i])
03490          if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt PNG");
03491       z->maxcode[i] = code << (16-i); // preshift for inner loop
03492       code <<= 1;
03493       k += sizes[i];
03494    }
03495    z->maxcode[16] = 0x10000; // sentinel
03496    for (i=0; i < num; ++i) {
03497       int s = sizelist[i];
03498       if (s) {
03499          int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
03500          stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i);
03501          z->size [c] = (stbi_uc     ) s;
03502          z->value[c] = (stbi__uint16) i;
03503          if (s <= STBI__ZFAST_BITS) {
03504             int k = stbi__bit_reverse(next_code[s],s);
03505             while (k < (1 << STBI__ZFAST_BITS)) {
03506                z->fast[k] = fastv;
03507                k += (1 << s);
03508             }
03509          }
03510          ++next_code[s];
03511       }
03512    }
03513    return 1;
03514 }
03515 
03516 // zlib-from-memory implementation for PNG reading
03517 //    because PNG allows splitting the zlib stream arbitrarily,
03518 //    and it's annoying structurally to have PNG call ZLIB call PNG,
03519 //    we require PNG read all the IDATs and combine them into a single
03520 //    memory buffer
03521 
03522 typedef struct
03523 {
03524    stbi_uc *zbuffer, *zbuffer_end;
03525    int num_bits;
03526    stbi__uint32 code_buffer;
03527 
03528    char *zout;
03529    char *zout_start;
03530    char *zout_end;
03531    int   z_expandable;
03532 
03533    stbi__zhuffman z_length, z_distance;
03534 } stbi__zbuf;
03535 
03536 stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z)
03537 {
03538    if (z->zbuffer >= z->zbuffer_end) return 0;
03539    return *z->zbuffer++;
03540 }
03541 
03542 static void stbi__fill_bits(stbi__zbuf *z)
03543 {
03544    do {
03545       STBI_ASSERT(z->code_buffer < (1U << z->num_bits));
03546       z->code_buffer |= stbi__zget8(z) << z->num_bits;
03547       z->num_bits += 8;
03548    } while (z->num_bits <= 24);
03549 }
03550 
03551 stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n)
03552 {
03553    unsigned int k;
03554    if (z->num_bits < n) stbi__fill_bits(z);
03555    k = z->code_buffer & ((1 << n) - 1);
03556    z->code_buffer >>= n;
03557    z->num_bits -= n;
03558    return k;
03559 }
03560 
03561 static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z)
03562 {
03563    int b,s,k;
03564    // not resolved by fast table, so compute it the slow way
03565    // use jpeg approach, which requires MSbits at top
03566    k = stbi__bit_reverse(a->code_buffer, 16);
03567    for (s=STBI__ZFAST_BITS+1; ; ++s)
03568       if (k < z->maxcode[s])
03569          break;
03570    if (s == 16) return -1; // invalid code!
03571    // code size is s, so:
03572    b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
03573    STBI_ASSERT(z->size[b] == s);
03574    a->code_buffer >>= s;
03575    a->num_bits -= s;
03576    return z->value[b];
03577 }
03578 
03579 stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z)
03580 {
03581    int b,s;
03582    if (a->num_bits < 16) stbi__fill_bits(a);
03583    b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
03584    if (b) {
03585       s = b >> 9;
03586       a->code_buffer >>= s;
03587       a->num_bits -= s;
03588       return b & 511;
03589    }
03590    return stbi__zhuffman_decode_slowpath(a, z);
03591 }
03592 
03593 static int stbi__zexpand(stbi__zbuf *z, char *zout, int n)  // need to make room for n bytes
03594 {
03595    char *q;
03596    int cur, limit;
03597    z->zout = zout;
03598    if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG");
03599    cur   = (int) (z->zout     - z->zout_start);
03600    limit = (int) (z->zout_end - z->zout_start);
03601    while (cur + n > limit)
03602       limit *= 2;
03603    q = (char *) STBI_REALLOC(z->zout_start, limit);
03604    if (q == NULL) return stbi__err("outofmem", "Out of memory");
03605    z->zout_start = q;
03606    z->zout       = q + cur;
03607    z->zout_end   = q + limit;
03608    return 1;
03609 }
03610 
03611 static int stbi__zlength_base[31] = {
03612    3,4,5,6,7,8,9,10,11,13,
03613    15,17,19,23,27,31,35,43,51,59,
03614    67,83,99,115,131,163,195,227,258,0,0 };
03615 
03616 static int stbi__zlength_extra[31]=
03617 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
03618 
03619 static int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
03620 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
03621 
03622 static int stbi__zdist_extra[32] =
03623 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
03624 
03625 static int stbi__parse_huffman_block(stbi__zbuf *a)
03626 {
03627    char *zout = a->zout;
03628    for(;;) {
03629       int z = stbi__zhuffman_decode(a, &a->z_length);
03630       if (z < 256) {
03631          if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
03632          if (zout >= a->zout_end) {
03633             if (!stbi__zexpand(a, zout, 1)) return 0;
03634             zout = a->zout;
03635          }
03636          *zout++ = (char) z;
03637       } else {
03638          stbi_uc *p;
03639          int len,dist;
03640          if (z == 256) {
03641             a->zout = zout;
03642             return 1;
03643          }
03644          z -= 257;
03645          len = stbi__zlength_base[z];
03646          if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
03647          z = stbi__zhuffman_decode(a, &a->z_distance);
03648          if (z < 0) return stbi__err("bad huffman code","Corrupt PNG");
03649          dist = stbi__zdist_base[z];
03650          if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
03651          if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG");
03652          if (zout + len > a->zout_end) {
03653             if (!stbi__zexpand(a, zout, len)) return 0;
03654             zout = a->zout;
03655          }
03656          p = (stbi_uc *) (zout - dist);
03657          if (dist == 1) { // run of one byte; common in images.
03658             stbi_uc v = *p;
03659             if (len) { do *zout++ = v; while (--len); }
03660          } else {
03661             if (len) { do *zout++ = *p++; while (--len); }
03662          }
03663       }
03664    }
03665 }
03666 
03667 static int stbi__compute_huffman_codes(stbi__zbuf *a)
03668 {
03669    static stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
03670    stbi__zhuffman z_codelength;
03671    stbi_uc lencodes[286+32+137];//padding for maximum single op
03672    stbi_uc codelength_sizes[19];
03673    int i,n;
03674 
03675    int hlit  = stbi__zreceive(a,5) + 257;
03676    int hdist = stbi__zreceive(a,5) + 1;
03677    int hclen = stbi__zreceive(a,4) + 4;
03678 
03679    memset(codelength_sizes, 0, sizeof(codelength_sizes));
03680    for (i=0; i < hclen; ++i) {
03681       int s = stbi__zreceive(a,3);
03682       codelength_sizes[length_dezigzag[i]] = (stbi_uc) s;
03683    }
03684    if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
03685 
03686    n = 0;
03687    while (n < hlit + hdist) {
03688       int c = stbi__zhuffman_decode(a, &z_codelength);
03689       if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG");
03690       if (c < 16)
03691          lencodes[n++] = (stbi_uc) c;
03692       else if (c == 16) {
03693          c = stbi__zreceive(a,2)+3;
03694          memset(lencodes+n, lencodes[n-1], c);
03695          n += c;
03696       } else if (c == 17) {
03697          c = stbi__zreceive(a,3)+3;
03698          memset(lencodes+n, 0, c);
03699          n += c;
03700       } else {
03701          STBI_ASSERT(c == 18);
03702          c = stbi__zreceive(a,7)+11;
03703          memset(lencodes+n, 0, c);
03704          n += c;
03705       }
03706    }
03707    if (n != hlit+hdist) return stbi__err("bad codelengths","Corrupt PNG");
03708    if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
03709    if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
03710    return 1;
03711 }
03712 
03713 static int stbi__parse_uncomperssed_block(stbi__zbuf *a)
03714 {
03715    stbi_uc header[4];
03716    int len,nlen,k;
03717    if (a->num_bits & 7)
03718       stbi__zreceive(a, a->num_bits & 7); // discard
03719    // drain the bit-packed data into header
03720    k = 0;
03721    while (a->num_bits > 0) {
03722       header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check
03723       a->code_buffer >>= 8;
03724       a->num_bits -= 8;
03725    }
03726    STBI_ASSERT(a->num_bits == 0);
03727    // now fill header the normal way
03728    while (k < 4)
03729       header[k++] = stbi__zget8(a);
03730    len  = header[1] * 256 + header[0];
03731    nlen = header[3] * 256 + header[2];
03732    if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
03733    if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG");
03734    if (a->zout + len > a->zout_end)
03735       if (!stbi__zexpand(a, a->zout, len)) return 0;
03736    memcpy(a->zout, a->zbuffer, len);
03737    a->zbuffer += len;
03738    a->zout += len;
03739    return 1;
03740 }
03741 
03742 static int stbi__parse_zlib_header(stbi__zbuf *a)
03743 {
03744    int cmf   = stbi__zget8(a);
03745    int cm    = cmf & 15;
03746    /* int cinfo = cmf >> 4; */
03747    int flg   = stbi__zget8(a);
03748    if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
03749    if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
03750    if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
03751    // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
03752    return 1;
03753 }
03754 
03755 // @TODO: should statically initialize these for optimal thread safety
03756 static stbi_uc stbi__zdefault_length[288], stbi__zdefault_distance[32];
03757 static void stbi__init_zdefaults(void)
03758 {
03759    int i;   // use <= to match clearly with spec
03760    for (i=0; i <= 143; ++i)     stbi__zdefault_length[i]   = 8;
03761    for (   ; i <= 255; ++i)     stbi__zdefault_length[i]   = 9;
03762    for (   ; i <= 279; ++i)     stbi__zdefault_length[i]   = 7;
03763    for (   ; i <= 287; ++i)     stbi__zdefault_length[i]   = 8;
03764 
03765    for (i=0; i <=  31; ++i)     stbi__zdefault_distance[i] = 5;
03766 }
03767 
03768 static int stbi__parse_zlib(stbi__zbuf *a, int parse_header)
03769 {
03770    int final, type;
03771    if (parse_header)
03772       if (!stbi__parse_zlib_header(a)) return 0;
03773    a->num_bits = 0;
03774    a->code_buffer = 0;
03775    do {
03776       final = stbi__zreceive(a,1);
03777       type = stbi__zreceive(a,2);
03778       if (type == 0) {
03779          if (!stbi__parse_uncomperssed_block(a)) return 0;
03780       } else if (type == 3) {
03781          return 0;
03782       } else {
03783          if (type == 1) {
03784             // use fixed code lengths
03785             if (!stbi__zdefault_distance[31]) stbi__init_zdefaults();
03786             if (!stbi__zbuild_huffman(&a->z_length  , stbi__zdefault_length  , 288)) return 0;
03787             if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance,  32)) return 0;
03788          } else {
03789             if (!stbi__compute_huffman_codes(a)) return 0;
03790          }
03791          if (!stbi__parse_huffman_block(a)) return 0;
03792       }
03793    } while (!final);
03794    return 1;
03795 }
03796 
03797 static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header)
03798 {
03799    a->zout_start = obuf;
03800    a->zout       = obuf;
03801    a->zout_end   = obuf + olen;
03802    a->z_expandable = exp;
03803 
03804    return stbi__parse_zlib(a, parse_header);
03805 }
03806 
03807 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)
03808 {
03809    stbi__zbuf a;
03810    char *p = (char *) stbi__malloc(initial_size);
03811    if (p == NULL) return NULL;
03812    a.zbuffer = (stbi_uc *) buffer;
03813    a.zbuffer_end = (stbi_uc *) buffer + len;
03814    if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
03815       if (outlen) *outlen = (int) (a.zout - a.zout_start);
03816       return a.zout_start;
03817    } else {
03818       STBI_FREE(a.zout_start);
03819       return NULL;
03820    }
03821 }
03822 
03823 STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
03824 {
03825    return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
03826 }
03827 
03828 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
03829 {
03830    stbi__zbuf a;
03831    char *p = (char *) stbi__malloc(initial_size);
03832    if (p == NULL) return NULL;
03833    a.zbuffer = (stbi_uc *) buffer;
03834    a.zbuffer_end = (stbi_uc *) buffer + len;
03835    if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
03836       if (outlen) *outlen = (int) (a.zout - a.zout_start);
03837       return a.zout_start;
03838    } else {
03839       STBI_FREE(a.zout_start);
03840       return NULL;
03841    }
03842 }
03843 
03844 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
03845 {
03846    stbi__zbuf a;
03847    a.zbuffer = (stbi_uc *) ibuffer;
03848    a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
03849    if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
03850       return (int) (a.zout - a.zout_start);
03851    else
03852       return -1;
03853 }
03854 
03855 STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
03856 {
03857    stbi__zbuf a;
03858    char *p = (char *) stbi__malloc(16384);
03859    if (p == NULL) return NULL;
03860    a.zbuffer = (stbi_uc *) buffer;
03861    a.zbuffer_end = (stbi_uc *) buffer+len;
03862    if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
03863       if (outlen) *outlen = (int) (a.zout - a.zout_start);
03864       return a.zout_start;
03865    } else {
03866       STBI_FREE(a.zout_start);
03867       return NULL;
03868    }
03869 }
03870 
03871 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
03872 {
03873    stbi__zbuf a;
03874    a.zbuffer = (stbi_uc *) ibuffer;
03875    a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
03876    if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
03877       return (int) (a.zout - a.zout_start);
03878    else
03879       return -1;
03880 }
03881 #endif
03882 
03883 // public domain "baseline" PNG decoder   v0.10  Sean Barrett 2006-11-18
03884 //    simple implementation
03885 //      - only 8-bit samples
03886 //      - no CRC checking
03887 //      - allocates lots of intermediate memory
03888 //        - avoids problem of streaming data between subsystems
03889 //        - avoids explicit window management
03890 //    performance
03891 //      - uses stb_zlib, a PD zlib implementation with fast huffman decoding
03892 
03893 #ifndef STBI_NO_PNG
03894 typedef struct
03895 {
03896    stbi__uint32 length;
03897    stbi__uint32 type;
03898 } stbi__pngchunk;
03899 
03900 static stbi__pngchunk stbi__get_chunk_header(stbi__context *s)
03901 {
03902    stbi__pngchunk c;
03903    c.length = stbi__get32be(s);
03904    c.type   = stbi__get32be(s);
03905    return c;
03906 }
03907 
03908 static int stbi__check_png_header(stbi__context *s)
03909 {
03910    static stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
03911    int i;
03912    for (i=0; i < 8; ++i)
03913       if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG");
03914    return 1;
03915 }
03916 
03917 typedef struct
03918 {
03919    stbi__context *s;
03920    stbi_uc *idata, *expanded, *out;
03921 } stbi__png;
03922 
03923 
03924 enum {
03925    STBI__F_none=0,
03926    STBI__F_sub=1,
03927    STBI__F_up=2,
03928    STBI__F_avg=3,
03929    STBI__F_paeth=4,
03930    // synthetic filters used for first scanline to avoid needing a dummy row of 0s
03931    STBI__F_avg_first,
03932    STBI__F_paeth_first
03933 };
03934 
03935 static stbi_uc first_row_filter[5] =
03936 {
03937    STBI__F_none,
03938    STBI__F_sub,
03939    STBI__F_none,
03940    STBI__F_avg_first,
03941    STBI__F_paeth_first
03942 };
03943 
03944 static int stbi__paeth(int a, int b, int c)
03945 {
03946    int p = a + b - c;
03947    int pa = abs(p-a);
03948    int pb = abs(p-b);
03949    int pc = abs(p-c);
03950    if (pa <= pb && pa <= pc) return a;
03951    if (pb <= pc) return b;
03952    return c;
03953 }
03954 
03955 static stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
03956 
03957 // create the png data from post-deflated data
03958 static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
03959 {
03960    stbi__context *s = a->s;
03961    stbi__uint32 i,j,stride = x*out_n;
03962    stbi__uint32 img_len, img_width_bytes;
03963    int k;
03964    int img_n = s->img_n; // copy it into a local for later
03965 
03966    STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1);
03967    a->out = (stbi_uc *) stbi__malloc(x * y * out_n); // extra bytes to write off the end into
03968    if (!a->out) return stbi__err("outofmem", "Out of memory");
03969 
03970    img_width_bytes = (((img_n * x * depth) + 7) >> 3);
03971    img_len = (img_width_bytes + 1) * y;
03972    if (s->img_x == x && s->img_y == y) {
03973       if (raw_len != img_len) return stbi__err("not enough pixels","Corrupt PNG");
03974    } else { // interlaced:
03975       if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG");
03976    }
03977 
03978    for (j=0; j < y; ++j) {
03979       stbi_uc *cur = a->out + stride*j;
03980       stbi_uc *prior = cur - stride;
03981       int filter = *raw++;
03982       int filter_bytes = img_n;
03983       int width = x;
03984       if (filter > 4)
03985          return stbi__err("invalid filter","Corrupt PNG");
03986 
03987       if (depth < 8) {
03988          STBI_ASSERT(img_width_bytes <= x);
03989          cur += x*out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place
03990          filter_bytes = 1;
03991          width = img_width_bytes;
03992       }
03993 
03994       // if first row, use special filter that doesn't sample previous row
03995       if (j == 0) filter = first_row_filter[filter];
03996 
03997       // handle first byte explicitly
03998       for (k=0; k < filter_bytes; ++k) {
03999          switch (filter) {
04000             case STBI__F_none       : cur[k] = raw[k]; break;
04001             case STBI__F_sub        : cur[k] = raw[k]; break;
04002             case STBI__F_up         : cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
04003             case STBI__F_avg        : cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1)); break;
04004             case STBI__F_paeth      : cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0,prior[k],0)); break;
04005             case STBI__F_avg_first  : cur[k] = raw[k]; break;
04006             case STBI__F_paeth_first: cur[k] = raw[k]; break;
04007          }
04008       }
04009 
04010       if (depth == 8) {
04011          if (img_n != out_n)
04012             cur[img_n] = 255; // first pixel
04013          raw += img_n;
04014          cur += out_n;
04015          prior += out_n;
04016       } else {
04017          raw += 1;
04018          cur += 1;
04019          prior += 1;
04020       }
04021 
04022       // this is a little gross, so that we don't switch per-pixel or per-component
04023       if (depth < 8 || img_n == out_n) {
04024          int nk = (width - 1)*img_n;
04025          #define CASE(f) \
04026              case f:     \
04027                 for (k=0; k < nk; ++k)
04028          switch (filter) {
04029             // "none" filter turns into a memcpy here; make that explicit.
04030             case STBI__F_none:         memcpy(cur, raw, nk); break;
04031             CASE(STBI__F_sub)          cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]); break;
04032             CASE(STBI__F_up)           cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
04033             CASE(STBI__F_avg)          cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1)); break;
04034             CASE(STBI__F_paeth)        cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],prior[k],prior[k-filter_bytes])); break;
04035             CASE(STBI__F_avg_first)    cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1)); break;
04036             CASE(STBI__F_paeth_first)  cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],0,0)); break;
04037          }
04038          #undef CASE
04039          raw += nk;
04040       } else {
04041          STBI_ASSERT(img_n+1 == out_n);
04042          #define CASE(f) \
04043              case f:     \
04044                 for (i=x-1; i >= 1; --i, cur[img_n]=255,raw+=img_n,cur+=out_n,prior+=out_n) \
04045                    for (k=0; k < img_n; ++k)
04046          switch (filter) {
04047             CASE(STBI__F_none)         cur[k] = raw[k]; break;
04048             CASE(STBI__F_sub)          cur[k] = STBI__BYTECAST(raw[k] + cur[k-out_n]); break;
04049             CASE(STBI__F_up)           cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
04050             CASE(STBI__F_avg)          cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-out_n])>>1)); break;
04051             CASE(STBI__F_paeth)        cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-out_n],prior[k],prior[k-out_n])); break;
04052             CASE(STBI__F_avg_first)    cur[k] = STBI__BYTECAST(raw[k] + (cur[k-out_n] >> 1)); break;
04053             CASE(STBI__F_paeth_first)  cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-out_n],0,0)); break;
04054          }
04055          #undef CASE
04056       }
04057    }
04058 
04059    // we make a separate pass to expand bits to pixels; for performance,
04060    // this could run two scanlines behind the above code, so it won't
04061    // intefere with filtering but will still be in the cache.
04062    if (depth < 8) {
04063       for (j=0; j < y; ++j) {
04064          stbi_uc *cur = a->out + stride*j;
04065          stbi_uc *in  = a->out + stride*j + x*out_n - img_width_bytes;
04066          // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
04067          // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
04068          stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range
04069 
04070          // note that the final byte might overshoot and write more data than desired.
04071          // we can allocate enough data that this never writes out of memory, but it
04072          // could also overwrite the next scanline. can it overwrite non-empty data
04073          // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
04074          // so we need to explicitly clamp the final ones
04075 
04076          if (depth == 4) {
04077             for (k=x*img_n; k >= 2; k-=2, ++in) {
04078                *cur++ = scale * ((*in >> 4)       );
04079                *cur++ = scale * ((*in     ) & 0x0f);
04080             }
04081             if (k > 0) *cur++ = scale * ((*in >> 4)       );
04082          } else if (depth == 2) {
04083             for (k=x*img_n; k >= 4; k-=4, ++in) {
04084                *cur++ = scale * ((*in >> 6)       );
04085                *cur++ = scale * ((*in >> 4) & 0x03);
04086                *cur++ = scale * ((*in >> 2) & 0x03);
04087                *cur++ = scale * ((*in     ) & 0x03);
04088             }
04089             if (k > 0) *cur++ = scale * ((*in >> 6)       );
04090             if (k > 1) *cur++ = scale * ((*in >> 4) & 0x03);
04091             if (k > 2) *cur++ = scale * ((*in >> 2) & 0x03);
04092          } else if (depth == 1) {
04093             for (k=x*img_n; k >= 8; k-=8, ++in) {
04094                *cur++ = scale * ((*in >> 7)       );
04095                *cur++ = scale * ((*in >> 6) & 0x01);
04096                *cur++ = scale * ((*in >> 5) & 0x01);
04097                *cur++ = scale * ((*in >> 4) & 0x01);
04098                *cur++ = scale * ((*in >> 3) & 0x01);
04099                *cur++ = scale * ((*in >> 2) & 0x01);
04100                *cur++ = scale * ((*in >> 1) & 0x01);
04101                *cur++ = scale * ((*in     ) & 0x01);
04102             }
04103             if (k > 0) *cur++ = scale * ((*in >> 7)       );
04104             if (k > 1) *cur++ = scale * ((*in >> 6) & 0x01);
04105             if (k > 2) *cur++ = scale * ((*in >> 5) & 0x01);
04106             if (k > 3) *cur++ = scale * ((*in >> 4) & 0x01);
04107             if (k > 4) *cur++ = scale * ((*in >> 3) & 0x01);
04108             if (k > 5) *cur++ = scale * ((*in >> 2) & 0x01);
04109             if (k > 6) *cur++ = scale * ((*in >> 1) & 0x01);
04110          }
04111          if (img_n != out_n) {
04112             // insert alpha = 255
04113             stbi_uc *cur = a->out + stride*j;
04114             int i;
04115             if (img_n == 1) {
04116                for (i=x-1; i >= 0; --i) {
04117                   cur[i*2+1] = 255;
04118                   cur[i*2+0] = cur[i];
04119                }
04120             } else {
04121                STBI_ASSERT(img_n == 3);
04122                for (i=x-1; i >= 0; --i) {
04123                   cur[i*4+3] = 255;
04124                   cur[i*4+2] = cur[i*3+2];
04125                   cur[i*4+1] = cur[i*3+1];
04126                   cur[i*4+0] = cur[i*3+0];
04127                }
04128             }
04129          }
04130       }
04131    }
04132 
04133    return 1;
04134 }
04135 
04136 static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
04137 {
04138    stbi_uc *final;
04139    int p;
04140    if (!interlaced)
04141       return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
04142 
04143    // de-interlacing
04144    final = (stbi_uc *) stbi__malloc(a->s->img_x * a->s->img_y * out_n);
04145    for (p=0; p < 7; ++p) {
04146       int xorig[] = { 0,4,0,2,0,1,0 };
04147       int yorig[] = { 0,0,4,0,2,0,1 };
04148       int xspc[]  = { 8,8,4,4,2,2,1 };
04149       int yspc[]  = { 8,8,8,4,4,2,2 };
04150       int i,j,x,y;
04151       // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
04152       x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
04153       y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
04154       if (x && y) {
04155          stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
04156          if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
04157             STBI_FREE(final);
04158             return 0;
04159          }
04160          for (j=0; j < y; ++j) {
04161             for (i=0; i < x; ++i) {
04162                int out_y = j*yspc[p]+yorig[p];
04163                int out_x = i*xspc[p]+xorig[p];
04164                memcpy(final + out_y*a->s->img_x*out_n + out_x*out_n,
04165                       a->out + (j*x+i)*out_n, out_n);
04166             }
04167          }
04168          STBI_FREE(a->out);
04169          image_data += img_len;
04170          image_data_len -= img_len;
04171       }
04172    }
04173    a->out = final;
04174 
04175    return 1;
04176 }
04177 
04178 static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n)
04179 {
04180    stbi__context *s = z->s;
04181    stbi__uint32 i, pixel_count = s->img_x * s->img_y;
04182    stbi_uc *p = z->out;
04183 
04184    // compute color-based transparency, assuming we've
04185    // already got 255 as the alpha value in the output
04186    STBI_ASSERT(out_n == 2 || out_n == 4);
04187 
04188    if (out_n == 2) {
04189       for (i=0; i < pixel_count; ++i) {
04190          p[1] = (p[0] == tc[0] ? 0 : 255);
04191          p += 2;
04192       }
04193    } else {
04194       for (i=0; i < pixel_count; ++i) {
04195          if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
04196             p[3] = 0;
04197          p += 4;
04198       }
04199    }
04200    return 1;
04201 }
04202 
04203 static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n)
04204 {
04205    stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
04206    stbi_uc *p, *temp_out, *orig = a->out;
04207 
04208    p = (stbi_uc *) stbi__malloc(pixel_count * pal_img_n);
04209    if (p == NULL) return stbi__err("outofmem", "Out of memory");
04210 
04211    // between here and free(out) below, exitting would leak
04212    temp_out = p;
04213 
04214    if (pal_img_n == 3) {
04215       for (i=0; i < pixel_count; ++i) {
04216          int n = orig[i]*4;
04217          p[0] = palette[n  ];
04218          p[1] = palette[n+1];
04219          p[2] = palette[n+2];
04220          p += 3;
04221       }
04222    } else {
04223       for (i=0; i < pixel_count; ++i) {
04224          int n = orig[i]*4;
04225          p[0] = palette[n  ];
04226          p[1] = palette[n+1];
04227          p[2] = palette[n+2];
04228          p[3] = palette[n+3];
04229          p += 4;
04230       }
04231    }
04232    STBI_FREE(a->out);
04233    a->out = temp_out;
04234 
04235    STBI_NOTUSED(len);
04236 
04237    return 1;
04238 }
04239 
04240 static int stbi__unpremultiply_on_load = 0;
04241 static int stbi__de_iphone_flag = 0;
04242 
04243 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
04244 {
04245    stbi__unpremultiply_on_load = flag_true_if_should_unpremultiply;
04246 }
04247 
04248 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
04249 {
04250    stbi__de_iphone_flag = flag_true_if_should_convert;
04251 }
04252 
04253 static void stbi__de_iphone(stbi__png *z)
04254 {
04255    stbi__context *s = z->s;
04256    stbi__uint32 i, pixel_count = s->img_x * s->img_y;
04257    stbi_uc *p = z->out;
04258 
04259    if (s->img_out_n == 3) {  // convert bgr to rgb
04260       for (i=0; i < pixel_count; ++i) {
04261          stbi_uc t = p[0];
04262          p[0] = p[2];
04263          p[2] = t;
04264          p += 3;
04265       }
04266    } else {
04267       STBI_ASSERT(s->img_out_n == 4);
04268       if (stbi__unpremultiply_on_load) {
04269          // convert bgr to rgb and unpremultiply
04270          for (i=0; i < pixel_count; ++i) {
04271             stbi_uc a = p[3];
04272             stbi_uc t = p[0];
04273             if (a) {
04274                p[0] = p[2] * 255 / a;
04275                p[1] = p[1] * 255 / a;
04276                p[2] =  t   * 255 / a;
04277             } else {
04278                p[0] = p[2];
04279                p[2] = t;
04280             }
04281             p += 4;
04282          }
04283       } else {
04284          // convert bgr to rgb
04285          for (i=0; i < pixel_count; ++i) {
04286             stbi_uc t = p[0];
04287             p[0] = p[2];
04288             p[2] = t;
04289             p += 4;
04290          }
04291       }
04292    }
04293 }
04294 
04295 #define STBI__PNG_TYPE(a,b,c,d)  (((a) << 24) + ((b) << 16) + ((c) << 8) + (d))
04296 
04297 static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp)
04298 {
04299    stbi_uc palette[1024], pal_img_n=0;
04300    stbi_uc has_trans=0, tc[3];
04301    stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0;
04302    int first=1,k,interlace=0, color=0, depth=0, is_iphone=0;
04303    stbi__context *s = z->s;
04304 
04305    z->expanded = NULL;
04306    z->idata = NULL;
04307    z->out = NULL;
04308 
04309    if (!stbi__check_png_header(s)) return 0;
04310 
04311    if (scan == STBI__SCAN_type) return 1;
04312 
04313    for (;;) {
04314       stbi__pngchunk c = stbi__get_chunk_header(s);
04315       switch (c.type) {
04316          case STBI__PNG_TYPE('C','g','B','I'):
04317             is_iphone = 1;
04318             stbi__skip(s, c.length);
04319             break;
04320          case STBI__PNG_TYPE('I','H','D','R'): {
04321             int comp,filter;
04322             if (!first) return stbi__err("multiple IHDR","Corrupt PNG");
04323             first = 0;
04324             if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG");
04325             s->img_x = stbi__get32be(s); if (s->img_x > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
04326             s->img_y = stbi__get32be(s); if (s->img_y > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
04327             depth = stbi__get8(s);  if (depth != 1 && depth != 2 && depth != 4 && depth != 8)  return stbi__err("1/2/4/8-bit only","PNG not supported: 1/2/4/8-bit only");
04328             color = stbi__get8(s);  if (color > 6)         return stbi__err("bad ctype","Corrupt PNG");
04329             if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG");
04330             comp  = stbi__get8(s);  if (comp) return stbi__err("bad comp method","Corrupt PNG");
04331             filter= stbi__get8(s);  if (filter) return stbi__err("bad filter method","Corrupt PNG");
04332             interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG");
04333             if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG");
04334             if (!pal_img_n) {
04335                s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
04336                if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
04337                if (scan == STBI__SCAN_header) return 1;
04338             } else {
04339                // if paletted, then pal_n is our final components, and
04340                // img_n is # components to decompress/filter.
04341                s->img_n = 1;
04342                if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG");
04343                // if SCAN_header, have to scan to see if we have a tRNS
04344             }
04345             break;
04346          }
04347 
04348          case STBI__PNG_TYPE('P','L','T','E'):  {
04349             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
04350             if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
04351             pal_len = c.length / 3;
04352             if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG");
04353             for (i=0; i < pal_len; ++i) {
04354                palette[i*4+0] = stbi__get8(s);
04355                palette[i*4+1] = stbi__get8(s);
04356                palette[i*4+2] = stbi__get8(s);
04357                palette[i*4+3] = 255;
04358             }
04359             break;
04360          }
04361 
04362          case STBI__PNG_TYPE('t','R','N','S'): {
04363             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
04364             if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG");
04365             if (pal_img_n) {
04366                if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; }
04367                if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
04368                if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG");
04369                pal_img_n = 4;
04370                for (i=0; i < c.length; ++i)
04371                   palette[i*4+3] = stbi__get8(s);
04372             } else {
04373                if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
04374                if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG");
04375                has_trans = 1;
04376                for (k=0; k < s->img_n; ++k)
04377                   tc[k] = (stbi_uc) (stbi__get16be(s) & 255) * stbi__depth_scale_table[depth]; // non 8-bit images will be larger
04378             }
04379             break;
04380          }
04381 
04382          case STBI__PNG_TYPE('I','D','A','T'): {
04383             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
04384             if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG");
04385             if (scan == STBI__SCAN_header) { s->img_n = pal_img_n; return 1; }
04386             if ((int)(ioff + c.length) < (int)ioff) return 0;
04387             if (ioff + c.length > idata_limit) {
04388                stbi_uc *p;
04389                if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
04390                while (ioff + c.length > idata_limit)
04391                   idata_limit *= 2;
04392                p = (stbi_uc *) STBI_REALLOC(z->idata, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
04393                z->idata = p;
04394             }
04395             if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG");
04396             ioff += c.length;
04397             break;
04398          }
04399 
04400          case STBI__PNG_TYPE('I','E','N','D'): {
04401             stbi__uint32 raw_len, bpl;
04402             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
04403             if (scan != STBI__SCAN_load) return 1;
04404             if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG");
04405             // initial guess for decoded data size to avoid unnecessary reallocs
04406             bpl = (s->img_x * depth + 7) / 8; // bytes per line, per component
04407             raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
04408             z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone);
04409             if (z->expanded == NULL) return 0; // zlib should set error
04410             STBI_FREE(z->idata); z->idata = NULL;
04411             if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
04412                s->img_out_n = s->img_n+1;
04413             else
04414                s->img_out_n = s->img_n;
04415             if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, depth, color, interlace)) return 0;
04416             if (has_trans)
04417                if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
04418             if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
04419                stbi__de_iphone(z);
04420             if (pal_img_n) {
04421                // pal_img_n == 3 or 4
04422                s->img_n = pal_img_n; // record the actual colors we had
04423                s->img_out_n = pal_img_n;
04424                if (req_comp >= 3) s->img_out_n = req_comp;
04425                if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
04426                   return 0;
04427             }
04428             STBI_FREE(z->expanded); z->expanded = NULL;
04429             return 1;
04430          }
04431 
04432          default:
04433             // if critical, fail
04434             if (first) return stbi__err("first not IHDR", "Corrupt PNG");
04435             if ((c.type & (1 << 29)) == 0) {
04436                #ifndef STBI_NO_FAILURE_STRINGS
04437                // not threadsafe
04438                static char invalid_chunk[] = "XXXX PNG chunk not known";
04439                invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
04440                invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
04441                invalid_chunk[2] = STBI__BYTECAST(c.type >>  8);
04442                invalid_chunk[3] = STBI__BYTECAST(c.type >>  0);
04443                #endif
04444                return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
04445             }
04446             stbi__skip(s, c.length);
04447             break;
04448       }
04449       // end of PNG chunk, read and skip CRC
04450       stbi__get32be(s);
04451    }
04452 }
04453 
04454 static unsigned char *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp)
04455 {
04456    unsigned char *result=NULL;
04457    if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
04458    if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) {
04459       result = p->out;
04460       p->out = NULL;
04461       if (req_comp && req_comp != p->s->img_out_n) {
04462          result = stbi__convert_format(result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
04463          p->s->img_out_n = req_comp;
04464          if (result == NULL) return result;
04465       }
04466       *x = p->s->img_x;
04467       *y = p->s->img_y;
04468       if (n) *n = p->s->img_out_n;
04469    }
04470    STBI_FREE(p->out);      p->out      = NULL;
04471    STBI_FREE(p->expanded); p->expanded = NULL;
04472    STBI_FREE(p->idata);    p->idata    = NULL;
04473 
04474    return result;
04475 }
04476 
04477 static unsigned char *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
04478 {
04479    stbi__png p;
04480    p.s = s;
04481    return stbi__do_png(&p, x,y,comp,req_comp);
04482 }
04483 
04484 static int stbi__png_test(stbi__context *s)
04485 {
04486    int r;
04487    r = stbi__check_png_header(s);
04488    stbi__rewind(s);
04489    return r;
04490 }
04491 
04492 static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp)
04493 {
04494    if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) {
04495       stbi__rewind( p->s );
04496       return 0;
04497    }
04498    if (x) *x = p->s->img_x;
04499    if (y) *y = p->s->img_y;
04500    if (comp) *comp = p->s->img_n;
04501    return 1;
04502 }
04503 
04504 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp)
04505 {
04506    stbi__png p;
04507    p.s = s;
04508    return stbi__png_info_raw(&p, x, y, comp);
04509 }
04510 #endif
04511 
04512 // Microsoft/Windows BMP image
04513 
04514 #ifndef STBI_NO_BMP
04515 static int stbi__bmp_test_raw(stbi__context *s)
04516 {
04517    int r;
04518    int sz;
04519    if (stbi__get8(s) != 'B') return 0;
04520    if (stbi__get8(s) != 'M') return 0;
04521    stbi__get32le(s); // discard filesize
04522    stbi__get16le(s); // discard reserved
04523    stbi__get16le(s); // discard reserved
04524    stbi__get32le(s); // discard data offset
04525    sz = stbi__get32le(s);
04526    r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
04527    return r;
04528 }
04529 
04530 static int stbi__bmp_test(stbi__context *s)
04531 {
04532    int r = stbi__bmp_test_raw(s);
04533    stbi__rewind(s);
04534    return r;
04535 }
04536 
04537 
04538 // returns 0..31 for the highest set bit
04539 static int stbi__high_bit(unsigned int z)
04540 {
04541    int n=0;
04542    if (z == 0) return -1;
04543    if (z >= 0x10000) n += 16, z >>= 16;
04544    if (z >= 0x00100) n +=  8, z >>=  8;
04545    if (z >= 0x00010) n +=  4, z >>=  4;
04546    if (z >= 0x00004) n +=  2, z >>=  2;
04547    if (z >= 0x00002) n +=  1, z >>=  1;
04548    return n;
04549 }
04550 
04551 static int stbi__bitcount(unsigned int a)
04552 {
04553    a = (a & 0x55555555) + ((a >>  1) & 0x55555555); // max 2
04554    a = (a & 0x33333333) + ((a >>  2) & 0x33333333); // max 4
04555    a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
04556    a = (a + (a >> 8)); // max 16 per 8 bits
04557    a = (a + (a >> 16)); // max 32 per 8 bits
04558    return a & 0xff;
04559 }
04560 
04561 static int stbi__shiftsigned(int v, int shift, int bits)
04562 {
04563    int result;
04564    int z=0;
04565 
04566    if (shift < 0) v <<= -shift;
04567    else v >>= shift;
04568    result = v;
04569 
04570    z = bits;
04571    while (z < 8) {
04572       result += v >> z;
04573       z += bits;
04574    }
04575    return result;
04576 }
04577 
04578 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
04579 {
04580    stbi_uc *out;
04581    unsigned int mr=0,mg=0,mb=0,ma=0, fake_a=0;
04582    stbi_uc pal[256][4];
04583    int psize=0,i,j,compress=0,width;
04584    int bpp, flip_vertically, pad, target, offset, hsz;
04585    if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
04586    stbi__get32le(s); // discard filesize
04587    stbi__get16le(s); // discard reserved
04588    stbi__get16le(s); // discard reserved
04589    offset = stbi__get32le(s);
04590    hsz = stbi__get32le(s);
04591    if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
04592    if (hsz == 12) {
04593       s->img_x = stbi__get16le(s);
04594       s->img_y = stbi__get16le(s);
04595    } else {
04596       s->img_x = stbi__get32le(s);
04597       s->img_y = stbi__get32le(s);
04598    }
04599    if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
04600    bpp = stbi__get16le(s);
04601    if (bpp == 1) return stbi__errpuc("monochrome", "BMP type not supported: 1-bit");
04602    flip_vertically = ((int) s->img_y) > 0;
04603    s->img_y = abs((int) s->img_y);
04604    if (hsz == 12) {
04605       if (bpp < 24)
04606          psize = (offset - 14 - 24) / 3;
04607    } else {
04608       compress = stbi__get32le(s);
04609       if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
04610       stbi__get32le(s); // discard sizeof
04611       stbi__get32le(s); // discard hres
04612       stbi__get32le(s); // discard vres
04613       stbi__get32le(s); // discard colorsused
04614       stbi__get32le(s); // discard max important
04615       if (hsz == 40 || hsz == 56) {
04616          if (hsz == 56) {
04617             stbi__get32le(s);
04618             stbi__get32le(s);
04619             stbi__get32le(s);
04620             stbi__get32le(s);
04621          }
04622          if (bpp == 16 || bpp == 32) {
04623             mr = mg = mb = 0;
04624             if (compress == 0) {
04625                if (bpp == 32) {
04626                   mr = 0xffu << 16;
04627                   mg = 0xffu <<  8;
04628                   mb = 0xffu <<  0;
04629                   ma = 0xffu << 24;
04630                   fake_a = 1; // @TODO: check for cases like alpha value is all 0 and switch it to 255
04631                   STBI_NOTUSED(fake_a);
04632                } else {
04633                   mr = 31u << 10;
04634                   mg = 31u <<  5;
04635                   mb = 31u <<  0;
04636                }
04637             } else if (compress == 3) {
04638                mr = stbi__get32le(s);
04639                mg = stbi__get32le(s);
04640                mb = stbi__get32le(s);
04641                // not documented, but generated by photoshop and handled by mspaint
04642                if (mr == mg && mg == mb) {
04643                   // ?!?!?
04644                   return stbi__errpuc("bad BMP", "bad BMP");
04645                }
04646             } else
04647                return stbi__errpuc("bad BMP", "bad BMP");
04648          }
04649       } else {
04650          STBI_ASSERT(hsz == 108 || hsz == 124);
04651          mr = stbi__get32le(s);
04652          mg = stbi__get32le(s);
04653          mb = stbi__get32le(s);
04654          ma = stbi__get32le(s);
04655          stbi__get32le(s); // discard color space
04656          for (i=0; i < 12; ++i)
04657             stbi__get32le(s); // discard color space parameters
04658          if (hsz == 124) {
04659             stbi__get32le(s); // discard rendering intent
04660             stbi__get32le(s); // discard offset of profile data
04661             stbi__get32le(s); // discard size of profile data
04662             stbi__get32le(s); // discard reserved
04663          }
04664       }
04665       if (bpp < 16)
04666          psize = (offset - 14 - hsz) >> 2;
04667    }
04668    s->img_n = ma ? 4 : 3;
04669    if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
04670       target = req_comp;
04671    else
04672       target = s->img_n; // if they want monochrome, we'll post-convert
04673    out = (stbi_uc *) stbi__malloc(target * s->img_x * s->img_y);
04674    if (!out) return stbi__errpuc("outofmem", "Out of memory");
04675    if (bpp < 16) {
04676       int z=0;
04677       if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
04678       for (i=0; i < psize; ++i) {
04679          pal[i][2] = stbi__get8(s);
04680          pal[i][1] = stbi__get8(s);
04681          pal[i][0] = stbi__get8(s);
04682          if (hsz != 12) stbi__get8(s);
04683          pal[i][3] = 255;
04684       }
04685       stbi__skip(s, offset - 14 - hsz - psize * (hsz == 12 ? 3 : 4));
04686       if (bpp == 4) width = (s->img_x + 1) >> 1;
04687       else if (bpp == 8) width = s->img_x;
04688       else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
04689       pad = (-width)&3;
04690       for (j=0; j < (int) s->img_y; ++j) {
04691          for (i=0; i < (int) s->img_x; i += 2) {
04692             int v=stbi__get8(s),v2=0;
04693             if (bpp == 4) {
04694                v2 = v & 15;
04695                v >>= 4;
04696             }
04697             out[z++] = pal[v][0];
04698             out[z++] = pal[v][1];
04699             out[z++] = pal[v][2];
04700             if (target == 4) out[z++] = 255;
04701             if (i+1 == (int) s->img_x) break;
04702             v = (bpp == 8) ? stbi__get8(s) : v2;
04703             out[z++] = pal[v][0];
04704             out[z++] = pal[v][1];
04705             out[z++] = pal[v][2];
04706             if (target == 4) out[z++] = 255;
04707          }
04708          stbi__skip(s, pad);
04709       }
04710    } else {
04711       int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
04712       int z = 0;
04713       int easy=0;
04714       stbi__skip(s, offset - 14 - hsz);
04715       if (bpp == 24) width = 3 * s->img_x;
04716       else if (bpp == 16) width = 2*s->img_x;
04717       else /* bpp = 32 and pad = 0 */ width=0;
04718       pad = (-width) & 3;
04719       if (bpp == 24) {
04720          easy = 1;
04721       } else if (bpp == 32) {
04722          if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
04723             easy = 2;
04724       }
04725       if (!easy) {
04726          if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
04727          // right shift amt to put high bit in position #7
04728          rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr);
04729          gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg);
04730          bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb);
04731          ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma);
04732       }
04733       for (j=0; j < (int) s->img_y; ++j) {
04734          if (easy) {
04735             for (i=0; i < (int) s->img_x; ++i) {
04736                unsigned char a;
04737                out[z+2] = stbi__get8(s);
04738                out[z+1] = stbi__get8(s);
04739                out[z+0] = stbi__get8(s);
04740                z += 3;
04741                a = (easy == 2 ? stbi__get8(s) : 255);
04742                if (target == 4) out[z++] = a;
04743             }
04744          } else {
04745             for (i=0; i < (int) s->img_x; ++i) {
04746                stbi__uint32 v = (bpp == 16 ? (stbi__uint32) stbi__get16le(s) : stbi__get32le(s));
04747                int a;
04748                out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
04749                out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
04750                out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
04751                a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
04752                if (target == 4) out[z++] = STBI__BYTECAST(a);
04753             }
04754          }
04755          stbi__skip(s, pad);
04756       }
04757    }
04758    if (flip_vertically) {
04759       stbi_uc t;
04760       for (j=0; j < (int) s->img_y>>1; ++j) {
04761          stbi_uc *p1 = out +      j     *s->img_x*target;
04762          stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
04763          for (i=0; i < (int) s->img_x*target; ++i) {
04764             t = p1[i], p1[i] = p2[i], p2[i] = t;
04765          }
04766       }
04767    }
04768 
04769    if (req_comp && req_comp != target) {
04770       out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
04771       if (out == NULL) return out; // stbi__convert_format frees input on failure
04772    }
04773 
04774    *x = s->img_x;
04775    *y = s->img_y;
04776    if (comp) *comp = s->img_n;
04777    return out;
04778 }
04779 #endif
04780 
04781 // Targa Truevision - TGA
04782 // by Jonathan Dummer
04783 #ifndef STBI_NO_TGA
04784 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp)
04785 {
04786     int tga_w, tga_h, tga_comp;
04787     int sz;
04788     stbi__get8(s);                   // discard Offset
04789     sz = stbi__get8(s);              // color type
04790     if( sz > 1 ) {
04791         stbi__rewind(s);
04792         return 0;      // only RGB or indexed allowed
04793     }
04794     sz = stbi__get8(s);              // image type
04795     // only RGB or grey allowed, +/- RLE
04796     if ((sz != 1) && (sz != 2) && (sz != 3) && (sz != 9) && (sz != 10) && (sz != 11)) return 0;
04797     stbi__skip(s,9);
04798     tga_w = stbi__get16le(s);
04799     if( tga_w < 1 ) {
04800         stbi__rewind(s);
04801         return 0;   // test width
04802     }
04803     tga_h = stbi__get16le(s);
04804     if( tga_h < 1 ) {
04805         stbi__rewind(s);
04806         return 0;   // test height
04807     }
04808     sz = stbi__get8(s);               // bits per pixel
04809     // only RGB or RGBA or grey allowed
04810     if ((sz != 8) && (sz != 16) && (sz != 24) && (sz != 32)) {
04811         stbi__rewind(s);
04812         return 0;
04813     }
04814     tga_comp = sz;
04815     if (x) *x = tga_w;
04816     if (y) *y = tga_h;
04817     if (comp) *comp = tga_comp / 8;
04818     return 1;                   // seems to have passed everything
04819 }
04820 
04821 static int stbi__tga_test(stbi__context *s)
04822 {
04823    int res;
04824    int sz;
04825    stbi__get8(s);      //   discard Offset
04826    sz = stbi__get8(s);   //   color type
04827    if ( sz > 1 ) return 0;   //   only RGB or indexed allowed
04828    sz = stbi__get8(s);   //   image type
04829    if ( (sz != 1) && (sz != 2) && (sz != 3) && (sz != 9) && (sz != 10) && (sz != 11) ) return 0;   //   only RGB or grey allowed, +/- RLE
04830    stbi__get16be(s);      //   discard palette start
04831    stbi__get16be(s);      //   discard palette length
04832    stbi__get8(s);         //   discard bits per palette color entry
04833    stbi__get16be(s);      //   discard x origin
04834    stbi__get16be(s);      //   discard y origin
04835    if ( stbi__get16be(s) < 1 ) return 0;      //   test width
04836    if ( stbi__get16be(s) < 1 ) return 0;      //   test height
04837    sz = stbi__get8(s);   //   bits per pixel
04838    if ( (sz != 8) && (sz != 16) && (sz != 24) && (sz != 32) )
04839       res = 0;
04840    else
04841       res = 1;
04842    stbi__rewind(s);
04843    return res;
04844 }
04845 
04846 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
04847 {
04848    //   read in the TGA header stuff
04849    int tga_offset = stbi__get8(s);
04850    int tga_indexed = stbi__get8(s);
04851    int tga_image_type = stbi__get8(s);
04852    int tga_is_RLE = 0;
04853    int tga_palette_start = stbi__get16le(s);
04854    int tga_palette_len = stbi__get16le(s);
04855    int tga_palette_bits = stbi__get8(s);
04856    int tga_x_origin = stbi__get16le(s);
04857    int tga_y_origin = stbi__get16le(s);
04858    int tga_width = stbi__get16le(s);
04859    int tga_height = stbi__get16le(s);
04860    int tga_bits_per_pixel = stbi__get8(s);
04861    int tga_comp = tga_bits_per_pixel / 8;
04862    int tga_inverted = stbi__get8(s);
04863    //   image data
04864    unsigned char *tga_data;
04865    unsigned char *tga_palette = NULL;
04866    int i, j;
04867    unsigned char raw_data[4];
04868    int RLE_count = 0;
04869    int RLE_repeating = 0;
04870    int read_next_pixel = 1;
04871 
04872    //   do a tiny bit of precessing
04873    if ( tga_image_type >= 8 )
04874    {
04875       tga_image_type -= 8;
04876       tga_is_RLE = 1;
04877    }
04878    /* int tga_alpha_bits = tga_inverted & 15; */
04879    tga_inverted = 1 - ((tga_inverted >> 5) & 1);
04880 
04881    //   error check
04882    if ( //(tga_indexed) ||
04883       (tga_width < 1) || (tga_height < 1) ||
04884       (tga_image_type < 1) || (tga_image_type > 3) ||
04885       ((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16) &&
04886       (tga_bits_per_pixel != 24) && (tga_bits_per_pixel != 32))
04887       )
04888    {
04889       return NULL; // we don't report this as a bad TGA because we don't even know if it's TGA
04890    }
04891 
04892    //   If I'm paletted, then I'll use the number of bits from the palette
04893    if ( tga_indexed )
04894    {
04895       tga_comp = tga_palette_bits / 8;
04896    }
04897 
04898    //   tga info
04899    *x = tga_width;
04900    *y = tga_height;
04901    if (comp) *comp = tga_comp;
04902 
04903    tga_data = (unsigned char*)stbi__malloc( (size_t)tga_width * tga_height * tga_comp );
04904    if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
04905 
04906    // skip to the data's starting position (offset usually = 0)
04907    stbi__skip(s, tga_offset );
04908 
04909    if ( !tga_indexed && !tga_is_RLE) {
04910       for (i=0; i < tga_height; ++i) {
04911          int y = tga_inverted ? tga_height -i - 1 : i;
04912          stbi_uc *tga_row = tga_data + y*tga_width*tga_comp;
04913          stbi__getn(s, tga_row, tga_width * tga_comp);
04914       }
04915    } else  {
04916       //   do I need to load a palette?
04917       if ( tga_indexed)
04918       {
04919          //   any data to skip? (offset usually = 0)
04920          stbi__skip(s, tga_palette_start );
04921          //   load the palette
04922          tga_palette = (unsigned char*)stbi__malloc( tga_palette_len * tga_palette_bits / 8 );
04923          if (!tga_palette) {
04924             STBI_FREE(tga_data);
04925             return stbi__errpuc("outofmem", "Out of memory");
04926          }
04927          if (!stbi__getn(s, tga_palette, tga_palette_len * tga_palette_bits / 8 )) {
04928             STBI_FREE(tga_data);
04929             STBI_FREE(tga_palette);
04930             return stbi__errpuc("bad palette", "Corrupt TGA");
04931          }
04932       }
04933       //   load the data
04934       for (i=0; i < tga_width * tga_height; ++i)
04935       {
04936          //   if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
04937          if ( tga_is_RLE )
04938          {
04939             if ( RLE_count == 0 )
04940             {
04941                //   yep, get the next byte as a RLE command
04942                int RLE_cmd = stbi__get8(s);
04943                RLE_count = 1 + (RLE_cmd & 127);
04944                RLE_repeating = RLE_cmd >> 7;
04945                read_next_pixel = 1;
04946             } else if ( !RLE_repeating )
04947             {
04948                read_next_pixel = 1;
04949             }
04950          } else
04951          {
04952             read_next_pixel = 1;
04953          }
04954          //   OK, if I need to read a pixel, do it now
04955          if ( read_next_pixel )
04956          {
04957             //   load however much data we did have
04958             if ( tga_indexed )
04959             {
04960                //   read in 1 byte, then perform the lookup
04961                int pal_idx = stbi__get8(s);
04962                if ( pal_idx >= tga_palette_len )
04963                {
04964                   //   invalid index
04965                   pal_idx = 0;
04966                }
04967                pal_idx *= tga_bits_per_pixel / 8;
04968                for (j = 0; j*8 < tga_bits_per_pixel; ++j)
04969                {
04970                   raw_data[j] = tga_palette[pal_idx+j];
04971                }
04972             } else
04973             {
04974                //   read in the data raw
04975                for (j = 0; j*8 < tga_bits_per_pixel; ++j)
04976                {
04977                   raw_data[j] = stbi__get8(s);
04978                }
04979             }
04980             //   clear the reading flag for the next pixel
04981             read_next_pixel = 0;
04982          } // end of reading a pixel
04983 
04984          // copy data
04985          for (j = 0; j < tga_comp; ++j)
04986            tga_data[i*tga_comp+j] = raw_data[j];
04987 
04988          //   in case we're in RLE mode, keep counting down
04989          --RLE_count;
04990       }
04991       //   do I need to invert the image?
04992       if ( tga_inverted )
04993       {
04994          for (j = 0; j*2 < tga_height; ++j)
04995          {
04996             int index1 = j * tga_width * tga_comp;
04997             int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
04998             for (i = tga_width * tga_comp; i > 0; --i)
04999             {
05000                unsigned char temp = tga_data[index1];
05001                tga_data[index1] = tga_data[index2];
05002                tga_data[index2] = temp;
05003                ++index1;
05004                ++index2;
05005             }
05006          }
05007       }
05008       //   clear my palette, if I had one
05009       if ( tga_palette != NULL )
05010       {
05011          STBI_FREE( tga_palette );
05012       }
05013    }
05014 
05015    // swap RGB
05016    if (tga_comp >= 3)
05017    {
05018       unsigned char* tga_pixel = tga_data;
05019       for (i=0; i < tga_width * tga_height; ++i)
05020       {
05021          unsigned char temp = tga_pixel[0];
05022          tga_pixel[0] = tga_pixel[2];
05023          tga_pixel[2] = temp;
05024          tga_pixel += tga_comp;
05025       }
05026    }
05027 
05028    // convert to target component count
05029    if (req_comp && req_comp != tga_comp)
05030       tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
05031 
05032    //   the things I do to get rid of an error message, and yet keep
05033    //   Microsoft's C compilers happy... [8^(
05034    tga_palette_start = tga_palette_len = tga_palette_bits =
05035          tga_x_origin = tga_y_origin = 0;
05036    //   OK, done
05037    return tga_data;
05038 }
05039 #endif
05040 
05041 // *************************************************************************************************
05042 // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
05043 
05044 #ifndef STBI_NO_PSD
05045 static int stbi__psd_test(stbi__context *s)
05046 {
05047    int r = (stbi__get32be(s) == 0x38425053);
05048    stbi__rewind(s);
05049    return r;
05050 }
05051 
05052 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
05053 {
05054    int   pixelCount;
05055    int channelCount, compression;
05056    int channel, i, count, len;
05057    int w,h;
05058    stbi_uc *out;
05059 
05060    // Check identifier
05061    if (stbi__get32be(s) != 0x38425053)   // "8BPS"
05062       return stbi__errpuc("not PSD", "Corrupt PSD image");
05063 
05064    // Check file type version.
05065    if (stbi__get16be(s) != 1)
05066       return stbi__errpuc("wrong version", "Unsupported version of PSD image");
05067 
05068    // Skip 6 reserved bytes.
05069    stbi__skip(s, 6 );
05070 
05071    // Read the number of channels (R, G, B, A, etc).
05072    channelCount = stbi__get16be(s);
05073    if (channelCount < 0 || channelCount > 16)
05074       return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
05075 
05076    // Read the rows and columns of the image.
05077    h = stbi__get32be(s);
05078    w = stbi__get32be(s);
05079 
05080    // Make sure the depth is 8 bits.
05081    if (stbi__get16be(s) != 8)
05082       return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 bit");
05083 
05084    // Make sure the color mode is RGB.
05085    // Valid options are:
05086    //   0: Bitmap
05087    //   1: Grayscale
05088    //   2: Indexed color
05089    //   3: RGB color
05090    //   4: CMYK color
05091    //   7: Multichannel
05092    //   8: Duotone
05093    //   9: Lab color
05094    if (stbi__get16be(s) != 3)
05095       return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
05096 
05097    // Skip the Mode Data.  (It's the palette for indexed color; other info for other modes.)
05098    stbi__skip(s,stbi__get32be(s) );
05099 
05100    // Skip the image resources.  (resolution, pen tool paths, etc)
05101    stbi__skip(s, stbi__get32be(s) );
05102 
05103    // Skip the reserved data.
05104    stbi__skip(s, stbi__get32be(s) );
05105 
05106    // Find out if the data is compressed.
05107    // Known values:
05108    //   0: no compression
05109    //   1: RLE compressed
05110    compression = stbi__get16be(s);
05111    if (compression > 1)
05112       return stbi__errpuc("bad compression", "PSD has an unknown compression format");
05113 
05114    // Create the destination image.
05115    out = (stbi_uc *) stbi__malloc(4 * w*h);
05116    if (!out) return stbi__errpuc("outofmem", "Out of memory");
05117    pixelCount = w*h;
05118 
05119    // Initialize the data to zero.
05120    //memset( out, 0, pixelCount * 4 );
05121 
05122    // Finally, the image data.
05123    if (compression) {
05124       // RLE as used by .PSD and .TIFF
05125       // Loop until you get the number of unpacked bytes you are expecting:
05126       //     Read the next source byte into n.
05127       //     If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
05128       //     Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
05129       //     Else if n is 128, noop.
05130       // Endloop
05131 
05132       // The RLE-compressed data is preceeded by a 2-byte data count for each row in the data,
05133       // which we're going to just skip.
05134       stbi__skip(s, h * channelCount * 2 );
05135 
05136       // Read the RLE data by channel.
05137       for (channel = 0; channel < 4; channel++) {
05138          stbi_uc *p;
05139 
05140          p = out+channel;
05141          if (channel >= channelCount) {
05142             // Fill this channel with default data.
05143             for (i = 0; i < pixelCount; i++, p += 4)
05144                *p = (channel == 3 ? 255 : 0);
05145          } else {
05146             // Read the RLE data.
05147             count = 0;
05148             while (count < pixelCount) {
05149                len = stbi__get8(s);
05150                if (len == 128) {
05151                   // No-op.
05152                } else if (len < 128) {
05153                   // Copy next len+1 bytes literally.
05154                   len++;
05155                   count += len;
05156                   while (len) {
05157                      *p = stbi__get8(s);
05158                      p += 4;
05159                      len--;
05160                   }
05161                } else if (len > 128) {
05162                   stbi_uc   val;
05163                   // Next -len+1 bytes in the dest are replicated from next source byte.
05164                   // (Interpret len as a negative 8-bit int.)
05165                   len ^= 0x0FF;
05166                   len += 2;
05167                   val = stbi__get8(s);
05168                   count += len;
05169                   while (len) {
05170                      *p = val;
05171                      p += 4;
05172                      len--;
05173                   }
05174                }
05175             }
05176          }
05177       }
05178 
05179    } else {
05180       // We're at the raw image data.  It's each channel in order (Red, Green, Blue, Alpha, ...)
05181       // where each channel consists of an 8-bit value for each pixel in the image.
05182 
05183       // Read the data by channel.
05184       for (channel = 0; channel < 4; channel++) {
05185          stbi_uc *p;
05186 
05187          p = out + channel;
05188          if (channel > channelCount) {
05189             // Fill this channel with default data.
05190             for (i = 0; i < pixelCount; i++, p += 4)
05191                *p = channel == 3 ? 255 : 0;
05192          } else {
05193             // Read the data.
05194             for (i = 0; i < pixelCount; i++, p += 4)
05195                *p = stbi__get8(s);
05196          }
05197       }
05198    }
05199 
05200    if (req_comp && req_comp != 4) {
05201       out = stbi__convert_format(out, 4, req_comp, w, h);
05202       if (out == NULL) return out; // stbi__convert_format frees input on failure
05203    }
05204 
05205    if (comp) *comp = 4;
05206    *y = h;
05207    *x = w;
05208 
05209    return out;
05210 }
05211 #endif
05212 
05213 // *************************************************************************************************
05214 // Softimage PIC loader
05215 // by Tom Seddon
05216 //
05217 // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
05218 // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
05219 
05220 #ifndef STBI_NO_PIC
05221 static int stbi__pic_is4(stbi__context *s,const char *str)
05222 {
05223    int i;
05224    for (i=0; i<4; ++i)
05225       if (stbi__get8(s) != (stbi_uc)str[i])
05226          return 0;
05227 
05228    return 1;
05229 }
05230 
05231 static int stbi__pic_test_core(stbi__context *s)
05232 {
05233    int i;
05234 
05235    if (!stbi__pic_is4(s,"\x53\x80\xF6\x34"))
05236       return 0;
05237 
05238    for(i=0;i<84;++i)
05239       stbi__get8(s);
05240 
05241    if (!stbi__pic_is4(s,"PICT"))
05242       return 0;
05243 
05244    return 1;
05245 }
05246 
05247 typedef struct
05248 {
05249    stbi_uc size,type,channel;
05250 } stbi__pic_packet;
05251 
05252 static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest)
05253 {
05254    int mask=0x80, i;
05255 
05256    for (i=0; i<4; ++i, mask>>=1) {
05257       if (channel & mask) {
05258          if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short");
05259          dest[i]=stbi__get8(s);
05260       }
05261    }
05262 
05263    return dest;
05264 }
05265 
05266 static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src)
05267 {
05268    int mask=0x80,i;
05269 
05270    for (i=0;i<4; ++i, mask>>=1)
05271       if (channel&mask)
05272          dest[i]=src[i];
05273 }
05274 
05275 static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result)
05276 {
05277    int act_comp=0,num_packets=0,y,chained;
05278    stbi__pic_packet packets[10];
05279 
05280    // this will (should...) cater for even some bizarre stuff like having data
05281     // for the same channel in multiple packets.
05282    do {
05283       stbi__pic_packet *packet;
05284 
05285       if (num_packets==sizeof(packets)/sizeof(packets[0]))
05286          return stbi__errpuc("bad format","too many packets");
05287 
05288       packet = &packets[num_packets++];
05289 
05290       chained = stbi__get8(s);
05291       packet->size    = stbi__get8(s);
05292       packet->type    = stbi__get8(s);
05293       packet->channel = stbi__get8(s);
05294 
05295       act_comp |= packet->channel;
05296 
05297       if (stbi__at_eof(s))          return stbi__errpuc("bad file","file too short (reading packets)");
05298       if (packet->size != 8)  return stbi__errpuc("bad format","packet isn't 8bpp");
05299    } while (chained);
05300 
05301    *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
05302 
05303    for(y=0; y<height; ++y) {
05304       int packet_idx;
05305 
05306       for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
05307          stbi__pic_packet *packet = &packets[packet_idx];
05308          stbi_uc *dest = result+y*width*4;
05309 
05310          switch (packet->type) {
05311             default:
05312                return stbi__errpuc("bad format","packet has bad compression type");
05313 
05314             case 0: {//uncompressed
05315                int x;
05316 
05317                for(x=0;x<width;++x, dest+=4)
05318                   if (!stbi__readval(s,packet->channel,dest))
05319                      return 0;
05320                break;
05321             }
05322 
05323             case 1://Pure RLE
05324                {
05325                   int left=width, i;
05326 
05327                   while (left>0) {
05328                      stbi_uc count,value[4];
05329 
05330                      count=stbi__get8(s);
05331                      if (stbi__at_eof(s))   return stbi__errpuc("bad file","file too short (pure read count)");
05332 
05333                      if (count > left)
05334                         count = (stbi_uc) left;
05335 
05336                      if (!stbi__readval(s,packet->channel,value))  return 0;
05337 
05338                      for(i=0; i<count; ++i,dest+=4)
05339                         stbi__copyval(packet->channel,dest,value);
05340                      left -= count;
05341                   }
05342                }
05343                break;
05344 
05345             case 2: {//Mixed RLE
05346                int left=width;
05347                while (left>0) {
05348                   int count = stbi__get8(s), i;
05349                   if (stbi__at_eof(s))  return stbi__errpuc("bad file","file too short (mixed read count)");
05350 
05351                   if (count >= 128) { // Repeated
05352                      stbi_uc value[4];
05353                      int i;
05354 
05355                      if (count==128)
05356                         count = stbi__get16be(s);
05357                      else
05358                         count -= 127;
05359                      if (count > left)
05360                         return stbi__errpuc("bad file","scanline overrun");
05361 
05362                      if (!stbi__readval(s,packet->channel,value))
05363                         return 0;
05364 
05365                      for(i=0;i<count;++i, dest += 4)
05366                         stbi__copyval(packet->channel,dest,value);
05367                   } else { // Raw
05368                      ++count;
05369                      if (count>left) return stbi__errpuc("bad file","scanline overrun");
05370 
05371                      for(i=0;i<count;++i, dest+=4)
05372                         if (!stbi__readval(s,packet->channel,dest))
05373                            return 0;
05374                   }
05375                   left-=count;
05376                }
05377                break;
05378             }
05379          }
05380       }
05381    }
05382 
05383    return result;
05384 }
05385 
05386 static stbi_uc *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp)
05387 {
05388    stbi_uc *result;
05389    int i, x,y;
05390 
05391    for (i=0; i<92; ++i)
05392       stbi__get8(s);
05393 
05394    x = stbi__get16be(s);
05395    y = stbi__get16be(s);
05396    if (stbi__at_eof(s))  return stbi__errpuc("bad file","file too short (pic header)");
05397    if ((1 << 28) / x < y) return stbi__errpuc("too large", "Image too large to decode");
05398 
05399    stbi__get32be(s); //skip `ratio'
05400    stbi__get16be(s); //skip `fields'
05401    stbi__get16be(s); //skip `pad'
05402 
05403    // intermediate buffer is RGBA
05404    result = (stbi_uc *) stbi__malloc(x*y*4);
05405    memset(result, 0xff, x*y*4);
05406 
05407    if (!stbi__pic_load_core(s,x,y,comp, result)) {
05408       STBI_FREE(result);
05409       result=0;
05410    }
05411    *px = x;
05412    *py = y;
05413    if (req_comp == 0) req_comp = *comp;
05414    result=stbi__convert_format(result,4,req_comp,x,y);
05415 
05416    return result;
05417 }
05418 
05419 static int stbi__pic_test(stbi__context *s)
05420 {
05421    int r = stbi__pic_test_core(s);
05422    stbi__rewind(s);
05423    return r;
05424 }
05425 #endif
05426 
05427 // *************************************************************************************************
05428 // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
05429 
05430 #ifndef STBI_NO_GIF
05431 typedef struct
05432 {
05433    stbi__int16 prefix;
05434    stbi_uc first;
05435    stbi_uc suffix;
05436 } stbi__gif_lzw;
05437 
05438 typedef struct
05439 {
05440    int w,h;
05441    stbi_uc *out;                 // output buffer (always 4 components)
05442    int flags, bgindex, ratio, transparent, eflags;
05443    stbi_uc  pal[256][4];
05444    stbi_uc lpal[256][4];
05445    stbi__gif_lzw codes[4096];
05446    stbi_uc *color_table;
05447    int parse, step;
05448    int lflags;
05449    int start_x, start_y;
05450    int max_x, max_y;
05451    int cur_x, cur_y;
05452    int line_size;
05453 } stbi__gif;
05454 
05455 static int stbi__gif_test_raw(stbi__context *s)
05456 {
05457    int sz;
05458    if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
05459    sz = stbi__get8(s);
05460    if (sz != '9' && sz != '7') return 0;
05461    if (stbi__get8(s) != 'a') return 0;
05462    return 1;
05463 }
05464 
05465 static int stbi__gif_test(stbi__context *s)
05466 {
05467    int r = stbi__gif_test_raw(s);
05468    stbi__rewind(s);
05469    return r;
05470 }
05471 
05472 static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp)
05473 {
05474    int i;
05475    for (i=0; i < num_entries; ++i) {
05476       pal[i][2] = stbi__get8(s);
05477       pal[i][1] = stbi__get8(s);
05478       pal[i][0] = stbi__get8(s);
05479       pal[i][3] = transp == i ? 0 : 255;
05480    }
05481 }
05482 
05483 static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info)
05484 {
05485    stbi_uc version;
05486    if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
05487       return stbi__err("not GIF", "Corrupt GIF");
05488 
05489    version = stbi__get8(s);
05490    if (version != '7' && version != '9')    return stbi__err("not GIF", "Corrupt GIF");
05491    if (stbi__get8(s) != 'a')                return stbi__err("not GIF", "Corrupt GIF");
05492 
05493    stbi__g_failure_reason = "";
05494    g->w = stbi__get16le(s);
05495    g->h = stbi__get16le(s);
05496    g->flags = stbi__get8(s);
05497    g->bgindex = stbi__get8(s);
05498    g->ratio = stbi__get8(s);
05499    g->transparent = -1;
05500 
05501    if (comp != 0) *comp = 4;  // can't actually tell whether it's 3 or 4 until we parse the comments
05502 
05503    if (is_info) return 1;
05504 
05505    if (g->flags & 0x80)
05506       stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
05507 
05508    return 1;
05509 }
05510 
05511 static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp)
05512 {
05513    stbi__gif g;
05514    if (!stbi__gif_header(s, &g, comp, 1)) {
05515       stbi__rewind( s );
05516       return 0;
05517    }
05518    if (x) *x = g.w;
05519    if (y) *y = g.h;
05520    return 1;
05521 }
05522 
05523 static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code)
05524 {
05525    stbi_uc *p, *c;
05526 
05527    // recurse to decode the prefixes, since the linked-list is backwards,
05528    // and working backwards through an interleaved image would be nasty
05529    if (g->codes[code].prefix >= 0)
05530       stbi__out_gif_code(g, g->codes[code].prefix);
05531 
05532    if (g->cur_y >= g->max_y) return;
05533 
05534    p = &g->out[g->cur_x + g->cur_y];
05535    c = &g->color_table[g->codes[code].suffix * 4];
05536 
05537    if (c[3] >= 128) {
05538       p[0] = c[2];
05539       p[1] = c[1];
05540       p[2] = c[0];
05541       p[3] = c[3];
05542    }
05543    g->cur_x += 4;
05544 
05545    if (g->cur_x >= g->max_x) {
05546       g->cur_x = g->start_x;
05547       g->cur_y += g->step;
05548 
05549       while (g->cur_y >= g->max_y && g->parse > 0) {
05550          g->step = (1 << g->parse) * g->line_size;
05551          g->cur_y = g->start_y + (g->step >> 1);
05552          --g->parse;
05553       }
05554    }
05555 }
05556 
05557 static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g)
05558 {
05559    stbi_uc lzw_cs;
05560    stbi__int32 len, code;
05561    stbi__uint32 first;
05562    stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
05563    stbi__gif_lzw *p;
05564 
05565    lzw_cs = stbi__get8(s);
05566    if (lzw_cs > 12) return NULL;
05567    clear = 1 << lzw_cs;
05568    first = 1;
05569    codesize = lzw_cs + 1;
05570    codemask = (1 << codesize) - 1;
05571    bits = 0;
05572    valid_bits = 0;
05573    for (code = 0; code < clear; code++) {
05574       g->codes[code].prefix = -1;
05575       g->codes[code].first = (stbi_uc) code;
05576       g->codes[code].suffix = (stbi_uc) code;
05577    }
05578 
05579    // support no starting clear code
05580    avail = clear+2;
05581    oldcode = -1;
05582 
05583    len = 0;
05584    for(;;) {
05585       if (valid_bits < codesize) {
05586          if (len == 0) {
05587             len = stbi__get8(s); // start new block
05588             if (len == 0)
05589                return g->out;
05590          }
05591          --len;
05592          bits |= (stbi__int32) stbi__get8(s) << valid_bits;
05593          valid_bits += 8;
05594       } else {
05595          stbi__int32 code = bits & codemask;
05596          bits >>= codesize;
05597          valid_bits -= codesize;
05598          // @OPTIMIZE: is there some way we can accelerate the non-clear path?
05599          if (code == clear) {  // clear code
05600             codesize = lzw_cs + 1;
05601             codemask = (1 << codesize) - 1;
05602             avail = clear + 2;
05603             oldcode = -1;
05604             first = 0;
05605          } else if (code == clear + 1) { // end of stream code
05606             stbi__skip(s, len);
05607             while ((len = stbi__get8(s)) > 0)
05608                stbi__skip(s,len);
05609             return g->out;
05610          } else if (code <= avail) {
05611             if (first) return stbi__errpuc("no clear code", "Corrupt GIF");
05612 
05613             if (oldcode >= 0) {
05614                p = &g->codes[avail++];
05615                if (avail > 4096)        return stbi__errpuc("too many codes", "Corrupt GIF");
05616                p->prefix = (stbi__int16) oldcode;
05617                p->first = g->codes[oldcode].first;
05618                p->suffix = (code == avail) ? p->first : g->codes[code].first;
05619             } else if (code == avail)
05620                return stbi__errpuc("illegal code in raster", "Corrupt GIF");
05621 
05622             stbi__out_gif_code(g, (stbi__uint16) code);
05623 
05624             if ((avail & codemask) == 0 && avail <= 0x0FFF) {
05625                codesize++;
05626                codemask = (1 << codesize) - 1;
05627             }
05628 
05629             oldcode = code;
05630          } else {
05631             return stbi__errpuc("illegal code in raster", "Corrupt GIF");
05632          }
05633       }
05634    }
05635 }
05636 
05637 static void stbi__fill_gif_background(stbi__gif *g)
05638 {
05639    int i;
05640    stbi_uc *c = g->pal[g->bgindex];
05641    // @OPTIMIZE: write a dword at a time
05642    for (i = 0; i < g->w * g->h * 4; i += 4) {
05643       stbi_uc *p  = &g->out[i];
05644       p[0] = c[2];
05645       p[1] = c[1];
05646       p[2] = c[0];
05647       p[3] = c[3];
05648    }
05649 }
05650 
05651 // this function is designed to support animated gifs, although stb_image doesn't support it
05652 static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp)
05653 {
05654    int i;
05655    stbi_uc *old_out = 0;
05656 
05657    if (g->out == 0) {
05658       if (!stbi__gif_header(s, g, comp,0))     return 0; // stbi__g_failure_reason set by stbi__gif_header
05659       g->out = (stbi_uc *) stbi__malloc(4 * g->w * g->h);
05660       if (g->out == 0)                      return stbi__errpuc("outofmem", "Out of memory");
05661       stbi__fill_gif_background(g);
05662    } else {
05663       // animated-gif-only path
05664       if (((g->eflags & 0x1C) >> 2) == 3) {
05665          old_out = g->out;
05666          g->out = (stbi_uc *) stbi__malloc(4 * g->w * g->h);
05667          if (g->out == 0)                   return stbi__errpuc("outofmem", "Out of memory");
05668          memcpy(g->out, old_out, g->w*g->h*4);
05669       }
05670    }
05671 
05672    for (;;) {
05673       switch (stbi__get8(s)) {
05674          case 0x2C: /* Image Descriptor */
05675          {
05676             stbi__int32 x, y, w, h;
05677             stbi_uc *o;
05678 
05679             x = stbi__get16le(s);
05680             y = stbi__get16le(s);
05681             w = stbi__get16le(s);
05682             h = stbi__get16le(s);
05683             if (((x + w) > (g->w)) || ((y + h) > (g->h)))
05684                return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
05685 
05686             g->line_size = g->w * 4;
05687             g->start_x = x * 4;
05688             g->start_y = y * g->line_size;
05689             g->max_x   = g->start_x + w * 4;
05690             g->max_y   = g->start_y + h * g->line_size;
05691             g->cur_x   = g->start_x;
05692             g->cur_y   = g->start_y;
05693 
05694             g->lflags = stbi__get8(s);
05695 
05696             if (g->lflags & 0x40) {
05697                g->step = 8 * g->line_size; // first interlaced spacing
05698                g->parse = 3;
05699             } else {
05700                g->step = g->line_size;
05701                g->parse = 0;
05702             }
05703 
05704             if (g->lflags & 0x80) {
05705                stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
05706                g->color_table = (stbi_uc *) g->lpal;
05707             } else if (g->flags & 0x80) {
05708                for (i=0; i < 256; ++i)  // @OPTIMIZE: stbi__jpeg_reset only the previous transparent
05709                   g->pal[i][3] = 255;
05710                if (g->transparent >= 0 && (g->eflags & 0x01))
05711                   g->pal[g->transparent][3] = 0;
05712                g->color_table = (stbi_uc *) g->pal;
05713             } else
05714                return stbi__errpuc("missing color table", "Corrupt GIF");
05715 
05716             o = stbi__process_gif_raster(s, g);
05717             if (o == NULL) return NULL;
05718 
05719             if (req_comp && req_comp != 4)
05720                o = stbi__convert_format(o, 4, req_comp, g->w, g->h);
05721             return o;
05722          }
05723 
05724          case 0x21: // Comment Extension.
05725          {
05726             int len;
05727             if (stbi__get8(s) == 0xF9) { // Graphic Control Extension.
05728                len = stbi__get8(s);
05729                if (len == 4) {
05730                   g->eflags = stbi__get8(s);
05731                   stbi__get16le(s); // delay
05732                   g->transparent = stbi__get8(s);
05733                } else {
05734                   stbi__skip(s, len);
05735                   break;
05736                }
05737             }
05738             while ((len = stbi__get8(s)) != 0)
05739                stbi__skip(s, len);
05740             break;
05741          }
05742 
05743          case 0x3B: // gif stream termination code
05744             return (stbi_uc *) s; // using '1' causes warning on some compilers
05745 
05746          default:
05747             return stbi__errpuc("unknown code", "Corrupt GIF");
05748       }
05749    }
05750 }
05751 
05752 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
05753 {
05754    stbi_uc *u = 0;
05755    stbi__gif g;
05756    memset(&g, 0, sizeof(g));
05757 
05758    u = stbi__gif_load_next(s, &g, comp, req_comp);
05759    if (u == (stbi_uc *) s) u = 0;  // end of animated gif marker
05760    if (u) {
05761       *x = g.w;
05762       *y = g.h;
05763    }
05764 
05765    return u;
05766 }
05767 
05768 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp)
05769 {
05770    return stbi__gif_info_raw(s,x,y,comp);
05771 }
05772 #endif
05773 
05774 // *************************************************************************************************
05775 // Radiance RGBE HDR loader
05776 // originally by Nicolas Schulz
05777 #ifndef STBI_NO_HDR
05778 static int stbi__hdr_test_core(stbi__context *s)
05779 {
05780    const char *signature = "#?RADIANCE\n";
05781    int i;
05782    for (i=0; signature[i]; ++i)
05783       if (stbi__get8(s) != signature[i])
05784          return 0;
05785    return 1;
05786 }
05787 
05788 static int stbi__hdr_test(stbi__context* s)
05789 {
05790    int r = stbi__hdr_test_core(s);
05791    stbi__rewind(s);
05792    return r;
05793 }
05794 
05795 #define STBI__HDR_BUFLEN  1024
05796 static char *stbi__hdr_gettoken(stbi__context *z, char *buffer)
05797 {
05798    int len=0;
05799    char c = '\0';
05800 
05801    c = (char) stbi__get8(z);
05802 
05803    while (!stbi__at_eof(z) && c != '\n') {
05804       buffer[len++] = c;
05805       if (len == STBI__HDR_BUFLEN-1) {
05806          // flush to end of line
05807          while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
05808             ;
05809          break;
05810       }
05811       c = (char) stbi__get8(z);
05812    }
05813 
05814    buffer[len] = 0;
05815    return buffer;
05816 }
05817 
05818 static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp)
05819 {
05820    if ( input[3] != 0 ) {
05821       float f1;
05822       // Exponent
05823       f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
05824       if (req_comp <= 2)
05825          output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
05826       else {
05827          output[0] = input[0] * f1;
05828          output[1] = input[1] * f1;
05829          output[2] = input[2] * f1;
05830       }
05831       if (req_comp == 2) output[1] = 1;
05832       if (req_comp == 4) output[3] = 1;
05833    } else {
05834       switch (req_comp) {
05835          case 4: output[3] = 1; /* fallthrough */
05836          case 3: output[0] = output[1] = output[2] = 0;
05837                  break;
05838          case 2: output[1] = 1; /* fallthrough */
05839          case 1: output[0] = 0;
05840                  break;
05841       }
05842    }
05843 }
05844 
05845 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
05846 {
05847    char buffer[STBI__HDR_BUFLEN];
05848    char *token;
05849    int valid = 0;
05850    int width, height;
05851    stbi_uc *scanline;
05852    float *hdr_data;
05853    int len;
05854    unsigned char count, value;
05855    int i, j, k, c1,c2, z;
05856 
05857 
05858    // Check identifier
05859    if (strcmp(stbi__hdr_gettoken(s,buffer), "#?RADIANCE") != 0)
05860       return stbi__errpf("not HDR", "Corrupt HDR image");
05861 
05862    // Parse header
05863    for(;;) {
05864       token = stbi__hdr_gettoken(s,buffer);
05865       if (token[0] == 0) break;
05866       if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
05867    }
05868 
05869    if (!valid)    return stbi__errpf("unsupported format", "Unsupported HDR format");
05870 
05871    // Parse width and height
05872    // can't use sscanf() if we're not using stdio!
05873    token = stbi__hdr_gettoken(s,buffer);
05874    if (strncmp(token, "-Y ", 3))  return stbi__errpf("unsupported data layout", "Unsupported HDR format");
05875    token += 3;
05876    height = (int) strtol(token, &token, 10);
05877    while (*token == ' ') ++token;
05878    if (strncmp(token, "+X ", 3))  return stbi__errpf("unsupported data layout", "Unsupported HDR format");
05879    token += 3;
05880    width = (int) strtol(token, NULL, 10);
05881 
05882    *x = width;
05883    *y = height;
05884 
05885    if (comp) *comp = 3;
05886    if (req_comp == 0) req_comp = 3;
05887 
05888    // Read data
05889    hdr_data = (float *) stbi__malloc(height * width * req_comp * sizeof(float));
05890 
05891    // Load image data
05892    // image data is stored as some number of sca
05893    if ( width < 8 || width >= 32768) {
05894       // Read flat data
05895       for (j=0; j < height; ++j) {
05896          for (i=0; i < width; ++i) {
05897             stbi_uc rgbe[4];
05898            main_decode_loop:
05899             stbi__getn(s, rgbe, 4);
05900             stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
05901          }
05902       }
05903    } else {
05904       // Read RLE-encoded data
05905       scanline = NULL;
05906 
05907       for (j = 0; j < height; ++j) {
05908          c1 = stbi__get8(s);
05909          c2 = stbi__get8(s);
05910          len = stbi__get8(s);
05911          if (c1 != 2 || c2 != 2 || (len & 0x80)) {
05912             // not run-length encoded, so we have to actually use THIS data as a decoded
05913             // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
05914             stbi_uc rgbe[4];
05915             rgbe[0] = (stbi_uc) c1;
05916             rgbe[1] = (stbi_uc) c2;
05917             rgbe[2] = (stbi_uc) len;
05918             rgbe[3] = (stbi_uc) stbi__get8(s);
05919             stbi__hdr_convert(hdr_data, rgbe, req_comp);
05920             i = 1;
05921             j = 0;
05922             STBI_FREE(scanline);
05923             goto main_decode_loop; // yes, this makes no sense
05924          }
05925          len <<= 8;
05926          len |= stbi__get8(s);
05927          if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
05928          if (scanline == NULL) scanline = (stbi_uc *) stbi__malloc(width * 4);
05929 
05930          for (k = 0; k < 4; ++k) {
05931             i = 0;
05932             while (i < width) {
05933                count = stbi__get8(s);
05934                if (count > 128) {
05935                   // Run
05936                   value = stbi__get8(s);
05937                   count -= 128;
05938                   for (z = 0; z < count; ++z)
05939                      scanline[i++ * 4 + k] = value;
05940                } else {
05941                   // Dump
05942                   for (z = 0; z < count; ++z)
05943                      scanline[i++ * 4 + k] = stbi__get8(s);
05944                }
05945             }
05946          }
05947          for (i=0; i < width; ++i)
05948             stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
05949       }
05950       STBI_FREE(scanline);
05951    }
05952 
05953    return hdr_data;
05954 }
05955 
05956 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp)
05957 {
05958    char buffer[STBI__HDR_BUFLEN];
05959    char *token;
05960    int valid = 0;
05961 
05962    if (strcmp(stbi__hdr_gettoken(s,buffer), "#?RADIANCE") != 0) {
05963        stbi__rewind( s );
05964        return 0;
05965    }
05966 
05967    for(;;) {
05968       token = stbi__hdr_gettoken(s,buffer);
05969       if (token[0] == 0) break;
05970       if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
05971    }
05972 
05973    if (!valid) {
05974        stbi__rewind( s );
05975        return 0;
05976    }
05977    token = stbi__hdr_gettoken(s,buffer);
05978    if (strncmp(token, "-Y ", 3)) {
05979        stbi__rewind( s );
05980        return 0;
05981    }
05982    token += 3;
05983    *y = (int) strtol(token, &token, 10);
05984    while (*token == ' ') ++token;
05985    if (strncmp(token, "+X ", 3)) {
05986        stbi__rewind( s );
05987        return 0;
05988    }
05989    token += 3;
05990    *x = (int) strtol(token, NULL, 10);
05991    *comp = 3;
05992    return 1;
05993 }
05994 #endif // STBI_NO_HDR
05995 
05996 #ifndef STBI_NO_BMP
05997 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp)
05998 {
05999    int hsz;
06000    if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') {
06001        stbi__rewind( s );
06002        return 0;
06003    }
06004    stbi__skip(s,12);
06005    hsz = stbi__get32le(s);
06006    if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) {
06007        stbi__rewind( s );
06008        return 0;
06009    }
06010    if (hsz == 12) {
06011       *x = stbi__get16le(s);
06012       *y = stbi__get16le(s);
06013    } else {
06014       *x = stbi__get32le(s);
06015       *y = stbi__get32le(s);
06016    }
06017    if (stbi__get16le(s) != 1) {
06018        stbi__rewind( s );
06019        return 0;
06020    }
06021    *comp = stbi__get16le(s) / 8;
06022    return 1;
06023 }
06024 #endif
06025 
06026 #ifndef STBI_NO_PSD
06027 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp)
06028 {
06029    int channelCount;
06030    if (stbi__get32be(s) != 0x38425053) {
06031        stbi__rewind( s );
06032        return 0;
06033    }
06034    if (stbi__get16be(s) != 1) {
06035        stbi__rewind( s );
06036        return 0;
06037    }
06038    stbi__skip(s, 6);
06039    channelCount = stbi__get16be(s);
06040    if (channelCount < 0 || channelCount > 16) {
06041        stbi__rewind( s );
06042        return 0;
06043    }
06044    *y = stbi__get32be(s);
06045    *x = stbi__get32be(s);
06046    if (stbi__get16be(s) != 8) {
06047        stbi__rewind( s );
06048        return 0;
06049    }
06050    if (stbi__get16be(s) != 3) {
06051        stbi__rewind( s );
06052        return 0;
06053    }
06054    *comp = 4;
06055    return 1;
06056 }
06057 #endif
06058 
06059 #ifndef STBI_NO_PIC
06060 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp)
06061 {
06062    int act_comp=0,num_packets=0,chained;
06063    stbi__pic_packet packets[10];
06064 
06065    stbi__skip(s, 92);
06066 
06067    *x = stbi__get16be(s);
06068    *y = stbi__get16be(s);
06069    if (stbi__at_eof(s))  return 0;
06070    if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
06071        stbi__rewind( s );
06072        return 0;
06073    }
06074 
06075    stbi__skip(s, 8);
06076 
06077    do {
06078       stbi__pic_packet *packet;
06079 
06080       if (num_packets==sizeof(packets)/sizeof(packets[0]))
06081          return 0;
06082 
06083       packet = &packets[num_packets++];
06084       chained = stbi__get8(s);
06085       packet->size    = stbi__get8(s);
06086       packet->type    = stbi__get8(s);
06087       packet->channel = stbi__get8(s);
06088       act_comp |= packet->channel;
06089 
06090       if (stbi__at_eof(s)) {
06091           stbi__rewind( s );
06092           return 0;
06093       }
06094       if (packet->size != 8) {
06095           stbi__rewind( s );
06096           return 0;
06097       }
06098    } while (chained);
06099 
06100    *comp = (act_comp & 0x10 ? 4 : 3);
06101 
06102    return 1;
06103 }
06104 #endif
06105 
06106 // *************************************************************************************************
06107 // Portable Gray Map and Portable Pixel Map loader
06108 // by Ken Miller
06109 //
06110 // PGM: http://netpbm.sourceforge.net/doc/pgm.html
06111 // PPM: http://netpbm.sourceforge.net/doc/ppm.html
06112 //
06113 // Known limitations:
06114 //    Does not support comments in the header section
06115 //    Does not support ASCII image data (formats P2 and P3)
06116 //    Does not support 16-bit-per-channel
06117 
06118 #ifndef STBI_NO_PNM
06119 
06120 static int      stbi__pnm_test(stbi__context *s)
06121 {
06122    char p, t;
06123    p = (char) stbi__get8(s);
06124    t = (char) stbi__get8(s);
06125    if (p != 'P' || (t != '5' && t != '6')) {
06126        stbi__rewind( s );
06127        return 0;
06128    }
06129    return 1;
06130 }
06131 
06132 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
06133 {
06134    stbi_uc *out;
06135    if (!stbi__pnm_info(s, (int *)&s->img_x, (int *)&s->img_y, (int *)&s->img_n))
06136       return 0;
06137    *x = s->img_x;
06138    *y = s->img_y;
06139    *comp = s->img_n;
06140 
06141    out = (stbi_uc *) stbi__malloc(s->img_n * s->img_x * s->img_y);
06142    if (!out) return stbi__errpuc("outofmem", "Out of memory");
06143    stbi__getn(s, out, s->img_n * s->img_x * s->img_y);
06144 
06145    if (req_comp && req_comp != s->img_n) {
06146       out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y);
06147       if (out == NULL) return out; // stbi__convert_format frees input on failure
06148    }
06149    return out;
06150 }
06151 
06152 static int      stbi__pnm_isspace(char c)
06153 {
06154    return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r';
06155 }
06156 
06157 static void     stbi__pnm_skip_whitespace(stbi__context *s, char *c)
06158 {
06159    while (!stbi__at_eof(s) && stbi__pnm_isspace(*c))
06160       *c = (char) stbi__get8(s);
06161 }
06162 
06163 static int      stbi__pnm_isdigit(char c)
06164 {
06165    return c >= '0' && c <= '9';
06166 }
06167 
06168 static int      stbi__pnm_getinteger(stbi__context *s, char *c)
06169 {
06170    int value = 0;
06171 
06172    while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) {
06173       value = value*10 + (*c - '0');
06174       *c = (char) stbi__get8(s);
06175    }
06176 
06177    return value;
06178 }
06179 
06180 static int      stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp)
06181 {
06182    int maxv;
06183    char c, p, t;
06184 
06185    stbi__rewind( s );
06186 
06187    // Get identifier
06188    p = (char) stbi__get8(s);
06189    t = (char) stbi__get8(s);
06190    if (p != 'P' || (t != '5' && t != '6')) {
06191        stbi__rewind( s );
06192        return 0;
06193    }
06194 
06195    *comp = (t == '6') ? 3 : 1;  // '5' is 1-component .pgm; '6' is 3-component .ppm
06196 
06197    c = (char) stbi__get8(s);
06198    stbi__pnm_skip_whitespace(s, &c);
06199 
06200    *x = stbi__pnm_getinteger(s, &c); // read width
06201    stbi__pnm_skip_whitespace(s, &c);
06202 
06203    *y = stbi__pnm_getinteger(s, &c); // read height
06204    stbi__pnm_skip_whitespace(s, &c);
06205 
06206    maxv = stbi__pnm_getinteger(s, &c);  // read max value
06207 
06208    if (maxv > 255)
06209       return stbi__err("max value > 255", "PPM image not 8-bit");
06210    else
06211       return 1;
06212 }
06213 #endif
06214 
06215 static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp)
06216 {
06217    #ifndef STBI_NO_JPEG
06218    if (stbi__jpeg_info(s, x, y, comp)) return 1;
06219    #endif
06220 
06221    #ifndef STBI_NO_PNG
06222    if (stbi__png_info(s, x, y, comp))  return 1;
06223    #endif
06224 
06225    #ifndef STBI_NO_GIF
06226    if (stbi__gif_info(s, x, y, comp))  return 1;
06227    #endif
06228 
06229    #ifndef STBI_NO_BMP
06230    if (stbi__bmp_info(s, x, y, comp))  return 1;
06231    #endif
06232 
06233    #ifndef STBI_NO_PSD
06234    if (stbi__psd_info(s, x, y, comp))  return 1;
06235    #endif
06236 
06237    #ifndef STBI_NO_PIC
06238    if (stbi__pic_info(s, x, y, comp))  return 1;
06239    #endif
06240 
06241    #ifndef STBI_NO_PNM
06242    if (stbi__pnm_info(s, x, y, comp))  return 1;
06243    #endif
06244 
06245    #ifndef STBI_NO_HDR
06246    if (stbi__hdr_info(s, x, y, comp))  return 1;
06247    #endif
06248 
06249    // test tga last because it's a crappy test!
06250    #ifndef STBI_NO_TGA
06251    if (stbi__tga_info(s, x, y, comp))
06252        return 1;
06253    #endif
06254    return stbi__err("unknown image type", "Image not of any known type, or corrupt");
06255 }
06256 
06257 #ifndef STBI_NO_STDIO
06258 STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
06259 {
06260     FILE *f = stbi__fopen(filename, "rb");
06261     int result;
06262     if (!f) return stbi__err("can't fopen", "Unable to open file");
06263     result = stbi_info_from_file(f, x, y, comp);
06264     fclose(f);
06265     return result;
06266 }
06267 
06268 STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
06269 {
06270    int r;
06271    stbi__context s;
06272    long pos = ftell(f);
06273    stbi__start_file(&s, f);
06274    r = stbi__info_main(&s,x,y,comp);
06275    fseek(f,pos,SEEK_SET);
06276    return r;
06277 }
06278 #endif // !STBI_NO_STDIO
06279 
06280 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
06281 {
06282    stbi__context s;
06283    stbi__start_mem(&s,buffer,len);
06284    return stbi__info_main(&s,x,y,comp);
06285 }
06286 
06287 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
06288 {
06289    stbi__context s;
06290    stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
06291    return stbi__info_main(&s,x,y,comp);
06292 }
06293 
06294 #endif // STB_IMAGE_IMPLEMENTATION
06295 
06296 /*
06297    revision history:
06298       2.06  (2015-04-19) fix bug where PSD returns wrong '*comp' value
06299       2.05  (2015-04-19) fix bug in progressive JPEG handling, fix warning
06300       2.04  (2015-04-15) try to re-enable SIMD on MinGW 64-bit
06301       2.03  (2015-04-12) extra corruption checking (mmozeiko)
06302                          stbi_set_flip_vertically_on_load (nguillemot)
06303                          fix NEON support; fix mingw support
06304       2.02  (2015-01-19) fix incorrect assert, fix warning
06305       2.01  (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
06306       2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
06307       2.00  (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
06308                          progressive JPEG (stb)
06309                          PGM/PPM support (Ken Miller)
06310                          STBI_MALLOC,STBI_REALLOC,STBI_FREE
06311                          GIF bugfix -- seemingly never worked
06312                          STBI_NO_*, STBI_ONLY_*
06313       1.48  (2014-12-14) fix incorrectly-named assert()
06314       1.47  (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
06315                          optimize PNG (ryg)
06316                          fix bug in interlaced PNG with user-specified channel count (stb)
06317       1.46  (2014-08-26)
06318               fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
06319       1.45  (2014-08-16)
06320               fix MSVC-ARM internal compiler error by wrapping malloc
06321       1.44  (2014-08-07)
06322               various warning fixes from Ronny Chevalier
06323       1.43  (2014-07-15)
06324               fix MSVC-only compiler problem in code changed in 1.42
06325       1.42  (2014-07-09)
06326               don't define _CRT_SECURE_NO_WARNINGS (affects user code)
06327               fixes to stbi__cleanup_jpeg path
06328               added STBI_ASSERT to avoid requiring assert.h
06329       1.41  (2014-06-25)
06330               fix search&replace from 1.36 that messed up comments/error messages
06331       1.40  (2014-06-22)
06332               fix gcc struct-initialization warning
06333       1.39  (2014-06-15)
06334               fix to TGA optimization when req_comp != number of components in TGA;
06335               fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
06336               add support for BMP version 5 (more ignored fields)
06337       1.38  (2014-06-06)
06338               suppress MSVC warnings on integer casts truncating values
06339               fix accidental rename of 'skip' field of I/O
06340       1.37  (2014-06-04)
06341               remove duplicate typedef
06342       1.36  (2014-06-03)
06343               convert to header file single-file library
06344               if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
06345       1.35  (2014-05-27)
06346               various warnings
06347               fix broken STBI_SIMD path
06348               fix bug where stbi_load_from_file no longer left file pointer in correct place
06349               fix broken non-easy path for 32-bit BMP (possibly never used)
06350               TGA optimization by Arseny Kapoulkine
06351       1.34  (unknown)
06352               use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
06353       1.33  (2011-07-14)
06354               make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
06355       1.32  (2011-07-13)
06356               support for "info" function for all supported filetypes (SpartanJ)
06357       1.31  (2011-06-20)
06358               a few more leak fixes, bug in PNG handling (SpartanJ)
06359       1.30  (2011-06-11)
06360               added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
06361               removed deprecated format-specific test/load functions
06362               removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
06363               error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
06364               fix inefficiency in decoding 32-bit BMP (David Woo)
06365       1.29  (2010-08-16)
06366               various warning fixes from Aurelien Pocheville
06367       1.28  (2010-08-01)
06368               fix bug in GIF palette transparency (SpartanJ)
06369       1.27  (2010-08-01)
06370               cast-to-stbi_uc to fix warnings
06371       1.26  (2010-07-24)
06372               fix bug in file buffering for PNG reported by SpartanJ
06373       1.25  (2010-07-17)
06374               refix trans_data warning (Won Chun)
06375       1.24  (2010-07-12)
06376               perf improvements reading from files on platforms with lock-heavy fgetc()
06377               minor perf improvements for jpeg
06378               deprecated type-specific functions so we'll get feedback if they're needed
06379               attempt to fix trans_data warning (Won Chun)
06380       1.23    fixed bug in iPhone support
06381       1.22  (2010-07-10)
06382               removed image *writing* support
06383               stbi_info support from Jetro Lauha
06384               GIF support from Jean-Marc Lienher
06385               iPhone PNG-extensions from James Brown
06386               warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
06387       1.21    fix use of 'stbi_uc' in header (reported by jon blow)
06388       1.20    added support for Softimage PIC, by Tom Seddon
06389       1.19    bug in interlaced PNG corruption check (found by ryg)
06390       1.18  (2008-08-02)
06391               fix a threading bug (local mutable static)
06392       1.17    support interlaced PNG
06393       1.16    major bugfix - stbi__convert_format converted one too many pixels
06394       1.15    initialize some fields for thread safety
06395       1.14    fix threadsafe conversion bug
06396               header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
06397       1.13    threadsafe
06398       1.12    const qualifiers in the API
06399       1.11    Support installable IDCT, colorspace conversion routines
06400       1.10    Fixes for 64-bit (don't use "unsigned long")
06401               optimized upsampling by Fabian "ryg" Giesen
06402       1.09    Fix format-conversion for PSD code (bad global variables!)
06403       1.08    Thatcher Ulrich's PSD code integrated by Nicolas Schulz
06404       1.07    attempt to fix C++ warning/errors again
06405       1.06    attempt to fix C++ warning/errors again
06406       1.05    fix TGA loading to return correct *comp and use good luminance calc
06407       1.04    default float alpha is 1, not 255; use 'void *' for stbi_image_free
06408       1.03    bugfixes to STBI_NO_STDIO, STBI_NO_HDR
06409       1.02    support for (subset of) HDR files, float interface for preferred access to them
06410       1.01    fix bug: possible bug in handling right-side up bmps... not sure
06411               fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
06412       1.00    interface to zlib that skips zlib header
06413       0.99    correct handling of alpha in palette
06414       0.98    TGA loader by lonesock; dynamically add loaders (untested)
06415       0.97    jpeg errors on too large a file; also catch another malloc failure
06416       0.96    fix detection of invalid v value - particleman@mollyrocket forum
06417       0.95    during header scan, seek to markers in case of padding
06418       0.94    STBI_NO_STDIO to disable stdio usage; rename all #defines the same
06419       0.93    handle jpegtran output; verbose errors
06420       0.92    read 4,8,16,24,32-bit BMP files of several formats
06421       0.91    output 24-bit Windows 3.0 BMP files
06422       0.90    fix a few more warnings; bump version number to approach 1.0
06423       0.61    bugfixes due to Marc LeBlanc, Christopher Lloyd
06424       0.60    fix compiling as c++
06425       0.59    fix warnings: merge Dave Moore's -Wall fixes
06426       0.58    fix bug: zlib uncompressed mode len/nlen was wrong endian
06427       0.57    fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
06428       0.56    fix bug: zlib uncompressed mode len vs. nlen
06429       0.55    fix bug: restart_interval not initialized to 0
06430       0.54    allow NULL for 'int *comp'
06431       0.53    fix bug in png 3->4; speedup png decoding
06432       0.52    png handles req_comp=3,4 directly; minor cleanup; jpeg comments
06433       0.51    obey req_comp requests, 1-component jpegs return as 1-component,
06434               on 'test' only check type, not whether we support this variant
06435       0.50  (2006-11-19)
06436               first released version
06437 */


rail_object_detector
Author(s):
autogenerated on Sat Jun 8 2019 20:26:30