stb_image.h
Go to the documentation of this file.
1 /* stb_image - v2.06 - public domain image loader - http://nothings.org/stb_image.h
2  no warranty implied; use at your own risk
3 
4  Do this:
5  #define STB_IMAGE_IMPLEMENTATION
6  before you include this file in *one* C or C++ file to create the implementation.
7 
8  // i.e. it should look like this:
9  #include ...
10  #include ...
11  #include ...
12  #define STB_IMAGE_IMPLEMENTATION
13  #include "stb_image.h"
14 
15  You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
16  And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
17 
18 
19  QUICK NOTES:
20  Primarily of interest to game developers and other people who can
21  avoid problematic images and only need the trivial interface
22 
23  JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
24  PNG 1/2/4/8-bit-per-channel (16 bpc not supported)
25 
26  TGA (not sure what subset, if a subset)
27  BMP non-1bpp, non-RLE
28  PSD (composited view only, no extra channels)
29 
30  GIF (*comp always reports as 4-channel)
31  HDR (radiance rgbE format)
32  PIC (Softimage PIC)
33  PNM (PPM and PGM binary only)
34 
35  - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
36  - decode from arbitrary I/O callbacks
37  - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
38 
39  Full documentation under "DOCUMENTATION" below.
40 
41 
42  Revision 2.00 release notes:
43 
44  - Progressive JPEG is now supported.
45 
46  - PPM and PGM binary formats are now supported, thanks to Ken Miller.
47 
48  - x86 platforms now make use of SSE2 SIMD instructions for
49  JPEG decoding, and ARM platforms can use NEON SIMD if requested.
50  This work was done by Fabian "ryg" Giesen. SSE2 is used by
51  default, but NEON must be enabled explicitly; see docs.
52 
53  With other JPEG optimizations included in this version, we see
54  2x speedup on a JPEG on an x86 machine, and a 1.5x speedup
55  on a JPEG on an ARM machine, relative to previous versions of this
56  library. The same results will not obtain for all JPGs and for all
57  x86/ARM machines. (Note that progressive JPEGs are significantly
58  slower to decode than regular JPEGs.) This doesn't mean that this
59  is the fastest JPEG decoder in the land; rather, it brings it
60  closer to parity with standard libraries. If you want the fastest
61  decode, look elsewhere. (See "Philosophy" section of docs below.)
62 
63  See final bullet items below for more info on SIMD.
64 
65  - Added STBI_MALLOC, STBI_REALLOC, and STBI_FREE macros for replacing
66  the memory allocator. Unlike other STBI libraries, these macros don't
67  support a context parameter, so if you need to pass a context in to
68  the allocator, you'll have to store it in a global or a thread-local
69  variable.
70 
71  - Split existing STBI_NO_HDR flag into two flags, STBI_NO_HDR and
72  STBI_NO_LINEAR.
73  STBI_NO_HDR: suppress implementation of .hdr reader format
74  STBI_NO_LINEAR: suppress high-dynamic-range light-linear float API
75 
76  - You can suppress implementation of any of the decoders to reduce
77  your code footprint by #defining one or more of the following
78  symbols before creating the implementation.
79 
80  STBI_NO_JPEG
81  STBI_NO_PNG
82  STBI_NO_BMP
83  STBI_NO_PSD
84  STBI_NO_TGA
85  STBI_NO_GIF
86  STBI_NO_HDR
87  STBI_NO_PIC
88  STBI_NO_PNM (.ppm and .pgm)
89 
90  - You can request *only* certain decoders and suppress all other ones
91  (this will be more forward-compatible, as addition of new decoders
92  doesn't require you to disable them explicitly):
93 
94  STBI_ONLY_JPEG
95  STBI_ONLY_PNG
96  STBI_ONLY_BMP
97  STBI_ONLY_PSD
98  STBI_ONLY_TGA
99  STBI_ONLY_GIF
100  STBI_ONLY_HDR
101  STBI_ONLY_PIC
102  STBI_ONLY_PNM (.ppm and .pgm)
103 
104  Note that you can define multiples of these, and you will get all
105  of them ("only x" and "only y" is interpreted to mean "only x&y").
106 
107  - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
108  want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
109 
110  - Compilation of all SIMD code can be suppressed with
111  #define STBI_NO_SIMD
112  It should not be necessary to disable SIMD unless you have issues
113  compiling (e.g. using an x86 compiler which doesn't support SSE
114  intrinsics or that doesn't support the method used to detect
115  SSE2 support at run-time), and even those can be reported as
116  bugs so I can refine the built-in compile-time checking to be
117  smarter.
118 
119  - The old STBI_SIMD system which allowed installing a user-defined
120  IDCT etc. has been removed. If you need this, don't upgrade. My
121  assumption is that almost nobody was doing this, and those who
122  were will find the built-in SIMD more satisfactory anyway.
123 
124  - RGB values computed for JPEG images are slightly different from
125  previous versions of stb_image. (This is due to using less
126  integer precision in SIMD.) The C code has been adjusted so
127  that the same RGB values will be computed regardless of whether
128  SIMD support is available, so your app should always produce
129  consistent results. But these results are slightly different from
130  previous versions. (Specifically, about 3% of available YCbCr values
131  will compute different RGB results from pre-1.49 versions by +-1;
132  most of the deviating values are one smaller in the G channel.)
133 
134  - If you must produce consistent results with previous versions of
135  stb_image, #define STBI_JPEG_OLD and you will get the same results
136  you used to; however, you will not get the SIMD speedups for
137  the YCbCr-to-RGB conversion step (although you should still see
138  significant JPEG speedup from the other changes).
139 
140  Please note that STBI_JPEG_OLD is a temporary feature; it will be
141  removed in future versions of the library. It is only intended for
142  near-term back-compatibility use.
143 
144 
145  Latest revision history:
146  2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
147  2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
148  2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
149  2.03 (2015-04-12) additional corruption checking
150  stbi_set_flip_vertically_on_load
151  fix NEON support; fix mingw support
152  2.02 (2015-01-19) fix incorrect assert, fix warning
153  2.01 (2015-01-17) fix various warnings
154  2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
155  2.00 (2014-12-25) optimize JPEG, including x86 SSE2 & ARM NEON SIMD
156  progressive JPEG
157  PGM/PPM support
158  STBI_MALLOC,STBI_REALLOC,STBI_FREE
159  STBI_NO_*, STBI_ONLY_*
160  GIF bugfix
161  1.48 (2014-12-14) fix incorrectly-named assert()
162  1.47 (2014-12-14) 1/2/4-bit PNG support (both grayscale and paletted)
163  optimize PNG
164  fix bug in interlaced PNG with user-specified channel count
165 
166  See end of file for full revision history.
167 
168 
169  ============================ Contributors =========================
170 
171  Image formats Bug fixes & warning fixes
172  Sean Barrett (jpeg, png, bmp) Marc LeBlanc
173  Nicolas Schulz (hdr, psd) Christpher Lloyd
174  Jonathan Dummer (tga) Dave Moore
175  Jean-Marc Lienher (gif) Won Chun
176  Tom Seddon (pic) the Horde3D community
177  Thatcher Ulrich (psd) Janez Zemva
178  Ken Miller (pgm, ppm) Jonathan Blow
179  Laurent Gomila
180  Aruelien Pocheville
181  Extensions, features Ryamond Barbiero
182  Jetro Lauha (stbi_info) David Woo
183  Martin "SpartanJ" Golini (stbi_info) Martin Golini
184  James "moose2000" Brown (iPhone PNG) Roy Eltham
185  Ben "Disch" Wenger (io callbacks) Luke Graham
186  Omar Cornut (1/2/4-bit PNG) Thomas Ruf
187  Nicolas Guillemot (vertical flip) John Bartholomew
188  Ken Hamada
189  Optimizations & bugfixes Cort Stratton
190  Fabian "ryg" Giesen Blazej Dariusz Roszkowski
191  Arseny Kapoulkine Thibault Reuille
192  Paul Du Bois
193  Guillaume George
194  If your name should be here but Jerry Jansson
195  isn't, let Sean know. Hayaki Saito
196  Johan Duparc
197  Ronny Chevalier
198  Michal Cichon
199  Tero Hanninen
200  Sergio Gonzalez
201  Cass Everitt
202  Engin Manap
203  Martins Mozeiko
204  Joseph Thomson
205  Phil Jordan
206 
207 License:
208  This software is in the public domain. Where that dedication is not
209  recognized, you are granted a perpetual, irrevocable license to copy
210  and modify this file however you want.
211 
212 */
213 
214 #ifndef STBI_INCLUDE_STB_IMAGE_H
215 #define STBI_INCLUDE_STB_IMAGE_H
216 
217 // DOCUMENTATION
218 //
219 // Limitations:
220 // - no 16-bit-per-channel PNG
221 // - no 12-bit-per-channel JPEG
222 // - no JPEGs with arithmetic coding
223 // - no 1-bit BMP
224 // - GIF always returns *comp=4
225 //
226 // Basic usage (see HDR discussion below for HDR usage):
227 // int x,y,n;
228 // unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
229 // // ... process data if not NULL ...
230 // // ... x = width, y = height, n = # 8-bit components per pixel ...
231 // // ... replace '0' with '1'..'4' to force that many components per pixel
232 // // ... but 'n' will always be the number that it would have been if you said 0
233 // stbi_image_free(data)
234 //
235 // Standard parameters:
236 // int *x -- outputs image width in pixels
237 // int *y -- outputs image height in pixels
238 // int *comp -- outputs # of image components in image file
239 // int req_comp -- if non-zero, # of image components requested in result
240 //
241 // The return value from an image loader is an 'unsigned char *' which points
242 // to the pixel data, or NULL on an allocation failure or if the image is
243 // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
244 // with each pixel consisting of N interleaved 8-bit components; the first
245 // pixel pointed to is top-left-most in the image. There is no padding between
246 // image scanlines or between pixels, regardless of format. The number of
247 // components N is 'req_comp' if req_comp is non-zero, or *comp otherwise.
248 // If req_comp is non-zero, *comp has the number of components that _would_
249 // have been output otherwise. E.g. if you set req_comp to 4, you will always
250 // get RGBA output, but you can check *comp to see if it's trivially opaque
251 // because e.g. there were only 3 channels in the source image.
252 //
253 // An output image with N components has the following components interleaved
254 // in this order in each pixel:
255 //
256 // N=#comp components
257 // 1 grey
258 // 2 grey, alpha
259 // 3 red, green, blue
260 // 4 red, green, blue, alpha
261 //
262 // If image loading fails for any reason, the return value will be NULL,
263 // and *x, *y, *comp will be unchanged. The function stbi_failure_reason()
264 // can be queried for an extremely brief, end-user unfriendly explanation
265 // of why the load failed. Define STBI_NO_FAILURE_STRINGS to avoid
266 // compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
267 // more user-friendly ones.
268 //
269 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
270 //
271 // ===========================================================================
272 //
273 // Philosophy
274 //
275 // stb libraries are designed with the following priorities:
276 //
277 // 1. easy to use
278 // 2. easy to maintain
279 // 3. good performance
280 //
281 // Sometimes I let "good performance" creep up in priority over "easy to maintain",
282 // and for best performance I may provide less-easy-to-use APIs that give higher
283 // performance, in addition to the easy to use ones. Nevertheless, it's important
284 // to keep in mind that from the standpoint of you, a client of this library,
285 // all you care about is #1 and #3, and stb libraries do not emphasize #3 above all.
286 //
287 // Some secondary priorities arise directly from the first two, some of which
288 // make more explicit reasons why performance can't be emphasized.
289 //
290 // - Portable ("ease of use")
291 // - Small footprint ("easy to maintain")
292 // - No dependencies ("ease of use")
293 //
294 // ===========================================================================
295 //
296 // I/O callbacks
297 //
298 // I/O callbacks allow you to read from arbitrary sources, like packaged
299 // files or some other source. Data read from callbacks are processed
300 // through a small internal buffer (currently 128 bytes) to try to reduce
301 // overhead.
302 //
303 // The three functions you must define are "read" (reads some bytes of data),
304 // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
305 //
306 // ===========================================================================
307 //
308 // SIMD support
309 //
310 // The JPEG decoder will try to automatically use SIMD kernels on x86 when
311 // supported by the compiler. For ARM Neon support, you must explicitly
312 // request it.
313 //
314 // (The old do-it-yourself SIMD API is no longer supported in the current
315 // code.)
316 //
317 // On x86, SSE2 will automatically be used when available based on a run-time
318 // test; if not, the generic C versions are used as a fall-back. On ARM targets,
319 // the typical path is to have separate builds for NEON and non-NEON devices
320 // (at least this is true for iOS and Android). Therefore, the NEON support is
321 // toggled by a build flag: define STBI_NEON to get NEON loops.
322 //
323 // The output of the JPEG decoder is slightly different from versions where
324 // SIMD support was introduced (that is, for versions before 1.49). The
325 // difference is only +-1 in the 8-bit RGB channels, and only on a small
326 // fraction of pixels. You can force the pre-1.49 behavior by defining
327 // STBI_JPEG_OLD, but this will disable some of the SIMD decoding path
328 // and hence cost some performance.
329 //
330 // If for some reason you do not want to use any of SIMD code, or if
331 // you have issues compiling it, you can disable it entirely by
332 // defining STBI_NO_SIMD.
333 //
334 // ===========================================================================
335 //
336 // HDR image support (disable by defining STBI_NO_HDR)
337 //
338 // stb_image now supports loading HDR images in general, and currently
339 // the Radiance .HDR file format, although the support is provided
340 // generically. You can still load any file through the existing interface;
341 // if you attempt to load an HDR file, it will be automatically remapped to
342 // LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
343 // both of these constants can be reconfigured through this interface:
344 //
345 // stbi_hdr_to_ldr_gamma(2.2f);
346 // stbi_hdr_to_ldr_scale(1.0f);
347 //
348 // (note, do not use _inverse_ constants; stbi_image will invert them
349 // appropriately).
350 //
351 // Additionally, there is a new, parallel interface for loading files as
352 // (linear) floats to preserve the full dynamic range:
353 //
354 // float *data = stbi_loadf(filename, &x, &y, &n, 0);
355 //
356 // If you load LDR images through this interface, those images will
357 // be promoted to floating point values, run through the inverse of
358 // constants corresponding to the above:
359 //
360 // stbi_ldr_to_hdr_scale(1.0f);
361 // stbi_ldr_to_hdr_gamma(2.2f);
362 //
363 // Finally, given a filename (or an open file or memory block--see header
364 // file for details) containing image data, you can query for the "most
365 // appropriate" interface to use (that is, whether the image is HDR or
366 // not), using:
367 //
368 // stbi_is_hdr(char *filename);
369 //
370 // ===========================================================================
371 //
372 // iPhone PNG support:
373 //
374 // By default we convert iphone-formatted PNGs back to RGB, even though
375 // they are internally encoded differently. You can disable this conversion
376 // by by calling stbi_convert_iphone_png_to_rgb(0), in which case
377 // you will always just get the native iphone "format" through (which
378 // is BGR stored in RGB).
379 //
380 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
381 // pixel to remove any premultiplied alpha *only* if the image file explicitly
382 // says there's premultiplied data (currently only happens in iPhone images,
383 // and only if iPhone convert-to-rgb processing is on).
384 //
385 
386 
387 #ifndef STBI_NO_STDIO
388 #include <stdio.h>
389 #endif // STBI_NO_STDIO
390 
391 #define STBI_VERSION 1
392 
393 enum
394 {
395  STBI_default = 0, // only used for req_comp
396 
399  STBI_rgb = 3,
401 };
402 
403 typedef unsigned char stbi_uc;
404 
405 #ifdef __cplusplus
406 extern "C" {
407 #endif
408 
409 #ifdef STB_IMAGE_STATIC
410 #define STBIDEF static
411 #else
412 #define STBIDEF extern
413 #endif
414 
416 //
417 // PRIMARY API - works on images of any type
418 //
419 
420 //
421 // load image by filename, open file, or memory buffer
422 //
423 
424 typedef struct
425 {
426  int (*read) (void *user,char *data,int size); // fill 'data' with 'size' bytes. return number of bytes actually read
427  void (*skip) (void *user,int n); // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
428  int (*eof) (void *user); // returns nonzero if we are at end of file/data
430 
431 STBIDEF stbi_uc *stbi_load (char const *filename, int *x, int *y, int *comp, int req_comp);
432 STBIDEF stbi_uc *stbi_load_from_memory (stbi_uc const *buffer, int len , int *x, int *y, int *comp, int req_comp);
433 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk , void *user, int *x, int *y, int *comp, int req_comp);
434 
435 #ifndef STBI_NO_STDIO
436 STBIDEF stbi_uc *stbi_load_from_file (FILE *f, int *x, int *y, int *comp, int req_comp);
437 // for stbi_load_from_file, file pointer is left pointing immediately after image
438 #endif
439 
440 #ifndef STBI_NO_LINEAR
441  STBIDEF float *stbi_loadf (char const *filename, int *x, int *y, int *comp, int req_comp);
442  STBIDEF float *stbi_loadf_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
443  STBIDEF float *stbi_loadf_from_callbacks (stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp);
444 
445  #ifndef STBI_NO_STDIO
446  STBIDEF float *stbi_loadf_from_file (FILE *f, int *x, int *y, int *comp, int req_comp);
447  #endif
448 #endif
449 
450 #ifndef STBI_NO_HDR
451  STBIDEF void stbi_hdr_to_ldr_gamma(float gamma);
452  STBIDEF void stbi_hdr_to_ldr_scale(float scale);
453 #endif
454 
455 #ifndef STBI_NO_LINEAR
456  STBIDEF void stbi_ldr_to_hdr_gamma(float gamma);
457  STBIDEF void stbi_ldr_to_hdr_scale(float scale);
458 #endif // STBI_NO_HDR
459 
460 // stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
461 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
463 #ifndef STBI_NO_STDIO
464 STBIDEF int stbi_is_hdr (char const *filename);
465 STBIDEF int stbi_is_hdr_from_file(FILE *f);
466 #endif // STBI_NO_STDIO
467 
468 
469 // get a VERY brief reason for failure
470 // NOT THREADSAFE
471 STBIDEF const char *stbi_failure_reason (void);
472 
473 // free the loaded image -- this is just free()
474 STBIDEF void stbi_image_free (void *retval_from_stbi_load);
475 
476 // get image dimensions & components without fully decoding
477 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
478 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp);
479 
480 #ifndef STBI_NO_STDIO
481 STBIDEF int stbi_info (char const *filename, int *x, int *y, int *comp);
482 STBIDEF int stbi_info_from_file (FILE *f, int *x, int *y, int *comp);
483 
484 #endif
485 
486 
487 
488 // for image formats that explicitly notate that they have premultiplied alpha,
489 // we just return the colors as stored in the file. set this flag to force
490 // unpremultiplication. results are undefined if the unpremultiply overflow.
491 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
492 
493 // indicate whether we should process iphone images back to canonical format,
494 // or just pass them through "as-is"
495 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
496 
497 // flip the image vertically, so the first pixel in the output array is the bottom left
498 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip);
499 
500 // ZLIB client - used by PNG, available for other purposes
501 
502 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
503 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
504 STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
505 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
506 
507 STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
508 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
509 
510 
511 #ifdef __cplusplus
512 }
513 #endif
514 
515 //
516 //
518 #endif // STBI_INCLUDE_STB_IMAGE_H
519 
520 #ifdef STB_IMAGE_IMPLEMENTATION
521 
522 #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
523  || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
524  || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
525  || defined(STBI_ONLY_ZLIB)
526  #ifndef STBI_ONLY_JPEG
527  #define STBI_NO_JPEG
528  #endif
529  #ifndef STBI_ONLY_PNG
530  #define STBI_NO_PNG
531  #endif
532  #ifndef STBI_ONLY_BMP
533  #define STBI_NO_BMP
534  #endif
535  #ifndef STBI_ONLY_PSD
536  #define STBI_NO_PSD
537  #endif
538  #ifndef STBI_ONLY_TGA
539  #define STBI_NO_TGA
540  #endif
541  #ifndef STBI_ONLY_GIF
542  #define STBI_NO_GIF
543  #endif
544  #ifndef STBI_ONLY_HDR
545  #define STBI_NO_HDR
546  #endif
547  #ifndef STBI_ONLY_PIC
548  #define STBI_NO_PIC
549  #endif
550  #ifndef STBI_ONLY_PNM
551  #define STBI_NO_PNM
552  #endif
553 #endif
554 
555 #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
556 #define STBI_NO_ZLIB
557 #endif
558 
559 
560 #include <stdarg.h>
561 #include <stddef.h> // ptrdiff_t on osx
562 #include <stdlib.h>
563 #include <string.h>
564 
565 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
566 #include <math.h> // ldexp
567 #endif
568 
569 #ifndef STBI_NO_STDIO
570 #include <stdio.h>
571 #endif
572 
573 #ifndef STBI_ASSERT
574 #include <assert.h>
575 #define STBI_ASSERT(x) assert(x)
576 #endif
577 
578 
579 #ifndef _MSC_VER
580  #ifdef __cplusplus
581  #define stbi_inline inline
582  #else
583  #define stbi_inline
584  #endif
585 #else
586  #define stbi_inline __forceinline
587 #endif
588 
589 
590 #ifdef _MSC_VER
591 typedef unsigned short stbi__uint16;
592 typedef signed short stbi__int16;
593 typedef unsigned int stbi__uint32;
594 typedef signed int stbi__int32;
595 #else
596 #include <stdint.h>
597 typedef uint16_t stbi__uint16;
598 typedef int16_t stbi__int16;
599 typedef uint32_t stbi__uint32;
600 typedef int32_t stbi__int32;
601 #endif
602 
603 // should produce compiler error if size is wrong
604 typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1];
605 
606 #ifdef _MSC_VER
607 #define STBI_NOTUSED(v) (void)(v)
608 #else
609 #define STBI_NOTUSED(v) (void)sizeof(v)
610 #endif
611 
612 #ifdef _MSC_VER
613 #define STBI_HAS_LROTL
614 #endif
615 
616 #ifdef STBI_HAS_LROTL
617  #define stbi_lrot(x,y) _lrotl(x,y)
618 #else
619  #define stbi_lrot(x,y) (((x) << (y)) | ((x) >> (32 - (y))))
620 #endif
621 
622 #if defined(STBI_MALLOC) && defined(STBI_FREE) && defined(STBI_REALLOC)
623 // ok
624 #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC)
625 // ok
626 #else
627 #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC."
628 #endif
629 
630 #ifndef STBI_MALLOC
631 #define STBI_MALLOC(sz) malloc(sz)
632 #define STBI_REALLOC(p,sz) realloc(p,sz)
633 #define STBI_FREE(p) free(p)
634 #endif
635 
636 // x86/x64 detection
637 #if defined(__x86_64__) || defined(_M_X64)
638 #define STBI__X64_TARGET
639 #elif defined(__i386) || defined(_M_IX86)
640 #define STBI__X86_TARGET
641 #endif
642 
643 #if defined(__GNUC__) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET)) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
644 // NOTE: not clear do we actually need this for the 64-bit path?
645 // gcc doesn't support sse2 intrinsics unless you compile with -msse2,
646 // (but compiling with -msse2 allows the compiler to use SSE2 everywhere;
647 // this is just broken and gcc are jerks for not fixing it properly
648 // http://www.virtualdub.org/blog/pivot/entry.php?id=363 )
649 #define STBI_NO_SIMD
650 #endif
651 
652 #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
653 // Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
654 //
655 // 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
656 // Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
657 // As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
658 // simultaneously enabling "-mstackrealign".
659 //
660 // See https://github.com/nothings/stb/issues/81 for more information.
661 //
662 // So default to no SSE2 on 32-bit MinGW. If you've read this far and added
663 // -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
664 #define STBI_NO_SIMD
665 #endif
666 
667 #if !defined(STBI_NO_SIMD) && defined(STBI__X86_TARGET)
668 #define STBI_SSE2
669 #include <emmintrin.h>
670 
671 #ifdef _MSC_VER
672 
673 #if _MSC_VER >= 1400 // not VC6
674 #include <intrin.h> // __cpuid
675 static int stbi__cpuid3(void)
676 {
677  int info[4];
678  __cpuid(info,1);
679  return info[3];
680 }
681 #else
682 static int stbi__cpuid3(void)
683 {
684  int res;
685  __asm {
686  mov eax,1
687  cpuid
688  mov res,edx
689  }
690  return res;
691 }
692 #endif
693 
694 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
695 
696 static int stbi__sse2_available()
697 {
698  int info3 = stbi__cpuid3();
699  return ((info3 >> 26) & 1) != 0;
700 }
701 #else // assume GCC-style if not VC++
702 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
703 
704 static int stbi__sse2_available()
705 {
706 #if defined(__GNUC__) && (__GNUC__ * 100 + __GNUC_MINOR__) >= 408 // GCC 4.8 or later
707  // GCC 4.8+ has a nice way to do this
708  return __builtin_cpu_supports("sse2");
709 #else
710  // portable way to do this, preferably without using GCC inline ASM?
711  // just bail for now.
712  return 0;
713 #endif
714 }
715 #endif
716 #endif
717 
718 // ARM NEON
719 #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
720 #undef STBI_NEON
721 #endif
722 
723 #ifdef STBI_NEON
724 #include <arm_neon.h>
725 // assume GCC or Clang on ARM targets
726 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
727 #endif
728 
729 #ifndef STBI_SIMD_ALIGN
730 #define STBI_SIMD_ALIGN(type, name) type name
731 #endif
732 
734 //
735 // stbi__context struct and start_xxx functions
736 
737 // stbi__context structure is our basic context used by all images, so it
738 // contains all the IO context, plus some basic image information
739 typedef struct
740 {
741  stbi__uint32 img_x, img_y;
742  int img_n, img_out_n;
743 
745  void *io_user_data;
746 
747  int read_from_callbacks;
748  int buflen;
749  stbi_uc buffer_start[128];
750 
751  stbi_uc *img_buffer, *img_buffer_end;
752  stbi_uc *img_buffer_original;
753 } stbi__context;
754 
755 
756 static void stbi__refill_buffer(stbi__context *s);
757 
758 // initialize a memory-decode context
759 static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len)
760 {
761  s->io.read = NULL;
762  s->read_from_callbacks = 0;
763  s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer;
764  s->img_buffer_end = (stbi_uc *) buffer+len;
765 }
766 
767 // initialize a callback-based context
768 static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user)
769 {
770  s->io = *c;
771  s->io_user_data = user;
772  s->buflen = sizeof(s->buffer_start);
773  s->read_from_callbacks = 1;
774  s->img_buffer_original = s->buffer_start;
775  stbi__refill_buffer(s);
776 }
777 
778 #ifndef STBI_NO_STDIO
779 
780 static int stbi__stdio_read(void *user, char *data, int size)
781 {
782  return (int) fread(data,1,size,(FILE*) user);
783 }
784 
785 static void stbi__stdio_skip(void *user, int n)
786 {
787  fseek((FILE*) user, n, SEEK_CUR);
788 }
789 
790 static int stbi__stdio_eof(void *user)
791 {
792  return feof((FILE*) user);
793 }
794 
795 static stbi_io_callbacks stbi__stdio_callbacks =
796 {
797  stbi__stdio_read,
798  stbi__stdio_skip,
799  stbi__stdio_eof,
800 };
801 
802 static void stbi__start_file(stbi__context *s, FILE *f)
803 {
804  stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f);
805 }
806 
807 //static void stop_file(stbi__context *s) { }
808 
809 #endif // !STBI_NO_STDIO
810 
811 static void stbi__rewind(stbi__context *s)
812 {
813  // conceptually rewind SHOULD rewind to the beginning of the stream,
814  // but we just rewind to the beginning of the initial buffer, because
815  // we only use it after doing 'test', which only ever looks at at most 92 bytes
816  s->img_buffer = s->img_buffer_original;
817 }
818 
819 #ifndef STBI_NO_JPEG
820 static int stbi__jpeg_test(stbi__context *s);
821 static stbi_uc *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
822 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp);
823 #endif
824 
825 #ifndef STBI_NO_PNG
826 static int stbi__png_test(stbi__context *s);
827 static stbi_uc *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
828 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp);
829 #endif
830 
831 #ifndef STBI_NO_BMP
832 static int stbi__bmp_test(stbi__context *s);
833 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
834 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp);
835 #endif
836 
837 #ifndef STBI_NO_TGA
838 static int stbi__tga_test(stbi__context *s);
839 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
840 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp);
841 #endif
842 
843 #ifndef STBI_NO_PSD
844 static int stbi__psd_test(stbi__context *s);
845 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
846 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp);
847 #endif
848 
849 #ifndef STBI_NO_HDR
850 static int stbi__hdr_test(stbi__context *s);
851 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
852 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp);
853 #endif
854 
855 #ifndef STBI_NO_PIC
856 static int stbi__pic_test(stbi__context *s);
857 static stbi_uc *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
858 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp);
859 #endif
860 
861 #ifndef STBI_NO_GIF
862 static int stbi__gif_test(stbi__context *s);
863 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
864 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp);
865 #endif
866 
867 #ifndef STBI_NO_PNM
868 static int stbi__pnm_test(stbi__context *s);
869 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp);
870 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp);
871 #endif
872 
873 // this is not threadsafe
874 static const char *stbi__g_failure_reason;
875 
876 STBIDEF const char *stbi_failure_reason(void)
877 {
878  return stbi__g_failure_reason;
879 }
880 
881 static int stbi__err(const char *str)
882 {
883  stbi__g_failure_reason = str;
884  return 0;
885 }
886 
887 static void *stbi__malloc(size_t size)
888 {
889  return STBI_MALLOC(size);
890 }
891 
892 // stbi__err - error
893 // stbi__errpf - error returning pointer to float
894 // stbi__errpuc - error returning pointer to unsigned char
895 
896 #ifdef STBI_NO_FAILURE_STRINGS
897  #define stbi__err(x,y) 0
898 #elif defined(STBI_FAILURE_USERMSG)
899  #define stbi__err(x,y) stbi__err(y)
900 #else
901  #define stbi__err(x,y) stbi__err(x)
902 #endif
903 
904 #define stbi__errpf(x,y) ((float *) (stbi__err(x,y)?nullptr:nullptr))
905 #define stbi__errpuc(x,y) ((unsigned char *) (stbi__err(x,y)?nullptr:nullptr))
906 
907 STBIDEF void stbi_image_free(void *retval_from_stbi_load)
908 {
909  STBI_FREE(retval_from_stbi_load);
910 }
911 
912 #ifndef STBI_NO_LINEAR
913 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
914 #endif
915 
916 #ifndef STBI_NO_HDR
917 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp);
918 #endif
919 
920 static int stbi__vertically_flip_on_load = 0;
921 
922 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
923 {
924  stbi__vertically_flip_on_load = flag_true_if_should_flip;
925 }
926 
927 static unsigned char *stbi__load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
928 {
929  #ifndef STBI_NO_JPEG
930  if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp);
931  #endif
932  #ifndef STBI_NO_PNG
933  if (stbi__png_test(s)) return stbi__png_load(s,x,y,comp,req_comp);
934  #endif
935  #ifndef STBI_NO_BMP
936  if (stbi__bmp_test(s)) return stbi__bmp_load(s,x,y,comp,req_comp);
937  #endif
938  #ifndef STBI_NO_GIF
939  if (stbi__gif_test(s)) return stbi__gif_load(s,x,y,comp,req_comp);
940  #endif
941  #ifndef STBI_NO_PSD
942  if (stbi__psd_test(s)) return stbi__psd_load(s,x,y,comp,req_comp);
943  #endif
944  #ifndef STBI_NO_PIC
945  if (stbi__pic_test(s)) return stbi__pic_load(s,x,y,comp,req_comp);
946  #endif
947  #ifndef STBI_NO_PNM
948  if (stbi__pnm_test(s)) return stbi__pnm_load(s,x,y,comp,req_comp);
949  #endif
950 
951  #ifndef STBI_NO_HDR
952  if (stbi__hdr_test(s)) {
953  float *hdr = stbi__hdr_load(s, x,y,comp,req_comp);
954  return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
955  }
956  #endif
957 
958  #ifndef STBI_NO_TGA
959  // test tga last because it's a crappy test!
960  if (stbi__tga_test(s))
961  return stbi__tga_load(s,x,y,comp,req_comp);
962  #endif
963 
964  return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
965 }
966 
967 static unsigned char *stbi__load_flip(stbi__context *s, int *x, int *y, int *comp, int req_comp)
968 {
969  unsigned char *result = stbi__load_main(s, x, y, comp, req_comp);
970 
971  if (stbi__vertically_flip_on_load && result != NULL) {
972  int w = *x, h = *y;
973  int depth = req_comp ? req_comp : *comp;
974  int row,col,z;
975  stbi_uc temp;
976 
977  // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
978  for (row = 0; row < (h>>1); row++) {
979  for (col = 0; col < w; col++) {
980  for (z = 0; z < depth; z++) {
981  temp = result[(row * w + col) * depth + z];
982  result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
983  result[((h - row - 1) * w + col) * depth + z] = temp;
984  }
985  }
986  }
987  }
988 
989  return result;
990 }
991 
992 static void stbi__float_postprocess(float *result, int *x, int *y, int *comp, int req_comp)
993 {
994  if (stbi__vertically_flip_on_load && result != NULL) {
995  int w = *x, h = *y;
996  int depth = req_comp ? req_comp : *comp;
997  int row,col,z;
998  float temp;
999 
1000  // @OPTIMIZE: use a bigger temp buffer and memcpy multiple pixels at once
1001  for (row = 0; row < (h>>1); row++) {
1002  for (col = 0; col < w; col++) {
1003  for (z = 0; z < depth; z++) {
1004  temp = result[(row * w + col) * depth + z];
1005  result[(row * w + col) * depth + z] = result[((h - row - 1) * w + col) * depth + z];
1006  result[((h - row - 1) * w + col) * depth + z] = temp;
1007  }
1008  }
1009  }
1010  }
1011 }
1012 
1013 
1014 #ifndef STBI_NO_STDIO
1015 
1016 static FILE *stbi__fopen(char const *filename, char const *mode)
1017 {
1018  FILE *f;
1019 #if defined(_MSC_VER) && _MSC_VER >= 1400
1020  if (0 != fopen_s(&f, filename, mode))
1021  f=0;
1022 #else
1023  f = fopen(filename, mode);
1024 #endif
1025  return f;
1026 }
1027 
1028 
1029 STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
1030 {
1031  FILE *f = stbi__fopen(filename, "rb");
1032  unsigned char *result;
1033  if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
1034  result = stbi_load_from_file(f,x,y,comp,req_comp);
1035  fclose(f);
1036  return result;
1037 }
1038 
1039 STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1040 {
1041  unsigned char *result;
1042  stbi__context s;
1043  stbi__start_file(&s,f);
1044  result = stbi__load_flip(&s,x,y,comp,req_comp);
1045  if (result) {
1046  // need to 'unget' all the characters in the IO buffer
1047  fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
1048  }
1049  return result;
1050 }
1051 #endif
1052 
1053 STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1054 {
1055  stbi__context s;
1056  stbi__start_mem(&s,buffer,len);
1057  return stbi__load_flip(&s,x,y,comp,req_comp);
1058 }
1059 
1060 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1061 {
1062  stbi__context s;
1063  stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1064  return stbi__load_flip(&s,x,y,comp,req_comp);
1065 }
1066 
1067 #ifndef STBI_NO_LINEAR
1068 static float *stbi__loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
1069 {
1070  unsigned char *data;
1071  #ifndef STBI_NO_HDR
1072  if (stbi__hdr_test(s)) {
1073  float *hdr_data = stbi__hdr_load(s,x,y,comp,req_comp);
1074  if (hdr_data)
1075  stbi__float_postprocess(hdr_data,x,y,comp,req_comp);
1076  return hdr_data;
1077  }
1078  #endif
1079  data = stbi__load_flip(s, x, y, comp, req_comp);
1080  if (data)
1081  return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
1082  return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
1083 }
1084 
1085 STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1086 {
1087  stbi__context s;
1088  stbi__start_mem(&s,buffer,len);
1089  return stbi__loadf_main(&s,x,y,comp,req_comp);
1090 }
1091 
1092 STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1093 {
1094  stbi__context s;
1095  stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1096  return stbi__loadf_main(&s,x,y,comp,req_comp);
1097 }
1098 
1099 #ifndef STBI_NO_STDIO
1100 STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
1101 {
1102  float *result;
1103  FILE *f = stbi__fopen(filename, "rb");
1104  if (!f) return stbi__errpf("can't fopen", "Unable to open file");
1105  result = stbi_loadf_from_file(f,x,y,comp,req_comp);
1106  fclose(f);
1107  return result;
1108 }
1109 
1110 STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1111 {
1112  stbi__context s;
1113  stbi__start_file(&s,f);
1114  return stbi__loadf_main(&s,x,y,comp,req_comp);
1115 }
1116 #endif // !STBI_NO_STDIO
1117 
1118 #endif // !STBI_NO_LINEAR
1119 
1120 // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
1121 // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
1122 // reports false!
1123 
1125 {
1126  #ifndef STBI_NO_HDR
1127  stbi__context s;
1128  stbi__start_mem(&s,buffer,len);
1129  return stbi__hdr_test(&s);
1130  #else
1131  STBI_NOTUSED(buffer);
1132  STBI_NOTUSED(len);
1133  return 0;
1134  #endif
1135 }
1136 
1137 #ifndef STBI_NO_STDIO
1138 STBIDEF int stbi_is_hdr (char const *filename)
1139 {
1140  FILE *f = stbi__fopen(filename, "rb");
1141  int result=0;
1142  if (f) {
1143  result = stbi_is_hdr_from_file(f);
1144  fclose(f);
1145  }
1146  return result;
1147 }
1148 
1149 STBIDEF int stbi_is_hdr_from_file(FILE *f)
1150 {
1151  #ifndef STBI_NO_HDR
1152  stbi__context s;
1153  stbi__start_file(&s,f);
1154  return stbi__hdr_test(&s);
1155  #else
1156  return 0;
1157  #endif
1158 }
1159 #endif // !STBI_NO_STDIO
1160 
1161 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
1162 {
1163  #ifndef STBI_NO_HDR
1164  stbi__context s;
1165  stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1166  return stbi__hdr_test(&s);
1167  #else
1168  return 0;
1169  #endif
1170 }
1171 
1172 static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f;
1173 static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f;
1174 
1175 #ifndef STBI_NO_LINEAR
1176 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
1177 STBIDEF void stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
1178 #endif
1179 
1180 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; }
1181 STBIDEF void stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; }
1182 
1183 
1185 //
1186 // Common code used by all image loaders
1187 //
1188 
1189 enum
1190 {
1191  STBI__SCAN_load=0,
1192  STBI__SCAN_type,
1193  STBI__SCAN_header
1194 };
1195 
1196 static void stbi__refill_buffer(stbi__context *s)
1197 {
1198  int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
1199  if (n == 0) {
1200  // at end of file, treat same as if from memory, but need to handle case
1201  // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
1202  s->read_from_callbacks = 0;
1203  s->img_buffer = s->buffer_start;
1204  s->img_buffer_end = s->buffer_start+1;
1205  *s->img_buffer = 0;
1206  } else {
1207  s->img_buffer = s->buffer_start;
1208  s->img_buffer_end = s->buffer_start + n;
1209  }
1210 }
1211 
1212 stbi_inline static stbi_uc stbi__get8(stbi__context *s)
1213 {
1214  if (s->img_buffer < s->img_buffer_end)
1215  return *s->img_buffer++;
1216  if (s->read_from_callbacks) {
1217  stbi__refill_buffer(s);
1218  return *s->img_buffer++;
1219  }
1220  return 0;
1221 }
1222 
1223 stbi_inline static int stbi__at_eof(stbi__context *s)
1224 {
1225  if (s->io.read) {
1226  if (!(s->io.eof)(s->io_user_data)) return 0;
1227  // if feof() is true, check if buffer = end
1228  // special case: we've only got the special 0 character at the end
1229  if (s->read_from_callbacks == 0) return 1;
1230  }
1231 
1232  return s->img_buffer >= s->img_buffer_end;
1233 }
1234 
1235 static void stbi__skip(stbi__context *s, int n)
1236 {
1237  if (n < 0) {
1238  s->img_buffer = s->img_buffer_end;
1239  return;
1240  }
1241  if (s->io.read) {
1242  int blen = (int) (s->img_buffer_end - s->img_buffer);
1243  if (blen < n) {
1244  s->img_buffer = s->img_buffer_end;
1245  (s->io.skip)(s->io_user_data, n - blen);
1246  return;
1247  }
1248  }
1249  s->img_buffer += n;
1250 }
1251 
1252 static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n)
1253 {
1254  if (s->io.read) {
1255  int blen = (int) (s->img_buffer_end - s->img_buffer);
1256  if (blen < n) {
1257  int res, count;
1258 
1259  memcpy(buffer, s->img_buffer, blen);
1260 
1261  count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
1262  res = (count == (n-blen));
1263  s->img_buffer = s->img_buffer_end;
1264  return res;
1265  }
1266  }
1267 
1268  if (s->img_buffer+n <= s->img_buffer_end) {
1269  memcpy(buffer, s->img_buffer, n);
1270  s->img_buffer += n;
1271  return 1;
1272  } else
1273  return 0;
1274 }
1275 
1276 static int stbi__get16be(stbi__context *s)
1277 {
1278  int z = stbi__get8(s);
1279  return (z << 8) + stbi__get8(s);
1280 }
1281 
1282 static stbi__uint32 stbi__get32be(stbi__context *s)
1283 {
1284  stbi__uint32 z = stbi__get16be(s);
1285  return (z << 16) + stbi__get16be(s);
1286 }
1287 
1288 static int stbi__get16le(stbi__context *s)
1289 {
1290  int z = stbi__get8(s);
1291  return z + (stbi__get8(s) << 8);
1292 }
1293 
1294 static stbi__uint32 stbi__get32le(stbi__context *s)
1295 {
1296  stbi__uint32 z = stbi__get16le(s);
1297  return z + (stbi__get16le(s) << 16);
1298 }
1299 
1300 #define STBI__BYTECAST(x) ((stbi_uc) ((x) & 255)) // truncate int to byte without warnings
1301 
1302 
1304 //
1305 // generic converter from built-in img_n to req_comp
1306 // individual types do this automatically as much as possible (e.g. jpeg
1307 // does all cases internally since it needs to colorspace convert anyway,
1308 // and it never has alpha, so very few cases ). png can automatically
1309 // interleave an alpha=255 channel, but falls back to this for other cases
1310 //
1311 // assume data buffer is malloced, so malloc a new one and free that one
1312 // only failure mode is malloc failing
1313 
1314 static stbi_uc stbi__compute_y(int r, int g, int b)
1315 {
1316  return (stbi_uc) (((r*77) + (g*150) + (29*b)) >> 8);
1317 }
1318 
1319 static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y)
1320 {
1321  int i,j;
1322  unsigned char *good;
1323 
1324  if (req_comp == img_n) return data;
1325  STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
1326 
1327  good = (unsigned char *) stbi__malloc(req_comp * x * y);
1328  if (good == NULL) {
1329  STBI_FREE(data);
1330  return stbi__errpuc("outofmem", "Out of memory");
1331  }
1332 
1333  for (j=0; j < (int) y; ++j) {
1334  unsigned char *src = data + j * x * img_n ;
1335  unsigned char *dest = good + j * x * req_comp;
1336 
1337  #define COMBO(a,b) ((a)*8+(b))
1338  #define CASE(a,b) case COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
1339  // convert source image with img_n components to one with req_comp components;
1340  // avoid switch per pixel, so use switch per scanline and massive macros
1341  switch (COMBO(img_n, req_comp)) {
1342  CASE(1,2) dest[0]=src[0], dest[1]=255; break;
1343  CASE(1,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1344  CASE(1,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=255; break;
1345  CASE(2,1) dest[0]=src[0]; break;
1346  CASE(2,3) dest[0]=dest[1]=dest[2]=src[0]; break;
1347  CASE(2,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=src[1]; break;
1348  CASE(3,4) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2],dest[3]=255; break;
1349  CASE(3,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1350  CASE(3,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = 255; break;
1351  CASE(4,1) dest[0]=stbi__compute_y(src[0],src[1],src[2]); break;
1352  CASE(4,2) dest[0]=stbi__compute_y(src[0],src[1],src[2]), dest[1] = src[3]; break;
1353  CASE(4,3) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2]; break;
1354  default: STBI_ASSERT(0);
1355  }
1356  #undef CASE
1357  }
1358 
1359  STBI_FREE(data);
1360  return good;
1361 }
1362 
1363 #ifndef STBI_NO_LINEAR
1364 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
1365 {
1366  int i,k,n;
1367  float *output = (float *) stbi__malloc(x * y * comp * sizeof(float));
1368  if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
1369  // compute number of non-alpha components
1370  if (comp & 1) n = comp; else n = comp-1;
1371  for (i=0; i < x*y; ++i) {
1372  for (k=0; k < n; ++k) {
1373  output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
1374  }
1375  if (k < comp) output[i*comp + k] = data[i*comp+k]/255.0f;
1376  }
1377  STBI_FREE(data);
1378  return output;
1379 }
1380 #endif
1381 
1382 #ifndef STBI_NO_HDR
1383 #define stbi__float2int(x) ((int) (x))
1384 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp)
1385 {
1386  int i,k,n;
1387  stbi_uc *output = (stbi_uc *) stbi__malloc(x * y * comp);
1388  if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
1389  // compute number of non-alpha components
1390  if (comp & 1) n = comp; else n = comp-1;
1391  for (i=0; i < x*y; ++i) {
1392  for (k=0; k < n; ++k) {
1393  float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
1394  if (z < 0) z = 0;
1395  if (z > 255) z = 255;
1396  output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1397  }
1398  if (k < comp) {
1399  float z = data[i*comp+k] * 255 + 0.5f;
1400  if (z < 0) z = 0;
1401  if (z > 255) z = 255;
1402  output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1403  }
1404  }
1405  STBI_FREE(data);
1406  return output;
1407 }
1408 #endif
1409 
1411 //
1412 // "baseline" JPEG/JFIF decoder
1413 //
1414 // simple implementation
1415 // - doesn't support delayed output of y-dimension
1416 // - simple interface (only one output format: 8-bit interleaved RGB)
1417 // - doesn't try to recover corrupt jpegs
1418 // - doesn't allow partial loading, loading multiple at once
1419 // - still fast on x86 (copying globals into locals doesn't help x86)
1420 // - allocates lots of intermediate memory (full size of all components)
1421 // - non-interleaved case requires this anyway
1422 // - allows good upsampling (see next)
1423 // high-quality
1424 // - upsampled channels are bilinearly interpolated, even across blocks
1425 // - quality integer IDCT derived from IJG's 'slow'
1426 // performance
1427 // - fast huffman; reasonable integer IDCT
1428 // - some SIMD kernels for common paths on targets with SSE2/NEON
1429 // - uses a lot of intermediate memory, could cache poorly
1430 
1431 #ifndef STBI_NO_JPEG
1432 
1433 // huffman decoding acceleration
1434 #define FAST_BITS 9 // larger handles more cases; smaller stomps less cache
1435 
1436 typedef struct
1437 {
1438  stbi_uc fast[1 << FAST_BITS];
1439  // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
1440  stbi__uint16 code[256];
1441  stbi_uc values[256];
1442  stbi_uc size[257];
1443  unsigned int maxcode[18];
1444  int delta[17]; // old 'firstsymbol' - old 'firstcode'
1445 } stbi__huffman;
1446 
1447 typedef struct
1448 {
1449  stbi__context *s;
1450  stbi__huffman huff_dc[4];
1451  stbi__huffman huff_ac[4];
1452  stbi_uc dequant[4][64];
1453  stbi__int16 fast_ac[4][1 << FAST_BITS];
1454 
1455 // sizes for components, interleaved MCUs
1456  int img_h_max, img_v_max;
1457  int img_mcu_x, img_mcu_y;
1458  int img_mcu_w, img_mcu_h;
1459 
1460 // definition of jpeg image component
1461  struct
1462  {
1463  int id;
1464  int h,v;
1465  int tq;
1466  int hd,ha;
1467  int dc_pred;
1468 
1469  int x,y,w2,h2;
1470  stbi_uc *data;
1471  void *raw_data, *raw_coeff;
1472  stbi_uc *linebuf;
1473  short *coeff; // progressive only
1474  int coeff_w, coeff_h; // number of 8x8 coefficient blocks
1475  } img_comp[4];
1476 
1477  stbi__uint32 code_buffer; // jpeg entropy-coded buffer
1478  int code_bits; // number of valid bits
1479  unsigned char marker; // marker seen while filling entropy buffer
1480  int nomore; // flag if we saw a marker so must stop
1481 
1482  int progressive;
1483  int spec_start;
1484  int spec_end;
1485  int succ_high;
1486  int succ_low;
1487  int eob_run;
1488 
1489  int scan_n, order[4];
1490  int restart_interval, todo;
1491 
1492 // kernels
1493  void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]);
1494  void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step);
1495  stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs);
1496 } stbi__jpeg;
1497 
1498 static int stbi__build_huffman(stbi__huffman *h, int *count)
1499 {
1500  int i,j,k=0,code;
1501  // build size list for each symbol (from JPEG spec)
1502  for (i=0; i < 16; ++i)
1503  for (j=0; j < count[i]; ++j)
1504  h->size[k++] = (stbi_uc) (i+1);
1505  h->size[k] = 0;
1506 
1507  // compute actual symbols (from jpeg spec)
1508  code = 0;
1509  k = 0;
1510  for(j=1; j <= 16; ++j) {
1511  // compute delta to add to code to compute symbol id
1512  h->delta[j] = k - code;
1513  if (h->size[k] == j) {
1514  while (h->size[k] == j)
1515  h->code[k++] = (stbi__uint16) (code++);
1516  if (code-1 >= (1 << j)) return stbi__err("bad code lengths","Corrupt JPEG");
1517  }
1518  // compute largest code + 1 for this size, preshifted as needed later
1519  h->maxcode[j] = code << (16-j);
1520  code <<= 1;
1521  }
1522  h->maxcode[j] = 0xffffffff;
1523 
1524  // build non-spec acceleration table; 255 is flag for not-accelerated
1525  memset(h->fast, 255, 1 << FAST_BITS);
1526  for (i=0; i < k; ++i) {
1527  int s = h->size[i];
1528  if (s <= FAST_BITS) {
1529  int c = h->code[i] << (FAST_BITS-s);
1530  int m = 1 << (FAST_BITS-s);
1531  for (j=0; j < m; ++j) {
1532  h->fast[c+j] = (stbi_uc) i;
1533  }
1534  }
1535  }
1536  return 1;
1537 }
1538 
1539 // build a table that decodes both magnitude and value of small ACs in
1540 // one go.
1541 static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h)
1542 {
1543  int i;
1544  for (i=0; i < (1 << FAST_BITS); ++i) {
1545  stbi_uc fast = h->fast[i];
1546  fast_ac[i] = 0;
1547  if (fast < 255) {
1548  int rs = h->values[fast];
1549  int run = (rs >> 4) & 15;
1550  int magbits = rs & 15;
1551  int len = h->size[fast];
1552 
1553  if (magbits && len + magbits <= FAST_BITS) {
1554  // magnitude code followed by receive_extend code
1555  int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
1556  int m = 1 << (magbits - 1);
1557  if (k < m) k += (-1 << magbits) + 1;
1558  // if the result is small enough, we can fit it in fast_ac table
1559  if (k >= -128 && k <= 127)
1560  fast_ac[i] = (stbi__int16) ((k << 8) + (run << 4) + (len + magbits));
1561  }
1562  }
1563  }
1564 }
1565 
1566 static void stbi__grow_buffer_unsafe(stbi__jpeg *j)
1567 {
1568  do {
1569  int b = j->nomore ? 0 : stbi__get8(j->s);
1570  if (b == 0xff) {
1571  int c = stbi__get8(j->s);
1572  if (c != 0) {
1573  j->marker = (unsigned char) c;
1574  j->nomore = 1;
1575  return;
1576  }
1577  }
1578  j->code_buffer |= b << (24 - j->code_bits);
1579  j->code_bits += 8;
1580  } while (j->code_bits <= 24);
1581 }
1582 
1583 // (1 << n) - 1
1584 static stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
1585 
1586 // decode a jpeg huffman value from the bitstream
1587 stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h)
1588 {
1589  unsigned int temp;
1590  int c,k;
1591 
1592  if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1593 
1594  // look at the top FAST_BITS and determine what symbol ID it is,
1595  // if the code is <= FAST_BITS
1596  c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1597  k = h->fast[c];
1598  if (k < 255) {
1599  int s = h->size[k];
1600  if (s > j->code_bits)
1601  return -1;
1602  j->code_buffer <<= s;
1603  j->code_bits -= s;
1604  return h->values[k];
1605  }
1606 
1607  // naive test is to shift the code_buffer down so k bits are
1608  // valid, then test against maxcode. To speed this up, we've
1609  // preshifted maxcode left so that it has (16-k) 0s at the
1610  // end; in other words, regardless of the number of bits, it
1611  // wants to be compared against something shifted to have 16;
1612  // that way we don't need to shift inside the loop.
1613  temp = j->code_buffer >> 16;
1614  for (k=FAST_BITS+1 ; ; ++k)
1615  if (temp < h->maxcode[k])
1616  break;
1617  if (k == 17) {
1618  // error! code not found
1619  j->code_bits -= 16;
1620  return -1;
1621  }
1622 
1623  if (k > j->code_bits)
1624  return -1;
1625 
1626  // convert the huffman code to the symbol id
1627  c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
1628  STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
1629 
1630  // convert the id to a symbol
1631  j->code_bits -= k;
1632  j->code_buffer <<= k;
1633  return h->values[c];
1634 }
1635 
1636 // bias[n] = (-1<<n) + 1
1637 static int const stbi__jbias[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
1638 
1639 // combined JPEG 'receive' and JPEG 'extend', since baseline
1640 // always extends everything it receives.
1641 stbi_inline static int stbi__extend_receive(stbi__jpeg *j, int n)
1642 {
1643  unsigned int k;
1644  int sgn;
1645  if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1646 
1647  sgn = (stbi__int32)j->code_buffer >> 31; // sign bit is always in MSB
1648  k = stbi_lrot(j->code_buffer, n);
1649  STBI_ASSERT(n >= 0 && n < (int) (sizeof(stbi__bmask)/sizeof(*stbi__bmask)));
1650  j->code_buffer = k & ~stbi__bmask[n];
1651  k &= stbi__bmask[n];
1652  j->code_bits -= n;
1653  return k + (stbi__jbias[n] & ~sgn);
1654 }
1655 
1656 // get some unsigned bits
1657 stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg *j, int n)
1658 {
1659  unsigned int k;
1660  if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
1661  k = stbi_lrot(j->code_buffer, n);
1662  j->code_buffer = k & ~stbi__bmask[n];
1663  k &= stbi__bmask[n];
1664  j->code_bits -= n;
1665  return k;
1666 }
1667 
1668 stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg *j)
1669 {
1670  unsigned int k;
1671  if (j->code_bits < 1) stbi__grow_buffer_unsafe(j);
1672  k = j->code_buffer;
1673  j->code_buffer <<= 1;
1674  --j->code_bits;
1675  return k & 0x80000000;
1676 }
1677 
1678 // given a value that's at position X in the zigzag stream,
1679 // where does it appear in the 8x8 matrix coded as row-major?
1680 static stbi_uc stbi__jpeg_dezigzag[64+15] =
1681 {
1682  0, 1, 8, 16, 9, 2, 3, 10,
1683  17, 24, 32, 25, 18, 11, 4, 5,
1684  12, 19, 26, 33, 40, 48, 41, 34,
1685  27, 20, 13, 6, 7, 14, 21, 28,
1686  35, 42, 49, 56, 57, 50, 43, 36,
1687  29, 22, 15, 23, 30, 37, 44, 51,
1688  58, 59, 52, 45, 38, 31, 39, 46,
1689  53, 60, 61, 54, 47, 55, 62, 63,
1690  // let corrupt input sample past end
1691  63, 63, 63, 63, 63, 63, 63, 63,
1692  63, 63, 63, 63, 63, 63, 63
1693 };
1694 
1695 // decode one 64-entry block--
1696 static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi_uc *dequant)
1697 {
1698  int diff,dc,k;
1699  int t;
1700 
1701  if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1702  t = stbi__jpeg_huff_decode(j, hdc);
1703  if (t < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1704 
1705  // 0 all the ac values now so we can do it 32-bits at a time
1706  memset(data,0,64*sizeof(data[0]));
1707 
1708  diff = t ? stbi__extend_receive(j, t) : 0;
1709  dc = j->img_comp[b].dc_pred + diff;
1710  j->img_comp[b].dc_pred = dc;
1711  data[0] = (short) (dc * dequant[0]);
1712 
1713  // decode AC components, see JPEG spec
1714  k = 1;
1715  do {
1716  unsigned int zig;
1717  int c,r,s;
1718  if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1719  c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1720  r = fac[c];
1721  if (r) { // fast-AC path
1722  k += (r >> 4) & 15; // run
1723  s = r & 15; // combined length
1724  j->code_buffer <<= s;
1725  j->code_bits -= s;
1726  // decode into unzigzag'd location
1727  zig = stbi__jpeg_dezigzag[k++];
1728  data[zig] = (short) ((r >> 8) * dequant[zig]);
1729  } else {
1730  int rs = stbi__jpeg_huff_decode(j, hac);
1731  if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1732  s = rs & 15;
1733  r = rs >> 4;
1734  if (s == 0) {
1735  if (rs != 0xf0) break; // end block
1736  k += 16;
1737  } else {
1738  k += r;
1739  // decode into unzigzag'd location
1740  zig = stbi__jpeg_dezigzag[k++];
1741  data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]);
1742  }
1743  }
1744  } while (k < 64);
1745  return 1;
1746 }
1747 
1748 static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg *j, short data[64], stbi__huffman *hdc, int b)
1749 {
1750  int diff,dc;
1751  int t;
1752  if (j->spec_end != 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1753 
1754  if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1755 
1756  if (j->succ_high == 0) {
1757  // first scan for DC coefficient, must be first
1758  memset(data,0,64*sizeof(data[0])); // 0 all the ac values now
1759  t = stbi__jpeg_huff_decode(j, hdc);
1760  diff = t ? stbi__extend_receive(j, t) : 0;
1761 
1762  dc = j->img_comp[b].dc_pred + diff;
1763  j->img_comp[b].dc_pred = dc;
1764  data[0] = (short) (dc << j->succ_low);
1765  } else {
1766  // refinement scan for DC coefficient
1767  if (stbi__jpeg_get_bit(j))
1768  data[0] += (short) (1 << j->succ_low);
1769  }
1770  return 1;
1771 }
1772 
1773 // @OPTIMIZE: store non-zigzagged during the decode passes,
1774 // and only de-zigzag when dequantizing
1775 static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg *j, short data[64], stbi__huffman *hac, stbi__int16 *fac)
1776 {
1777  int k;
1778  if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
1779 
1780  if (j->succ_high == 0) {
1781  int shift = j->succ_low;
1782 
1783  if (j->eob_run) {
1784  --j->eob_run;
1785  return 1;
1786  }
1787 
1788  k = j->spec_start;
1789  do {
1790  unsigned int zig;
1791  int c,r,s;
1792  if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
1793  c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
1794  r = fac[c];
1795  if (r) { // fast-AC path
1796  k += (r >> 4) & 15; // run
1797  s = r & 15; // combined length
1798  j->code_buffer <<= s;
1799  j->code_bits -= s;
1800  zig = stbi__jpeg_dezigzag[k++];
1801  data[zig] = (short) ((r >> 8) << shift);
1802  } else {
1803  int rs = stbi__jpeg_huff_decode(j, hac);
1804  if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1805  s = rs & 15;
1806  r = rs >> 4;
1807  if (s == 0) {
1808  if (r < 15) {
1809  j->eob_run = (1 << r);
1810  if (r)
1811  j->eob_run += stbi__jpeg_get_bits(j, r);
1812  --j->eob_run;
1813  break;
1814  }
1815  k += 16;
1816  } else {
1817  k += r;
1818  zig = stbi__jpeg_dezigzag[k++];
1819  data[zig] = (short) (stbi__extend_receive(j,s) << shift);
1820  }
1821  }
1822  } while (k <= j->spec_end);
1823  } else {
1824  // refinement scan for these AC coefficients
1825 
1826  short bit = (short) (1 << j->succ_low);
1827 
1828  if (j->eob_run) {
1829  --j->eob_run;
1830  for (k = j->spec_start; k <= j->spec_end; ++k) {
1831  short *p = &data[stbi__jpeg_dezigzag[k]];
1832  if (*p != 0)
1833  if (stbi__jpeg_get_bit(j))
1834  if ((*p & bit)==0) {
1835  if (*p > 0)
1836  *p += bit;
1837  else
1838  *p -= bit;
1839  }
1840  }
1841  } else {
1842  k = j->spec_start;
1843  do {
1844  int r,s;
1845  int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
1846  if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
1847  s = rs & 15;
1848  r = rs >> 4;
1849  if (s == 0) {
1850  if (r < 15) {
1851  j->eob_run = (1 << r) - 1;
1852  if (r)
1853  j->eob_run += stbi__jpeg_get_bits(j, r);
1854  r = 64; // force end of block
1855  } else {
1856  // r=15 s=0 should write 16 0s, so we just do
1857  // a run of 15 0s and then write s (which is 0),
1858  // so we don't have to do anything special here
1859  }
1860  } else {
1861  if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG");
1862  // sign bit
1863  if (stbi__jpeg_get_bit(j))
1864  s = bit;
1865  else
1866  s = -bit;
1867  }
1868 
1869  // advance by r
1870  while (k <= j->spec_end) {
1871  short *p = &data[stbi__jpeg_dezigzag[k++]];
1872  if (*p != 0) {
1873  if (stbi__jpeg_get_bit(j))
1874  if ((*p & bit)==0) {
1875  if (*p > 0)
1876  *p += bit;
1877  else
1878  *p -= bit;
1879  }
1880  } else {
1881  if (r == 0) {
1882  *p = (short) s;
1883  break;
1884  }
1885  --r;
1886  }
1887  }
1888  } while (k <= j->spec_end);
1889  }
1890  }
1891  return 1;
1892 }
1893 
1894 // take a -128..127 value and stbi__clamp it and convert to 0..255
1895 stbi_inline static stbi_uc stbi__clamp(int x)
1896 {
1897  // trick to use a single test to catch both cases
1898  if ((unsigned int) x > 255) {
1899  if (x < 0) return 0;
1900  if (x > 255) return 255;
1901  }
1902  return (stbi_uc) x;
1903 }
1904 
1905 #define stbi__f2f(x) ((int) (((x) * 4096 + 0.5)))
1906 #define stbi__fsh(x) ((x) << 12)
1907 
1908 // derived from jidctint -- DCT_ISLOW
1909 #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
1910  int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
1911  p2 = s2; \
1912  p3 = s6; \
1913  p1 = (p2+p3) * stbi__f2f(0.5411961f); \
1914  t2 = p1 + p3*stbi__f2f(-1.847759065f); \
1915  t3 = p1 + p2*stbi__f2f( 0.765366865f); \
1916  p2 = s0; \
1917  p3 = s4; \
1918  t0 = stbi__fsh(p2+p3); \
1919  t1 = stbi__fsh(p2-p3); \
1920  x0 = t0+t3; \
1921  x3 = t0-t3; \
1922  x1 = t1+t2; \
1923  x2 = t1-t2; \
1924  t0 = s7; \
1925  t1 = s5; \
1926  t2 = s3; \
1927  t3 = s1; \
1928  p3 = t0+t2; \
1929  p4 = t1+t3; \
1930  p1 = t0+t3; \
1931  p2 = t1+t2; \
1932  p5 = (p3+p4)*stbi__f2f( 1.175875602f); \
1933  t0 = t0*stbi__f2f( 0.298631336f); \
1934  t1 = t1*stbi__f2f( 2.053119869f); \
1935  t2 = t2*stbi__f2f( 3.072711026f); \
1936  t3 = t3*stbi__f2f( 1.501321110f); \
1937  p1 = p5 + p1*stbi__f2f(-0.899976223f); \
1938  p2 = p5 + p2*stbi__f2f(-2.562915447f); \
1939  p3 = p3*stbi__f2f(-1.961570560f); \
1940  p4 = p4*stbi__f2f(-0.390180644f); \
1941  t3 += p1+p4; \
1942  t2 += p2+p3; \
1943  t1 += p2+p4; \
1944  t0 += p1+p3;
1945 
1946 static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64])
1947 {
1948  int i,val[64],*v=val;
1949  stbi_uc *o;
1950  short *d = data;
1951 
1952  // columns
1953  for (i=0; i < 8; ++i,++d, ++v) {
1954  // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
1955  if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
1956  && d[40]==0 && d[48]==0 && d[56]==0) {
1957  // no shortcut 0 seconds
1958  // (1|2|3|4|5|6|7)==0 0 seconds
1959  // all separate -0.047 seconds
1960  // 1 && 2|3 && 4|5 && 6|7: -0.047 seconds
1961  int dcterm = d[0] << 2;
1962  v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
1963  } else {
1964  STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56])
1965  // constants scaled things up by 1<<12; let's bring them back
1966  // down, but keep 2 extra bits of precision
1967  x0 += 512; x1 += 512; x2 += 512; x3 += 512;
1968  v[ 0] = (x0+t3) >> 10;
1969  v[56] = (x0-t3) >> 10;
1970  v[ 8] = (x1+t2) >> 10;
1971  v[48] = (x1-t2) >> 10;
1972  v[16] = (x2+t1) >> 10;
1973  v[40] = (x2-t1) >> 10;
1974  v[24] = (x3+t0) >> 10;
1975  v[32] = (x3-t0) >> 10;
1976  }
1977  }
1978 
1979  for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
1980  // no fast case since the first 1D IDCT spread components out
1981  STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
1982  // constants scaled things up by 1<<12, plus we had 1<<2 from first
1983  // loop, plus horizontal and vertical each scale by sqrt(8) so together
1984  // we've got an extra 1<<3, so 1<<17 total we need to remove.
1985  // so we want to round that, which means adding 0.5 * 1<<17,
1986  // aka 65536. Also, we'll end up with -128 to 127 that we want
1987  // to encode as 0..255 by adding 128, so we'll add that before the shift
1988  x0 += 65536 + (128<<17);
1989  x1 += 65536 + (128<<17);
1990  x2 += 65536 + (128<<17);
1991  x3 += 65536 + (128<<17);
1992  // tried computing the shifts into temps, or'ing the temps to see
1993  // if any were out of range, but that was slower
1994  o[0] = stbi__clamp((x0+t3) >> 17);
1995  o[7] = stbi__clamp((x0-t3) >> 17);
1996  o[1] = stbi__clamp((x1+t2) >> 17);
1997  o[6] = stbi__clamp((x1-t2) >> 17);
1998  o[2] = stbi__clamp((x2+t1) >> 17);
1999  o[5] = stbi__clamp((x2-t1) >> 17);
2000  o[3] = stbi__clamp((x3+t0) >> 17);
2001  o[4] = stbi__clamp((x3-t0) >> 17);
2002  }
2003 }
2004 
2005 #ifdef STBI_SSE2
2006 // sse2 integer IDCT. not the fastest possible implementation but it
2007 // produces bit-identical results to the generic C version so it's
2008 // fully "transparent".
2009 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2010 {
2011  // This is constructed to match our regular (generic) integer IDCT exactly.
2012  __m128i row0, row1, row2, row3, row4, row5, row6, row7;
2013  __m128i tmp;
2014 
2015  // dot product constant: even elems=x, odd elems=y
2016  #define dct_const(x,y) _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
2017 
2018  // out(0) = c0[even]*x + c0[odd]*y (c0, x, y 16-bit, out 32-bit)
2019  // out(1) = c1[even]*x + c1[odd]*y
2020  #define dct_rot(out0,out1, x,y,c0,c1) \
2021  __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
2022  __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
2023  __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
2024  __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
2025  __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
2026  __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
2027 
2028  // out = in << 12 (in 16-bit, out 32-bit)
2029  #define dct_widen(out, in) \
2030  __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
2031  __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
2032 
2033  // wide add
2034  #define dct_wadd(out, a, b) \
2035  __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
2036  __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
2037 
2038  // wide sub
2039  #define dct_wsub(out, a, b) \
2040  __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
2041  __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
2042 
2043  // butterfly a/b, add bias, then shift by "s" and pack
2044  #define dct_bfly32o(out0, out1, a,b,bias,s) \
2045  { \
2046  __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
2047  __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
2048  dct_wadd(sum, abiased, b); \
2049  dct_wsub(dif, abiased, b); \
2050  out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
2051  out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
2052  }
2053 
2054  // 8-bit interleave step (for transposes)
2055  #define dct_interleave8(a, b) \
2056  tmp = a; \
2057  a = _mm_unpacklo_epi8(a, b); \
2058  b = _mm_unpackhi_epi8(tmp, b)
2059 
2060  // 16-bit interleave step (for transposes)
2061  #define dct_interleave16(a, b) \
2062  tmp = a; \
2063  a = _mm_unpacklo_epi16(a, b); \
2064  b = _mm_unpackhi_epi16(tmp, b)
2065 
2066  #define dct_pass(bias,shift) \
2067  { \
2068  /* even part */ \
2069  dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
2070  __m128i sum04 = _mm_add_epi16(row0, row4); \
2071  __m128i dif04 = _mm_sub_epi16(row0, row4); \
2072  dct_widen(t0e, sum04); \
2073  dct_widen(t1e, dif04); \
2074  dct_wadd(x0, t0e, t3e); \
2075  dct_wsub(x3, t0e, t3e); \
2076  dct_wadd(x1, t1e, t2e); \
2077  dct_wsub(x2, t1e, t2e); \
2078  /* odd part */ \
2079  dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
2080  dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
2081  __m128i sum17 = _mm_add_epi16(row1, row7); \
2082  __m128i sum35 = _mm_add_epi16(row3, row5); \
2083  dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
2084  dct_wadd(x4, y0o, y4o); \
2085  dct_wadd(x5, y1o, y5o); \
2086  dct_wadd(x6, y2o, y5o); \
2087  dct_wadd(x7, y3o, y4o); \
2088  dct_bfly32o(row0,row7, x0,x7,bias,shift); \
2089  dct_bfly32o(row1,row6, x1,x6,bias,shift); \
2090  dct_bfly32o(row2,row5, x2,x5,bias,shift); \
2091  dct_bfly32o(row3,row4, x3,x4,bias,shift); \
2092  }
2093 
2094  __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
2095  __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f));
2096  __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
2097  __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
2098  __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f));
2099  __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f));
2100  __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f));
2101  __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f));
2102 
2103  // rounding biases in column/row passes, see stbi__idct_block for explanation.
2104  __m128i bias_0 = _mm_set1_epi32(512);
2105  __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17));
2106 
2107  // load
2108  row0 = _mm_load_si128((const __m128i *) (data + 0*8));
2109  row1 = _mm_load_si128((const __m128i *) (data + 1*8));
2110  row2 = _mm_load_si128((const __m128i *) (data + 2*8));
2111  row3 = _mm_load_si128((const __m128i *) (data + 3*8));
2112  row4 = _mm_load_si128((const __m128i *) (data + 4*8));
2113  row5 = _mm_load_si128((const __m128i *) (data + 5*8));
2114  row6 = _mm_load_si128((const __m128i *) (data + 6*8));
2115  row7 = _mm_load_si128((const __m128i *) (data + 7*8));
2116 
2117  // column pass
2118  dct_pass(bias_0, 10);
2119 
2120  {
2121  // 16bit 8x8 transpose pass 1
2122  dct_interleave16(row0, row4);
2123  dct_interleave16(row1, row5);
2124  dct_interleave16(row2, row6);
2125  dct_interleave16(row3, row7);
2126 
2127  // transpose pass 2
2128  dct_interleave16(row0, row2);
2129  dct_interleave16(row1, row3);
2130  dct_interleave16(row4, row6);
2131  dct_interleave16(row5, row7);
2132 
2133  // transpose pass 3
2134  dct_interleave16(row0, row1);
2135  dct_interleave16(row2, row3);
2136  dct_interleave16(row4, row5);
2137  dct_interleave16(row6, row7);
2138  }
2139 
2140  // row pass
2141  dct_pass(bias_1, 17);
2142 
2143  {
2144  // pack
2145  __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
2146  __m128i p1 = _mm_packus_epi16(row2, row3);
2147  __m128i p2 = _mm_packus_epi16(row4, row5);
2148  __m128i p3 = _mm_packus_epi16(row6, row7);
2149 
2150  // 8bit 8x8 transpose pass 1
2151  dct_interleave8(p0, p2); // a0e0a1e1...
2152  dct_interleave8(p1, p3); // c0g0c1g1...
2153 
2154  // transpose pass 2
2155  dct_interleave8(p0, p1); // a0c0e0g0...
2156  dct_interleave8(p2, p3); // b0d0f0h0...
2157 
2158  // transpose pass 3
2159  dct_interleave8(p0, p2); // a0b0c0d0...
2160  dct_interleave8(p1, p3); // a4b4c4d4...
2161 
2162  // store
2163  _mm_storel_epi64((__m128i *) out, p0); out += out_stride;
2164  _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
2165  _mm_storel_epi64((__m128i *) out, p2); out += out_stride;
2166  _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
2167  _mm_storel_epi64((__m128i *) out, p1); out += out_stride;
2168  _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
2169  _mm_storel_epi64((__m128i *) out, p3); out += out_stride;
2170  _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e));
2171  }
2172 
2173 #undef dct_const
2174 #undef dct_rot
2175 #undef dct_widen
2176 #undef dct_wadd
2177 #undef dct_wsub
2178 #undef dct_bfly32o
2179 #undef dct_interleave8
2180 #undef dct_interleave16
2181 #undef dct_pass
2182 }
2183 
2184 #endif // STBI_SSE2
2185 
2186 #ifdef STBI_NEON
2187 
2188 // NEON integer IDCT. should produce bit-identical
2189 // results to the generic C version.
2190 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2191 {
2192  int16x8_t row0, row1, row2, row3, row4, row5, row6, row7;
2193 
2194  int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f));
2195  int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f));
2196  int16x4_t rot0_2 = vdup_n_s16(stbi__f2f( 0.765366865f));
2197  int16x4_t rot1_0 = vdup_n_s16(stbi__f2f( 1.175875602f));
2198  int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f));
2199  int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f));
2200  int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f));
2201  int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f));
2202  int16x4_t rot3_0 = vdup_n_s16(stbi__f2f( 0.298631336f));
2203  int16x4_t rot3_1 = vdup_n_s16(stbi__f2f( 2.053119869f));
2204  int16x4_t rot3_2 = vdup_n_s16(stbi__f2f( 3.072711026f));
2205  int16x4_t rot3_3 = vdup_n_s16(stbi__f2f( 1.501321110f));
2206 
2207 #define dct_long_mul(out, inq, coeff) \
2208  int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
2209  int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
2210 
2211 #define dct_long_mac(out, acc, inq, coeff) \
2212  int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
2213  int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
2214 
2215 #define dct_widen(out, inq) \
2216  int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
2217  int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
2218 
2219 // wide add
2220 #define dct_wadd(out, a, b) \
2221  int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
2222  int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
2223 
2224 // wide sub
2225 #define dct_wsub(out, a, b) \
2226  int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
2227  int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
2228 
2229 // butterfly a/b, then shift using "shiftop" by "s" and pack
2230 #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
2231  { \
2232  dct_wadd(sum, a, b); \
2233  dct_wsub(dif, a, b); \
2234  out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
2235  out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
2236  }
2237 
2238 #define dct_pass(shiftop, shift) \
2239  { \
2240  /* even part */ \
2241  int16x8_t sum26 = vaddq_s16(row2, row6); \
2242  dct_long_mul(p1e, sum26, rot0_0); \
2243  dct_long_mac(t2e, p1e, row6, rot0_1); \
2244  dct_long_mac(t3e, p1e, row2, rot0_2); \
2245  int16x8_t sum04 = vaddq_s16(row0, row4); \
2246  int16x8_t dif04 = vsubq_s16(row0, row4); \
2247  dct_widen(t0e, sum04); \
2248  dct_widen(t1e, dif04); \
2249  dct_wadd(x0, t0e, t3e); \
2250  dct_wsub(x3, t0e, t3e); \
2251  dct_wadd(x1, t1e, t2e); \
2252  dct_wsub(x2, t1e, t2e); \
2253  /* odd part */ \
2254  int16x8_t sum15 = vaddq_s16(row1, row5); \
2255  int16x8_t sum17 = vaddq_s16(row1, row7); \
2256  int16x8_t sum35 = vaddq_s16(row3, row5); \
2257  int16x8_t sum37 = vaddq_s16(row3, row7); \
2258  int16x8_t sumodd = vaddq_s16(sum17, sum35); \
2259  dct_long_mul(p5o, sumodd, rot1_0); \
2260  dct_long_mac(p1o, p5o, sum17, rot1_1); \
2261  dct_long_mac(p2o, p5o, sum35, rot1_2); \
2262  dct_long_mul(p3o, sum37, rot2_0); \
2263  dct_long_mul(p4o, sum15, rot2_1); \
2264  dct_wadd(sump13o, p1o, p3o); \
2265  dct_wadd(sump24o, p2o, p4o); \
2266  dct_wadd(sump23o, p2o, p3o); \
2267  dct_wadd(sump14o, p1o, p4o); \
2268  dct_long_mac(x4, sump13o, row7, rot3_0); \
2269  dct_long_mac(x5, sump24o, row5, rot3_1); \
2270  dct_long_mac(x6, sump23o, row3, rot3_2); \
2271  dct_long_mac(x7, sump14o, row1, rot3_3); \
2272  dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
2273  dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
2274  dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
2275  dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
2276  }
2277 
2278  // load
2279  row0 = vld1q_s16(data + 0*8);
2280  row1 = vld1q_s16(data + 1*8);
2281  row2 = vld1q_s16(data + 2*8);
2282  row3 = vld1q_s16(data + 3*8);
2283  row4 = vld1q_s16(data + 4*8);
2284  row5 = vld1q_s16(data + 5*8);
2285  row6 = vld1q_s16(data + 6*8);
2286  row7 = vld1q_s16(data + 7*8);
2287 
2288  // add DC bias
2289  row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
2290 
2291  // column pass
2292  dct_pass(vrshrn_n_s32, 10);
2293 
2294  // 16bit 8x8 transpose
2295  {
2296 // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
2297 // whether compilers actually get this is another story, sadly.
2298 #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
2299 #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
2300 #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
2301 
2302  // pass 1
2303  dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6
2304  dct_trn16(row2, row3);
2305  dct_trn16(row4, row5);
2306  dct_trn16(row6, row7);
2307 
2308  // pass 2
2309  dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4
2310  dct_trn32(row1, row3);
2311  dct_trn32(row4, row6);
2312  dct_trn32(row5, row7);
2313 
2314  // pass 3
2315  dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0
2316  dct_trn64(row1, row5);
2317  dct_trn64(row2, row6);
2318  dct_trn64(row3, row7);
2319 
2320 #undef dct_trn16
2321 #undef dct_trn32
2322 #undef dct_trn64
2323  }
2324 
2325  // row pass
2326  // vrshrn_n_s32 only supports shifts up to 16, we need
2327  // 17. so do a non-rounding shift of 16 first then follow
2328  // up with a rounding shift by 1.
2329  dct_pass(vshrn_n_s32, 16);
2330 
2331  {
2332  // pack and round
2333  uint8x8_t p0 = vqrshrun_n_s16(row0, 1);
2334  uint8x8_t p1 = vqrshrun_n_s16(row1, 1);
2335  uint8x8_t p2 = vqrshrun_n_s16(row2, 1);
2336  uint8x8_t p3 = vqrshrun_n_s16(row3, 1);
2337  uint8x8_t p4 = vqrshrun_n_s16(row4, 1);
2338  uint8x8_t p5 = vqrshrun_n_s16(row5, 1);
2339  uint8x8_t p6 = vqrshrun_n_s16(row6, 1);
2340  uint8x8_t p7 = vqrshrun_n_s16(row7, 1);
2341 
2342  // again, these can translate into one instruction, but often don't.
2343 #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
2344 #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
2345 #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
2346 
2347  // sadly can't use interleaved stores here since we only write
2348  // 8 bytes to each scan line!
2349 
2350  // 8x8 8-bit transpose pass 1
2351  dct_trn8_8(p0, p1);
2352  dct_trn8_8(p2, p3);
2353  dct_trn8_8(p4, p5);
2354  dct_trn8_8(p6, p7);
2355 
2356  // pass 2
2357  dct_trn8_16(p0, p2);
2358  dct_trn8_16(p1, p3);
2359  dct_trn8_16(p4, p6);
2360  dct_trn8_16(p5, p7);
2361 
2362  // pass 3
2363  dct_trn8_32(p0, p4);
2364  dct_trn8_32(p1, p5);
2365  dct_trn8_32(p2, p6);
2366  dct_trn8_32(p3, p7);
2367 
2368  // store
2369  vst1_u8(out, p0); out += out_stride;
2370  vst1_u8(out, p1); out += out_stride;
2371  vst1_u8(out, p2); out += out_stride;
2372  vst1_u8(out, p3); out += out_stride;
2373  vst1_u8(out, p4); out += out_stride;
2374  vst1_u8(out, p5); out += out_stride;
2375  vst1_u8(out, p6); out += out_stride;
2376  vst1_u8(out, p7);
2377 
2378 #undef dct_trn8_8
2379 #undef dct_trn8_16
2380 #undef dct_trn8_32
2381  }
2382 
2383 #undef dct_long_mul
2384 #undef dct_long_mac
2385 #undef dct_widen
2386 #undef dct_wadd
2387 #undef dct_wsub
2388 #undef dct_bfly32o
2389 #undef dct_pass
2390 }
2391 
2392 #endif // STBI_NEON
2393 
2394 #define STBI__MARKER_none 0xff
2395 // if there's a pending marker from the entropy stream, return that
2396 // otherwise, fetch from the stream and get a marker. if there's no
2397 // marker, return 0xff, which is never a valid marker value
2398 static stbi_uc stbi__get_marker(stbi__jpeg *j)
2399 {
2400  stbi_uc x;
2401  if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
2402  x = stbi__get8(j->s);
2403  if (x != 0xff) return STBI__MARKER_none;
2404  while (x == 0xff)
2405  x = stbi__get8(j->s);
2406  return x;
2407 }
2408 
2409 // in each scan, we'll have scan_n components, and the order
2410 // of the components is specified by order[]
2411 #define STBI__RESTART(x) ((x) >= 0xd0 && (x) <= 0xd7)
2412 
2413 // after a restart interval, stbi__jpeg_reset the entropy decoder and
2414 // the dc prediction
2415 static void stbi__jpeg_reset(stbi__jpeg *j)
2416 {
2417  j->code_bits = 0;
2418  j->code_buffer = 0;
2419  j->nomore = 0;
2420  j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = 0;
2421  j->marker = STBI__MARKER_none;
2422  j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
2423  j->eob_run = 0;
2424  // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
2425  // since we don't even allow 1<<30 pixels
2426 }
2427 
2428 static int stbi__parse_entropy_coded_data(stbi__jpeg *z)
2429 {
2430  stbi__jpeg_reset(z);
2431  if (!z->progressive) {
2432  if (z->scan_n == 1) {
2433  int i,j;
2434  STBI_SIMD_ALIGN(short, data[64]);
2435  int n = z->order[0];
2436  // non-interleaved data, we just need to process one block at a time,
2437  // in trivial scanline order
2438  // number of blocks to do just depends on how many actual "pixels" this
2439  // component has, independent of interleaved MCU blocking and such
2440  int w = (z->img_comp[n].x+7) >> 3;
2441  int h = (z->img_comp[n].y+7) >> 3;
2442  for (j=0; j < h; ++j) {
2443  for (i=0; i < w; ++i) {
2444  int ha = z->img_comp[n].ha;
2445  if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2446  z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2447  // every data block is an MCU, so countdown the restart interval
2448  if (--z->todo <= 0) {
2449  if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2450  // if it's NOT a restart, then just bail, so we get corrupt data
2451  // rather than no data
2452  if (!STBI__RESTART(z->marker)) return 1;
2453  stbi__jpeg_reset(z);
2454  }
2455  }
2456  }
2457  return 1;
2458  } else { // interleaved
2459  int i,j,k,x,y;
2460  STBI_SIMD_ALIGN(short, data[64]);
2461  for (j=0; j < z->img_mcu_y; ++j) {
2462  for (i=0; i < z->img_mcu_x; ++i) {
2463  // scan an interleaved mcu... process scan_n components in order
2464  for (k=0; k < z->scan_n; ++k) {
2465  int n = z->order[k];
2466  // scan out an mcu's worth of this component; that's just determined
2467  // by the basic H and V specified for the component
2468  for (y=0; y < z->img_comp[n].v; ++y) {
2469  for (x=0; x < z->img_comp[n].h; ++x) {
2470  int x2 = (i*z->img_comp[n].h + x)*8;
2471  int y2 = (j*z->img_comp[n].v + y)*8;
2472  int ha = z->img_comp[n].ha;
2473  if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2474  z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data);
2475  }
2476  }
2477  }
2478  // after all interleaved components, that's an interleaved MCU,
2479  // so now count down the restart interval
2480  if (--z->todo <= 0) {
2481  if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2482  if (!STBI__RESTART(z->marker)) return 1;
2483  stbi__jpeg_reset(z);
2484  }
2485  }
2486  }
2487  return 1;
2488  }
2489  } else {
2490  if (z->scan_n == 1) {
2491  int i,j;
2492  int n = z->order[0];
2493  // non-interleaved data, we just need to process one block at a time,
2494  // in trivial scanline order
2495  // number of blocks to do just depends on how many actual "pixels" this
2496  // component has, independent of interleaved MCU blocking and such
2497  int w = (z->img_comp[n].x+7) >> 3;
2498  int h = (z->img_comp[n].y+7) >> 3;
2499  for (j=0; j < h; ++j) {
2500  for (i=0; i < w; ++i) {
2501  short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2502  if (z->spec_start == 0) {
2503  if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2504  return 0;
2505  } else {
2506  int ha = z->img_comp[n].ha;
2507  if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha]))
2508  return 0;
2509  }
2510  // every data block is an MCU, so countdown the restart interval
2511  if (--z->todo <= 0) {
2512  if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2513  if (!STBI__RESTART(z->marker)) return 1;
2514  stbi__jpeg_reset(z);
2515  }
2516  }
2517  }
2518  return 1;
2519  } else { // interleaved
2520  int i,j,k,x,y;
2521  for (j=0; j < z->img_mcu_y; ++j) {
2522  for (i=0; i < z->img_mcu_x; ++i) {
2523  // scan an interleaved mcu... process scan_n components in order
2524  for (k=0; k < z->scan_n; ++k) {
2525  int n = z->order[k];
2526  // scan out an mcu's worth of this component; that's just determined
2527  // by the basic H and V specified for the component
2528  for (y=0; y < z->img_comp[n].v; ++y) {
2529  for (x=0; x < z->img_comp[n].h; ++x) {
2530  int x2 = (i*z->img_comp[n].h + x);
2531  int y2 = (j*z->img_comp[n].v + y);
2532  short *data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w);
2533  if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
2534  return 0;
2535  }
2536  }
2537  }
2538  // after all interleaved components, that's an interleaved MCU,
2539  // so now count down the restart interval
2540  if (--z->todo <= 0) {
2541  if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2542  if (!STBI__RESTART(z->marker)) return 1;
2543  stbi__jpeg_reset(z);
2544  }
2545  }
2546  }
2547  return 1;
2548  }
2549  }
2550 }
2551 
2552 static void stbi__jpeg_dequantize(short *data, stbi_uc *dequant)
2553 {
2554  int i;
2555  for (i=0; i < 64; ++i)
2556  data[i] *= dequant[i];
2557 }
2558 
2559 static void stbi__jpeg_finish(stbi__jpeg *z)
2560 {
2561  if (z->progressive) {
2562  // dequantize and idct the data
2563  int i,j,n;
2564  for (n=0; n < z->s->img_n; ++n) {
2565  int w = (z->img_comp[n].x+7) >> 3;
2566  int h = (z->img_comp[n].y+7) >> 3;
2567  for (j=0; j < h; ++j) {
2568  for (i=0; i < w; ++i) {
2569  short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
2570  stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]);
2571  z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2572  }
2573  }
2574  }
2575  }
2576 }
2577 
2578 static int stbi__process_marker(stbi__jpeg *z, int m)
2579 {
2580  int L;
2581  switch (m) {
2582  case STBI__MARKER_none: // no marker found
2583  return stbi__err("expected marker","Corrupt JPEG");
2584 
2585  case 0xDD: // DRI - specify restart interval
2586  if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
2587  z->restart_interval = stbi__get16be(z->s);
2588  return 1;
2589 
2590  case 0xDB: // DQT - define quantization table
2591  L = stbi__get16be(z->s)-2;
2592  while (L > 0) {
2593  int q = stbi__get8(z->s);
2594  int p = q >> 4;
2595  int t = q & 15,i;
2596  if (p != 0) return stbi__err("bad DQT type","Corrupt JPEG");
2597  if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG");
2598  for (i=0; i < 64; ++i)
2599  z->dequant[t][stbi__jpeg_dezigzag[i]] = stbi__get8(z->s);
2600  L -= 65;
2601  }
2602  return L==0;
2603 
2604  case 0xC4: // DHT - define huffman table
2605  L = stbi__get16be(z->s)-2;
2606  while (L > 0) {
2607  stbi_uc *v;
2608  int sizes[16],i,n=0;
2609  int q = stbi__get8(z->s);
2610  int tc = q >> 4;
2611  int th = q & 15;
2612  if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG");
2613  for (i=0; i < 16; ++i) {
2614  sizes[i] = stbi__get8(z->s);
2615  n += sizes[i];
2616  }
2617  L -= 17;
2618  if (tc == 0) {
2619  if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0;
2620  v = z->huff_dc[th].values;
2621  } else {
2622  if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0;
2623  v = z->huff_ac[th].values;
2624  }
2625  for (i=0; i < n; ++i)
2626  v[i] = stbi__get8(z->s);
2627  if (tc != 0)
2628  stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
2629  L -= n;
2630  }
2631  return L==0;
2632  }
2633  // check for comment block or APP blocks
2634  if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
2635  stbi__skip(z->s, stbi__get16be(z->s)-2);
2636  return 1;
2637  }
2638  return 0;
2639 }
2640 
2641 // after we see SOS
2642 static int stbi__process_scan_header(stbi__jpeg *z)
2643 {
2644  int i;
2645  int Ls = stbi__get16be(z->s);
2646  z->scan_n = stbi__get8(z->s);
2647  if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return stbi__err("bad SOS component count","Corrupt JPEG");
2648  if (Ls != 6+2*z->scan_n) return stbi__err("bad SOS len","Corrupt JPEG");
2649  for (i=0; i < z->scan_n; ++i) {
2650  int id = stbi__get8(z->s), which;
2651  int q = stbi__get8(z->s);
2652  for (which = 0; which < z->s->img_n; ++which)
2653  if (z->img_comp[which].id == id)
2654  break;
2655  if (which == z->s->img_n) return 0; // no match
2656  z->img_comp[which].hd = q >> 4; if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff","Corrupt JPEG");
2657  z->img_comp[which].ha = q & 15; if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff","Corrupt JPEG");
2658  z->order[i] = which;
2659  }
2660 
2661  {
2662  int aa;
2663  z->spec_start = stbi__get8(z->s);
2664  z->spec_end = stbi__get8(z->s); // should be 63, but might be 0
2665  aa = stbi__get8(z->s);
2666  z->succ_high = (aa >> 4);
2667  z->succ_low = (aa & 15);
2668  if (z->progressive) {
2669  if (z->spec_start > 63 || z->spec_end > 63 || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13)
2670  return stbi__err("bad SOS", "Corrupt JPEG");
2671  } else {
2672  if (z->spec_start != 0) return stbi__err("bad SOS","Corrupt JPEG");
2673  if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS","Corrupt JPEG");
2674  z->spec_end = 63;
2675  }
2676  }
2677 
2678  return 1;
2679 }
2680 
2681 static int stbi__process_frame_header(stbi__jpeg *z, int scan)
2682 {
2683  stbi__context *s = z->s;
2684  int Lf,p,i,q, h_max=1,v_max=1,c;
2685  Lf = stbi__get16be(s); if (Lf < 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
2686  p = stbi__get8(s); if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
2687  s->img_y = stbi__get16be(s); if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
2688  s->img_x = stbi__get16be(s); if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
2689  c = stbi__get8(s);
2690  if (c != 3 && c != 1) return stbi__err("bad component count","Corrupt JPEG"); // JFIF requires
2691  s->img_n = c;
2692  for (i=0; i < c; ++i) {
2693  z->img_comp[i].data = NULL;
2694  z->img_comp[i].linebuf = NULL;
2695  }
2696 
2697  if (Lf != 8+3*s->img_n) return stbi__err("bad SOF len","Corrupt JPEG");
2698 
2699  for (i=0; i < s->img_n; ++i) {
2700  z->img_comp[i].id = stbi__get8(s);
2701  if (z->img_comp[i].id != i+1) // JFIF requires
2702  if (z->img_comp[i].id != i) // some version of jpegtran outputs non-JFIF-compliant files!
2703  return stbi__err("bad component ID","Corrupt JPEG");
2704  q = stbi__get8(s);
2705  z->img_comp[i].h = (q >> 4); if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG");
2706  z->img_comp[i].v = q & 15; if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG");
2707  z->img_comp[i].tq = stbi__get8(s); if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG");
2708  }
2709 
2710  if (scan != STBI__SCAN_load) return 1;
2711 
2712  if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
2713 
2714  for (i=0; i < s->img_n; ++i) {
2715  if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
2716  if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
2717  }
2718 
2719  // compute interleaved mcu info
2720  z->img_h_max = h_max;
2721  z->img_v_max = v_max;
2722  z->img_mcu_w = h_max * 8;
2723  z->img_mcu_h = v_max * 8;
2724  z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
2725  z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;
2726 
2727  for (i=0; i < s->img_n; ++i) {
2728  // number of effective pixels (e.g. for non-interleaved MCU)
2729  z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
2730  z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
2731  // to simplify generation, we'll allocate enough memory to decode
2732  // the bogus oversized data from using interleaved MCUs and their
2733  // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
2734  // discard the extra data until colorspace conversion
2735  z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
2736  z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
2737  z->img_comp[i].raw_data = stbi__malloc(z->img_comp[i].w2 * z->img_comp[i].h2+15);
2738 
2739  if (z->img_comp[i].raw_data == NULL) {
2740  for(--i; i >= 0; --i) {
2741  STBI_FREE(z->img_comp[i].raw_data);
2742  z->img_comp[i].data = NULL;
2743  }
2744  return stbi__err("outofmem", "Out of memory");
2745  }
2746  // align blocks for idct using mmx/sse
2747  z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
2748  z->img_comp[i].linebuf = NULL;
2749  if (z->progressive) {
2750  z->img_comp[i].coeff_w = (z->img_comp[i].w2 + 7) >> 3;
2751  z->img_comp[i].coeff_h = (z->img_comp[i].h2 + 7) >> 3;
2752  z->img_comp[i].raw_coeff = STBI_MALLOC(z->img_comp[i].coeff_w * z->img_comp[i].coeff_h * 64 * sizeof(short) + 15);
2753  z->img_comp[i].coeff = (short*) (((size_t) z->img_comp[i].raw_coeff + 15) & ~15);
2754  } else {
2755  z->img_comp[i].coeff = 0;
2756  z->img_comp[i].raw_coeff = 0;
2757  }
2758  }
2759 
2760  return 1;
2761 }
2762 
2763 // use comparisons since in some cases we handle more than one case (e.g. SOF)
2764 #define stbi__DNL(x) ((x) == 0xdc)
2765 #define stbi__SOI(x) ((x) == 0xd8)
2766 #define stbi__EOI(x) ((x) == 0xd9)
2767 #define stbi__SOF(x) ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
2768 #define stbi__SOS(x) ((x) == 0xda)
2769 
2770 #define stbi__SOF_progressive(x) ((x) == 0xc2)
2771 
2772 static int stbi__decode_jpeg_header(stbi__jpeg *z, int scan)
2773 {
2774  int m;
2775  z->marker = STBI__MARKER_none; // initialize cached marker to empty
2776  m = stbi__get_marker(z);
2777  if (!stbi__SOI(m)) return stbi__err("no SOI","Corrupt JPEG");
2778  if (scan == STBI__SCAN_type) return 1;
2779  m = stbi__get_marker(z);
2780  while (!stbi__SOF(m)) {
2781  if (!stbi__process_marker(z,m)) return 0;
2782  m = stbi__get_marker(z);
2783  while (m == STBI__MARKER_none) {
2784  // some files have extra padding after their blocks, so ok, we'll scan
2785  if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG");
2786  m = stbi__get_marker(z);
2787  }
2788  }
2789  z->progressive = stbi__SOF_progressive(m);
2790  if (!stbi__process_frame_header(z, scan)) return 0;
2791  return 1;
2792 }
2793 
2794 // decode image to YCbCr format
2795 static int stbi__decode_jpeg_image(stbi__jpeg *j)
2796 {
2797  int m;
2798  for (m = 0; m < 4; m++) {
2799  j->img_comp[m].raw_data = NULL;
2800  j->img_comp[m].raw_coeff = NULL;
2801  }
2802  j->restart_interval = 0;
2803  if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0;
2804  m = stbi__get_marker(j);
2805  while (!stbi__EOI(m)) {
2806  if (stbi__SOS(m)) {
2807  if (!stbi__process_scan_header(j)) return 0;
2808  if (!stbi__parse_entropy_coded_data(j)) return 0;
2809  if (j->marker == STBI__MARKER_none ) {
2810  // handle 0s at the end of image data from IP Kamera 9060
2811  while (!stbi__at_eof(j->s)) {
2812  int x = stbi__get8(j->s);
2813  if (x == 255) {
2814  j->marker = stbi__get8(j->s);
2815  break;
2816  } else if (x != 0) {
2817  return stbi__err("junk before marker", "Corrupt JPEG");
2818  }
2819  }
2820  // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
2821  }
2822  } else {
2823  if (!stbi__process_marker(j, m)) return 0;
2824  }
2825  m = stbi__get_marker(j);
2826  }
2827  if (j->progressive)
2828  stbi__jpeg_finish(j);
2829  return 1;
2830 }
2831 
2832 // static jfif-centered resampling (across block boundaries)
2833 
2834 typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1,
2835  int w, int hs);
2836 
2837 #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
2838 
2839 static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2840 {
2841  STBI_NOTUSED(out);
2842  STBI_NOTUSED(in_far);
2843  STBI_NOTUSED(w);
2844  STBI_NOTUSED(hs);
2845  return in_near;
2846 }
2847 
2848 static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2849 {
2850  // need to generate two samples vertically for every one in input
2851  int i;
2852  STBI_NOTUSED(hs);
2853  for (i=0; i < w; ++i)
2854  out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2);
2855  return out;
2856 }
2857 
2858 static stbi_uc* stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2859 {
2860  // need to generate two samples horizontally for every one in input
2861  int i;
2862  stbi_uc *input = in_near;
2863 
2864  if (w == 1) {
2865  // if only one sample, can't do any interpolation
2866  out[0] = out[1] = input[0];
2867  return out;
2868  }
2869 
2870  out[0] = input[0];
2871  out[1] = stbi__div4(input[0]*3 + input[1] + 2);
2872  for (i=1; i < w-1; ++i) {
2873  int n = 3*input[i]+2;
2874  out[i*2+0] = stbi__div4(n+input[i-1]);
2875  out[i*2+1] = stbi__div4(n+input[i+1]);
2876  }
2877  out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2);
2878  out[i*2+1] = input[w-1];
2879 
2880  STBI_NOTUSED(in_far);
2881  STBI_NOTUSED(hs);
2882 
2883  return out;
2884 }
2885 
2886 #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
2887 
2888 static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2889 {
2890  // need to generate 2x2 samples for every one in input
2891  int i,t0,t1;
2892  if (w == 1) {
2893  out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2894  return out;
2895  }
2896 
2897  t1 = 3*in_near[0] + in_far[0];
2898  out[0] = stbi__div4(t1+2);
2899  for (i=1; i < w; ++i) {
2900  t0 = t1;
2901  t1 = 3*in_near[i]+in_far[i];
2902  out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
2903  out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
2904  }
2905  out[w*2-1] = stbi__div4(t1+2);
2906 
2907  STBI_NOTUSED(hs);
2908 
2909  return out;
2910 }
2911 
2912 #if defined(STBI_SSE2) || defined(STBI_NEON)
2913 static stbi_uc *stbi__resample_row_hv_2_simd(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
2914 {
2915  // need to generate 2x2 samples for every one in input
2916  int i=0,t0,t1;
2917 
2918  if (w == 1) {
2919  out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
2920  return out;
2921  }
2922 
2923  t1 = 3*in_near[0] + in_far[0];
2924  // process groups of 8 pixels for as long as we can.
2925  // note we can't handle the last pixel in a row in this loop
2926  // because we need to handle the filter boundary conditions.
2927  for (; i < ((w-1) & ~7); i += 8) {
2928 #if defined(STBI_SSE2)
2929  // load and perform the vertical filtering pass
2930  // this uses 3*x + y = 4*x + (y - x)
2931  __m128i zero = _mm_setzero_si128();
2932  __m128i farb = _mm_loadl_epi64((__m128i *) (in_far + i));
2933  __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i));
2934  __m128i farw = _mm_unpacklo_epi8(farb, zero);
2935  __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
2936  __m128i diff = _mm_sub_epi16(farw, nearw);
2937  __m128i nears = _mm_slli_epi16(nearw, 2);
2938  __m128i curr = _mm_add_epi16(nears, diff); // current row
2939 
2940  // horizontal filter works the same based on shifted vers of current
2941  // row. "prev" is current row shifted right by 1 pixel; we need to
2942  // insert the previous pixel value (from t1).
2943  // "next" is current row shifted left by 1 pixel, with first pixel
2944  // of next block of 8 pixels added in.
2945  __m128i prv0 = _mm_slli_si128(curr, 2);
2946  __m128i nxt0 = _mm_srli_si128(curr, 2);
2947  __m128i prev = _mm_insert_epi16(prv0, t1, 0);
2948  __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7);
2949 
2950  // horizontal filter, polyphase implementation since it's convenient:
2951  // even pixels = 3*cur + prev = cur*4 + (prev - cur)
2952  // odd pixels = 3*cur + next = cur*4 + (next - cur)
2953  // note the shared term.
2954  __m128i bias = _mm_set1_epi16(8);
2955  __m128i curs = _mm_slli_epi16(curr, 2);
2956  __m128i prvd = _mm_sub_epi16(prev, curr);
2957  __m128i nxtd = _mm_sub_epi16(next, curr);
2958  __m128i curb = _mm_add_epi16(curs, bias);
2959  __m128i even = _mm_add_epi16(prvd, curb);
2960  __m128i odd = _mm_add_epi16(nxtd, curb);
2961 
2962  // interleave even and odd pixels, then undo scaling.
2963  __m128i int0 = _mm_unpacklo_epi16(even, odd);
2964  __m128i int1 = _mm_unpackhi_epi16(even, odd);
2965  __m128i de0 = _mm_srli_epi16(int0, 4);
2966  __m128i de1 = _mm_srli_epi16(int1, 4);
2967 
2968  // pack and write output
2969  __m128i outv = _mm_packus_epi16(de0, de1);
2970  _mm_storeu_si128((__m128i *) (out + i*2), outv);
2971 #elif defined(STBI_NEON)
2972  // load and perform the vertical filtering pass
2973  // this uses 3*x + y = 4*x + (y - x)
2974  uint8x8_t farb = vld1_u8(in_far + i);
2975  uint8x8_t nearb = vld1_u8(in_near + i);
2976  int16x8_t diff = vreinterpretq_s16_u16(vsubl_u8(farb, nearb));
2977  int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2));
2978  int16x8_t curr = vaddq_s16(nears, diff); // current row
2979 
2980  // horizontal filter works the same based on shifted vers of current
2981  // row. "prev" is current row shifted right by 1 pixel; we need to
2982  // insert the previous pixel value (from t1).
2983  // "next" is current row shifted left by 1 pixel, with first pixel
2984  // of next block of 8 pixels added in.
2985  int16x8_t prv0 = vextq_s16(curr, curr, 7);
2986  int16x8_t nxt0 = vextq_s16(curr, curr, 1);
2987  int16x8_t prev = vsetq_lane_s16(t1, prv0, 0);
2988  int16x8_t next = vsetq_lane_s16(3*in_near[i+8] + in_far[i+8], nxt0, 7);
2989 
2990  // horizontal filter, polyphase implementation since it's convenient:
2991  // even pixels = 3*cur + prev = cur*4 + (prev - cur)
2992  // odd pixels = 3*cur + next = cur*4 + (next - cur)
2993  // note the shared term.
2994  int16x8_t curs = vshlq_n_s16(curr, 2);
2995  int16x8_t prvd = vsubq_s16(prev, curr);
2996  int16x8_t nxtd = vsubq_s16(next, curr);
2997  int16x8_t even = vaddq_s16(curs, prvd);
2998  int16x8_t odd = vaddq_s16(curs, nxtd);
2999 
3000  // undo scaling and round, then store with even/odd phases interleaved
3001  uint8x8x2_t o;
3002  o.val[0] = vqrshrun_n_s16(even, 4);
3003  o.val[1] = vqrshrun_n_s16(odd, 4);
3004  vst2_u8(out + i*2, o);
3005 #endif
3006 
3007  // "previous" value for next iter
3008  t1 = 3*in_near[i+7] + in_far[i+7];
3009  }
3010 
3011  t0 = t1;
3012  t1 = 3*in_near[i] + in_far[i];
3013  out[i*2] = stbi__div16(3*t1 + t0 + 8);
3014 
3015  for (++i; i < w; ++i) {
3016  t0 = t1;
3017  t1 = 3*in_near[i]+in_far[i];
3018  out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
3019  out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
3020  }
3021  out[w*2-1] = stbi__div4(t1+2);
3022 
3023  STBI_NOTUSED(hs);
3024 
3025  return out;
3026 }
3027 #endif
3028 
3029 static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3030 {
3031  // resample with nearest-neighbor
3032  int i,j;
3033  STBI_NOTUSED(in_far);
3034  for (i=0; i < w; ++i)
3035  for (j=0; j < hs; ++j)
3036  out[i*hs+j] = in_near[i];
3037  return out;
3038 }
3039 
3040 #ifdef STBI_JPEG_OLD
3041 // this is the same YCbCr-to-RGB calculation that stb_image has used
3042 // historically before the algorithm changes in 1.49
3043 #define float2fixed(x) ((int) ((x) * 65536 + 0.5))
3044 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3045 {
3046  int i;
3047  for (i=0; i < count; ++i) {
3048  int y_fixed = (y[i] << 16) + 32768; // rounding
3049  int r,g,b;
3050  int cr = pcr[i] - 128;
3051  int cb = pcb[i] - 128;
3052  r = y_fixed + cr*float2fixed(1.40200f);
3053  g = y_fixed - cr*float2fixed(0.71414f) - cb*float2fixed(0.34414f);
3054  b = y_fixed + cb*float2fixed(1.77200f);
3055  r >>= 16;
3056  g >>= 16;
3057  b >>= 16;
3058  if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3059  if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3060  if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3061  out[0] = (stbi_uc)r;
3062  out[1] = (stbi_uc)g;
3063  out[2] = (stbi_uc)b;
3064  out[3] = 255;
3065  out += step;
3066  }
3067 }
3068 #else
3069 // this is a reduced-precision calculation of YCbCr-to-RGB introduced
3070 // to make sure the code produces the same results in both SIMD and scalar
3071 #define float2fixed(x) (((int) ((x) * 4096.0f + 0.5f)) << 8)
3072 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3073 {
3074  int i;
3075  for (i=0; i < count; ++i) {
3076  int y_fixed = (y[i] << 20) + (1<<19); // rounding
3077  int r,g,b;
3078  int cr = pcr[i] - 128;
3079  int cb = pcb[i] - 128;
3080  r = y_fixed + cr* float2fixed(1.40200f);
3081  g = y_fixed + (cr*-float2fixed(0.71414f)) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3082  b = y_fixed + cb* float2fixed(1.77200f);
3083  r >>= 20;
3084  g >>= 20;
3085  b >>= 20;
3086  if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3087  if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3088  if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3089  out[0] = (stbi_uc)r;
3090  out[1] = (stbi_uc)g;
3091  out[2] = (stbi_uc)b;
3092  out[3] = 255;
3093  out += step;
3094  }
3095 }
3096 #endif
3097 
3098 #if defined(STBI_SSE2) || defined(STBI_NEON)
3099 static void stbi__YCbCr_to_RGB_simd(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step)
3100 {
3101  int i = 0;
3102 
3103 #ifdef STBI_SSE2
3104  // step == 3 is pretty ugly on the final interleave, and i'm not convinced
3105  // it's useful in practice (you wouldn't use it for textures, for example).
3106  // so just accelerate step == 4 case.
3107  if (step == 4) {
3108  // this is a fairly straightforward implementation and not super-optimized.
3109  __m128i signflip = _mm_set1_epi8(-0x80);
3110  __m128i cr_const0 = _mm_set1_epi16( (short) ( 1.40200f*4096.0f+0.5f));
3111  __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f));
3112  __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f));
3113  __m128i cb_const1 = _mm_set1_epi16( (short) ( 1.77200f*4096.0f+0.5f));
3114  __m128i y_bias = _mm_set1_epi8((char) (unsigned char) 128);
3115  __m128i xw = _mm_set1_epi16(255); // alpha channel
3116 
3117  for (; i+7 < count; i += 8) {
3118  // load
3119  __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i));
3120  __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i));
3121  __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i));
3122  __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
3123  __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
3124 
3125  // unpack to short (and left-shift cr, cb by 8)
3126  __m128i yw = _mm_unpacklo_epi8(y_bias, y_bytes);
3127  __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
3128  __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
3129 
3130  // color transform
3131  __m128i yws = _mm_srli_epi16(yw, 4);
3132  __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
3133  __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
3134  __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
3135  __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
3136  __m128i rws = _mm_add_epi16(cr0, yws);
3137  __m128i gwt = _mm_add_epi16(cb0, yws);
3138  __m128i bws = _mm_add_epi16(yws, cb1);
3139  __m128i gws = _mm_add_epi16(gwt, cr1);
3140 
3141  // descale
3142  __m128i rw = _mm_srai_epi16(rws, 4);
3143  __m128i bw = _mm_srai_epi16(bws, 4);
3144  __m128i gw = _mm_srai_epi16(gws, 4);
3145 
3146  // back to byte, set up for transpose
3147  __m128i brb = _mm_packus_epi16(rw, bw);
3148  __m128i gxb = _mm_packus_epi16(gw, xw);
3149 
3150  // transpose to interleave channels
3151  __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
3152  __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
3153  __m128i o0 = _mm_unpacklo_epi16(t0, t1);
3154  __m128i o1 = _mm_unpackhi_epi16(t0, t1);
3155 
3156  // store
3157  _mm_storeu_si128((__m128i *) (out + 0), o0);
3158  _mm_storeu_si128((__m128i *) (out + 16), o1);
3159  out += 32;
3160  }
3161  }
3162 #endif
3163 
3164 #ifdef STBI_NEON
3165  // in this version, step=3 support would be easy to add. but is there demand?
3166  if (step == 4) {
3167  // this is a fairly straightforward implementation and not super-optimized.
3168  uint8x8_t signflip = vdup_n_u8(0x80);
3169  int16x8_t cr_const0 = vdupq_n_s16( (short) ( 1.40200f*4096.0f+0.5f));
3170  int16x8_t cr_const1 = vdupq_n_s16( - (short) ( 0.71414f*4096.0f+0.5f));
3171  int16x8_t cb_const0 = vdupq_n_s16( - (short) ( 0.34414f*4096.0f+0.5f));
3172  int16x8_t cb_const1 = vdupq_n_s16( (short) ( 1.77200f*4096.0f+0.5f));
3173 
3174  for (; i+7 < count; i += 8) {
3175  // load
3176  uint8x8_t y_bytes = vld1_u8(y + i);
3177  uint8x8_t cr_bytes = vld1_u8(pcr + i);
3178  uint8x8_t cb_bytes = vld1_u8(pcb + i);
3179  int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip));
3180  int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip));
3181 
3182  // expand to s16
3183  int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4));
3184  int16x8_t crw = vshll_n_s8(cr_biased, 7);
3185  int16x8_t cbw = vshll_n_s8(cb_biased, 7);
3186 
3187  // color transform
3188  int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0);
3189  int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0);
3190  int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1);
3191  int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1);
3192  int16x8_t rws = vaddq_s16(yws, cr0);
3193  int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1);
3194  int16x8_t bws = vaddq_s16(yws, cb1);
3195 
3196  // undo scaling, round, convert to byte
3197  uint8x8x4_t o;
3198  o.val[0] = vqrshrun_n_s16(rws, 4);
3199  o.val[1] = vqrshrun_n_s16(gws, 4);
3200  o.val[2] = vqrshrun_n_s16(bws, 4);
3201  o.val[3] = vdup_n_u8(255);
3202 
3203  // store, interleaving r/g/b/a
3204  vst4_u8(out, o);
3205  out += 8*4;
3206  }
3207  }
3208 #endif
3209 
3210  for (; i < count; ++i) {
3211  int y_fixed = (y[i] << 20) + (1<<19); // rounding
3212  int r,g,b;
3213  int cr = pcr[i] - 128;
3214  int cb = pcb[i] - 128;
3215  r = y_fixed + cr* float2fixed(1.40200f);
3216  g = y_fixed + cr*-float2fixed(0.71414f) + ((cb*-float2fixed(0.34414f)) & 0xffff0000);
3217  b = y_fixed + cb* float2fixed(1.77200f);
3218  r >>= 20;
3219  g >>= 20;
3220  b >>= 20;
3221  if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3222  if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3223  if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3224  out[0] = (stbi_uc)r;
3225  out[1] = (stbi_uc)g;
3226  out[2] = (stbi_uc)b;
3227  out[3] = 255;
3228  out += step;
3229  }
3230 }
3231 #endif
3232 
3233 // set up the kernels
3234 static void stbi__setup_jpeg(stbi__jpeg *j)
3235 {
3236  j->idct_block_kernel = stbi__idct_block;
3237  j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
3238  j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
3239 
3240 #ifdef STBI_SSE2
3241  if (stbi__sse2_available()) {
3242  j->idct_block_kernel = stbi__idct_simd;
3243  #ifndef STBI_JPEG_OLD
3244  j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3245  #endif
3246  j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3247  }
3248 #endif
3249 
3250 #ifdef STBI_NEON
3251  j->idct_block_kernel = stbi__idct_simd;
3252  #ifndef STBI_JPEG_OLD
3253  j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3254  #endif
3255  j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3256 #endif
3257 }
3258 
3259 // clean up the temporary component buffers
3260 static void stbi__cleanup_jpeg(stbi__jpeg *j)
3261 {
3262  int i;
3263  for (i=0; i < j->s->img_n; ++i) {
3264  if (j->img_comp[i].raw_data) {
3265  STBI_FREE(j->img_comp[i].raw_data);
3266  j->img_comp[i].raw_data = NULL;
3267  j->img_comp[i].data = NULL;
3268  }
3269  if (j->img_comp[i].raw_coeff) {
3270  STBI_FREE(j->img_comp[i].raw_coeff);
3271  j->img_comp[i].raw_coeff = 0;
3272  j->img_comp[i].coeff = 0;
3273  }
3274  if (j->img_comp[i].linebuf) {
3275  STBI_FREE(j->img_comp[i].linebuf);
3276  j->img_comp[i].linebuf = NULL;
3277  }
3278  }
3279 }
3280 
3281 typedef struct
3282 {
3283  resample_row_func resample;
3284  stbi_uc *line0,*line1;
3285  int hs,vs; // expansion factor in each axis
3286  int w_lores; // horizontal pixels pre-expansion
3287  int ystep; // how far through vertical expansion we are
3288  int ypos; // which pre-expansion row we're on
3289 } stbi__resample;
3290 
3291 static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
3292 {
3293  int n, decode_n;
3294  z->s->img_n = 0; // make stbi__cleanup_jpeg safe
3295 
3296  // validate req_comp
3297  if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
3298 
3299  // load a jpeg image from whichever source, but leave in YCbCr format
3300  if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
3301 
3302  // determine actual number of components to generate
3303  n = req_comp ? req_comp : z->s->img_n;
3304 
3305  if (z->s->img_n == 3 && n < 3)
3306  decode_n = 1;
3307  else
3308  decode_n = z->s->img_n;
3309 
3310  // resample and color-convert
3311  {
3312  int k;
3313  unsigned int i,j;
3314  stbi_uc *output;
3315  stbi_uc *coutput[4];
3316 
3317  stbi__resample res_comp[4];
3318 
3319  for (k=0; k < decode_n; ++k) {
3320  stbi__resample *r = &res_comp[k];
3321 
3322  // allocate line buffer big enough for upsampling off the edges
3323  // with upsample factor of 4
3324  z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3);
3325  if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3326 
3327  r->hs = z->img_h_max / z->img_comp[k].h;
3328  r->vs = z->img_v_max / z->img_comp[k].v;
3329  r->ystep = r->vs >> 1;
3330  r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
3331  r->ypos = 0;
3332  r->line0 = r->line1 = z->img_comp[k].data;
3333 
3334  if (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
3335  else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
3336  else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
3337  else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
3338  else r->resample = stbi__resample_row_generic;
3339  }
3340 
3341  // can't error after this so, this is safe
3342  output = (stbi_uc *) stbi__malloc(n * z->s->img_x * z->s->img_y + 1);
3343  if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3344 
3345  // now go ahead and resample
3346  for (j=0; j < z->s->img_y; ++j) {
3347  stbi_uc *out = output + n * z->s->img_x * j;
3348  for (k=0; k < decode_n; ++k) {
3349  stbi__resample *r = &res_comp[k];
3350  int y_bot = r->ystep >= (r->vs >> 1);
3351  coutput[k] = r->resample(z->img_comp[k].linebuf,
3352  y_bot ? r->line1 : r->line0,
3353  y_bot ? r->line0 : r->line1,
3354  r->w_lores, r->hs);
3355  if (++r->ystep >= r->vs) {
3356  r->ystep = 0;
3357  r->line0 = r->line1;
3358  if (++r->ypos < z->img_comp[k].y)
3359  r->line1 += z->img_comp[k].w2;
3360  }
3361  }
3362  if (n >= 3) {
3363  stbi_uc *y = coutput[0];
3364  if (z->s->img_n == 3) {
3365  z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3366  } else
3367  for (i=0; i < z->s->img_x; ++i) {
3368  out[0] = out[1] = out[2] = y[i];
3369  out[3] = 255; // not used if n==3
3370  out += n;
3371  }
3372  } else {
3373  stbi_uc *y = coutput[0];
3374  if (n == 1)
3375  for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
3376  else
3377  for (i=0; i < z->s->img_x; ++i) *out++ = y[i], *out++ = 255;
3378  }
3379  }
3380  stbi__cleanup_jpeg(z);
3381  *out_x = z->s->img_x;
3382  *out_y = z->s->img_y;
3383  if (comp) *comp = z->s->img_n; // report original components, not output
3384  return output;
3385  }
3386 }
3387 
3388 static unsigned char *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
3389 {
3390  stbi__jpeg j;
3391  j.s = s;
3392  stbi__setup_jpeg(&j);
3393  return load_jpeg_image(&j, x,y,comp,req_comp);
3394 }
3395 
3396 static int stbi__jpeg_test(stbi__context *s)
3397 {
3398  int r;
3399  stbi__jpeg j;
3400  j.s = s;
3401  stbi__setup_jpeg(&j);
3402  r = stbi__decode_jpeg_header(&j, STBI__SCAN_type);
3403  stbi__rewind(s);
3404  return r;
3405 }
3406 
3407 static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp)
3408 {
3409  if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) {
3410  stbi__rewind( j->s );
3411  return 0;
3412  }
3413  if (x) *x = j->s->img_x;
3414  if (y) *y = j->s->img_y;
3415  if (comp) *comp = j->s->img_n;
3416  return 1;
3417 }
3418 
3419 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp)
3420 {
3421  stbi__jpeg j;
3422  j.s = s;
3423  return stbi__jpeg_info_raw(&j, x, y, comp);
3424 }
3425 #endif
3426 
3427 // public domain zlib decode v0.2 Sean Barrett 2006-11-18
3428 // simple implementation
3429 // - all input must be provided in an upfront buffer
3430 // - all output is written to a single output buffer (can malloc/realloc)
3431 // performance
3432 // - fast huffman
3433 
3434 #ifndef STBI_NO_ZLIB
3435 
3436 // fast-way is faster to check than jpeg huffman, but slow way is slower
3437 #define STBI__ZFAST_BITS 9 // accelerate all cases in default tables
3438 #define STBI__ZFAST_MASK ((1 << STBI__ZFAST_BITS) - 1)
3439 
3440 // zlib-style huffman encoding
3441 // (jpegs packs from left, zlib from right, so can't share code)
3442 typedef struct
3443 {
3444  stbi__uint16 fast[1 << STBI__ZFAST_BITS];
3445  stbi__uint16 firstcode[16];
3446  int maxcode[17];
3447  stbi__uint16 firstsymbol[16];
3448  stbi_uc size[288];
3449  stbi__uint16 value[288];
3450 } stbi__zhuffman;
3451 
3452 stbi_inline static int stbi__bitreverse16(int n)
3453 {
3454  n = ((n & 0xAAAA) >> 1) | ((n & 0x5555) << 1);
3455  n = ((n & 0xCCCC) >> 2) | ((n & 0x3333) << 2);
3456  n = ((n & 0xF0F0) >> 4) | ((n & 0x0F0F) << 4);
3457  n = ((n & 0xFF00) >> 8) | ((n & 0x00FF) << 8);
3458  return n;
3459 }
3460 
3461 stbi_inline static int stbi__bit_reverse(int v, int bits)
3462 {
3463  STBI_ASSERT(bits <= 16);
3464  // to bit reverse n bits, reverse 16 and shift
3465  // e.g. 11 bits, bit reverse and shift away 5
3466  return stbi__bitreverse16(v) >> (16-bits);
3467 }
3468 
3469 static int stbi__zbuild_huffman(stbi__zhuffman *z, stbi_uc *sizelist, int num)
3470 {
3471  int i,kl=0;
3472  int code, next_code[16], sizes[17];
3473 
3474  // DEFLATE spec for generating codes
3475  memset(sizes, 0, sizeof(sizes));
3476  memset(z->fast, 0, sizeof(z->fast));
3477  for (i=0; i < num; ++i)
3478  ++sizes[sizelist[i]];
3479  sizes[0] = 0;
3480  for (i=1; i < 16; ++i)
3481  if (sizes[i] > (1 << i))
3482  return stbi__err("bad sizes", "Corrupt PNG");
3483  code = 0;
3484  for (i=1; i < 16; ++i) {
3485  next_code[i] = code;
3486  z->firstcode[i] = (stbi__uint16) code;
3487  z->firstsymbol[i] = (stbi__uint16) kl;
3488  code = (code + sizes[i]);
3489  if (sizes[i])
3490  if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt PNG");
3491  z->maxcode[i] = code << (16-i); // preshift for inner loop
3492  code <<= 1;
3493  kl += sizes[i];
3494  }
3495  z->maxcode[16] = 0x10000; // sentinel
3496  for (i=0; i < num; ++i) {
3497  int s = sizelist[i];
3498  if (s) {
3499  int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
3500  stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i);
3501  z->size [c] = (stbi_uc ) s;
3502  z->value[c] = (stbi__uint16) i;
3503  if (s <= STBI__ZFAST_BITS) {
3504  int kll = stbi__bit_reverse(next_code[s],s);
3505  while (kll < (1 << STBI__ZFAST_BITS)) {
3506  z->fast[kll] = fastv;
3507  kll += (1 << s);
3508  }
3509  }
3510  ++next_code[s];
3511  }
3512  }
3513  return 1;
3514 }
3515 
3516 // zlib-from-memory implementation for PNG reading
3517 // because PNG allows splitting the zlib stream arbitrarily,
3518 // and it's annoying structurally to have PNG call ZLIB call PNG,
3519 // we require PNG read all the IDATs and combine them into a single
3520 // memory buffer
3521 
3522 typedef struct
3523 {
3524  stbi_uc *zbuffer, *zbuffer_end;
3525  int num_bits;
3526  stbi__uint32 code_buffer;
3527 
3528  char *zout;
3529  char *zout_start;
3530  char *zout_end;
3531  int z_expandable;
3532 
3533  stbi__zhuffman z_length, z_distance;
3534 } stbi__zbuf;
3535 
3536 stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z)
3537 {
3538  if (z->zbuffer >= z->zbuffer_end) return 0;
3539  return *z->zbuffer++;
3540 }
3541 
3542 static void stbi__fill_bits(stbi__zbuf *z)
3543 {
3544  do {
3545  STBI_ASSERT(z->code_buffer < (1U << z->num_bits));
3546  z->code_buffer |= stbi__zget8(z) << z->num_bits;
3547  z->num_bits += 8;
3548  } while (z->num_bits <= 24);
3549 }
3550 
3551 stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n)
3552 {
3553  unsigned int k;
3554  if (z->num_bits < n) stbi__fill_bits(z);
3555  k = z->code_buffer & ((1 << n) - 1);
3556  z->code_buffer >>= n;
3557  z->num_bits -= n;
3558  return k;
3559 }
3560 
3561 static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z)
3562 {
3563  int b,s,k;
3564  // not resolved by fast table, so compute it the slow way
3565  // use jpeg approach, which requires MSbits at top
3566  k = stbi__bit_reverse(a->code_buffer, 16);
3567  for (s=STBI__ZFAST_BITS+1; ; ++s)
3568  if (k < z->maxcode[s])
3569  break;
3570  if (s == 16) return -1; // invalid code!
3571  // code size is s, so:
3572  b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
3573  STBI_ASSERT(z->size[b] == s);
3574  a->code_buffer >>= s;
3575  a->num_bits -= s;
3576  return z->value[b];
3577 }
3578 
3579 stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z)
3580 {
3581  int b,s;
3582  if (a->num_bits < 16) stbi__fill_bits(a);
3583  b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
3584  if (b) {
3585  s = b >> 9;
3586  a->code_buffer >>= s;
3587  a->num_bits -= s;
3588  return b & 511;
3589  }
3590  return stbi__zhuffman_decode_slowpath(a, z);
3591 }
3592 
3593 static int stbi__zexpand(stbi__zbuf *z, char *zout, int n) // need to make room for n bytes
3594 {
3595  char *q;
3596  int cur, limit;
3597  z->zout = zout;
3598  if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG");
3599  cur = (int) (z->zout - z->zout_start);
3600  limit = (int) (z->zout_end - z->zout_start);
3601  while (cur + n > limit)
3602  limit *= 2;
3603  q = (char *) STBI_REALLOC(z->zout_start, limit);
3604  if (q == NULL) return stbi__err("outofmem", "Out of memory");
3605  z->zout_start = q;
3606  z->zout = q + cur;
3607  z->zout_end = q + limit;
3608  return 1;
3609 }
3610 
3611 static int stbi__zlength_base[31] = {
3612  3,4,5,6,7,8,9,10,11,13,
3613  15,17,19,23,27,31,35,43,51,59,
3614  67,83,99,115,131,163,195,227,258,0,0 };
3615 
3616 static int stbi__zlength_extra[31]=
3617 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
3618 
3619 static int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
3620 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
3621 
3622 static int stbi__zdist_extra[32] =
3623 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
3624 
3625 static int stbi__parse_huffman_block(stbi__zbuf *a)
3626 {
3627  char *zout = a->zout;
3628  for(;;) {
3629  int z = stbi__zhuffman_decode(a, &a->z_length);
3630  if (z < 256) {
3631  if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
3632  if (zout >= a->zout_end) {
3633  if (!stbi__zexpand(a, zout, 1)) return 0;
3634  zout = a->zout;
3635  }
3636  *zout++ = (char) z;
3637  } else {
3638  stbi_uc *p;
3639  int len,dist;
3640  if (z == 256) {
3641  a->zout = zout;
3642  return 1;
3643  }
3644  z -= 257;
3645  len = stbi__zlength_base[z];
3646  if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
3647  z = stbi__zhuffman_decode(a, &a->z_distance);
3648  if (z < 0) return stbi__err("bad huffman code","Corrupt PNG");
3649  dist = stbi__zdist_base[z];
3650  if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
3651  if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG");
3652  if (zout + len > a->zout_end) {
3653  if (!stbi__zexpand(a, zout, len)) return 0;
3654  zout = a->zout;
3655  }
3656  p = (stbi_uc *) (zout - dist);
3657  if (dist == 1) { // run of one byte; common in images.
3658  stbi_uc v = *p;
3659  if (len) { do *zout++ = v; while (--len); }
3660  } else {
3661  if (len) { do *zout++ = *p++; while (--len); }
3662  }
3663  }
3664  }
3665 }
3666 
3667 static int stbi__compute_huffman_codes(stbi__zbuf *a)
3668 {
3669  static stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
3670  stbi__zhuffman z_codelength;
3671  stbi_uc lencodes[286+32+137];//padding for maximum single op
3672  stbi_uc codelength_sizes[19];
3673  int i,n;
3674 
3675  int hlit = stbi__zreceive(a,5) + 257;
3676  int hdist = stbi__zreceive(a,5) + 1;
3677  int hclen = stbi__zreceive(a,4) + 4;
3678 
3679  memset(codelength_sizes, 0, sizeof(codelength_sizes));
3680  for (i=0; i < hclen; ++i) {
3681  int s = stbi__zreceive(a,3);
3682  codelength_sizes[length_dezigzag[i]] = (stbi_uc) s;
3683  }
3684  if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
3685 
3686  n = 0;
3687  while (n < hlit + hdist) {
3688  int c = stbi__zhuffman_decode(a, &z_codelength);
3689  if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG");
3690  if (c < 16)
3691  lencodes[n++] = (stbi_uc) c;
3692  else if (c == 16) {
3693  c = stbi__zreceive(a,2)+3;
3694  memset(lencodes+n, lencodes[n-1], c);
3695  n += c;
3696  } else if (c == 17) {
3697  c = stbi__zreceive(a,3)+3;
3698  memset(lencodes+n, 0, c);
3699  n += c;
3700  } else {
3701  STBI_ASSERT(c == 18);
3702  c = stbi__zreceive(a,7)+11;
3703  memset(lencodes+n, 0, c);
3704  n += c;
3705  }
3706  }
3707  if (n != hlit+hdist) return stbi__err("bad codelengths","Corrupt PNG");
3708  if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
3709  if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
3710  return 1;
3711 }
3712 
3713 static int stbi__parse_uncomperssed_block(stbi__zbuf *a)
3714 {
3715  stbi_uc header[4];
3716  int len,nlen,k;
3717  if (a->num_bits & 7)
3718  stbi__zreceive(a, a->num_bits & 7); // discard
3719  // drain the bit-packed data into header
3720  k = 0;
3721  while (a->num_bits > 0) {
3722  header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check
3723  a->code_buffer >>= 8;
3724  a->num_bits -= 8;
3725  }
3726  STBI_ASSERT(a->num_bits == 0);
3727  // now fill header the normal way
3728  while (k < 4)
3729  header[k++] = stbi__zget8(a);
3730  len = header[1] * 256 + header[0];
3731  nlen = header[3] * 256 + header[2];
3732  if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
3733  if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG");
3734  if (a->zout + len > a->zout_end)
3735  if (!stbi__zexpand(a, a->zout, len)) return 0;
3736  memcpy(a->zout, a->zbuffer, len);
3737  a->zbuffer += len;
3738  a->zout += len;
3739  return 1;
3740 }
3741 
3742 static int stbi__parse_zlib_header(stbi__zbuf *a)
3743 {
3744  int cmf = stbi__zget8(a);
3745  int cm = cmf & 15;
3746  /* int cinfo = cmf >> 4; */
3747  int flg = stbi__zget8(a);
3748  if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
3749  if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
3750  if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
3751  // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
3752  return 1;
3753 }
3754 
3755 // @TODO: should statically initialize these for optimal thread safety
3756 static stbi_uc stbi__zdefault_length[288], stbi__zdefault_distance[32];
3757 static void stbi__init_zdefaults(void)
3758 {
3759  int i; // use <= to match clearly with spec
3760  for (i=0; i <= 143; ++i) stbi__zdefault_length[i] = 8;
3761  for ( ; i <= 255; ++i) stbi__zdefault_length[i] = 9;
3762  for ( ; i <= 279; ++i) stbi__zdefault_length[i] = 7;
3763  for ( ; i <= 287; ++i) stbi__zdefault_length[i] = 8;
3764 
3765  for (i=0; i <= 31; ++i) stbi__zdefault_distance[i] = 5;
3766 }
3767 
3768 static int stbi__parse_zlib(stbi__zbuf *a, int parse_header)
3769 {
3770  int final, type;
3771  if (parse_header)
3772  if (!stbi__parse_zlib_header(a)) return 0;
3773  a->num_bits = 0;
3774  a->code_buffer = 0;
3775  do {
3776  final = stbi__zreceive(a,1);
3777  type = stbi__zreceive(a,2);
3778  if (type == 0) {
3779  if (!stbi__parse_uncomperssed_block(a)) return 0;
3780  } else if (type == 3) {
3781  return 0;
3782  } else {
3783  if (type == 1) {
3784  // use fixed code lengths
3785  if (!stbi__zdefault_distance[31]) stbi__init_zdefaults();
3786  if (!stbi__zbuild_huffman(&a->z_length , stbi__zdefault_length , 288)) return 0;
3787  if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance, 32)) return 0;
3788  } else {
3789  if (!stbi__compute_huffman_codes(a)) return 0;
3790  }
3791  if (!stbi__parse_huffman_block(a)) return 0;
3792  }
3793  } while (!final);
3794  return 1;
3795 }
3796 
3797 static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header)
3798 {
3799  a->zout_start = obuf;
3800  a->zout = obuf;
3801  a->zout_end = obuf + olen;
3802  a->z_expandable = exp;
3803 
3804  return stbi__parse_zlib(a, parse_header);
3805 }
3806 
3807 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)
3808 {
3809  stbi__zbuf a;
3810  char *p = (char *) stbi__malloc(initial_size);
3811  if (p == NULL) return NULL;
3812  a.zbuffer = (stbi_uc *) buffer;
3813  a.zbuffer_end = (stbi_uc *) buffer + len;
3814  if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
3815  if (outlen) *outlen = (int) (a.zout - a.zout_start);
3816  return a.zout_start;
3817  } else {
3818  STBI_FREE(a.zout_start);
3819  return NULL;
3820  }
3821 }
3822 
3823 STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
3824 {
3825  return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
3826 }
3827 
3828 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
3829 {
3830  stbi__zbuf a;
3831  char *p = (char *) stbi__malloc(initial_size);
3832  if (p == NULL) return NULL;
3833  a.zbuffer = (stbi_uc *) buffer;
3834  a.zbuffer_end = (stbi_uc *) buffer + len;
3835  if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
3836  if (outlen) *outlen = (int) (a.zout - a.zout_start);
3837  return a.zout_start;
3838  } else {
3839  STBI_FREE(a.zout_start);
3840  return NULL;
3841  }
3842 }
3843 
3844 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
3845 {
3846  stbi__zbuf a;
3847  a.zbuffer = (stbi_uc *) ibuffer;
3848  a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3849  if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
3850  return (int) (a.zout - a.zout_start);
3851  else
3852  return -1;
3853 }
3854 
3855 STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
3856 {
3857  stbi__zbuf a;
3858  char *p = (char *) stbi__malloc(16384);
3859  if (p == NULL) return NULL;
3860  a.zbuffer = (stbi_uc *) buffer;
3861  a.zbuffer_end = (stbi_uc *) buffer+len;
3862  if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
3863  if (outlen) *outlen = (int) (a.zout - a.zout_start);
3864  return a.zout_start;
3865  } else {
3866  STBI_FREE(a.zout_start);
3867  return NULL;
3868  }
3869 }
3870 
3871 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
3872 {
3873  stbi__zbuf a;
3874  a.zbuffer = (stbi_uc *) ibuffer;
3875  a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
3876  if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
3877  return (int) (a.zout - a.zout_start);
3878  else
3879  return -1;
3880 }
3881 #endif
3882 
3883 // public domain "baseline" PNG decoder v0.10 Sean Barrett 2006-11-18
3884 // simple implementation
3885 // - only 8-bit samples
3886 // - no CRC checking
3887 // - allocates lots of intermediate memory
3888 // - avoids problem of streaming data between subsystems
3889 // - avoids explicit window management
3890 // performance
3891 // - uses stb_zlib, a PD zlib implementation with fast huffman decoding
3892 
3893 #ifndef STBI_NO_PNG
3894 typedef struct
3895 {
3896  stbi__uint32 length;
3897  stbi__uint32 type;
3898 } stbi__pngchunk;
3899 
3900 static stbi__pngchunk stbi__get_chunk_header(stbi__context *s)
3901 {
3902  stbi__pngchunk c;
3903  c.length = stbi__get32be(s);
3904  c.type = stbi__get32be(s);
3905  return c;
3906 }
3907 
3908 static int stbi__check_png_header(stbi__context *s)
3909 {
3910  static stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
3911  int i;
3912  for (i=0; i < 8; ++i)
3913  if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG");
3914  return 1;
3915 }
3916 
3917 typedef struct
3918 {
3919  stbi__context *s;
3920  stbi_uc *idata, *expanded, *out;
3921 } stbi__png;
3922 
3923 
3924 enum {
3925  STBI__F_none=0,
3926  STBI__F_sub=1,
3927  STBI__F_up=2,
3928  STBI__F_avg=3,
3929  STBI__F_paeth=4,
3930  // synthetic filters used for first scanline to avoid needing a dummy row of 0s
3931  STBI__F_avg_first,
3932  STBI__F_paeth_first
3933 };
3934 
3935 static stbi_uc first_row_filter[5] =
3936 {
3937  STBI__F_none,
3938  STBI__F_sub,
3939  STBI__F_none,
3940  STBI__F_avg_first,
3941  STBI__F_paeth_first
3942 };
3943 
3944 static int stbi__paeth(int a, int b, int c)
3945 {
3946  int p = a + b - c;
3947  int pa = abs(p-a);
3948  int pb = abs(p-b);
3949  int pc = abs(p-c);
3950  if (pa <= pb && pa <= pc) return a;
3951  if (pb <= pc) return b;
3952  return c;
3953 }
3954 
3955 static stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
3956 
3957 // create the png data from post-deflated data
3958 static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
3959 {
3960  stbi__context *s = a->s;
3961  stbi__uint32 i,j,stride = x*out_n;
3962  stbi__uint32 img_len, img_width_bytes;
3963  int k;
3964  int img_n = s->img_n; // copy it into a local for later
3965 
3966  STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1);
3967  a->out = (stbi_uc *) stbi__malloc(x * y * out_n); // extra bytes to write off the end into
3968  if (!a->out) return stbi__err("outofmem", "Out of memory");
3969 
3970  img_width_bytes = (((img_n * x * depth) + 7) >> 3);
3971  img_len = (img_width_bytes + 1) * y;
3972  if (s->img_x == x && s->img_y == y) {
3973  if (raw_len != img_len) return stbi__err("not enough pixels","Corrupt PNG");
3974  } else { // interlaced:
3975  if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG");
3976  }
3977 
3978  for (j=0; j < y; ++j) {
3979  stbi_uc *cur = a->out + stride*j;
3980  stbi_uc *prior = cur - stride;
3981  int filter = *raw++;
3982  int filter_bytes = img_n;
3983  int width = x;
3984  if (filter > 4)
3985  return stbi__err("invalid filter","Corrupt PNG");
3986 
3987  if (depth < 8) {
3988  STBI_ASSERT(img_width_bytes <= x);
3989  cur += x*out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place
3990  filter_bytes = 1;
3991  width = img_width_bytes;
3992  }
3993 
3994  // if first row, use special filter that doesn't sample previous row
3995  if (j == 0) filter = first_row_filter[filter];
3996 
3997  // handle first byte explicitly
3998  for (k=0; k < filter_bytes; ++k) {
3999  switch (filter) {
4000  case STBI__F_none : cur[k] = raw[k]; break;
4001  case STBI__F_sub : cur[k] = raw[k]; break;
4002  case STBI__F_up : cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4003  case STBI__F_avg : cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1)); break;
4004  case STBI__F_paeth : cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0,prior[k],0)); break;
4005  case STBI__F_avg_first : cur[k] = raw[k]; break;
4006  case STBI__F_paeth_first: cur[k] = raw[k]; break;
4007  }
4008  }
4009 
4010  if (depth == 8) {
4011  if (img_n != out_n)
4012  cur[img_n] = 255; // first pixel
4013  raw += img_n;
4014  cur += out_n;
4015  prior += out_n;
4016  } else {
4017  raw += 1;
4018  cur += 1;
4019  prior += 1;
4020  }
4021 
4022  // this is a little gross, so that we don't switch per-pixel or per-component
4023  if (depth < 8 || img_n == out_n) {
4024  int nk = (width - 1)*img_n;
4025  #define CASE(f) \
4026  case f: \
4027  for (k=0; k < nk; ++k)
4028  switch (filter) {
4029  // "none" filter turns into a memcpy here; make that explicit.
4030  case STBI__F_none: memcpy(cur, raw, nk); break;
4031  CASE(STBI__F_sub) cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]); break;
4032  CASE(STBI__F_up) cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4033  CASE(STBI__F_avg) cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1)); break;
4034  CASE(STBI__F_paeth) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],prior[k],prior[k-filter_bytes])); break;
4035  CASE(STBI__F_avg_first) cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1)); break;
4036  CASE(STBI__F_paeth_first) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],0,0)); break;
4037  }
4038  #undef CASE
4039  raw += nk;
4040  } else {
4041  STBI_ASSERT(img_n+1 == out_n);
4042  #define CASE(f) \
4043  case f: \
4044  for (i=x-1; i >= 1; --i, cur[img_n]=255,raw+=img_n,cur+=out_n,prior+=out_n) \
4045  for (k=0; k < img_n; ++k)
4046  switch (filter) {
4047  CASE(STBI__F_none) cur[k] = raw[k]; break;
4048  CASE(STBI__F_sub) cur[k] = STBI__BYTECAST(raw[k] + cur[k-out_n]); break;
4049  CASE(STBI__F_up) cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4050  CASE(STBI__F_avg) cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-out_n])>>1)); break;
4051  CASE(STBI__F_paeth) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-out_n],prior[k],prior[k-out_n])); break;
4052  CASE(STBI__F_avg_first) cur[k] = STBI__BYTECAST(raw[k] + (cur[k-out_n] >> 1)); break;
4053  CASE(STBI__F_paeth_first) cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-out_n],0,0)); break;
4054  }
4055  #undef CASE
4056  }
4057  }
4058 
4059  // we make a separate pass to expand bits to pixels; for performance,
4060  // this could run two scanlines behind the above code, so it won't
4061  // intefere with filtering but will still be in the cache.
4062  if (depth < 8) {
4063  for (j=0; j < y; ++j) {
4064  stbi_uc *cur = a->out + stride*j;
4065  stbi_uc *in = a->out + stride*j + x*out_n - img_width_bytes;
4066  // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
4067  // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
4068  stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range
4069 
4070  // note that the final byte might overshoot and write more data than desired.
4071  // we can allocate enough data that this never writes out of memory, but it
4072  // could also overwrite the next scanline. can it overwrite non-empty data
4073  // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
4074  // so we need to explicitly clamp the final ones
4075 
4076  if (depth == 4) {
4077  for (k=x*img_n; k >= 2; k-=2, ++in) {
4078  *cur++ = scale * ((*in >> 4) );
4079  *cur++ = scale * ((*in ) & 0x0f);
4080  }
4081  if (k > 0) *cur++ = scale * ((*in >> 4) );
4082  } else if (depth == 2) {
4083  for (k=x*img_n; k >= 4; k-=4, ++in) {
4084  *cur++ = scale * ((*in >> 6) );
4085  *cur++ = scale * ((*in >> 4) & 0x03);
4086  *cur++ = scale * ((*in >> 2) & 0x03);
4087  *cur++ = scale * ((*in ) & 0x03);
4088  }
4089  if (k > 0) *cur++ = scale * ((*in >> 6) );
4090  if (k > 1) *cur++ = scale * ((*in >> 4) & 0x03);
4091  if (k > 2) *cur++ = scale * ((*in >> 2) & 0x03);
4092  } else if (depth == 1) {
4093  for (k=x*img_n; k >= 8; k-=8, ++in) {
4094  *cur++ = scale * ((*in >> 7) );
4095  *cur++ = scale * ((*in >> 6) & 0x01);
4096  *cur++ = scale * ((*in >> 5) & 0x01);
4097  *cur++ = scale * ((*in >> 4) & 0x01);
4098  *cur++ = scale * ((*in >> 3) & 0x01);
4099  *cur++ = scale * ((*in >> 2) & 0x01);
4100  *cur++ = scale * ((*in >> 1) & 0x01);
4101  *cur++ = scale * ((*in ) & 0x01);
4102  }
4103  if (k > 0) *cur++ = scale * ((*in >> 7) );
4104  if (k > 1) *cur++ = scale * ((*in >> 6) & 0x01);
4105  if (k > 2) *cur++ = scale * ((*in >> 5) & 0x01);
4106  if (k > 3) *cur++ = scale * ((*in >> 4) & 0x01);
4107  if (k > 4) *cur++ = scale * ((*in >> 3) & 0x01);
4108  if (k > 5) *cur++ = scale * ((*in >> 2) & 0x01);
4109  if (k > 6) *cur++ = scale * ((*in >> 1) & 0x01);
4110  }
4111  if (img_n != out_n) {
4112  // insert alpha = 255
4113  stbi_uc *curl = a->out + stride*j;
4114  int il;
4115  if (img_n == 1) {
4116  for (il=x-1; il >= 0; --il) {
4117  curl[il*2+1] = 255;
4118  curl[il*2+0] = curl[il];
4119  }
4120  } else {
4121  STBI_ASSERT(img_n == 3);
4122  for (il=x-1; il >= 0; --il) {
4123  curl[il*4+3] = 255;
4124  curl[il*4+2] = curl[il*3+2];
4125  curl[il*4+1] = curl[il*3+1];
4126  curl[il*4+0] = curl[il*3+0];
4127  }
4128  }
4129  }
4130  }
4131  }
4132 
4133  return 1;
4134 }
4135 
4136 static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
4137 {
4138  stbi_uc *final;
4139  int p;
4140  if (!interlaced)
4141  return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
4142 
4143  // de-interlacing
4144  final = (stbi_uc *) stbi__malloc(a->s->img_x * a->s->img_y * out_n);
4145  for (p=0; p < 7; ++p) {
4146  int xorig[] = { 0,4,0,2,0,1,0 };
4147  int yorig[] = { 0,0,4,0,2,0,1 };
4148  int xspc[] = { 8,8,4,4,2,2,1 };
4149  int yspc[] = { 8,8,8,4,4,2,2 };
4150  int i,j,x,y;
4151  // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
4152  x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
4153  y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
4154  if (x && y) {
4155  stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
4156  if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
4157  STBI_FREE(final);
4158  return 0;
4159  }
4160  for (j=0; j < y; ++j) {
4161  for (i=0; i < x; ++i) {
4162  int out_y = j*yspc[p]+yorig[p];
4163  int out_x = i*xspc[p]+xorig[p];
4164  memcpy(final + out_y*a->s->img_x*out_n + out_x*out_n,
4165  a->out + (j*x+i)*out_n, out_n);
4166  }
4167  }
4168  STBI_FREE(a->out);
4169  image_data += img_len;
4170  image_data_len -= img_len;
4171  }
4172  }
4173  a->out = final;
4174 
4175  return 1;
4176 }
4177 
4178 static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n)
4179 {
4180  stbi__context *s = z->s;
4181  stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4182  stbi_uc *p = z->out;
4183 
4184  // compute color-based transparency, assuming we've
4185  // already got 255 as the alpha value in the output
4186  STBI_ASSERT(out_n == 2 || out_n == 4);
4187 
4188  if (out_n == 2) {
4189  for (i=0; i < pixel_count; ++i) {
4190  p[1] = (p[0] == tc[0] ? 0 : 255);
4191  p += 2;
4192  }
4193  } else {
4194  for (i=0; i < pixel_count; ++i) {
4195  if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4196  p[3] = 0;
4197  p += 4;
4198  }
4199  }
4200  return 1;
4201 }
4202 
4203 static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n)
4204 {
4205  stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
4206  stbi_uc *p, *temp_out, *orig = a->out;
4207 
4208  p = (stbi_uc *) stbi__malloc(pixel_count * pal_img_n);
4209  if (p == NULL) return stbi__err("outofmem", "Out of memory");
4210 
4211  // between here and free(out) below, exitting would leak
4212  temp_out = p;
4213 
4214  if (pal_img_n == 3) {
4215  for (i=0; i < pixel_count; ++i) {
4216  int n = orig[i]*4;
4217  p[0] = palette[n ];
4218  p[1] = palette[n+1];
4219  p[2] = palette[n+2];
4220  p += 3;
4221  }
4222  } else {
4223  for (i=0; i < pixel_count; ++i) {
4224  int n = orig[i]*4;
4225  p[0] = palette[n ];
4226  p[1] = palette[n+1];
4227  p[2] = palette[n+2];
4228  p[3] = palette[n+3];
4229  p += 4;
4230  }
4231  }
4232  STBI_FREE(a->out);
4233  a->out = temp_out;
4234 
4235  STBI_NOTUSED(len);
4236 
4237  return 1;
4238 }
4239 
4240 static int stbi__unpremultiply_on_load = 0;
4241 static int stbi__de_iphone_flag = 0;
4242 
4243 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
4244 {
4245  stbi__unpremultiply_on_load = flag_true_if_should_unpremultiply;
4246 }
4247 
4248 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
4249 {
4250  stbi__de_iphone_flag = flag_true_if_should_convert;
4251 }
4252 
4253 static void stbi__de_iphone(stbi__png *z)
4254 {
4255  stbi__context *s = z->s;
4256  stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4257  stbi_uc *p = z->out;
4258 
4259  if (s->img_out_n == 3) { // convert bgr to rgb
4260  for (i=0; i < pixel_count; ++i) {
4261  stbi_uc t = p[0];
4262  p[0] = p[2];
4263  p[2] = t;
4264  p += 3;
4265  }
4266  } else {
4267  STBI_ASSERT(s->img_out_n == 4);
4268  if (stbi__unpremultiply_on_load) {
4269  // convert bgr to rgb and unpremultiply
4270  for (i=0; i < pixel_count; ++i) {
4271  stbi_uc a = p[3];
4272  stbi_uc t = p[0];
4273  if (a) {
4274  p[0] = p[2] * 255 / a;
4275  p[1] = p[1] * 255 / a;
4276  p[2] = t * 255 / a;
4277  } else {
4278  p[0] = p[2];
4279  p[2] = t;
4280  }
4281  p += 4;
4282  }
4283  } else {
4284  // convert bgr to rgb
4285  for (i=0; i < pixel_count; ++i) {
4286  stbi_uc t = p[0];
4287  p[0] = p[2];
4288  p[2] = t;
4289  p += 4;
4290  }
4291  }
4292  }
4293 }
4294 
4295 #define STBI__PNG_TYPE(a,b,c,d) (((a) << 24) + ((b) << 16) + ((c) << 8) + (d))
4296 
4297 static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp)
4298 {
4299  stbi_uc palette[1024], pal_img_n=0;
4300  stbi_uc has_trans=0, tc[3];
4301  stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0;
4302  int first=1,k,interlace=0, color=0, depth=0, is_iphone=0;
4303  stbi__context *s = z->s;
4304 
4305  z->expanded = NULL;
4306  z->idata = NULL;
4307  z->out = NULL;
4308 
4309  if (!stbi__check_png_header(s)) return 0;
4310 
4311  if (scan == STBI__SCAN_type) return 1;
4312 
4313  for (;;) {
4314  stbi__pngchunk c = stbi__get_chunk_header(s);
4315  switch (c.type) {
4316  case STBI__PNG_TYPE('C','g','B','I'):
4317  is_iphone = 1;
4318  stbi__skip(s, c.length);
4319  break;
4320  case STBI__PNG_TYPE('I','H','D','R'): {
4321  int comp,filter;
4322  if (!first) return stbi__err("multiple IHDR","Corrupt PNG");
4323  first = 0;
4324  if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG");
4325  s->img_x = stbi__get32be(s); if (s->img_x > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4326  s->img_y = stbi__get32be(s); if (s->img_y > (1 << 24)) return stbi__err("too large","Very large image (corrupt?)");
4327  depth = stbi__get8(s); if (depth != 1 && depth != 2 && depth != 4 && depth != 8) return stbi__err("1/2/4/8-bit only","PNG not supported: 1/2/4/8-bit only");
4328  color = stbi__get8(s); if (color > 6) return stbi__err("bad ctype","Corrupt PNG");
4329  if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG");
4330  comp = stbi__get8(s); if (comp) return stbi__err("bad comp method","Corrupt PNG");
4331  filter= stbi__get8(s); if (filter) return stbi__err("bad filter method","Corrupt PNG");
4332  interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG");
4333  if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG");
4334  if (!pal_img_n) {
4335  s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
4336  if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
4337  if (scan == STBI__SCAN_header) return 1;
4338  } else {
4339  // if paletted, then pal_n is our final components, and
4340  // img_n is # components to decompress/filter.
4341  s->img_n = 1;
4342  if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG");
4343  // if SCAN_header, have to scan to see if we have a tRNS
4344  }
4345  break;
4346  }
4347 
4348  case STBI__PNG_TYPE('P','L','T','E'): {
4349  if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4350  if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
4351  pal_len = c.length / 3;
4352  if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG");
4353  for (i=0; i < pal_len; ++i) {
4354  palette[i*4+0] = stbi__get8(s);
4355  palette[i*4+1] = stbi__get8(s);
4356  palette[i*4+2] = stbi__get8(s);
4357  palette[i*4+3] = 255;
4358  }
4359  break;
4360  }
4361 
4362  case STBI__PNG_TYPE('t','R','N','S'): {
4363  if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4364  if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG");
4365  if (pal_img_n) {
4366  if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; }
4367  if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
4368  if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG");
4369  pal_img_n = 4;
4370  for (i=0; i < c.length; ++i)
4371  palette[i*4+3] = stbi__get8(s);
4372  } else {
4373  if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
4374  if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG");
4375  has_trans = 1;
4376  for (k=0; k < s->img_n; ++k)
4377  tc[k] = (stbi_uc) (stbi__get16be(s) & 255) * stbi__depth_scale_table[depth]; // non 8-bit images will be larger
4378  }
4379  break;
4380  }
4381 
4382  case STBI__PNG_TYPE('I','D','A','T'): {
4383  if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4384  if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG");
4385  if (scan == STBI__SCAN_header) { s->img_n = pal_img_n; return 1; }
4386  if ((int)(ioff + c.length) < (int)ioff) return 0;
4387  if (ioff + c.length > idata_limit) {
4388  stbi_uc *p;
4389  if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
4390  while (ioff + c.length > idata_limit)
4391  idata_limit *= 2;
4392  p = (stbi_uc *) STBI_REALLOC(z->idata, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
4393  z->idata = p;
4394  }
4395  if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG");
4396  ioff += c.length;
4397  break;
4398  }
4399 
4400  case STBI__PNG_TYPE('I','E','N','D'): {
4401  stbi__uint32 raw_len, bpl;
4402  if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4403  if (scan != STBI__SCAN_load) return 1;
4404  if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG");
4405  // initial guess for decoded data size to avoid unnecessary reallocs
4406  bpl = (s->img_x * depth + 7) / 8; // bytes per line, per component
4407  raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
4408  z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone);
4409  if (z->expanded == NULL) return 0; // zlib should set error
4410  STBI_FREE(z->idata); z->idata = NULL;
4411  if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
4412  s->img_out_n = s->img_n+1;
4413  else
4414  s->img_out_n = s->img_n;
4415  if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, depth, color, interlace)) return 0;
4416  if (has_trans)
4417  if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
4418  if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
4419  stbi__de_iphone(z);
4420  if (pal_img_n) {
4421  // pal_img_n == 3 or 4
4422  s->img_n = pal_img_n; // record the actual colors we had
4423  s->img_out_n = pal_img_n;
4424  if (req_comp >= 3) s->img_out_n = req_comp;
4425  if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
4426  return 0;
4427  }
4428  STBI_FREE(z->expanded); z->expanded = NULL;
4429  return 1;
4430  }
4431 
4432  default:
4433  // if critical, fail
4434  if (first) return stbi__err("first not IHDR", "Corrupt PNG");
4435  if ((c.type & (1 << 29)) == 0) {
4436  #ifndef STBI_NO_FAILURE_STRINGS
4437  // not threadsafe
4438  static char invalid_chunk[] = "XXXX PNG chunk not known";
4439  invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
4440  invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
4441  invalid_chunk[2] = STBI__BYTECAST(c.type >> 8);
4442  invalid_chunk[3] = STBI__BYTECAST(c.type >> 0);
4443  #endif
4444  return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
4445  }
4446  stbi__skip(s, c.length);
4447  break;
4448  }
4449  // end of PNG chunk, read and skip CRC
4450  stbi__get32be(s);
4451  }
4452 }
4453 
4454 static unsigned char *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp)
4455 {
4456  unsigned char *result=NULL;
4457  if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
4458  if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) {
4459  result = p->out;
4460  p->out = NULL;
4461  if (req_comp && req_comp != p->s->img_out_n) {
4462  result = stbi__convert_format(result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
4463  p->s->img_out_n = req_comp;
4464  if (result == NULL) return result;
4465  }
4466  *x = p->s->img_x;
4467  *y = p->s->img_y;
4468  if (n) *n = p->s->img_out_n;
4469  }
4470  STBI_FREE(p->out); p->out = NULL;
4471  STBI_FREE(p->expanded); p->expanded = NULL;
4472  STBI_FREE(p->idata); p->idata = NULL;
4473 
4474  return result;
4475 }
4476 
4477 static unsigned char *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4478 {
4479  stbi__png p;
4480  p.s = s;
4481  return stbi__do_png(&p, x,y,comp,req_comp);
4482 }
4483 
4484 static int stbi__png_test(stbi__context *s)
4485 {
4486  int r;
4487  r = stbi__check_png_header(s);
4488  stbi__rewind(s);
4489  return r;
4490 }
4491 
4492 static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp)
4493 {
4494  if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) {
4495  stbi__rewind( p->s );
4496  return 0;
4497  }
4498  if (x) *x = p->s->img_x;
4499  if (y) *y = p->s->img_y;
4500  if (comp) *comp = p->s->img_n;
4501  return 1;
4502 }
4503 
4504 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp)
4505 {
4506  stbi__png p;
4507  p.s = s;
4508  return stbi__png_info_raw(&p, x, y, comp);
4509 }
4510 #endif
4511 
4512 // Microsoft/Windows BMP image
4513 
4514 #ifndef STBI_NO_BMP
4515 static int stbi__bmp_test_raw(stbi__context *s)
4516 {
4517  int r;
4518  int sz;
4519  if (stbi__get8(s) != 'B') return 0;
4520  if (stbi__get8(s) != 'M') return 0;
4521  stbi__get32le(s); // discard filesize
4522  stbi__get16le(s); // discard reserved
4523  stbi__get16le(s); // discard reserved
4524  stbi__get32le(s); // discard data offset
4525  sz = stbi__get32le(s);
4526  r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
4527  return r;
4528 }
4529 
4530 static int stbi__bmp_test(stbi__context *s)
4531 {
4532  int r = stbi__bmp_test_raw(s);
4533  stbi__rewind(s);
4534  return r;
4535 }
4536 
4537 
4538 // returns 0..31 for the highest set bit
4539 static int stbi__high_bit(unsigned int z)
4540 {
4541  int n=0;
4542  if (z == 0) return -1;
4543  if (z >= 0x10000) n += 16, z >>= 16;
4544  if (z >= 0x00100) n += 8, z >>= 8;
4545  if (z >= 0x00010) n += 4, z >>= 4;
4546  if (z >= 0x00004) n += 2, z >>= 2;
4547  if (z >= 0x00002) n += 1, z >>= 1;
4548  return n;
4549 }
4550 
4551 static int stbi__bitcount(unsigned int a)
4552 {
4553  a = (a & 0x55555555) + ((a >> 1) & 0x55555555); // max 2
4554  a = (a & 0x33333333) + ((a >> 2) & 0x33333333); // max 4
4555  a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
4556  a = (a + (a >> 8)); // max 16 per 8 bits
4557  a = (a + (a >> 16)); // max 32 per 8 bits
4558  return a & 0xff;
4559 }
4560 
4561 static int stbi__shiftsigned(int v, int shift, int bits)
4562 {
4563  int result;
4564  int z=0;
4565 
4566  if (shift < 0) v <<= -shift;
4567  else v >>= shift;
4568  result = v;
4569 
4570  z = bits;
4571  while (z < 8) {
4572  result += v >> z;
4573  z += bits;
4574  }
4575  return result;
4576 }
4577 
4578 static stbi_uc *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4579 {
4580  stbi_uc *out;
4581  unsigned int mr=0,mg=0,mb=0,ma=0, fake_a=0;
4582  stbi_uc pal[256][4];
4583  int psize=0,i,j,compress=0,width;
4584  int bpp, flip_vertically, pad, target, offset, hsz;
4585  if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
4586  stbi__get32le(s); // discard filesize
4587  stbi__get16le(s); // discard reserved
4588  stbi__get16le(s); // discard reserved
4589  offset = stbi__get32le(s);
4590  hsz = stbi__get32le(s);
4591  if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
4592  if (hsz == 12) {
4593  s->img_x = stbi__get16le(s);
4594  s->img_y = stbi__get16le(s);
4595  } else {
4596  s->img_x = stbi__get32le(s);
4597  s->img_y = stbi__get32le(s);
4598  }
4599  if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
4600  bpp = stbi__get16le(s);
4601  if (bpp == 1) return stbi__errpuc("monochrome", "BMP type not supported: 1-bit");
4602  flip_vertically = ((int) s->img_y) > 0;
4603  s->img_y = abs((int) s->img_y);
4604  if (hsz == 12) {
4605  if (bpp < 24)
4606  psize = (offset - 14 - 24) / 3;
4607  } else {
4608  compress = stbi__get32le(s);
4609  if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
4610  stbi__get32le(s); // discard sizeof
4611  stbi__get32le(s); // discard hres
4612  stbi__get32le(s); // discard vres
4613  stbi__get32le(s); // discard colorsused
4614  stbi__get32le(s); // discard max important
4615  if (hsz == 40 || hsz == 56) {
4616  if (hsz == 56) {
4617  stbi__get32le(s);
4618  stbi__get32le(s);
4619  stbi__get32le(s);
4620  stbi__get32le(s);
4621  }
4622  if (bpp == 16 || bpp == 32) {
4623  mr = mg = mb = 0;
4624  if (compress == 0) {
4625  if (bpp == 32) {
4626  mr = 0xffu << 16;
4627  mg = 0xffu << 8;
4628  mb = 0xffu << 0;
4629  ma = 0xffu << 24;
4630  fake_a = 1; // @TODO: check for cases like alpha value is all 0 and switch it to 255
4631  STBI_NOTUSED(fake_a);
4632  } else {
4633  mr = 31u << 10;
4634  mg = 31u << 5;
4635  mb = 31u << 0;
4636  }
4637  } else if (compress == 3) {
4638  mr = stbi__get32le(s);
4639  mg = stbi__get32le(s);
4640  mb = stbi__get32le(s);
4641  // not documented, but generated by photoshop and handled by mspaint
4642  if (mr == mg && mg == mb) {
4643  // ?!?!?
4644  return stbi__errpuc("bad BMP", "bad BMP");
4645  }
4646  } else
4647  return stbi__errpuc("bad BMP", "bad BMP");
4648  }
4649  } else {
4650  STBI_ASSERT(hsz == 108 || hsz == 124);
4651  mr = stbi__get32le(s);
4652  mg = stbi__get32le(s);
4653  mb = stbi__get32le(s);
4654  ma = stbi__get32le(s);
4655  stbi__get32le(s); // discard color space
4656  for (i=0; i < 12; ++i)
4657  stbi__get32le(s); // discard color space parameters
4658  if (hsz == 124) {
4659  stbi__get32le(s); // discard rendering intent
4660  stbi__get32le(s); // discard offset of profile data
4661  stbi__get32le(s); // discard size of profile data
4662  stbi__get32le(s); // discard reserved
4663  }
4664  }
4665  if (bpp < 16)
4666  psize = (offset - 14 - hsz) >> 2;
4667  }
4668  s->img_n = ma ? 4 : 3;
4669  if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
4670  target = req_comp;
4671  else
4672  target = s->img_n; // if they want monochrome, we'll post-convert
4673  out = (stbi_uc *) stbi__malloc(target * s->img_x * s->img_y);
4674  if (!out) return stbi__errpuc("outofmem", "Out of memory");
4675  if (bpp < 16) {
4676  int z=0;
4677  if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
4678  for (i=0; i < psize; ++i) {
4679  pal[i][2] = stbi__get8(s);
4680  pal[i][1] = stbi__get8(s);
4681  pal[i][0] = stbi__get8(s);
4682  if (hsz != 12) stbi__get8(s);
4683  pal[i][3] = 255;
4684  }
4685  stbi__skip(s, offset - 14 - hsz - psize * (hsz == 12 ? 3 : 4));
4686  if (bpp == 4) width = (s->img_x + 1) >> 1;
4687  else if (bpp == 8) width = s->img_x;
4688  else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
4689  pad = (-width)&3;
4690  for (j=0; j < (int) s->img_y; ++j) {
4691  for (i=0; i < (int) s->img_x; i += 2) {
4692  int v=stbi__get8(s),v2=0;
4693  if (bpp == 4) {
4694  v2 = v & 15;
4695  v >>= 4;
4696  }
4697  out[z++] = pal[v][0];
4698  out[z++] = pal[v][1];
4699  out[z++] = pal[v][2];
4700  if (target == 4) out[z++] = 255;
4701  if (i+1 == (int) s->img_x) break;
4702  v = (bpp == 8) ? stbi__get8(s) : v2;
4703  out[z++] = pal[v][0];
4704  out[z++] = pal[v][1];
4705  out[z++] = pal[v][2];
4706  if (target == 4) out[z++] = 255;
4707  }
4708  stbi__skip(s, pad);
4709  }
4710  } else {
4711  int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
4712  int z = 0;
4713  int easy=0;
4714  stbi__skip(s, offset - 14 - hsz);
4715  if (bpp == 24) width = 3 * s->img_x;
4716  else if (bpp == 16) width = 2*s->img_x;
4717  else /* bpp = 32 and pad = 0 */ width=0;
4718  pad = (-width) & 3;
4719  if (bpp == 24) {
4720  easy = 1;
4721  } else if (bpp == 32) {
4722  if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
4723  easy = 2;
4724  }
4725  if (!easy) {
4726  if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
4727  // right shift amt to put high bit in position #7
4728  rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr);
4729  gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg);
4730  bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb);
4731  ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma);
4732  }
4733  for (j=0; j < (int) s->img_y; ++j) {
4734  if (easy) {
4735  for (i=0; i < (int) s->img_x; ++i) {
4736  unsigned char a;
4737  out[z+2] = stbi__get8(s);
4738  out[z+1] = stbi__get8(s);
4739  out[z+0] = stbi__get8(s);
4740  z += 3;
4741  a = (easy == 2 ? stbi__get8(s) : 255);
4742  if (target == 4) out[z++] = a;
4743  }
4744  } else {
4745  for (i=0; i < (int) s->img_x; ++i) {
4746  stbi__uint32 v = (bpp == 16 ? (stbi__uint32) stbi__get16le(s) : stbi__get32le(s));
4747  int a;
4748  out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
4749  out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
4750  out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
4751  a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
4752  if (target == 4) out[z++] = STBI__BYTECAST(a);
4753  }
4754  }
4755  stbi__skip(s, pad);
4756  }
4757  }
4758  if (flip_vertically) {
4759  stbi_uc t;
4760  for (j=0; j < (int) s->img_y>>1; ++j) {
4761  stbi_uc *p1 = out + j *s->img_x*target;
4762  stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
4763  for (i=0; i < (int) s->img_x*target; ++i) {
4764  t = p1[i], p1[i] = p2[i], p2[i] = t;
4765  }
4766  }
4767  }
4768 
4769  if (req_comp && req_comp != target) {
4770  out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
4771  if (out == NULL) return out; // stbi__convert_format frees input on failure
4772  }
4773 
4774  *x = s->img_x;
4775  *y = s->img_y;
4776  if (comp) *comp = s->img_n;
4777  return out;
4778 }
4779 #endif
4780 
4781 // Targa Truevision - TGA
4782 // by Jonathan Dummer
4783 #ifndef STBI_NO_TGA
4784 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp)
4785 {
4786  int tga_w, tga_h, tga_comp;
4787  int sz;
4788  stbi__get8(s); // discard Offset
4789  sz = stbi__get8(s); // color type
4790  if( sz > 1 ) {
4791  stbi__rewind(s);
4792  return 0; // only RGB or indexed allowed
4793  }
4794  sz = stbi__get8(s); // image type
4795  // only RGB or grey allowed, +/- RLE
4796  if ((sz != 1) && (sz != 2) && (sz != 3) && (sz != 9) && (sz != 10) && (sz != 11)) return 0;
4797  stbi__skip(s,9);
4798  tga_w = stbi__get16le(s);
4799  if( tga_w < 1 ) {
4800  stbi__rewind(s);
4801  return 0; // test width
4802  }
4803  tga_h = stbi__get16le(s);
4804  if( tga_h < 1 ) {
4805  stbi__rewind(s);
4806  return 0; // test height
4807  }
4808  sz = stbi__get8(s); // bits per pixel
4809  // only RGB or RGBA or grey allowed
4810  if ((sz != 8) && (sz != 16) && (sz != 24) && (sz != 32)) {
4811  stbi__rewind(s);
4812  return 0;
4813  }
4814  tga_comp = sz;
4815  if (x) *x = tga_w;
4816  if (y) *y = tga_h;
4817  if (comp) *comp = tga_comp / 8;
4818  return 1; // seems to have passed everything
4819 }
4820 
4821 static int stbi__tga_test(stbi__context *s)
4822 {
4823  int res;
4824  int sz;
4825  stbi__get8(s); // discard Offset
4826  sz = stbi__get8(s); // color type
4827  if ( sz > 1 ) return 0; // only RGB or indexed allowed
4828  sz = stbi__get8(s); // image type
4829  if ( (sz != 1) && (sz != 2) && (sz != 3) && (sz != 9) && (sz != 10) && (sz != 11) ) return 0; // only RGB or grey allowed, +/- RLE
4830  stbi__get16be(s); // discard palette start
4831  stbi__get16be(s); // discard palette length
4832  stbi__get8(s); // discard bits per palette color entry
4833  stbi__get16be(s); // discard x origin
4834  stbi__get16be(s); // discard y origin
4835  if ( stbi__get16be(s) < 1 ) return 0; // test width
4836  if ( stbi__get16be(s) < 1 ) return 0; // test height
4837  sz = stbi__get8(s); // bits per pixel
4838  if ( (sz != 8) && (sz != 16) && (sz != 24) && (sz != 32) )
4839  res = 0;
4840  else
4841  res = 1;
4842  stbi__rewind(s);
4843  return res;
4844 }
4845 
4846 static stbi_uc *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
4847 {
4848  // read in the TGA header stuff
4849  int tga_offset = stbi__get8(s);
4850  int tga_indexed = stbi__get8(s);
4851  int tga_image_type = stbi__get8(s);
4852  int tga_is_RLE = 0;
4853  int tga_palette_start = stbi__get16le(s);
4854  int tga_palette_len = stbi__get16le(s);
4855  int tga_palette_bits = stbi__get8(s);
4856  int tga_x_origin = stbi__get16le(s);
4857  int tga_y_origin = stbi__get16le(s);
4858  int tga_width = stbi__get16le(s);
4859  int tga_height = stbi__get16le(s);
4860  int tga_bits_per_pixel = stbi__get8(s);
4861  int tga_comp = tga_bits_per_pixel / 8;
4862  int tga_inverted = stbi__get8(s);
4863  // image data
4864  unsigned char *tga_data;
4865  unsigned char *tga_palette = NULL;
4866  int i, j;
4867  unsigned char raw_data[4];
4868  int RLE_count = 0;
4869  int RLE_repeating = 0;
4870  int read_next_pixel = 1;
4871 
4872  // do a tiny bit of precessing
4873  if ( tga_image_type >= 8 )
4874  {
4875  tga_image_type -= 8;
4876  tga_is_RLE = 1;
4877  }
4878  /* int tga_alpha_bits = tga_inverted & 15; */
4879  tga_inverted = 1 - ((tga_inverted >> 5) & 1);
4880 
4881  // error check
4882  if ( //(tga_indexed) ||
4883  (tga_width < 1) || (tga_height < 1) ||
4884  (tga_image_type < 1) || (tga_image_type > 3) ||
4885  ((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16) &&
4886  (tga_bits_per_pixel != 24) && (tga_bits_per_pixel != 32))
4887  )
4888  {
4889  return NULL; // we don't report this as a bad TGA because we don't even know if it's TGA
4890  }
4891 
4892  // If I'm paletted, then I'll use the number of bits from the palette
4893  if ( tga_indexed )
4894  {
4895  tga_comp = tga_palette_bits / 8;
4896  }
4897 
4898  // tga info
4899  *x = tga_width;
4900  *y = tga_height;
4901  if (comp) *comp = tga_comp;
4902 
4903  tga_data = (unsigned char*)stbi__malloc( (size_t)tga_width * tga_height * tga_comp );
4904  if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
4905 
4906  // skip to the data's starting position (offset usually = 0)
4907  stbi__skip(s, tga_offset );
4908 
4909  if ( !tga_indexed && !tga_is_RLE) {
4910  for (i=0; i < tga_height; ++i) {
4911  int yl = tga_inverted ? tga_height -i - 1 : i;
4912  stbi_uc *tga_row = tga_data + yl*tga_width*tga_comp;
4913  stbi__getn(s, tga_row, tga_width * tga_comp);
4914  }
4915  } else {
4916  // do I need to load a palette?
4917  if ( tga_indexed)
4918  {
4919  // any data to skip? (offset usually = 0)
4920  stbi__skip(s, tga_palette_start );
4921  // load the palette
4922  tga_palette = (unsigned char*)stbi__malloc( tga_palette_len * tga_palette_bits / 8 );
4923  if (!tga_palette) {
4924  STBI_FREE(tga_data);
4925  return stbi__errpuc("outofmem", "Out of memory");
4926  }
4927  if (!stbi__getn(s, tga_palette, tga_palette_len * tga_palette_bits / 8 )) {
4928  STBI_FREE(tga_data);
4929  STBI_FREE(tga_palette);
4930  return stbi__errpuc("bad palette", "Corrupt TGA");
4931  }
4932  }
4933  // load the data
4934  for (i=0; i < tga_width * tga_height; ++i)
4935  {
4936  // if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
4937  if ( tga_is_RLE )
4938  {
4939  if ( RLE_count == 0 )
4940  {
4941  // yep, get the next byte as a RLE command
4942  int RLE_cmd = stbi__get8(s);
4943  RLE_count = 1 + (RLE_cmd & 127);
4944  RLE_repeating = RLE_cmd >> 7;
4945  read_next_pixel = 1;
4946  } else if ( !RLE_repeating )
4947  {
4948  read_next_pixel = 1;
4949  }
4950  } else
4951  {
4952  read_next_pixel = 1;
4953  }
4954  // OK, if I need to read a pixel, do it now
4955  if ( read_next_pixel )
4956  {
4957  // load however much data we did have
4958  if ( tga_indexed )
4959  {
4960  // read in 1 byte, then perform the lookup
4961  int pal_idx = stbi__get8(s);
4962  if ( pal_idx >= tga_palette_len )
4963  {
4964  // invalid index
4965  pal_idx = 0;
4966  }
4967  pal_idx *= tga_bits_per_pixel / 8;
4968  for (j = 0; j*8 < tga_bits_per_pixel; ++j)
4969  {
4970  raw_data[j] = tga_palette[pal_idx+j];
4971  }
4972  } else
4973  {
4974  // read in the data raw
4975  for (j = 0; j*8 < tga_bits_per_pixel; ++j)
4976  {
4977  raw_data[j] = stbi__get8(s);
4978  }
4979  }
4980  // clear the reading flag for the next pixel
4981  read_next_pixel = 0;
4982  } // end of reading a pixel
4983 
4984  // copy data
4985  for (j = 0; j < tga_comp; ++j)
4986  tga_data[i*tga_comp+j] = raw_data[j];
4987 
4988  // in case we're in RLE mode, keep counting down
4989  --RLE_count;
4990  }
4991  // do I need to invert the image?
4992  if ( tga_inverted )
4993  {
4994  for (j = 0; j*2 < tga_height; ++j)
4995  {
4996  int index1 = j * tga_width * tga_comp;
4997  int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
4998  for (i = tga_width * tga_comp; i > 0; --i)
4999  {
5000  unsigned char temp = tga_data[index1];
5001  tga_data[index1] = tga_data[index2];
5002  tga_data[index2] = temp;
5003  ++index1;
5004  ++index2;
5005  }
5006  }
5007  }
5008  // clear my palette, if I had one
5009  if ( tga_palette != NULL )
5010  {
5011  STBI_FREE( tga_palette );
5012  }
5013  }
5014 
5015  // swap RGB
5016  if (tga_comp >= 3)
5017  {
5018  unsigned char* tga_pixel = tga_data;
5019  for (i=0; i < tga_width * tga_height; ++i)
5020  {
5021  unsigned char temp = tga_pixel[0];
5022  tga_pixel[0] = tga_pixel[2];
5023  tga_pixel[2] = temp;
5024  tga_pixel += tga_comp;
5025  }
5026  }
5027 
5028  // convert to target component count
5029  if (req_comp && req_comp != tga_comp)
5030  tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
5031 
5032  // the things I do to get rid of an error message, and yet keep
5033  // Microsoft's C compilers happy... [8^(
5034  tga_palette_start = tga_palette_len = tga_palette_bits =
5035  tga_x_origin = tga_y_origin = 0;
5036  // OK, done
5037  return tga_data;
5038 }
5039 #endif
5040 
5041 // *************************************************************************************************
5042 // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
5043 
5044 #ifndef STBI_NO_PSD
5045 static int stbi__psd_test(stbi__context *s)
5046 {
5047  int r = (stbi__get32be(s) == 0x38425053);
5048  stbi__rewind(s);
5049  return r;
5050 }
5051 
5052 static stbi_uc *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5053 {
5054  int pixelCount;
5055  int channelCount, compression;
5056  int channel, i, count, len;
5057  int w,h;
5058  stbi_uc *out;
5059 
5060  // Check identifier
5061  if (stbi__get32be(s) != 0x38425053) // "8BPS"
5062  return stbi__errpuc("not PSD", "Corrupt PSD image");
5063 
5064  // Check file type version.
5065  if (stbi__get16be(s) != 1)
5066  return stbi__errpuc("wrong version", "Unsupported version of PSD image");
5067 
5068  // Skip 6 reserved bytes.
5069  stbi__skip(s, 6 );
5070 
5071  // Read the number of channels (R, G, B, A, etc).
5072  channelCount = stbi__get16be(s);
5073  if (channelCount < 0 || channelCount > 16)
5074  return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
5075 
5076  // Read the rows and columns of the image.
5077  h = stbi__get32be(s);
5078  w = stbi__get32be(s);
5079 
5080  // Make sure the depth is 8 bits.
5081  if (stbi__get16be(s) != 8)
5082  return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 bit");
5083 
5084  // Make sure the color mode is RGB.
5085  // Valid options are:
5086  // 0: Bitmap
5087  // 1: Grayscale
5088  // 2: Indexed color
5089  // 3: RGB color
5090  // 4: CMYK color
5091  // 7: Multichannel
5092  // 8: Duotone
5093  // 9: Lab color
5094  if (stbi__get16be(s) != 3)
5095  return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
5096 
5097  // Skip the Mode Data. (It's the palette for indexed color; other info for other modes.)
5098  stbi__skip(s,stbi__get32be(s) );
5099 
5100  // Skip the image resources. (resolution, pen tool paths, etc)
5101  stbi__skip(s, stbi__get32be(s) );
5102 
5103  // Skip the reserved data.
5104  stbi__skip(s, stbi__get32be(s) );
5105 
5106  // Find out if the data is compressed.
5107  // Known values:
5108  // 0: no compression
5109  // 1: RLE compressed
5110  compression = stbi__get16be(s);
5111  if (compression > 1)
5112  return stbi__errpuc("bad compression", "PSD has an unknown compression format");
5113 
5114  // Create the destination image.
5115  out = (stbi_uc *) stbi__malloc(4 * w*h);
5116  if (!out) return stbi__errpuc("outofmem", "Out of memory");
5117  pixelCount = w*h;
5118 
5119  // Initialize the data to zero.
5120  //memset( out, 0, pixelCount * 4 );
5121 
5122  // Finally, the image data.
5123  if (compression) {
5124  // RLE as used by .PSD and .TIFF
5125  // Loop until you get the number of unpacked bytes you are expecting:
5126  // Read the next source byte into n.
5127  // If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
5128  // Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
5129  // Else if n is 128, noop.
5130  // Endloop
5131 
5132  // The RLE-compressed data is preceeded by a 2-byte data count for each row in the data,
5133  // which we're going to just skip.
5134  stbi__skip(s, h * channelCount * 2 );
5135 
5136  // Read the RLE data by channel.
5137  for (channel = 0; channel < 4; channel++) {
5138  stbi_uc *p;
5139 
5140  p = out+channel;
5141  if (channel >= channelCount) {
5142  // Fill this channel with default data.
5143  for (i = 0; i < pixelCount; i++, p += 4)
5144  *p = (channel == 3 ? 255 : 0);
5145  } else {
5146  // Read the RLE data.
5147  count = 0;
5148  while (count < pixelCount) {
5149  len = stbi__get8(s);
5150  if (len == 128) {
5151  // No-op.
5152  } else if (len < 128) {
5153  // Copy next len+1 bytes literally.
5154  len++;
5155  count += len;
5156  while (len) {
5157  *p = stbi__get8(s);
5158  p += 4;
5159  len--;
5160  }
5161  } else if (len > 128) {
5162  stbi_uc val;
5163  // Next -len+1 bytes in the dest are replicated from next source byte.
5164  // (Interpret len as a negative 8-bit int.)
5165  len ^= 0x0FF;
5166  len += 2;
5167  val = stbi__get8(s);
5168  count += len;
5169  while (len) {
5170  *p = val;
5171  p += 4;
5172  len--;
5173  }
5174  }
5175  }
5176  }
5177  }
5178 
5179  } else {
5180  // We're at the raw image data. It's each channel in order (Red, Green, Blue, Alpha, ...)
5181  // where each channel consists of an 8-bit value for each pixel in the image.
5182 
5183  // Read the data by channel.
5184  for (channel = 0; channel < 4; channel++) {
5185  stbi_uc *p;
5186 
5187  p = out + channel;
5188  if (channel > channelCount) {
5189  // Fill this channel with default data.
5190  for (i = 0; i < pixelCount; i++, p += 4)
5191  *p = channel == 3 ? 255 : 0;
5192  } else {
5193  // Read the data.
5194  for (i = 0; i < pixelCount; i++, p += 4)
5195  *p = stbi__get8(s);
5196  }
5197  }
5198  }
5199 
5200  if (req_comp && req_comp != 4) {
5201  out = stbi__convert_format(out, 4, req_comp, w, h);
5202  if (out == NULL) return out; // stbi__convert_format frees input on failure
5203  }
5204 
5205  if (comp) *comp = 4;
5206  *y = h;
5207  *x = w;
5208 
5209  return out;
5210 }
5211 #endif
5212 
5213 // *************************************************************************************************
5214 // Softimage PIC loader
5215 // by Tom Seddon
5216 //
5217 // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
5218 // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
5219 
5220 #ifndef STBI_NO_PIC
5221 static int stbi__pic_is4(stbi__context *s,const char *str)
5222 {
5223  int i;
5224  for (i=0; i<4; ++i)
5225  if (stbi__get8(s) != (stbi_uc)str[i])
5226  return 0;
5227 
5228  return 1;
5229 }
5230 
5231 static int stbi__pic_test_core(stbi__context *s)
5232 {
5233  int i;
5234 
5235  if (!stbi__pic_is4(s,"\x53\x80\xF6\x34"))
5236  return 0;
5237 
5238  for(i=0;i<84;++i)
5239  stbi__get8(s);
5240 
5241  if (!stbi__pic_is4(s,"PICT"))
5242  return 0;
5243 
5244  return 1;
5245 }
5246 
5247 typedef struct
5248 {
5249  stbi_uc size,type,channel;
5250 } stbi__pic_packet;
5251 
5252 static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest)
5253 {
5254  int mask=0x80, i;
5255 
5256  for (i=0; i<4; ++i, mask>>=1) {
5257  if (channel & mask) {
5258  if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short");
5259  dest[i]=stbi__get8(s);
5260  }
5261  }
5262 
5263  return dest;
5264 }
5265 
5266 static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src)
5267 {
5268  int mask=0x80,i;
5269 
5270  for (i=0;i<4; ++i, mask>>=1)
5271  if (channel&mask)
5272  dest[i]=src[i];
5273 }
5274 
5275 static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result)
5276 {
5277  int act_comp=0,num_packets=0,y,chained;
5278  stbi__pic_packet packets[10];
5279 
5280  // this will (should...) cater for even some bizarre stuff like having data
5281  // for the same channel in multiple packets.
5282  do {
5283  stbi__pic_packet *packet;
5284 
5285  if (num_packets==sizeof(packets)/sizeof(packets[0]))
5286  return stbi__errpuc("bad format","too many packets");
5287 
5288  packet = &packets[num_packets++];
5289 
5290  chained = stbi__get8(s);
5291  packet->size = stbi__get8(s);
5292  packet->type = stbi__get8(s);
5293  packet->channel = stbi__get8(s);
5294 
5295  act_comp |= packet->channel;
5296 
5297  if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (reading packets)");
5298  if (packet->size != 8) return stbi__errpuc("bad format","packet isn't 8bpp");
5299  } while (chained);
5300 
5301  *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
5302 
5303  for(y=0; y<height; ++y) {
5304  int packet_idx;
5305 
5306  for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
5307  stbi__pic_packet *packet = &packets[packet_idx];
5308  stbi_uc *dest = result+y*width*4;
5309 
5310  switch (packet->type) {
5311  default:
5312  return stbi__errpuc("bad format","packet has bad compression type");
5313 
5314  case 0: {//uncompressed
5315  int x;
5316 
5317  for(x=0;x<width;++x, dest+=4)
5318  if (!stbi__readval(s,packet->channel,dest))
5319  return 0;
5320  break;
5321  }
5322 
5323  case 1://Pure RLE
5324  {
5325  int left=width, i;
5326 
5327  while (left>0) {
5328  stbi_uc count,value[4];
5329 
5330  count=stbi__get8(s);
5331  if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pure read count)");
5332 
5333  if (count > left)
5334  count = (stbi_uc) left;
5335 
5336  if (!stbi__readval(s,packet->channel,value)) return 0;
5337 
5338  for(i=0; i<count; ++i,dest+=4)
5339  stbi__copyval(packet->channel,dest,value);
5340  left -= count;
5341  }
5342  }
5343  break;
5344 
5345  case 2: {//Mixed RLE
5346  int left=width;
5347  while (left>0) {
5348  int count = stbi__get8(s), i;
5349  if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (mixed read count)");
5350 
5351  if (count >= 128) { // Repeated
5352  stbi_uc value[4];
5353  int il;
5354 
5355  if (count==128)
5356  count = stbi__get16be(s);
5357  else
5358  count -= 127;
5359  if (count > left)
5360  return stbi__errpuc("bad file","scanline overrun");
5361 
5362  if (!stbi__readval(s,packet->channel,value))
5363  return 0;
5364 
5365  for(il=0;il<count;++il, dest += 4)
5366  stbi__copyval(packet->channel,dest,value);
5367  } else { // Raw
5368  ++count;
5369  if (count>left) return stbi__errpuc("bad file","scanline overrun");
5370 
5371  for(i=0;i<count;++i, dest+=4)
5372  if (!stbi__readval(s,packet->channel,dest))
5373  return 0;
5374  }
5375  left-=count;
5376  }
5377  break;
5378  }
5379  }
5380  }
5381  }
5382 
5383  return result;
5384 }
5385 
5386 static stbi_uc *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp)
5387 {
5388  stbi_uc *result;
5389  int i, x,y;
5390 
5391  for (i=0; i<92; ++i)
5392  stbi__get8(s);
5393 
5394  x = stbi__get16be(s);
5395  y = stbi__get16be(s);
5396  if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pic header)");
5397  if ((1 << 28) / x < y) return stbi__errpuc("too large", "Image too large to decode");
5398 
5399  stbi__get32be(s); //skip `ratio'
5400  stbi__get16be(s); //skip `fields'
5401  stbi__get16be(s); //skip `pad'
5402 
5403  // intermediate buffer is RGBA
5404  result = (stbi_uc *) stbi__malloc(x*y*4);
5405  memset(result, 0xff, x*y*4);
5406 
5407  if (!stbi__pic_load_core(s,x,y,comp, result)) {
5408  STBI_FREE(result);
5409  result=0;
5410  }
5411  *px = x;
5412  *py = y;
5413  if (req_comp == 0) req_comp = *comp;
5414  result=stbi__convert_format(result,4,req_comp,x,y);
5415 
5416  return result;
5417 }
5418 
5419 static int stbi__pic_test(stbi__context *s)
5420 {
5421  int r = stbi__pic_test_core(s);
5422  stbi__rewind(s);
5423  return r;
5424 }
5425 #endif
5426 
5427 // *************************************************************************************************
5428 // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
5429 
5430 #ifndef STBI_NO_GIF
5431 typedef struct
5432 {
5433  stbi__int16 prefix;
5434  stbi_uc first;
5435  stbi_uc suffix;
5436 } stbi__gif_lzw;
5437 
5438 typedef struct
5439 {
5440  int w,h;
5441  stbi_uc *out; // output buffer (always 4 components)
5442  int flags, bgindex, ratio, transparent, eflags;
5443  stbi_uc pal[256][4];
5444  stbi_uc lpal[256][4];
5445  stbi__gif_lzw codes[4096];
5446  stbi_uc *color_table;
5447  int parse, step;
5448  int lflags;
5449  int start_x, start_y;
5450  int max_x, max_y;
5451  int cur_x, cur_y;
5452  int line_size;
5453 } stbi__gif;
5454 
5455 static int stbi__gif_test_raw(stbi__context *s)
5456 {
5457  int sz;
5458  if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
5459  sz = stbi__get8(s);
5460  if (sz != '9' && sz != '7') return 0;
5461  if (stbi__get8(s) != 'a') return 0;
5462  return 1;
5463 }
5464 
5465 static int stbi__gif_test(stbi__context *s)
5466 {
5467  int r = stbi__gif_test_raw(s);
5468  stbi__rewind(s);
5469  return r;
5470 }
5471 
5472 static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp)
5473 {
5474  int i;
5475  for (i=0; i < num_entries; ++i) {
5476  pal[i][2] = stbi__get8(s);
5477  pal[i][1] = stbi__get8(s);
5478  pal[i][0] = stbi__get8(s);
5479  pal[i][3] = transp == i ? 0 : 255;
5480  }
5481 }
5482 
5483 static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info)
5484 {
5485  stbi_uc version;
5486  if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
5487  return stbi__err("not GIF", "Corrupt GIF");
5488 
5489  version = stbi__get8(s);
5490  if (version != '7' && version != '9') return stbi__err("not GIF", "Corrupt GIF");
5491  if (stbi__get8(s) != 'a') return stbi__err("not GIF", "Corrupt GIF");
5492 
5493  stbi__g_failure_reason = "";
5494  g->w = stbi__get16le(s);
5495  g->h = stbi__get16le(s);
5496  g->flags = stbi__get8(s);
5497  g->bgindex = stbi__get8(s);
5498  g->ratio = stbi__get8(s);
5499  g->transparent = -1;
5500 
5501  if (comp != 0) *comp = 4; // can't actually tell whether it's 3 or 4 until we parse the comments
5502 
5503  if (is_info) return 1;
5504 
5505  if (g->flags & 0x80)
5506  stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
5507 
5508  return 1;
5509 }
5510 
5511 static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp)
5512 {
5513  stbi__gif g;
5514  if (!stbi__gif_header(s, &g, comp, 1)) {
5515  stbi__rewind( s );
5516  return 0;
5517  }
5518  if (x) *x = g.w;
5519  if (y) *y = g.h;
5520  return 1;
5521 }
5522 
5523 static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code)
5524 {
5525  stbi_uc *p, *c;
5526 
5527  // recurse to decode the prefixes, since the linked-list is backwards,
5528  // and working backwards through an interleaved image would be nasty
5529  if (g->codes[code].prefix >= 0)
5530  stbi__out_gif_code(g, g->codes[code].prefix);
5531 
5532  if (g->cur_y >= g->max_y) return;
5533 
5534  p = &g->out[g->cur_x + g->cur_y];
5535  c = &g->color_table[g->codes[code].suffix * 4];
5536 
5537  if (c[3] >= 128) {
5538  p[0] = c[2];
5539  p[1] = c[1];
5540  p[2] = c[0];
5541  p[3] = c[3];
5542  }
5543  g->cur_x += 4;
5544 
5545  if (g->cur_x >= g->max_x) {
5546  g->cur_x = g->start_x;
5547  g->cur_y += g->step;
5548 
5549  while (g->cur_y >= g->max_y && g->parse > 0) {
5550  g->step = (1 << g->parse) * g->line_size;
5551  g->cur_y = g->start_y + (g->step >> 1);
5552  --g->parse;
5553  }
5554  }
5555 }
5556 
5557 static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g)
5558 {
5559  stbi_uc lzw_cs;
5560  stbi__int32 len, code;
5561  stbi__uint32 first;
5562  stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
5563  stbi__gif_lzw *p;
5564 
5565  lzw_cs = stbi__get8(s);
5566  if (lzw_cs > 12) return NULL;
5567  clear = 1 << lzw_cs;
5568  first = 1;
5569  codesize = lzw_cs + 1;
5570  codemask = (1 << codesize) - 1;
5571  bits = 0;
5572  valid_bits = 0;
5573  for (code = 0; code < clear; code++) {
5574  g->codes[code].prefix = -1;
5575  g->codes[code].first = (stbi_uc) code;
5576  g->codes[code].suffix = (stbi_uc) code;
5577  }
5578 
5579  // support no starting clear code
5580  avail = clear+2;
5581  oldcode = -1;
5582 
5583  len = 0;
5584  for(;;) {
5585  if (valid_bits < codesize) {
5586  if (len == 0) {
5587  len = stbi__get8(s); // start new block
5588  if (len == 0)
5589  return g->out;
5590  }
5591  --len;
5592  bits |= (stbi__int32) stbi__get8(s) << valid_bits;
5593  valid_bits += 8;
5594  } else {
5595  stbi__int32 codel = bits & codemask;
5596  bits >>= codesize;
5597  valid_bits -= codesize;
5598  // @OPTIMIZE: is there some way we can accelerate the non-clear path?
5599  if (codel == clear) { // clear code
5600  codesize = lzw_cs + 1;
5601  codemask = (1 << codesize) - 1;
5602  avail = clear + 2;
5603  oldcode = -1;
5604  first = 0;
5605  } else if (codel == clear + 1) { // end of stream code
5606  stbi__skip(s, len);
5607  while ((len = stbi__get8(s)) > 0)
5608  stbi__skip(s,len);
5609  return g->out;
5610  } else if (codel <= avail) {
5611  if (first) return stbi__errpuc("no clear code", "Corrupt GIF");
5612 
5613  if (oldcode >= 0) {
5614  p = &g->codes[avail++];
5615  if (avail > 4096) return stbi__errpuc("too many codes", "Corrupt GIF");
5616  p->prefix = (stbi__int16) oldcode;
5617  p->first = g->codes[oldcode].first;
5618  p->suffix = (codel == avail) ? p->first : g->codes[codel].first;
5619  } else if (codel == avail)
5620  return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5621 
5622  stbi__out_gif_code(g, (stbi__uint16) codel);
5623 
5624  if ((avail & codemask) == 0 && avail <= 0x0FFF) {
5625  codesize++;
5626  codemask = (1 << codesize) - 1;
5627  }
5628 
5629  oldcode = codel;
5630  } else {
5631  return stbi__errpuc("illegal code in raster", "Corrupt GIF");
5632  }
5633  }
5634  }
5635 }
5636 
5637 static void stbi__fill_gif_background(stbi__gif *g)
5638 {
5639  int i;
5640  stbi_uc *c = g->pal[g->bgindex];
5641  // @OPTIMIZE: write a dword at a time
5642  for (i = 0; i < g->w * g->h * 4; i += 4) {
5643  stbi_uc *p = &g->out[i];
5644  p[0] = c[2];
5645  p[1] = c[1];
5646  p[2] = c[0];
5647  p[3] = c[3];
5648  }
5649 }
5650 
5651 // this function is designed to support animated gifs, although stb_image doesn't support it
5652 static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp)
5653 {
5654  int i;
5655  stbi_uc *old_out = 0;
5656 
5657  if (g->out == 0) {
5658  if (!stbi__gif_header(s, g, comp,0)) return 0; // stbi__g_failure_reason set by stbi__gif_header
5659  g->out = (stbi_uc *) stbi__malloc(4 * g->w * g->h);
5660  if (g->out == 0) return stbi__errpuc("outofmem", "Out of memory");
5661  stbi__fill_gif_background(g);
5662  } else {
5663  // animated-gif-only path
5664  if (((g->eflags & 0x1C) >> 2) == 3) {
5665  old_out = g->out;
5666  g->out = (stbi_uc *) stbi__malloc(4 * g->w * g->h);
5667  if (g->out == 0) return stbi__errpuc("outofmem", "Out of memory");
5668  memcpy(g->out, old_out, g->w*g->h*4);
5669  }
5670  }
5671 
5672  for (;;) {
5673  switch (stbi__get8(s)) {
5674  case 0x2C: /* Image Descriptor */
5675  {
5676  stbi__int32 x, y, w, h;
5677  stbi_uc *o;
5678 
5679  x = stbi__get16le(s);
5680  y = stbi__get16le(s);
5681  w = stbi__get16le(s);
5682  h = stbi__get16le(s);
5683  if (((x + w) > (g->w)) || ((y + h) > (g->h)))
5684  return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
5685 
5686  g->line_size = g->w * 4;
5687  g->start_x = x * 4;
5688  g->start_y = y * g->line_size;
5689  g->max_x = g->start_x + w * 4;
5690  g->max_y = g->start_y + h * g->line_size;
5691  g->cur_x = g->start_x;
5692  g->cur_y = g->start_y;
5693 
5694  g->lflags = stbi__get8(s);
5695 
5696  if (g->lflags & 0x40) {
5697  g->step = 8 * g->line_size; // first interlaced spacing
5698  g->parse = 3;
5699  } else {
5700  g->step = g->line_size;
5701  g->parse = 0;
5702  }
5703 
5704  if (g->lflags & 0x80) {
5705  stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
5706  g->color_table = (stbi_uc *) g->lpal;
5707  } else if (g->flags & 0x80) {
5708  for (i=0; i < 256; ++i) // @OPTIMIZE: stbi__jpeg_reset only the previous transparent
5709  g->pal[i][3] = 255;
5710  if (g->transparent >= 0 && (g->eflags & 0x01))
5711  g->pal[g->transparent][3] = 0;
5712  g->color_table = (stbi_uc *) g->pal;
5713  } else
5714  return stbi__errpuc("missing color table", "Corrupt GIF");
5715 
5716  o = stbi__process_gif_raster(s, g);
5717  if (o == NULL) return NULL;
5718 
5719  if (req_comp && req_comp != 4)
5720  o = stbi__convert_format(o, 4, req_comp, g->w, g->h);
5721  return o;
5722  }
5723 
5724  case 0x21: // Comment Extension.
5725  {
5726  int len;
5727  if (stbi__get8(s) == 0xF9) { // Graphic Control Extension.
5728  len = stbi__get8(s);
5729  if (len == 4) {
5730  g->eflags = stbi__get8(s);
5731  stbi__get16le(s); // delay
5732  g->transparent = stbi__get8(s);
5733  } else {
5734  stbi__skip(s, len);
5735  break;
5736  }
5737  }
5738  while ((len = stbi__get8(s)) != 0)
5739  stbi__skip(s, len);
5740  break;
5741  }
5742 
5743  case 0x3B: // gif stream termination code
5744  return (stbi_uc *) s; // using '1' causes warning on some compilers
5745 
5746  default:
5747  return stbi__errpuc("unknown code", "Corrupt GIF");
5748  }
5749  }
5750 }
5751 
5752 static stbi_uc *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5753 {
5754  stbi_uc *u = 0;
5755  stbi__gif g;
5756  memset(&g, 0, sizeof(g));
5757 
5758  u = stbi__gif_load_next(s, &g, comp, req_comp);
5759  if (u == (stbi_uc *) s) u = 0; // end of animated gif marker
5760  if (u) {
5761  *x = g.w;
5762  *y = g.h;
5763  }
5764 
5765  return u;
5766 }
5767 
5768 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp)
5769 {
5770  return stbi__gif_info_raw(s,x,y,comp);
5771 }
5772 #endif
5773 
5774 // *************************************************************************************************
5775 // Radiance RGBE HDR loader
5776 // originally by Nicolas Schulz
5777 #ifndef STBI_NO_HDR
5778 static int stbi__hdr_test_core(stbi__context *s)
5779 {
5780  const char *signature = "#?RADIANCE\n";
5781  int i;
5782  for (i=0; signature[i]; ++i)
5783  if (stbi__get8(s) != signature[i])
5784  return 0;
5785  return 1;
5786 }
5787 
5788 static int stbi__hdr_test(stbi__context* s)
5789 {
5790  int r = stbi__hdr_test_core(s);
5791  stbi__rewind(s);
5792  return r;
5793 }
5794 
5795 #define STBI__HDR_BUFLEN 1024
5796 static char *stbi__hdr_gettoken(stbi__context *z, char *buffer)
5797 {
5798  int len=0;
5799  char c = '\0';
5800 
5801  c = (char) stbi__get8(z);
5802 
5803  while (!stbi__at_eof(z) && c != '\n') {
5804  buffer[len++] = c;
5805  if (len == STBI__HDR_BUFLEN-1) {
5806  // flush to end of line
5807  while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
5808  ;
5809  break;
5810  }
5811  c = (char) stbi__get8(z);
5812  }
5813 
5814  buffer[len] = 0;
5815  return buffer;
5816 }
5817 
5818 static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp)
5819 {
5820  if ( input[3] != 0 ) {
5821  float f1;
5822  // Exponent
5823  f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
5824  if (req_comp <= 2)
5825  output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
5826  else {
5827  output[0] = input[0] * f1;
5828  output[1] = input[1] * f1;
5829  output[2] = input[2] * f1;
5830  }
5831  if (req_comp == 2) output[1] = 1;
5832  if (req_comp == 4) output[3] = 1;
5833  } else {
5834  switch (req_comp) {
5835  case 4: output[3] = 1; /* fallthrough */
5836  case 3: output[0] = output[1] = output[2] = 0;
5837  break;
5838  case 2: output[1] = 1; /* fallthrough */
5839  case 1: output[0] = 0;
5840  break;
5841  }
5842  }
5843 }
5844 
5845 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
5846 {
5847  char buffer[STBI__HDR_BUFLEN];
5848  char *token;
5849  int valid = 0;
5850  int width, height;
5851  stbi_uc *scanline;
5852  float *hdr_data;
5853  int len;
5854  unsigned char count, value;
5855  int i, j, k, c1,c2, z;
5856 
5857 
5858  // Check identifier
5859  if (strcmp(stbi__hdr_gettoken(s,buffer), "#?RADIANCE") != 0)
5860  return stbi__errpf("not HDR", "Corrupt HDR image");
5861 
5862  // Parse header
5863  for(;;) {
5864  token = stbi__hdr_gettoken(s,buffer);
5865  if (token[0] == 0) break;
5866  if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
5867  }
5868 
5869  if (!valid) return stbi__errpf("unsupported format", "Unsupported HDR format");
5870 
5871  // Parse width and height
5872  // can't use sscanf() if we're not using stdio!
5873  token = stbi__hdr_gettoken(s,buffer);
5874  if (strncmp(token, "-Y ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
5875  token += 3;
5876  height = (int) strtol(token, &token, 10);
5877  while (*token == ' ') ++token;
5878  if (strncmp(token, "+X ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
5879  token += 3;
5880  width = (int) strtol(token, NULL, 10);
5881 
5882  *x = width;
5883  *y = height;
5884 
5885  if (comp) *comp = 3;
5886  if (req_comp == 0) req_comp = 3;
5887 
5888  // Read data
5889  hdr_data = (float *) stbi__malloc(height * width * req_comp * sizeof(float));
5890 
5891  // Load image data
5892  // image data is stored as some number of sca
5893  if ( width < 8 || width >= 32768) {
5894  // Read flat data
5895  for (j=0; j < height; ++j) {
5896  for (i=0; i < width; ++i) {
5897  stbi_uc rgbe[4];
5898  main_decode_loop:
5899  stbi__getn(s, rgbe, 4);
5900  stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
5901  }
5902  }
5903  } else {
5904  // Read RLE-encoded data
5905  scanline = NULL;
5906 
5907  for (j = 0; j < height; ++j) {
5908  c1 = stbi__get8(s);
5909  c2 = stbi__get8(s);
5910  len = stbi__get8(s);
5911  if (c1 != 2 || c2 != 2 || (len & 0x80)) {
5912  // not run-length encoded, so we have to actually use THIS data as a decoded
5913  // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
5914  stbi_uc rgbe[4];
5915  rgbe[0] = (stbi_uc) c1;
5916  rgbe[1] = (stbi_uc) c2;
5917  rgbe[2] = (stbi_uc) len;
5918  rgbe[3] = (stbi_uc) stbi__get8(s);
5919  stbi__hdr_convert(hdr_data, rgbe, req_comp);
5920  i = 1;
5921  j = 0;
5922  STBI_FREE(scanline);
5923  goto main_decode_loop; // yes, this makes no sense
5924  }
5925  len <<= 8;
5926  len |= stbi__get8(s);
5927  if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
5928  if (scanline == NULL) scanline = (stbi_uc *) stbi__malloc(width * 4);
5929 
5930  for (k = 0; k < 4; ++k) {
5931  i = 0;
5932  while (i < width) {
5933  count = stbi__get8(s);
5934  if (count > 128) {
5935  // Run
5936  value = stbi__get8(s);
5937  count -= 128;
5938  for (z = 0; z < count; ++z)
5939  scanline[i++ * 4 + k] = value;
5940  } else {
5941  // Dump
5942  for (z = 0; z < count; ++z)
5943  scanline[i++ * 4 + k] = stbi__get8(s);
5944  }
5945  }
5946  }
5947  for (i=0; i < width; ++i)
5948  stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
5949  }
5950  STBI_FREE(scanline);
5951  }
5952 
5953  return hdr_data;
5954 }
5955 
5956 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp)
5957 {
5958  char buffer[STBI__HDR_BUFLEN];
5959  char *token;
5960  int valid = 0;
5961 
5962  if (strcmp(stbi__hdr_gettoken(s,buffer), "#?RADIANCE") != 0) {
5963  stbi__rewind( s );
5964  return 0;
5965  }
5966 
5967  for(;;) {
5968  token = stbi__hdr_gettoken(s,buffer);
5969  if (token[0] == 0) break;
5970  if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
5971  }
5972 
5973  if (!valid) {
5974  stbi__rewind( s );
5975  return 0;
5976  }
5977  token = stbi__hdr_gettoken(s,buffer);
5978  if (strncmp(token, "-Y ", 3)) {
5979  stbi__rewind( s );
5980  return 0;
5981  }
5982  token += 3;
5983  *y = (int) strtol(token, &token, 10);
5984  while (*token == ' ') ++token;
5985  if (strncmp(token, "+X ", 3)) {
5986  stbi__rewind( s );
5987  return 0;
5988  }
5989  token += 3;
5990  *x = (int) strtol(token, NULL, 10);
5991  *comp = 3;
5992  return 1;
5993 }
5994 #endif // STBI_NO_HDR
5995 
5996 #ifndef STBI_NO_BMP
5997 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp)
5998 {
5999  int hsz;
6000  if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') {
6001  stbi__rewind( s );
6002  return 0;
6003  }
6004  stbi__skip(s,12);
6005  hsz = stbi__get32le(s);
6006  if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) {
6007  stbi__rewind( s );
6008  return 0;
6009  }
6010  if (hsz == 12) {
6011  *x = stbi__get16le(s);
6012  *y = stbi__get16le(s);
6013  } else {
6014  *x = stbi__get32le(s);
6015  *y = stbi__get32le(s);
6016  }
6017  if (stbi__get16le(s) != 1) {
6018  stbi__rewind( s );
6019  return 0;
6020  }
6021  *comp = stbi__get16le(s) / 8;
6022  return 1;
6023 }
6024 #endif
6025 
6026 #ifndef STBI_NO_PSD
6027 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp)
6028 {
6029  int channelCount;
6030  if (stbi__get32be(s) != 0x38425053) {
6031  stbi__rewind( s );
6032  return 0;
6033  }
6034  if (stbi__get16be(s) != 1) {
6035  stbi__rewind( s );
6036  return 0;
6037  }
6038  stbi__skip(s, 6);
6039  channelCount = stbi__get16be(s);
6040  if (channelCount < 0 || channelCount > 16) {
6041  stbi__rewind( s );
6042  return 0;
6043  }
6044  *y = stbi__get32be(s);
6045  *x = stbi__get32be(s);
6046  if (stbi__get16be(s) != 8) {
6047  stbi__rewind( s );
6048  return 0;
6049  }
6050  if (stbi__get16be(s) != 3) {
6051  stbi__rewind( s );
6052  return 0;
6053  }
6054  *comp = 4;
6055  return 1;
6056 }
6057 #endif
6058 
6059 #ifndef STBI_NO_PIC
6060 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp)
6061 {
6062  int act_comp=0,num_packets=0,chained;
6063  stbi__pic_packet packets[10];
6064 
6065  stbi__skip(s, 92);
6066 
6067  *x = stbi__get16be(s);
6068  *y = stbi__get16be(s);
6069  if (stbi__at_eof(s)) return 0;
6070  if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
6071  stbi__rewind( s );
6072  return 0;
6073  }
6074 
6075  stbi__skip(s, 8);
6076 
6077  do {
6078  stbi__pic_packet *packet;
6079 
6080  if (num_packets==sizeof(packets)/sizeof(packets[0]))
6081  return 0;
6082 
6083  packet = &packets[num_packets++];
6084  chained = stbi__get8(s);
6085  packet->size = stbi__get8(s);
6086  packet->type = stbi__get8(s);
6087  packet->channel = stbi__get8(s);
6088  act_comp |= packet->channel;
6089 
6090  if (stbi__at_eof(s)) {
6091  stbi__rewind( s );
6092  return 0;
6093  }
6094  if (packet->size != 8) {
6095  stbi__rewind( s );
6096  return 0;
6097  }
6098  } while (chained);
6099 
6100  *comp = (act_comp & 0x10 ? 4 : 3);
6101 
6102  return 1;
6103 }
6104 #endif
6105 
6106 // *************************************************************************************************
6107 // Portable Gray Map and Portable Pixel Map loader
6108 // by Ken Miller
6109 //
6110 // PGM: http://netpbm.sourceforge.net/doc/pgm.html
6111 // PPM: http://netpbm.sourceforge.net/doc/ppm.html
6112 //
6113 // Known limitations:
6114 // Does not support comments in the header section
6115 // Does not support ASCII image data (formats P2 and P3)
6116 // Does not support 16-bit-per-channel
6117 
6118 #ifndef STBI_NO_PNM
6119 
6120 static int stbi__pnm_test(stbi__context *s)
6121 {
6122  char p, t;
6123  p = (char) stbi__get8(s);
6124  t = (char) stbi__get8(s);
6125  if (p != 'P' || (t != '5' && t != '6')) {
6126  stbi__rewind( s );
6127  return 0;
6128  }
6129  return 1;
6130 }
6131 
6132 static stbi_uc *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp)
6133 {
6134  stbi_uc *out;
6135  if (!stbi__pnm_info(s, (int *)&s->img_x, (int *)&s->img_y, (int *)&s->img_n))
6136  return 0;
6137  *x = s->img_x;
6138  *y = s->img_y;
6139  *comp = s->img_n;
6140 
6141  out = (stbi_uc *) stbi__malloc(s->img_n * s->img_x * s->img_y);
6142  if (!out) return stbi__errpuc("outofmem", "Out of memory");
6143  stbi__getn(s, out, s->img_n * s->img_x * s->img_y);
6144 
6145  if (req_comp && req_comp != s->img_n) {
6146  out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y);
6147  if (out == NULL) return out; // stbi__convert_format frees input on failure
6148  }
6149  return out;
6150 }
6151 
6152 static int stbi__pnm_isspace(char c)
6153 {
6154  return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r';
6155 }
6156 
6157 static void stbi__pnm_skip_whitespace(stbi__context *s, char *c)
6158 {
6159  while (!stbi__at_eof(s) && stbi__pnm_isspace(*c))
6160  *c = (char) stbi__get8(s);
6161 }
6162 
6163 static int stbi__pnm_isdigit(char c)
6164 {
6165  return c >= '0' && c <= '9';
6166 }
6167 
6168 static int stbi__pnm_getinteger(stbi__context *s, char *c)
6169 {
6170  int value = 0;
6171 
6172  while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) {
6173  value = value*10 + (*c - '0');
6174  *c = (char) stbi__get8(s);
6175  }
6176 
6177  return value;
6178 }
6179 
6180 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp)
6181 {
6182  int maxv;
6183  char c, p, t;
6184 
6185  stbi__rewind( s );
6186 
6187  // Get identifier
6188  p = (char) stbi__get8(s);
6189  t = (char) stbi__get8(s);
6190  if (p != 'P' || (t != '5' && t != '6')) {
6191  stbi__rewind( s );
6192  return 0;
6193  }
6194 
6195  *comp = (t == '6') ? 3 : 1; // '5' is 1-component .pgm; '6' is 3-component .ppm
6196 
6197  c = (char) stbi__get8(s);
6198  stbi__pnm_skip_whitespace(s, &c);
6199 
6200  *x = stbi__pnm_getinteger(s, &c); // read width
6201  stbi__pnm_skip_whitespace(s, &c);
6202 
6203  *y = stbi__pnm_getinteger(s, &c); // read height
6204  stbi__pnm_skip_whitespace(s, &c);
6205 
6206  maxv = stbi__pnm_getinteger(s, &c); // read max value
6207 
6208  if (maxv > 255)
6209  return stbi__err("max value > 255", "PPM image not 8-bit");
6210  else
6211  return 1;
6212 }
6213 #endif
6214 
6215 static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp)
6216 {
6217  #ifndef STBI_NO_JPEG
6218  if (stbi__jpeg_info(s, x, y, comp)) return 1;
6219  #endif
6220 
6221  #ifndef STBI_NO_PNG
6222  if (stbi__png_info(s, x, y, comp)) return 1;
6223  #endif
6224 
6225  #ifndef STBI_NO_GIF
6226  if (stbi__gif_info(s, x, y, comp)) return 1;
6227  #endif
6228 
6229  #ifndef STBI_NO_BMP
6230  if (stbi__bmp_info(s, x, y, comp)) return 1;
6231  #endif
6232 
6233  #ifndef STBI_NO_PSD
6234  if (stbi__psd_info(s, x, y, comp)) return 1;
6235  #endif
6236 
6237  #ifndef STBI_NO_PIC
6238  if (stbi__pic_info(s, x, y, comp)) return 1;
6239  #endif
6240 
6241  #ifndef STBI_NO_PNM
6242  if (stbi__pnm_info(s, x, y, comp)) return 1;
6243  #endif
6244 
6245  #ifndef STBI_NO_HDR
6246  if (stbi__hdr_info(s, x, y, comp)) return 1;
6247  #endif
6248 
6249  // test tga last because it's a crappy test!
6250  #ifndef STBI_NO_TGA
6251  if (stbi__tga_info(s, x, y, comp))
6252  return 1;
6253  #endif
6254  return stbi__err("unknown image type", "Image not of any known type, or corrupt");
6255 }
6256 
6257 #ifndef STBI_NO_STDIO
6258 STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
6259 {
6260  FILE *f = stbi__fopen(filename, "rb");
6261  int result;
6262  if (!f) return stbi__err("can't fopen", "Unable to open file");
6263  result = stbi_info_from_file(f, x, y, comp);
6264  fclose(f);
6265  return result;
6266 }
6267 
6268 STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
6269 {
6270  int r;
6271  stbi__context s;
6272  long pos = ftell(f);
6273  stbi__start_file(&s, f);
6274  r = stbi__info_main(&s,x,y,comp);
6275  fseek(f,pos,SEEK_SET);
6276  return r;
6277 }
6278 #endif // !STBI_NO_STDIO
6279 
6280 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
6281 {
6282  stbi__context s;
6283  stbi__start_mem(&s,buffer,len);
6284  return stbi__info_main(&s,x,y,comp);
6285 }
6286 
6287 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
6288 {
6289  stbi__context s;
6290  stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
6291  return stbi__info_main(&s,x,y,comp);
6292 }
6293 
6294 #endif // STB_IMAGE_IMPLEMENTATION
6295 
6296 /*
6297  revision history:
6298  2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
6299  2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
6300  2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
6301  2.03 (2015-04-12) extra corruption checking (mmozeiko)
6302  stbi_set_flip_vertically_on_load (nguillemot)
6303  fix NEON support; fix mingw support
6304  2.02 (2015-01-19) fix incorrect assert, fix warning
6305  2.01 (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
6306  2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
6307  2.00 (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
6308  progressive JPEG (stb)
6309  PGM/PPM support (Ken Miller)
6310  STBI_MALLOC,STBI_REALLOC,STBI_FREE
6311  GIF bugfix -- seemingly never worked
6312  STBI_NO_*, STBI_ONLY_*
6313  1.48 (2014-12-14) fix incorrectly-named assert()
6314  1.47 (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
6315  optimize PNG (ryg)
6316  fix bug in interlaced PNG with user-specified channel count (stb)
6317  1.46 (2014-08-26)
6318  fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
6319  1.45 (2014-08-16)
6320  fix MSVC-ARM internal compiler error by wrapping malloc
6321  1.44 (2014-08-07)
6322  various warning fixes from Ronny Chevalier
6323  1.43 (2014-07-15)
6324  fix MSVC-only compiler problem in code changed in 1.42
6325  1.42 (2014-07-09)
6326  don't define _CRT_SECURE_NO_WARNINGS (affects user code)
6327  fixes to stbi__cleanup_jpeg path
6328  added STBI_ASSERT to avoid requiring assert.h
6329  1.41 (2014-06-25)
6330  fix search&replace from 1.36 that messed up comments/error messages
6331  1.40 (2014-06-22)
6332  fix gcc struct-initialization warning
6333  1.39 (2014-06-15)
6334  fix to TGA optimization when req_comp != number of components in TGA;
6335  fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
6336  add support for BMP version 5 (more ignored fields)
6337  1.38 (2014-06-06)
6338  suppress MSVC warnings on integer casts truncating values
6339  fix accidental rename of 'skip' field of I/O
6340  1.37 (2014-06-04)
6341  remove duplicate typedef
6342  1.36 (2014-06-03)
6343  convert to header file single-file library
6344  if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
6345  1.35 (2014-05-27)
6346  various warnings
6347  fix broken STBI_SIMD path
6348  fix bug where stbi_load_from_file no longer left file pointer in correct place
6349  fix broken non-easy path for 32-bit BMP (possibly never used)
6350  TGA optimization by Arseny Kapoulkine
6351  1.34 (unknown)
6352  use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
6353  1.33 (2011-07-14)
6354  make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
6355  1.32 (2011-07-13)
6356  support for "info" function for all supported filetypes (SpartanJ)
6357  1.31 (2011-06-20)
6358  a few more leak fixes, bug in PNG handling (SpartanJ)
6359  1.30 (2011-06-11)
6360  added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
6361  removed deprecated format-specific test/load functions
6362  removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
6363  error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
6364  fix inefficiency in decoding 32-bit BMP (David Woo)
6365  1.29 (2010-08-16)
6366  various warning fixes from Aurelien Pocheville
6367  1.28 (2010-08-01)
6368  fix bug in GIF palette transparency (SpartanJ)
6369  1.27 (2010-08-01)
6370  cast-to-stbi_uc to fix warnings
6371  1.26 (2010-07-24)
6372  fix bug in file buffering for PNG reported by SpartanJ
6373  1.25 (2010-07-17)
6374  refix trans_data warning (Won Chun)
6375  1.24 (2010-07-12)
6376  perf improvements reading from files on platforms with lock-heavy fgetc()
6377  minor perf improvements for jpeg
6378  deprecated type-specific functions so we'll get feedback if they're needed
6379  attempt to fix trans_data warning (Won Chun)
6380  1.23 fixed bug in iPhone support
6381  1.22 (2010-07-10)
6382  removed image *writing* support
6383  stbi_info support from Jetro Lauha
6384  GIF support from Jean-Marc Lienher
6385  iPhone PNG-extensions from James Brown
6386  warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
6387  1.21 fix use of 'stbi_uc' in header (reported by jon blow)
6388  1.20 added support for Softimage PIC, by Tom Seddon
6389  1.19 bug in interlaced PNG corruption check (found by ryg)
6390  1.18 (2008-08-02)
6391  fix a threading bug (local mutable static)
6392  1.17 support interlaced PNG
6393  1.16 major bugfix - stbi__convert_format converted one too many pixels
6394  1.15 initialize some fields for thread safety
6395  1.14 fix threadsafe conversion bug
6396  header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
6397  1.13 threadsafe
6398  1.12 const qualifiers in the API
6399  1.11 Support installable IDCT, colorspace conversion routines
6400  1.10 Fixes for 64-bit (don't use "unsigned long")
6401  optimized upsampling by Fabian "ryg" Giesen
6402  1.09 Fix format-conversion for PSD code (bad global variables!)
6403  1.08 Thatcher Ulrich's PSD code integrated by Nicolas Schulz
6404  1.07 attempt to fix C++ warning/errors again
6405  1.06 attempt to fix C++ warning/errors again
6406  1.05 fix TGA loading to return correct *comp and use good luminance calc
6407  1.04 default float alpha is 1, not 255; use 'void *' for stbi_image_free
6408  1.03 bugfixes to STBI_NO_STDIO, STBI_NO_HDR
6409  1.02 support for (subset of) HDR files, float interface for preferred access to them
6410  1.01 fix bug: possible bug in handling right-side up bmps... not sure
6411  fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
6412  1.00 interface to zlib that skips zlib header
6413  0.99 correct handling of alpha in palette
6414  0.98 TGA loader by lonesock; dynamically add loaders (untested)
6415  0.97 jpeg errors on too large a file; also catch another malloc failure
6416  0.96 fix detection of invalid v value - particleman@mollyrocket forum
6417  0.95 during header scan, seek to markers in case of padding
6418  0.94 STBI_NO_STDIO to disable stdio usage; rename all #defines the same
6419  0.93 handle jpegtran output; verbose errors
6420  0.92 read 4,8,16,24,32-bit BMP files of several formats
6421  0.91 output 24-bit Windows 3.0 BMP files
6422  0.90 fix a few more warnings; bump version number to approach 1.0
6423  0.61 bugfixes due to Marc LeBlanc, Christopher Lloyd
6424  0.60 fix compiling as c++
6425