337 lines
		
	
	
		
			13 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
			
		
		
	
	
			337 lines
		
	
	
		
			13 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
| * Tiff -- A port of Oleg Kiselyov's tiff code
 | |
| 
 | |
| Both the code and this documentation is derived from Oleg's files, cf.
 | |
| http://okmij.org/ftp/Scheme/binary-io.html#tiff
 | |
| 
 | |
| This distribution comes with two family-friendly tiff files in sunterlib/
 | |
| scsh/tiff/.
 | |
| 
 | |
| 
 | |
| ** Package Dependencies
 | |
| 
 | |
| TIFF's structures depend on structures from these other sunterlib projects:
 | |
| 
 | |
|     krims
 | |
|     sequences
 | |
| 
 | |
|                                   *
 | |
| 
 | |
| 
 | |
| ** Reading TIFF files
 | |
| 
 | |
| [ Oleg's announcement.  Main changes to the announced library: modularisation
 | |
|   and fake srfi-4 uniform vectors.  (The source files list the changes.)    ]
 | |
| 
 | |
| From posting-system@google.com Wed Oct  8 01:24:13 2003
 | |
| Date: Tue, 7 Oct 2003 18:24:12 -0700
 | |
| From: oleg@pobox.com (oleg@pobox.com)
 | |
| Newsgroups: comp.lang.scheme
 | |
| Subject: [ANN] Reading TIFF files
 | |
| Message-ID: <7eb8ac3e.0310071724.59bffe62@posting.google.com>
 | |
| Status: OR
 | |
| 
 | |
| This is to announce a Scheme library to read and analyze TIFF image
 | |
| files. We can use the library to obtain the dimensions of a TIFF
 | |
| image; the image name and description; the resolution and other
 | |
| meta-data. We can then load a pixel matrix or a colormap table. An
 | |
| accompanying tiff-prober program prints out the TIFF dictionary in a
 | |
| raw and polished formats.
 | |
| 
 | |
|      http://pobox.com/~oleg/ftp/Scheme/lib/tiff.scm
 | |
| 	dependencies: util.scm, char-encoding.scm, myenv.scm
 | |
|      http://pobox.com/~oleg/ftp/Scheme/tests/vtiff.scm
 | |
|         see also: gnu-head-sm.tif in the same directory
 | |
|      http://pobox.com/~oleg/ftp/Scheme/tiff-prober.scm
 | |
| 
 | |
| Features:
 | |
|    - The library handles TIFF files written in both endian formats
 | |
|    - A TIFF directory is treated somewhat as a SRFI-44 immutable
 | |
|      dictionary collection. Only the most basic SRFI-44 methods are
 | |
|      implemented, including the left fold iterator and the get method.
 | |
|    - An extensible tag dictionary translates between symbolic tag
 | |
|      names and numeric ones. Ditto for tag values.
 | |
|    - A tag dictionary for all TIFF 6 standard tags and values comes
 | |
|      with the library. A user can add the definitions of
 | |
|      his private tags.
 | |
|    - The library handles TIFF directory values of types:
 | |
|     (signed/unsigned) byte, short, long, rational; ASCII strings.
 | |
|    - A particular care is taken to properly handle values whose
 | |
|      total size is no more than 4 bytes.
 | |
|    - Array values (including the image matrix) are returned as
 | |
|      uniform vectors (SRFI-4)
 | |
|    - Values are read lazily. If you are only interested in the
 | |
|      dimensions of an image, the image matrix itself will not be loaded.
 | |
| 
 | |
| 
 | |
| Here's the result of running tiff-prober on the image of the GNU head
 | |
| (converted from JPEG to TIFF by xv). I hope I won't have any copyright
 | |
| problems with using and distributing that image.
 | |
| 
 | |
| Analyzing TIFF file tests/gnu-head-sm.tif...
 | |
| There are 15 entries in the TIFF directory
 | |
| they are
 | |
| TIFFTAG:IMAGEWIDTH, count 1, type short, value-offset 129 (0x81)
 | |
| TIFFTAG:IMAGELENGTH, count 1, type short, value-offset 122 (0x7A)
 | |
| TIFFTAG:BITSPERSAMPLE, count 1, type short, value-offset 8 (0x8)
 | |
| TIFFTAG:COMPRESSION, count 1, type short, value-offset 1 (0x1)
 | |
| TIFFTAG:PHOTOMETRIC, count 1, type short, value-offset 1 (0x1)
 | |
| TIFFTAG:IMAGEDESCRIPTION, count 29, type ascii str, value-offset 15932 (0x3E3C)
 | |
| TIFFTAG:STRIPOFFSETS, count 1, type long, value-offset 8 (0x8)
 | |
| TIFFTAG:ORIENTATION, count 1, type short, value-offset 1 (0x1)
 | |
| TIFFTAG:SAMPLESPERPIXEL, count 1, type short, value-offset 1 (0x1)
 | |
| TIFFTAG:ROWSPERSTRIP, count 1, type short, value-offset 122 (0x7A)
 | |
| TIFFTAG:STRIPBYTECOUNTS, count 1, type long, value-offset 15738 (0x3D7A)
 | |
| TIFFTAG:XRESOLUTION, count 1, type rational, value-offset 15962 (0x3E5A)
 | |
| TIFFTAG:YRESOLUTION, count 1, type rational, value-offset 15970 (0x3E62)
 | |
| TIFFTAG:PLANARCONFIG, count 1, type short, value-offset 1 (0x1)
 | |
| TIFFTAG:RESOLUTIONUNIT, count 1, type short, value-offset 2 (0x2)
 | |
| 
 | |
| image width:    129
 | |
| image height:   122
 | |
| image depth:    8
 | |
| document name:  *NOT SPECIFIED*
 | |
| image description:
 | |
|   JPEG:gnu-head-sm.jpg 129x122
 | |
| time stamp:     *NOT SPECIFIED*
 | |
| compression:    NONE
 | |
| 
 | |
| In particular, the dump of the tiff directory is produced by the
 | |
| following line of code
 | |
| 	  (print-tiff-directory tiff-dict (current-output-port))
 | |
| To determine the width of the image, we do
 | |
| 	(tiff-directory-get tiff-dict 'TIFFTAG:IMAGEWIDTH not-spec)
 | |
| To determine the compression (as a symbol) we evaluate
 | |
| 	(tiff-directory-get-as-symbol tiff-dict 'TIFFTAG:COMPRESSION not-spec)
 | |
| 
 | |
| If an image directory contains private tags, they will be printed like
 | |
| the following:
 | |
| 
 | |
| private tag 33009, count 1, type signed long, value-offset 16500000 (0xFBC520)
 | |
| private tag 33010, count 1, type signed long, value-offset 4294467296
 | |
|                                                                  (0xFFF85EE0)
 | |
| 
 | |
| A user may supply a dictionary of his private tags and enjoy
 | |
| the automatic translation from symbolic to numerical tag names.
 | |
| 
 | |
| The validation code vtiff.scm includes a function
 | |
| test-reading-pixel-matrix that demonstrates loading a pixel matrix of
 | |
| an image in an u8vector. The code can handle a single or multiple
 | |
| strips.
 | |
| 
 | |
| Portability: the library itself, tiff.scm, relies on the following
 | |
| extensions to R5RS: uniform vectors (SRFI-4); ascii->char function
 | |
| (which is on many systems just integer->char); trivial define-macro
 | |
| (which can be easily re-written into syntax-rules); let*-values
 | |
| (SRFI-11); records (SRFI-9). Actually, the code uses Gambit's native
 | |
| define-structures, which can be easily re-written into SRFI-9
 | |
| records. The Scheme system should be able to represent the full range
 | |
| of 32-bit integers and should support rationals.
 | |
| 
 | |
| The most problematic extension is an endian port. The TIFF library
 | |
| assumes the existence of a data structure with the following
 | |
| operations
 | |
|    endian-port-set-bigendian!::   EPORT -> UNSPECIFIED
 | |
|    endian-port-set-littlendian!:: EPORT -> UNSPECIFIED
 | |
|    endian-port-read-int1:: EPORT -> UINTEGER (byte)
 | |
|    endian-port-read-int2:: EPORT -> UINTEGER
 | |
|    endian-port-read-int4:: EPORT -> UINTEGER
 | |
|    endian-port-setpos:: EPORT INTEGER -> UNSPECIFIED
 | |
| 
 | |
| The library uses solely these methods to access the input port. The
 | |
| endian port can be implemented in a R5RS Scheme system if we assume
 | |
| that the composition of char->integer and read-char yields a byte and
 | |
| if we read the whole file into a string or a u8vector
 | |
| (SRFI-4). Obviously, there are times when such a solution is not
 | |
| satisfactory. Therefore, tiff-prober and the validation code
 | |
| vtiff.scm rely on a Gambit-specific code. All major Scheme systems can
 | |
| implement endian ports in a similar vein -- alas, each in its own
 | |
| particular way.
 | |
| 
 | |
| 
 | |
| 
 | |
| ** Endian ports
 | |
| from structure endian.
 | |
| 
 | |
| We rely on an ENDIAN-PORT
 | |
| A port with the following operations
 | |
|   endian-port-set-bigendian!::   EPORT -> UNSPECIFIED
 | |
|   endian-port-set-littlendian!:: EPORT -> UNSPECIFIED
 | |
|   endian-port-read-int1:: EPORT -> UINTEGER (byte)
 | |
|   endian-port-read-int2:: EPORT -> UINTEGER
 | |
|   endian-port-read-int4:: EPORT -> UINTEGER
 | |
|   endian-port-setpos EPORT INTEGER -> UNSPECIFIED
 | |
| 
 | |
|   close-endian-port:: EPORT -> UNSPECIFIED
 | |
|   make-endian-port:: INPORT BOOLEAN -> EPORT
 | |
|     The boolean argument sets the endianness of the resulting endian-port,
 | |
|   boolean(most sigificant bit first).  After having wrapped the INPORT
 | |
|   in the EPORT, you should no longer manipulate the INPORT directly.
 | |
| 
 | |
| 
 | |
| 
 | |
| ** Tiff
 | |
| in structures TIFF and TIFFLET.  TIFFLET exports a survival package of
 | |
| bindings:
 | |
|   read-tiff-file, print-tiff-directory, tiff-directory-get(-as-symbol).
 | |
| Refined needs will require TIFF.
 | |
| 
 | |
| *** TIFF tags: codes and values
 | |
| 
 | |
| A tag dictionary, tagdict, record helps translate between
 | |
| tag-symbols and their numerical values.
 | |
| 
 | |
| tagdict-get-by-name TAGDICT TAG-NAME => INT
 | |
|   where TAG-NAME is a symbol.
 | |
| Translate a symbolic representation of a TIFF tag into a numeric
 | |
| representation.
 | |
| An error is raised if the lookup fails.
 | |
| 
 | |
| tagdict-get-by-num TAGDICT INT => TAG-NAME or #f
 | |
|   Translate from a numeric tag value to a symbolic representation,
 | |
| if it exists. Return #f otherwise.
 | |
| 
 | |
| tagdict-tagval-get-by-name TAGDICT TAG-NAME VAL-NAME => INT
 | |
|   where VAL-NAME is a symbol.
 | |
| Translate from the symbolic representation of a value associated
 | |
| with TAG-NAME in the TIFF directory, into the numeric representation.
 | |
| An error is raised if the lookup fails.
 | |
| 
 | |
| tagdict-tagval-get-by-num TAGDICT TAG-NAME INT => VAL-NAME or #f
 | |
|   Translate from a numeric value associated with TAG-NAME in the TIFF
 | |
| directory to a symbolic representation, if it exists. Return #f
 | |
| otherwise.
 | |
| 
 | |
| make-tagdict ((TAG-NAME INT (VAL-NAME . INT) ...) ...)
 | |
|   Build a tag dictionary
 | |
| 
 | |
| tagdict? TAGDICT -> BOOL
 | |
| 
 | |
| tagdict-add-all DEST-DICT SRC-DICT -> DEST-DICT
 | |
|   Join two dictionaries
 | |
| 
 | |
| tiff-standard-tagdict : TAGDICT
 | |
|   The variable tiff-standard-tagdict is initialized to the dictionary
 | |
| of standard TIFF tags (which you may look up in the first section above
 | |
| or in the source, tiff.scm).
 | |
| 
 | |
| Usage scenario:
 | |
|    (tagdict-get-by-name  tiff-standard-tagdict 'TIFFTAG:IMAGEWIDTH) => 256
 | |
|    (tagdict-get-by-num   tiff-standard-tagdict 256) => 'TIFFTAG:IMAGEWIDTH
 | |
|    (tagdict-tagval-get-by-name tiff-standard-tagdict
 | |
|       'TIFFTAG:COMPRESSION 'LZW) => 5
 | |
|    (tagdict-tagval-get-by-num  tiff-standard-tagdict
 | |
|       'TIFFTAG:COMPRESSION 5) => 'LZW
 | |
| 
 | |
|    (define extended-tagdict
 | |
|       (tagdict-add-all tiff-standard-tagdict
 | |
|          (make-tagdict
 | |
| 	   '((WAupper_left_lat 33004)
 | |
| 	     (WAhemisphere 33003 (North . 1) (South . 2))))))
 | |
| 
 | |
| 
 | |
| *** TIFF directory entry
 | |
| 
 | |
| a descriptor of a TIFF "item", which can be image data, document description,
 | |
| time stamp, etc, depending on the tag. Thus an entry has the following
 | |
| structure:
 | |
|  unsigned short tag;
 | |
|  unsigned short type;          // data type: byte, short word, etc.
 | |
|  unsigned long  count;         // number of items; length in spec
 | |
|  unsigned long  val_offset;    // byte offset to field data
 | |
| 
 | |
| The values associated with each entry are disjoint and may appear anywhere
 | |
| in the file (so long as they are placed on a word boundary).
 | |
| 
 | |
| Note, If the value takes 4 bytes or less, then it is placed in the offset
 | |
| field to save space.  If the value takes less than 4 bytes, it is
 | |
| *left*-justified in the offset field.
 | |
| Note, that it's always *left* justified (stored in the lower bytes)
 | |
| no matter what the byte order (big- or little- endian) is!
 | |
| Here's the precise quote from the TIFF 6.0 specification:
 | |
| "To save time and space the Value Offset contains the Value instead of
 | |
| pointing to the Value if and only if the Value fits into 4 bytes. If
 | |
| the Value is shorter than 4 bytes, it is left-justified within the
 | |
| 4-byte Value Offset, i.e., stored in the lower- numbered
 | |
| bytes. Whether the Value fits within 4 bytes is determined by the Type
 | |
| and Count of the field."
 | |
| 
 | |
| tiff-dir-entry? TIFF-DIR-ENTRY => BOOLEAN
 | |
| tiff-dir-entry-tag TIFF-DIR-ENTRY => INTEGER
 | |
| tiff-dir-entry-type TIFF-DIR-ENTRY => INTEGER
 | |
| tiff-dir-entry-count TIFF-DIR-ENTRY => INTEGER
 | |
| tiff-dir-entry-val-offset TIFF-DIR-ENTRY => INTEGER
 | |
| tiff-dir-entry-value TIFF-DIR-ENTRY => VALUE
 | |
| 
 | |
| print-tiff-dir-entry TIFF-DIR-ENTRY TAGDICT OPORT -> UNSPECIFIED
 | |
|   Print the contents of TIFF-DIR-ENTRY onto the output port OPORT
 | |
| using TAGDICT to convert tag identifiers to symbolic names
 | |
| 
 | |
| 
 | |
| *** TIFF Image File Directory
 | |
| 
 | |
| TIFF directory is a collection of TIFF directory entries. The entries
 | |
| are sorted in an ascending order by tag.
 | |
| Note, a TIFF file can contain more than one directory (chained together).
 | |
| We handle only the first one.
 | |
| 
 | |
| We treat a TIFF image directory somewhat as an ordered, immutable,
 | |
| dictionary collection, see SRFI-44.
 | |
| 
 | |
| tiff-directory? VALUE => BOOLEAN
 | |
| tiff-directory-size TIFF-DIRECTORY => INTEGER
 | |
| tiff-directory-empty? TIFF-DIRECTORY => BOOLEAN
 | |
| 
 | |
| tiff-directory-get TIFF-DIRECTORY KEY [ABSENCE-THUNK] => VALUE
 | |
|   KEY can be either a symbol or an integer.
 | |
| If the lookup fails, ABSENCE-THUNK, if given, is evaluated and its value
 | |
| is returned. If ABSENCE-THUNK is omitted, the return value on failure
 | |
| is #f.
 | |
| 
 | |
| tiff-directory-get-as-symbol TIFF-DIRECTORY KEY [ABSENCE-THUNK] => VALUE
 | |
|   KEY must be a symbol.
 | |
| If it is possible, the VALUE is returned as a symbol, as translated
 | |
| by the tagdict.
 | |
| 
 | |
| tiff-directory-fold-left TIFF-DIRECTORY FOLD-FUNCTION SEED-VALUE
 | |
|             ... => seed-value ...
 | |
|   The fold function receives a tiff-directory-entry as a value
 | |
| 
 | |
| read-tiff-file:: EPORT [PRIVATE-TAGDICT] -> TIFF-DIRECTORY
 | |
| print-tiff-directory:: TIFF-DIRECTORY OPORT -> UNSPECIFIED
 | |
| 
 | |
| 
 | |
| 
 | |
| ** Usage example: tiff prober
 | |
| 
 | |
| The scripts probe-tiff and equivalently tiff-prober.scm read a TIFF file
 | |
| and print out its directory (as well as values of a few "important" tags).
 | |
| The scripts (or script headers) assume that the executable scsh resides
 | |
| in /usr/local/bin, and that the environment variable SCSH_LIB_DIRS lists
 | |
| the sunterlib directory with the config file sunterlib.scm; cf. the reference
 | |
| manual about "Running Scsh\Scsh command-line switches\Switches".
 | |
| 
 | |
| Usage
 | |
|         probe-tiff tiff-file1 ...
 | |
| or
 | |
|         tiff-prober.scm tiff-file1 ...
 | |
| 
 | |
| Structure tiff-prober exports the entry point for the scripts:
 | |
| 
 | |
| tiff-prober ARGV => UNSPECIFIED
 | |
| Call, for instance, (tiff-prober '("foo" "bsp.tiff")).
 | |
| 
 | |
| 
 | |
| ** Validating the library
 | |
| 
 | |
| The valdidating code is in sunterlib/scsh/tiff/vtiff.scm and assumes that
 | |
| the tiffed GNU logo sunterlib/tiff/gnu-head-sm.tif resides in the working
 | |
| directory.  In that situation you may go
 | |
|                                           ,in tiff-testbed  ; and
 | |
|                                           ,load vtiff.scm
 | |
| Alternatively make sure that the env variable SCSH_LIB_DIRS lists the
 | |
| directory with sunterlib.scm (just as for the tiff prober, see above)
 | |
| and run vtiff.scm as script.
 | |
| 
 | |
|                                    oOo
 | |
| 
 |