What a lovely subject! This is a response to a question raised on the Records Management Association of Australasia “RMAA” Listserv in June 2002
On Wed, 26 Jun 2002 10:22:41 +1000, Mark de Berg, asked:
What is the best format to save the images in, that will minimise the file size and reduce the ability for users to alter the image?
Scanning devices work in such a way that all documents are scanned into images as a raster image which, is a non-intelligent digital bitmap file consisting of dots in an X and Y combination which produces a reasonable if not exact facsimile of the original dependent on the resolution selected or available. An A4 size scanned blank document is equal to a 1 Meg file UNCOMPRESSED. NOTE: All images get compressed in some fashion. An average typed A4 page should create a .tiff file at around 35 – 50K per image at 200 to 300 dots per inch [DPI] resolution.
Most if not all document scanners use TIFF as the output format from the scanner. Strengths: TIFF is primarily designed for raster data interchange. TIFF’s main strengths are a highly flexible and platform-independent format which is supported by numerous image processing applications. Since developers of printers, scanners and monitors designed it, it has an extraordinarily rich space of information elements for colorimetry calibration, gamut tables, etc. Such information is also extremely useful for remote sensing and multispectral applications.
- For more information go to the The Unofficial TIFF Home Page @ The Unofficial TIFF Home Page
What happens after images are produced as a raster file format is open to question.
What is a raster image? Microsoft’s Computer Dictionary: Fourth Edition states the following: raster image: n. A display image formed by patterns of light and dark or differently coloured pixels in a rectangular array. See also raster graphics. raster graphics: n. A method of generating graphics that treats an image as a collection of small, independently controlled dots (pixels) arranged in rows and columns. Compare vector graphics. As a comparison to an intelligent graphic file, which a vector graphic is, the definition is as follows: vector graphics: n. Images generated from mathematical descriptions that determine the position, length, and direction in which lines are drawn. Objects are created as collections of lines rather than as patterns of individual dots or pixels. Compare raster graphics.
A definition of .TIF OR TIFF:
.tif or .tiff: n. The file extension that identifies bitmap images in Tagged Image File Format (TIFF). See also TIFF. TIFF or TIF: n. Acronym for Tagged Image File Format or Tag Image File. A standard file format commonly used for scanning, storage, and interchange of gray-scale graphic images. TIFF may be the only format available for older programs (such as older versions of Mac Paint), but most modern programs are able to save images in a variety of other formats, such as GIF or JPEG. See also gray scale. Compare GIF, JPEG.
A definition of .PDF:
.pdf: n. The file extension that identifies documents encoded in the Portable Document Format developed by Adobe Systems. In order to display or print a .pdf file, the user should obtain the freeware Adobe Acrobat Reader. See also Acrobat, Portable Document Format Portable Document Format: n. The Adobe specification for electronic documents that use the Adobe Acrobat family of servers and readers. Acronym: PDF. See also Acrobat, .pdf. Acrobat: n. A program from Adobe Systems, Inc., that converts a fully formatted document created on a Windows, Macintosh, MS-DOS, or UNIX platform into a Portable Document Format (PDF) file that can be viewed on several different platforms. Acrobat enables users to send documents that contain distinctive typefaces, colour, graphics, and photographs electronically to recipients, regardless of the application used to create the originals. Recipients need the Acrobat reader, which is available free, to view the files. Depending on version and platform, it also includes tools such as Distiller (which creates PDF files from PostScript files), Exchange (which is used for links, annotations, and security-related matters), and PDF Writer (which creates PDF files from files created with business software). In essence, TIFF is the header which identifies the raster image bitmap file compressed or uncompressed in a multitude of versions. TIFF is a much more complicated technical issue that we need to go into here. .PDF is also a header, and much more as it is also the actual encoded file.
As Glenn Sanders advises:
But integrity is a function of your processes, not the format itself.
Any format can be changed, even a .PDF file. As stated above, PDF is a Portable File Format which many people mistake as being an unalterable format because the FREE SOFTWARE the Acrobat Reader will not allow alterations or additions. Pay your $’s and you can do anything with any file format including .PDF. Once an image is scanned into a Records Management System should be locked and not be able to be altered as advised in the old Australian Records Management AS 4390 now replaced by AS/ISO 15489.1 and .2 which advised a document should be INVIOLATE.
All of this is after the fact. You may ask why?
As many of you may know, I have a vested interest in people scanning images to meet published Standards.
I wish to state here and now that I am the Australian Agent for a USA Resolution Test Targets for scanners, A&P International.
Late last year I promoted the use of resolution test targets for Document Scanners as detailed in the ANSI/AIIM standard for document scanners. I had a response from a concerned person who asked how they could recover 600,000 images which were less than usable. Why did this situation arise? I do not know the answer, but I doubt that any Quality Assurance Policies, Procedures & Practices were in place to check on the quality of the images produced at the point of scanning or sometime soon after scanning. I was advised that fortunately the documents had not been destroyed. My advice was to rescan the 600,000 documents [as small job]; only this time to make certain that some Quality Assurance process at the scanner was in place.
The USA Standard ANSI/AIIM MS44-1988 titled “Recommended Practice for Quality Control of Image Scanners” states the following along with many other important criteria in nine sections. 4.2 Why do I need Quality Control?
In the typical digital image management system, all incoming documents are scanned, indexing information is entered, and the original paper documents are eventually destroyed. In some systems the scanned image of the document may never be examined until it is needed. Strict quality control is required to assure that the images stored are of acceptable quality and are locatable by way of the index.
If a scanner is not operating properly, many useless images may be stored on the system. When the problem is discovered and corrected, the original documents will have to be scanned again. Procedures should be established so that any problems are discovered while the original documents are still available.
The quality control procedures described in this document allow the user to make sure that the system is performing today as well as it was when originally adjusted by the manufacturer. Used on a regular basis, these procedures can assure the user that the scanner will produce digital images of sufficient quality for their intended use.
5. Frequency of Testing
How often test runs should be made depends on how much scanning will take place, and the consequences of improperly scanning documents.
The best security is provided by doing a test run before and after each batch of documents scanned, where a batch is several documents scanned with the same settings. This may be ten documents or ten thousand documents, however the batch should be terminated at the end of a shift, and new test runs made, even if all documents have not been scanned. If the pre and post test runs are acceptable, the scanned documents will be acceptable.
If a scanner is known to be stable, the test runs after each batch can be eliminated. In this case it may be desirable to print out and examine the last document scanned to make sure it is acceptable. Testing only at the beginning and end of each scanning shift or workday may be acceptable in some operations, but this should be the minimum testing frequency.
Frequent testing is strongly recommended because it minimizes the risk of lost time or lost documents. Lack of frequent testing carries the risk of scanning documents which will be unusable and committing nonerasable storage to these documents. By the time a scanner problem is detected, thousands of documents may have been scanned, and will have to be scanned again. A worse risk is incurred if original documents are routinely destroyed after scanning.
End of information from ANSI/AIIM MS44-1988. **********************************
How many organisation or Scanning Bureaus in Australia scan to the standards set out in ANSI/AIIM MS44-1988 titled “Recommended Practice for Quality Control of Image Scanners” is questionable. If the number is in the hundreds I would be surprised. It is probable less that 100 across Australia. Assuming that we are scanning to the standards, we now have the legislative ISSUE which others have mentioned. I note that the NSW, Victorian and NAA guidelines is to keep all long term and archive material in paper format even after scanning. I have commented on this process previously in a posting to this and other Listserv’s titled “Standards for A4 & A3 size document scanning plus the long-term management of born electronic data” late in 2001.
Not only is this the advised situation in Western Australia, the suggestion by Glenn Sanders
You could scan in from paper, toss the paper in an archive box in date scanned order, and keep it somewhere far away and ridiculously cheap, just in case, for a few years till enough precedents have been set is not acceptable as a process.
The WA requirement is, to my understanding, and I believe that it may also be the case in other States and Federally, that files and the enclosed documents with either long term or archival Retention and Disposal sentencing periods are required to be placed in a file folder suitably identified with the enclosed documents in folio order. Day boxing of scanned documents as suggested by Glenn Sanders and others is, to my understanding, VERBOTEN! This situation may also apply to short-term R&D scheduled documents and files.
Mark de Berg asked:
Where are the courts currently at, with regards to accepting scanned images as evidence?
I am aware of “The opinion issued by the Victorian Government solicitor” in respect to the scanning of documents. I personally have some difficulty with this advice, but who am I to question the Victorian Government solicitor. I would like to see the wording of the situation posed to him/her which resulted in this advice being given. A separate set of questions or wording may have resulted in a different opinion on the issue. In any case as Kathy Sinclair advises: This Advice indicates that the paper originals of permanent public records CANNOT be destroyed after they have been scanned, despite the implications of the Electronic Transactions Act. The legal reasons for this are convoluted :-) but the Advice spells them out clearly if you are interested. As legislation, both for best evidence and electronic transactions may vary from state to state and the commonwealth to those in Victoria, a request for an opinion to the Crown Solicitor in each jurisdiction may provide a differing opinion to that of the Victorian Government solicitor. As stated above, how the request for advice is worded may provide a differing outcome for each variance of the wording in a particular request for an opinion.
What am I saying? Each situation posed will provide a potentially different outcome. No two lawyers ever have the same opinion on any given situation, so I do not think my views are far from reality in respect to the provision of opinions in the legal arena.
The WA Evidence Act 1906 recently revised by the Acts Amendment (Evidence) Bill 1999 states the following: 73A. Admissibility of reproductions (best evidence rule modified) (1) A document that accurately reproduces the contents of another document is admissible in evidence before a court in the same circumstances, and for the same purposes, as that other document, whether that other document still exists. (2) In determining whether a particular document accurately reproduces the contents of another, a court is not bound by the rules of evidence and — (a) may rely on its own knowledge of the nature and reliability of the processes by which the reproduction was made; (b) may make findings based on a certificate in the prescribed form signed by a person with knowledge and experience of the processes by which the reproduction was made; (c) may make findings based on a certificate in the prescribed form signed by a person who has compared the contents of both documents and found them to be identical; or (d) may act on any other basis it considers appropriate in the circumstances. (3) This section applies to a reproduction made — (a) by an instantaneous process; (b) by a process in which the contents of a document are recorded by photographic, electronic, or other means, and the reproduction is subsequently produced from that record; (c) by a process prescribed for the purposes of this section; or (d) in any other way. (4) If a reproduction is made by a process referred to in subsection (3)(c), the process shall be presumed to reproduce accurately the contents of the document reproduced unless the contrary is proved. (5) If so, requested by a party to the proceedings, a court shall give reasons for determining that a document is or is not an accurate reproduction. (6) A person who signs a certificate for the purposes of this section knowing it to be false or misleading in any material particular commits an indictable offence and is liable on conviction to imprisonment for 7 years.
END OF SECTION 73A NEW BEST EVIDENCE RULE******
WA now also is in the final process of the passage of the Electronic Transaction Bill 2001 which has across the board support and should be passed by the Upper House in the not too distant future. The Bill States: 10. Production of document (1) If, under a law of this jurisdiction, a person is required to produce a document that is in the form of paper, an article or other material, that requirement is taken to have been met if the person produces, by electronic communication or otherwise, an electronic form of the document, where — (a) having regard to all the relevant circumstances at the time the document was produced, the method of generating the electronic form of the document provided a reliable means of assuring the maintenance of the integrity of the information contained in the document; (b) at the time the document was produced, it was reasonable to expect that the information contained in the electronic form of the document would be readily accessible then and for subsequent reference; and (c) the person to whom the document is required to be produced consents to the production of an electronic form of the document. (2) If, under a law of this jurisdiction, a person is permitted to produce a document that is in the form of paper, an article or other material, then, instead of producing the document in that form, the person may produce, by electronic communication or otherwise, an electronic form of the document, where — (a) having regard to all the relevant circumstances at the time the document was produced, the method of generating the electronic form of the document provided a reliable means of assuring the maintenance of the integrity of the information contained in the document; (b) at the time the document was produced, it was reasonable to expect that the information contained in the electronic form of the document would be readily accessible then and for subsequent reference; and (c) the person to whom the document is permitted to be produced consents to the production of an electronic form of the document. (3) For the purposes of this section, the integrity of information contained in a document is maintained if, and only if, the information has remained complete and unaltered, apart from — (a) the addition of any endorsement; or (b) any immaterial change, which arises in the normal course of communication, storage or display. (4) This section does not affect the operation of any other law of this jurisdiction that makes provision for or in relation to requiring or permitting electronic forms of documents to be produced, in accordance with information technology requirements — (a) on a particular kind of data storage device; or (b) by a particular electronic communication.
Note: Section 12 sets out exemptions from this section.
END OF SECTION 10 PRODUCTION OF DOCUMENT***************
The new updated rules of evidence and WA Electronic Transaction Bill 2001 look pretty good to me and, which would appear to allow for the scanning and then the destruction of documents and files, IF and a big IF at that if Section 73A, (2) (a) may rely on its own knowledge of the nature and reliability of the processes by which the reproduction was made can be verified. This is where our ANSI/AIIM Standard comes to the rescue for the scanning of documents and their legal admissibility.
No identified QA Process, and possible no legitimacy of the scanned images produced. The scanning process is not the only process in question here. The whole Policy, Procedure & Practice issues from the receipt of incoming correspondence to the short or long-term retention as a digital image come into question. If no such processes are in place and then more importantly it is not followed to the letter on a day-to-day basis by all personnel, the best evidence rule may eat you alive.
Happy Scanning!
Laurie Varendorff ARMA
The Author
Laurie Varendorff, ARMA, a former RMAA Western Australia Branch president & national director, has been involved in records management and the micrographic industry for 37 years. Laurie has his own microfilm equipment sales & support organisation – Digital Microfilm Equipment – DME – and a – records & information management – RIM – consulting & training business – The Varendorff Consultancy – TVC – located near Perth, Western Australia, & has tutored & written course material in recordkeeping & archival storage & preservation for Perth’s Edith Cowan University – ECU. Phone: +618 9286 3705; mobile: +61 417 094 147; email @ Laurie Varendorff
The author, Laurie Varendorff gives permission for the redistribution or republishing of this article by individuals and nonprofit professional organisations without cost based on the condition that he as well as the URL of the article are recognised at the introduction of the article when redistributed or republished.
SPECIAL NOTE: Use of this article by publishers, commercial, government, or educational organisations requires a financial agreement to be negotiated with Laurie as the copyright holder for this work.