Project

General

Profile

Compressing huge images (Dealing with large element values)

Forum post by Marco Eichelberg, 20080506:

Compression of large datasets (multiframe images) where you don't want to keep everything in memory is rather difficult to do with DCMTK currently.

The good news is that it is well possible to load a very large file. All attributes larger than 4K will remain in the file unless the attribute value is accessed. At this point in time, however, the complete attribute value is loaded into main memory - which in the case of an uncompressed multiframe image means: the complete blob of pixel data. If you work with the latest CVS snapshot of DCMTK, there is a new method DcmElement::getPartialValue() that has been designed to circumvent this problem. This method allows you to load a part of a (large) attribute without allocating memory for all of the attribute. This method does not yet exist in DCMTK 3.5.4 release. The second problem is that the current DcmCodec classes have not been re-engineered (yet) to use this new method. They will force the complete pixel data blob into main memory and store the compressed result of all frames in main memory as well. It should not be too difficult to change the codecs to use DcmElement::getPartialValue(), which would address the first half of the problem, but this has not yet been done.

On the second half of the problem, class DcmFileFormat will not permit you to write a partial DICOM file and append to it later. The code expects the complete data structure to be established in memory - this is needed, for example, for group length calculations, trailing set padding etc. and not easily changed. The good news is that with the current CVS snapshot DcmFileFormat::saveFile and the underlying DcmObject::write methods will be able to save large files without forcing everything into main memory - if an attribute value still remains in a source file, the new DcmElement::getPartialValue() will be used to copy it block-wise into the new file.

What you would need in addition to that is the capability to store the compressed pixel data fragments in temporary files such that a call to DcmFileFormat::saveFile (or the network send code) assembles everything into one proper DICOM file without having everything in memory. This would require some new code that manages reference counting - the temporary files should be deleted once no reference is left, but not earlier. You would also have to re-work the DcmCodec classes to use this new mechanism and store compressed fragments in temporary files instead of storing them in memory. All of the above is somewhere on the wish list for future DCMTK extensions (along with the capability to decompress frame by frame), but not implemented.