Project

General

Profile

Actions

Bug #1004

closed

dcm2json produces invalid UTF-8 output for some incorrect DICOM files

Added by Marco Eichelberg about 4 years ago. Updated almost 4 years ago.

Status:
Closed
Priority:
Normal
Category:
Library and Apps
Target version:
Start date:
2021-08-27
Due date:
% Done:

100%

Estimated time:
1:00 h
Module:
dcmdata
Operating System:
Compiler:

Description

The JSON specifications require all JSON scripts to be encoded in UTF-8. dcm2json, therefore, converts DICOM datasets to UTF-8 before writing them to JSON.
Currently, however, DICOM files that do not contain (0008,0005) SpecificCharacterSet but do contain extended characters are simply passed through to JSON, possibly resulting in invalid UTF-8.

dcm2json should check and report this case, just like dcm2xml or dcmconv +U8.

Furthermore, DICOM files not containing (0008,0005) SpecificCharacterSet (or containing the value "ISO_IR 6") should be written to JSON without setting SpecificCharacterSet to ISO_IR 192.

Reported 2021-08-26 by Mathieu Malaterre <>.


Files

badUnc.dcm (513 KB) badUnc.dcm Example file containing extended characters but no SpecificCharacterSet Marco Eichelberg, 2021-08-27 13:00
Actions #1

Updated by Marco Eichelberg almost 4 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100
  • Estimated time set to 1:00 h

Closed by commit #92da003ff.

Actions

Also available in: Atom PDF