Project

General

Profile

Actions

Feature #1077

open

Improve handling of extended character sets in "unexpected" places

Added by Marco Eichelberg about 2 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Library and Apps
Target version:
-
Start date:
2023-06-07
Due date:
% Done:

0%

Estimated time:
Module:
dcmdata
Operating System:
Compiler:

Description

Some devices seem to permit free-text user input into fields that are stored as DICOM Code Strings.
An example has surfaced where an accented ISO 8859/1 character is present in a CS attribute.

Currently, DCMTK will silently ignore such fields when performing character set conversion,
and dcm2xml will produce an XML file that is technically invalid because it contains characters
that are not valid UTF-8. Although the reason is clearly an invalid DICOM file, it might be possible
to improve the handling of such errors in DCMTK:

  • Attributes that must not use a specific character set should cause a warning during character set conversion if any extended characters (byte values > 127) are present
  • We should introduce an option that causes character set conversion to fail in this case
  • We should introduce a command line option in dcm2xml (based on the option mentioned in the previous bullet point) that causes the XML generation to fail in this case

Opinions in the team are split whether or not we should offer an option that would cause character set conversion also in such VRs, which might be useful for dcm2xml, but possibly harmful elsewhere.

No data to display

Actions

Also available in: Atom PDF