Project

General

Profile

Actions

Feature #1073

open

Character set conversion should warn or reject in certain cases

Added by Marco Eichelberg over 2 years ago. Updated over 2 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Library and Apps
Target version:
-
Start date:
2023-04-14
Due date:
% Done:

0%

Estimated time:
Module:
dcmdata
Operating System:
Compiler:

Description

There are certain cases in which a character set conversion in DCMTK (e.g. using the dcmconv tool) will not convert all strings to the new character set:
  • the dataset is in implicit VR and an attribute containing an extended character set string is not in the data dictionary
  • the dataset is in implicit VR and the attribute containing an extended character set string is contained in a sequence with defined length that is not in the data dictionary
  • the dataset is in explicit VR and an attribute containing an extended character set string is coded as UN
  • the dataset is in explicit VR and the attribute containing an extended character set string is contained in a sequence with defined length that is encoded as UN

In all cases the result is a dataset where some elements use the "new" character set and some elements remain encoded with the "old" character set.
When another application that has an updated data dictionary or is configured to convert UN to the proper VR reads and processes the file, it will encounter inconsistent character set encoding.

It would be desirable that
  • we introduce a configuration flag that would cause the conversion to be rejected in these cases, with an error code, and the dataset remaining unmodified
  • that in the case of current behaviour (no error) a warning is printed to the logger in these cases
  • that dcmconv gets new command line options that allow the user to select between these two policies

Issue reported 2023-04-13 by Mathieu Maleterre <>.


Related issues 1 (1 open0 closed)

Blocked by DCMTK - Feature #870: Introduce separate class for VR=UNNew2019-02-08

Actions
Actions #1

Updated by Marco Eichelberg over 2 years ago

One more case missing in the list above:
  • the dataset is in explicit VR and an attribute containing an extended character set string is coded in a newly introduced VR unknown to the toolkit
Actions #2

Updated by Marco Eichelberg over 2 years ago

  • Related to Feature #870: Introduce separate class for VR=UN added
Actions #3

Updated by Marco Eichelberg over 2 years ago

  • Related to deleted (Feature #870: Introduce separate class for VR=UN)
Actions #4

Updated by Marco Eichelberg over 2 years ago

  • Blocked by Feature #870: Introduce separate class for VR=UN added
Actions

Also available in: Atom PDF