Actions
Feature #1073
openCharacter set conversion should warn or reject in certain cases
Status:
New
Priority:
Normal
Assignee:
-
Category:
Library and Apps
Target version:
-
Start date:
2023-04-14
Due date:
% Done:
0%
Estimated time:
Module:
dcmdata
Operating System:
Compiler:
Description
There are certain cases in which a character set conversion in DCMTK (e.g. using the dcmconv tool) will not convert all strings to the new character set:
- the dataset is in implicit VR and an attribute containing an extended character set string is not in the data dictionary
- the dataset is in implicit VR and the attribute containing an extended character set string is contained in a sequence with defined length that is not in the data dictionary
- the dataset is in explicit VR and an attribute containing an extended character set string is coded as UN
- the dataset is in explicit VR and the attribute containing an extended character set string is contained in a sequence with defined length that is encoded as UN
In all cases the result is a dataset where some elements use the "new" character set and some elements remain encoded with the "old" character set.
When another application that has an updated data dictionary or is configured to convert UN to the proper VR reads and processes the file, it will encounter inconsistent character set encoding.
- we introduce a configuration flag that would cause the conversion to be rejected in these cases, with an error code, and the dataset remaining unmodified
- that in the case of current behaviour (no error) a warning is printed to the logger in these cases
- that dcmconv gets new command line options that allow the user to select between these two policies
Issue reported 2023-04-13 by Mathieu Maleterre <mathieu.malaterre@gmail.com>.
Actions