Project

General

Profile

Actions

Bug #1113

closed

oficonv creates illegal characters when converting from ISO_IR 192 to ISO_IR 101

Added by Marco Eichelberg over 1 year ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Category:
Library and Apps
Target version:
Start date:
2024-03-08
Due date:
% Done:

100%

Estimated time:
4:00 h
Module:
oficonv
Operating System:
Compiler:

Description

When converting a string that contains greek letters from Unicode (ISO_IR 192) to Latin-2 (ISO_IR 101), oficonv does not report an error for characters that cannot be converted, but instead writes a weird byte sequence "bd\b4\2f\34" for each character.

The problem can be demonstrated by converting the attached sample file to Latin-2:

dcmconv +C "ISO_IR 101" unicode_with_greek_chars.dcm - | dcmdump - --search PatientName

Apparently, Latin-3 (ISO-IR 109) and Latin-4 (ISO-IR 110) are also affected, while Latin-1 (ISO_IR 100) is not.

Reported 2024-03-07 by Fabian Günther, see https://forum.dcmtk.org/viewtopic.php?t=5367


Files

unicode_with_greek_chars.dcm (5.25 KB) unicode_with_greek_chars.dcm Marco Eichelberg, 2024-03-08 13:57
check_iso8859_mapping_table.pl (22.4 KB) check_iso8859_mapping_table.pl Perl script for checking a Unicode to ISO 8859 mapping table Marco Eichelberg, 2024-04-07 13:10
Actions #1

Updated by Marco Eichelberg over 1 year ago

Apparently, this is caused by incorrect translation tables, in this case oficonv/datasrc/csmapper/ISO-8859/UCS%ISO-8859-2.src.
This is remarkable, because these tables come from the latest FreeBSD source, without any modification.

Actions #2

Updated by Marco Eichelberg over 1 year ago

The iconv mapping tables from Unicode to ISO-8859-2 and ISO-8859-3 contained many incorrect mappings where characters not available in the ISO character set were mapped to four character sequences essentially containing garbage. This has now been fixed by removing all mappings to four-byte character sequences that were not also present in the other ISO-8859 mapping tables.

These issues are also present in the original FreeBSD source from which oficonv has been ported. They have been reported to FreeBSD: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=278229

I have written a small (primitive) Perl script for visualizing the mapping tables. This is attached to this issue and may be useful for similar reports in the future.

Closed by commit #5d7495d8c.

Actions

Also available in: Atom PDF