Character encodings and databases
Andrew Hill
london.pm at welikegoats.com
Fri Jun 20 09:37:07 BST 2014
On Thu, 19 Jun 2014, William Blunn wrote:
> When you read it back, you will get a string "Z\xFCrich", which will compare
> different to "Z\xC3\xBCrich", and from what you describe your program will
> update the database again.
Indeed.
I have done what I should have done in the first place and replicated the
problem in a simple form rather than try and describe what the original
code was doing. So:
$ENV{NLS_LANG} = "AMERICAN_AMERICA.AL32UTF8";
$ENV{ORACLE_SID} = "blah";
$ENV{ORAPIPES} = "V2";
$ENV{ORACLE_HOME} = "/home/oracle/app/oracle/product/11.2.0/dbhome_1";
my $dbh = DBI->connect("dbi:Oracle:host=127.0.0.1;sid=blah", "blah",
"blah", {ora_charset => 'AL32UTF8'});
$dbh->{AutoCommit} = 0; $dbh->{RaiseError} = 1; $dbh->{PrintError} = 1;
$dbh->{FetchHashKeyName} = 'NAME_lc';
my $zurich = "Zürich";
$dbh->do("delete from foo");
my $sth = $dbh->prepare("INSERT INTO foo (bar) values (?)");
$sth->execute($zurich);
$dbh->commit;
$sth = $dbh->prepare("select bar from foo");
$sth->execute;
my$foo = ($sth->fetchrow_array)[0];
$sth->finish;
$dbh->disconnect;
print HexDump $foo;
print HexDump $zurich;
This outputs:
00 01 02 03 04 05 06 07 - 08 09 0A 0B 0C 0D 0E 0F
0123456789ABCDEF
00000000 5A FC 72 69 63 68 Z.rich
00 01 02 03 04 05 06 07 - 08 09 0A 0B 0C 0D 0E 0F
0123456789ABCDEF
00000000 5A C3 BC 72 69 63 68 Z..rich
And:
SQL> select dump(bar) from foo;
DUMP(BAR)
--------------------------------------------------------------------------------
Typ=1 Len=7: 90,195,188,114,105,99,104
So I know that the string is stored correctly in the table, accidentally
or otherwise. I know that the string is being stored correctly in $zurich,
as I have eliminated IO. It appears it is specifically the act of fetching
the row that causes the "wrong" data to be returned. What have I missed?
Cheers,
Andrew
More information about the london.pm
mailing list