Home > Archive > PostgreSQL Administration > December 2005 > 8.1: import of 8.0 dump fails with UTF-8 error









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author 8.1: import of 8.0 dump fails with UTF-8 error
Thomas Mueller

2005-12-11, 8:24 pm

Hello,

I did a pg_dumpall of all my Pg 8.0.3 databases, removed 8.0, installed
8.1 and tried to import the dump. One table in one database failed with:

ERROR: invalid UTF-8 byte sequence detected near byte 0x83
CONTEXT: COPY pwd_name, line 22428, column name: "t.tonnement"

So I exported that database with 8.0 as Inserts to a text file and tried
to fix it using iconv, but that fails as well:

# iconv -f UTF-8 -t UTF-8 dump.sql > dump-fixed.sql
iconv: illegal input sequence at position 2588882

How can I fix the sql script to import it?
I have Debian Linux 3.1.


Thanks a lot,
Thomas


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Bruce Momjian

2005-12-12, 3:24 am

Thomas Mueller wrote:
> Hello,
>
> I did a pg_dumpall of all my Pg 8.0.3 databases, removed 8.0, installed
> 8.1 and tried to import the dump. One table in one database failed with:
>
> ERROR: invalid UTF-8 byte sequence detected near byte 0x83
> CONTEXT: COPY pwd_name, line 22428, column name: "t.tonnement"
>
> So I exported that database with 8.0 as Inserts to a text file and tried
> to fix it using iconv, but that fails as well:
>
> # iconv -f UTF-8 -t UTF-8 dump.sql > dump-fixed.sql
> iconv: illegal input sequence at position 2588882
>
> How can I fix the sql script to import it?
> I have Debian Linux 3.1.


We have updated the 8.1.0 release notes to mention a fix:

Some users are having problems loading UTF-8 data into 8.1.X.
This is because previous versions allowed invalid UTF-8 byte
sequences to be entered into the database, and this release
properly accepts only valid UTF-8 sequences. One way to correct a
dumpfile is to run the command <command>iconv -c -f UTF-8 -t
UTF-8 -o cleanfile.sql dumpfile.sql</>. The <literal>-c</> option
removes invalid character sequences. A diff of the two files will
show the sequences that are invalid. <command>iconv</> reads the
entire input file into memory so it might be necessary to use
<application>split</> to break up the dump into multiple smaller
files for processing.

--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql
.org so that your
message can get through to the mailing list cleanly

Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2008 droptable.com