Home > Archive > MySQL ODBC Connector > September 2005 > Another LOAD Infile Problem









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Another LOAD Infile Problem
Jason Ferguson

2005-09-27, 3:23 am

------ =_Part_7451_21520236
.1127789882464
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Many thanks for the earlier response to why LOAD DATA INFILE wasnt working
for me. However, another problem has appeared.

In the file I am reading, 2 of the fields are SUPPOSED to be float values.
However, in several places, they are set to "UNKNOWN". This seems to cause
LOAD to abort.

Is there a way for me to tell it to ignore this problem and just use the
default value for the column?

Jason

------ =_Part_7451_21520236
.1127789882464--
Jasper Bryant-Greene

2005-09-27, 3:23 am

Jason Ferguson wrote:
> Many thanks for the earlier response to why LOAD DATA INFILE wasnt working
> for me. However, another problem has appeared.
>
> In the file I am reading, 2 of the fields are SUPPOSED to be float values.
> However, in several places, they are set to "UNKNOWN". This seems to cause
> LOAD to abort.
>
> Is there a way for me to tell it to ignore this problem and just use the
> default value for the column?


How about just replace occurrences of the string "UNKNOWN" in the
original file with NULL (the logical equivalent) or 0 (if you're using
NOT NULL columns) before doing LOAD DATA INFILE?

--
Jasper Bryant-Greene
Freelance web developer
http://jasper.bryant-greene.name/

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql? unsub...sie.nctu.edu.tw

Jason Ferguson

2005-09-27, 3:23 am

------ =_Part_7467_24933093
.1127790445344
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

The data is split into about 60 files, average file size of 5 MB (varying
from 1 to 10 MB). Since there are many files, I'm trying to minimize the
required work (if there was just one consolidated file, no problem).

Jason

On 9/26/05, Jasper Bryant-Greene <jasper@bryant-greene.name> wrote:
>
> Jason Ferguson wrote:
> working
> values.
> cause
e[color=darkred]
>
> How about just replace occurrences of the string "UNKNOWN" in the
> original file with NULL (the logical equivalent) or 0 (if you're using
> NOT NULL columns) before doing LOAD DATA INFILE?
>
> --
> Jasper Bryant-Greene
> Freelance web developer
> http://jasper.bryant-greene.name/
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe: http://lists.mysql.com/mysql? unsub...br /> n@gmail.=

com
>
>


------ =_Part_7467_24933093
.1127790445344--
Robert L Cochran

2005-09-27, 3:23 am

You'll have to edit your input file. There will always be instances
where some field is quirky and you need to fix it/them/entire rows.
Don't expect the input file to be perfect.

I'd also suggest that you have a test database on a test machine that is
devoted entirely to getting your tables set up correctly. It saves a lot
of stress by giving you a platform to experiment on.

Bob Cochran



Jasper Bryant-Greene wrote:

> Jason Ferguson wrote:
>
>
>
> How about just replace occurrences of the string "UNKNOWN" in the
> original file with NULL (the logical equivalent) or 0 (if you're using
> NOT NULL columns) before doing LOAD DATA INFILE?
>



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql? unsub...sie.nctu.edu.tw

Robert L Cochran

2005-09-27, 3:23 am

Then you are in for quite a lot of editing work. I've done it a lot
myself. Don't expect your project to be easy. Look for automated ways to
edit the data according to your needs and the actual table structure.

Bob Cochran

Jason Ferguson wrote:

>The data is split into about 60 files, average file size of 5 MB (varying
>from 1 to 10 MB). Since there are many files, I'm trying to minimize the
>required work (if there was just one consolidated file, no problem).
>
>Jason
>
>On 9/26/05, Jasper Bryant-Greene <jasper@bryant-greene.name> wrote:
>
>
>
>
>



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql? unsub...sie.nctu.edu.tw

Rory McKinley

2005-09-27, 3:23 am

Jason Ferguson wrote:
> The data is split into about 60 files, average file size of 5 MB (varying
> from 1 to 10 MB). Since there are many files, I'm trying to minimize the
> required work (if there was just one consolidated file, no problem).
>
> Jason

<snippety-snip>

Hi Jason

If it's not too late (aren't timezones wonderful?) ;).

Have you considered using an interim table into which you load your file
in its entirety? Load all fields and have each field set to something
like CHAR or VARCHAR big enough to accommodate the fields in the file.
Then just pick the columns that you are interested in:

Then you can do a :

INSERT INTO final_table (col_1, col_2..col_n)
SELECT col_1, IF(col_3="unknown", 0, col_3) AS col_2
FROM interim_table
.....

Also means that you don't have to necessarily upgrade to 5.x (as per
your previous problem) - unless you want to, of course ;)

Regards

Rory


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql? unsub...sie.nctu.edu.tw

Jigal van Hemert

2005-09-27, 3:23 am

Jason Ferguson wrote:
> The data is split into about 60 files, average file size of 5 MB (varying
> from 1 to 10 MB). Since there are many files, I'm trying to minimize the
> required work (if there was just one consolidated file, no problem).


The work can be automated easily with the right tools ;-)

If you have for example perl installed on your system and the files all
have the '.dat' extension, you can use:

perl -pi -e 's/unknown/0/gi' *.dat

All instances of 'unknown' (without the quotes of course and case
insensitive) will be replaced with '0' in all of the .dat files; use
different wildcard constructions if your file have other names.

Regards, Jigal.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql? unsub...sie.nctu.edu.tw

SGreen@unimin.com

2005-09-27, 9:23 am

--=_alternative 00463A7D85257089_=
Content-Type: text/plain; charset="US-ASCII"

Jason Ferguson <fergusonjason@gmail.com> wrote on 09/26/2005 10:58:02 PM:

> Many thanks for the earlier response to why LOAD DATA INFILE wasnt

working
> for me. However, another problem has appeared.
>
> In the file I am reading, 2 of the fields are SUPPOSED to be float

values.
> However, in several places, they are set to "UNKNOWN". This seems to

cause

> LOAD to abort.
>
> Is there a way for me to tell it to ignore this problem and just use the
> default value for the column?
>
> Jason


One option is to stage that data into a table that has those columns
defined as varchars. Then copy the data from there into the original
destination table. This is also a way to get around your "ignored columns"
problem of your previous post. Import everything into a flexible staging
table (mostly varchars) that will accept the data. Then, only migrate from
your staging table those columns you actually wanted in your data.
Truncate or drop your staging table when you finish each batch.

Sure it takes up more room but you can scrub your data in MySQL which may
be easier for you to handle than trying to scrub the raw text files. I
never take raw text data and merge it into a production database in one
step. This is how I screen out bad inputs, malformed text, and otherwize
invalid data. It sometimes takes 4 or 5 times to get the raw data into the
staging table (depending on how messed up the raw data is). Once it's
there, it's much easier for me to screen and fix.

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine
--=_alternative 00463A7D85257089_=--
Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2009 droptable.com