sh and MySQL LOAD DATA

I have a csv file in a folder that is roughly 500,000 rows long.

Rather than using PHP, I would like to use SH to run a MYSQL LOAD DATA command to load the data in, as I think it would be much faster and would not cause any memory problems associated with PHP.

But the problem is, I am not sure how to combine sh and my MySQL LOAD DATA command as it's on several lines and is very, very long.

The following is a sample of the actual load data I use.

How do I make sh to connect to mysql, and run this following command?

Any help on this would be great!

LOAD DATA LOCAL INFILE '/path/to/traveldata/traveldata.csv' 
INTO TABLE `offers` 
FIELDS TERMINATED BY ',' ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 2 LINES
(
OfferProcessType,
TradingName,
OfferLabelDesc,
DestinationName,
DepAirportCode,
TelephoneNumber,
ReturnAirportCode,
ArrivalAirportCode,
ABTA,
ATOL,
OtherBonding,
BoardTypesShort,
AccomodationShort,
AllocationShort,
OperatorShort,
@DepartureDate,
Price,
ShareBasisAdult,
BulkLoaderRef,
Email,
PublicNotes,
CCC,
TOD,
DCC,
AMEX,
CDW,
BookingFee,
AccomodationName,
Duration,
ShareBasisChild,
Rating,
OfferLink,
AccomodationRef,
Transfers,
OperatorsRating,
QuoteRef,
Url,
OutDepTime,
OutArrTime,
InDepTime,
InArrTime
)
SET
DepartureDate = (str_to_date(@DepartureDate, '%d-%M-%y'))
;

On the command line, you can just use mysqlimport. It's the direct equivalent of the command you want to run, and takes the same arguments.

If you're familiar with sqlldr for Oracle, it works in a similar fashion.

Also, you must make sure that your MySQL configuration permits loading local data. I think this is normally the case, by default, but some folks turn it off with the following setting in my.cnf

local-infile=0

(off)

This looks like a good solution, however... it states;

So this means that the CSV file has to be named the same as the table -- but what if I want to import into a specific table?

I don't want to be reliant upon the CSV filename, especially if the name suddenly changes then the mysqlimport code will result in an error.

No, you have to rename the file to match the table. That's just the way mysqlimport works.

Oh... I guess I'll just have to use PHP or another solution as I don't have any control over the incoming filename.

Sure you do. If you're writing a shell script, have the script rename it. Of course you can always use a programming language to do the inserts, but it's probably going to be less efficient.

Oh, I didn't know that.

Am I right in thinking all I need to do is rename the CSV filename to match the table and then do the mysqlimport?

I've done some tests, I do not have the right priveleges for FILE with mysql, and I can't seem to change them either.

Another easy option is to save your SQL query exactly as you would type it an the MySQL prompt in a text file, then:

mysql -u yourname -p < myquery.sql

If you can specify the filename in the query (and have permission), then you can do it that way.

Oh, I will try your solution with a dummy table and hopefully it will work!

Thanks!

Can you could use this command to load CSV files? I'm going to investigate this.

The problem I see is that my CSV file isn't a SQL query, and secondly I do not have any control over the contents of the CSV file.

There are parts I need to change, such as the date which is stored as text and needs to be converted to a proper date format.

Also there are lines in the CSV file I want to ignore, such as the first 2 lines, and ensure it understands line breaks as being the end of a row.

I'm going to investigate whether you can use the command to load csv files.

You can create a 'LOAD DATA INFILE' query and place it in the ".sql" file (the name is irrelevant). This is what I meant earlier, but I didn't specify that because I mistakenly thought your first post already mentioned it, but instead I see you wrote "MYSQL LOAD DATA," which may have been referring to something else.

The LOAD DATA INFILE syntax allows you to ignore any number of lines from the beginning of the file, and you don't have to convert a string date to a datetime if it's in a reasonable format.

Obviously you can tell the query that your file delimiter is a comma, and if you have strings which may include commas, use the ENCLOSED BY option.

MySQL :: MySQL 5.1 Reference Manual :: 12.2.6 LOAD DATA INFILE Syntax

You keep saying that you have no control over the CSV file. That's not true in any way. If you're able to write a script to rename it or move it to a temp folder, there's no reason why you can't run it through a little sed, awk, or Perl to do a little cleanup first. You're on a *nix command line -- you're empowered!

Thanks for clarification! I appreciate it!

I've now had the chance to test this, it does indeed work! Thanks a lot! I want to thank you for your help, I really appreciate it. My CSV file extracts its contents into a date-based folder (Format: YYYY-MM-DD), what I'm wondering is, is it possible to tell my `csv_import.sql` (which is where my big LOAD DATA FILE sits) to use the latest date in the folder structure? I tried the following but MySQL complained that there was an error.

 LOAD DATA LOCAL INFILE '/path/to/inbounddata/data/%Y-%m-%d/traveldata.csv'  

Is there a way of either, passing a date argument to my .sql file, or perhaps telling this command to use the latest date as part of the folder structure? Thanks

Maybe something like this:

 echo "LOAD DATA LOCAL INFILE '/path/to/inbounddata/data/$(date +%Y-%m-%d)/traveldata.csv'" | mysql -u yourname -p

Try this alone first to judge the output:

$ echo "LOAD DATA LOCAL INFILE '/path/to/inbounddata/data/$(date +%Y-%m-%d)/traveldata.csv'"
LOAD DATA LOCAL INFILE '/path/to/inbounddata/data/2009-03-13/traveldata.csv'
$ echo "LOAD DATA LOCAL INFILE '/path/to/inbounddata/data/$(date +%Y-%m-%d)/traveldata.csv'"
LOAD DATA LOCAL INFILE '/path/to/inbounddata/data/2009-03-13/traveldata.csv'

Hey, that worked great on an echo/output -- I'm going to do some tests on a dummy table and hopefully it'll be fine!.

I've tried it this on the actual .sql file which holds the load data stuff;

LOAD DATA LOCAL INFILE '/path/to/inbounddata/data/$(date +%Y-%m-%d)/traveldata.csv'

But I get this error;

Seems like I cannot mix and match MySQL/Shell commands inside .sql files. I tried putting the mysql date function, but this doesn't appear to work either.

Just to clarify, this is what my process is.

  1. Use sh to extract contents of a zip file to a date based folder (YYYY-MM-DD)
  2. Use sh to run a mysql import command
mysql -u someroot -psomepassword -h localhost somedb < /path/to/csv_import.sql
  1. Import the CSV file into MySQL. The csv_import.sql is a really big long file where the LOAD DATA LOCAL INFILE exists. It needs to get the date of the latest csv file extracted.

Ideally I'd like shell to do all the heavy lifting rather than a PHP script, or perhaps there's a better way of making step 3 working?

I'm open to any ideas.

Thanks.

/path/to/inbounddata/inbounddata/data/

... is not to be taken literally. You must substitute the correct path on your system.

I know that .. I replace it with the path I use. I'm going to try again with making a date command inside a .sql file, and hopefully I can resolve the issue.