Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


How to import .SQL Database dump consisting of .txts?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

How to import .SQL Database dump consisting of .txts?

FreekFreek Member
edited April 2012 in General

For an assignment at my University, we need to make use of a Patent Database. The problem is that this database is messed up and not ready to use. It has been delivered to us as several .txt files and one .sql file. The txt files are tables of the database and contain data seperated by TABs. The .sql file contains the query to create the appropriate tables with columns. I assume this is a MySQL dump.
Other students tried to open it in Access but it was a big mess with lots of import errors and non existing references.
So, I setup a local MySQL server, ran the SQL query in PHPMyAdmin and am now struggling to import the txt files.
In PHPMyAdmin I selected 'Import' and then CSV and selected the .txt file. That sadly doesn't work, it gives me the following error: Invalid column count in CSV input on line 1.
If I select to import it as 'CSV using LOAD Data' with '\t' as 'Columns terminated by', it does import only 1 row....
I contacted our professor but haven't got a reply from him yet. I assume he is abroad.
Does anyone have an idea how to sucessfully import this database?
Here's the SQL query: http://pastebin.com/1kVECmYU
Here's some sample data from the txt files: http://pastebin.com/fm8SiRzQ

Thank you very much!!

Comments

  • Why don't you write a small java programm that converts the .txts to .sqls?

  • vedranvedran Veteran

    Open .txt file in OpenOffice calc, save the file as comma separated (CSV), select , as a separator and import it from phpMyAdmin

    Thanked by 1djvdorp
  • AldryicAldryic Member
    edited April 2012

    Quickest way I can think of would be importing the DB from the .sql, then using php or perl to split up the .txt files.

    1 - replace all repeating instances of tab with a single tab or character (such as |)
    2 - explode/split each line based on the delimiter
    3 - INSERT the data based on the array from the explode/split

    Edit: vedran wins, his method is simpler :P

  • man mysqlimport

    There are options depending on how is your text formatted.

    Or you can try the other suggestions.

  • @vedran said: Open .txt file in OpenOffice calc, save the file as comma separated (CSV), select , as a separator and import it from phpMyAdmin

    Looks tab delimited to me... That is something that mySQL already handles quite well.

  • FreekFreek Member

    Thanks for the fast replies

    @gsrdgrdghd I don't have the knowledge to do so and I think it will take a long time before I get a working program which does that ;)

    @Aldryic the .sql file only contains the query which makes empty tables with coloums, so there isn't much to import?

    @yomero Text is formatted by TABs

    @vedran Using , is a seperator is sadly not an option. The data already contains some , I could use _ though.

    @speckl That's correct, it is tab delimited.

  • AldryicAldryic Member
    edited April 2012

    @Freek said: @Aldryic the .sql file only contains the query which makes empty tables with coloums, so there isn't much to import?

    Yup, but importing that .sql file creates the structure for your DBs and tables (and is necessary before inserting the data). After that, you only have to parse the .txt files and INSERT the data into the already-existing tables.

    Thanked by 1yomero
  • vedranvedran Veteran

    @Freek said: Using , is a seperator is sadly not an option. The data already contains some , I could use _ though.

    Then separate the data with " and use , as field separator http://pastebin.com/VamubCU5

  • raindog308raindog308 Administrator, Veteran

    First run the SQL to create the tables:

    mysql -u PatentDBUser -pLongLiveLET patent_db < some_file.sql

    Or

    mysql -u PatentDBUser -pLongLiveLET patent_db mysql> source some_file.sql

    Then you could use the LOAD DATA INFILE command

    mysql> LOAD DATA INFILE 'somefile.dat' INTO TABLE t1;

    Lather, rinse, repeat for the other talbes. I'm assuming you get one file per table.

    I think tab-delimited is the default. If not, you can play with the FIELDS TERMINATED BY clause:

    http://dev.mysql.com/doc/refman/5.6/en/load-data.html

    I sincerely hope I'm not helping a patent troll.

  • FreekFreek Member

    @vedran I downloaded and installed OpenOffice Calc, but the file is too large to be loaded in completely (too many rows).

    @raindog308 Thanks for the reply. Using LOAD DATA INFILE I get this error:

    1062 - Duplicate entry '3970526-FIRMA HARTUNG KUHN & CO GMBH' for key 'PRIMARY'

    Which is strange, as the table is totally empty...
    PS: No you aren't helping a patent troll ;) It's just for an assignment

    Jeesh this is harder than I thought

  • DamianDamian Member
    edited April 2012

    You'll get a 1062 error if your key is greater the allowed value of the field. '3970526-FIRMA HARTUNG KUHN & CO GMBH' is not really a valid key, unless your key field was set to some kind of TEXT field. And then it wouldn't really be a very good key.

    You should split your field so that the key would be '3970526' and 'FIRMA HARTUNG KUHN & CO GMBH' should go into another field. The concept of key will work properly, then.

    http://www.softwareprojects.com/resources/programming/t-how-to-fix-mysql-duplicate-entry-for-key-primary-o-1844.html has more info

    The first answer of http://stackoverflow.com/questions/3845014/what-is-mysql-key-efficiency is a good bit of text about why keys are important.

  • yomeroyomero Member
    edited April 2012

    Probably something in your file is messed up

  • FreekFreek Member

    The fields are split. I assume that one row is just messed up. Problem is, this file contains about 4 millions rows. Checking them by hand is going to take a long time.
    I just mailed the prof. again. Let's see what he has to say about it.
    Others 'claim' to have the database working. In Excel. That's not what I call working. 6 different excel sheets, for each table one; Useless.

  • @Freek said: Useless.

  • netomxnetomx Moderator, Veteran

    @Freek said: The fields are split. I assume that one row is just messed up. Problem is, this file contains about 4 millions rows. Checking them by hand is going to take a long time.

    just check that line, with notepad++, change, then reimport them.

  • FreekFreek Member

    Too large to import in Notepad++, already tried :(

  • vedranvedran Veteran

    Use sed!

  • @Freek said: Too large to import in Notepad++, already tried :(

    Try UltraEdit

  • use vi and type :line number
    like :537537
    and you should be able to see it

Sign In or Register to comment.