Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Advertise on LowEndTalk.com
Manipulating a data file
New on LowEndTalk? Please read our 'Community Rules' by clicking on it in the right menu!

Manipulating a data file

drmikedrmike Member
edited September 2011 in General

I hate doing these. With a passion.

Just got handed a text file made up of a couple thousand of these data records.

Some data record with a md5 on it#:#/1 - Keep/some file name.pdf;

All of the data runs together in one large paragraph. What I need to do is get rid of everything before the #/ and stick a line space where the ending ; mark is. And continue on from there.

I found how I think to do the line break. Any ideas of a command line comment to get rid of the first half of that? We just need a file listing the file names and the trailing file type.

thanks

You should be proud of me. I just spelled Manipulating correctly on the first try. Major shock.

Comments

  • Congratulations drmike for achieving this!

    I'm looking for a sponsored KVM VPS, but I'm not in a hurry. Just want to hear what you might offer to me
  • dannixdannix Member
    edited September 2011
    $ cat input.txt
    Some data record with a md5 on it#:#/1 - Keep/some file name1.pdf;Some data record with a md5 on it#:#/1 - Keep/some file name2.pdf;Some data record with a md5 on it#:#/1 - Keep/some file name3.pdf;
    $ sed -e 's?;?\n?g' input.txt > input_nl.txt
    
    cat input_nl.txt
    Some data record with a md5 on it#:#/1 - Keep/some file name1.pdf
    Some data record with a md5 on it#:#/1 - Keep/some file name2.pdf
    Some data record with a md5 on it#:#/1 - Keep/some file name3.pdf
    
    $ sed -e 's?[^/]*\(.*\)?#\1?' input_nl.txt > output.txt
    #/1 - Keep/some file name1.pdf
    #/1 - Keep/some file name2.pdf
    #/1 - Keep/some file name3.pdf
    #
    
    

    Not sure about last line with # but you can easily get read of it input_nl.txt

  • I wound up doing a

    sed  's .\{53\}  ' test2.txt >> test3.txt

    to get rid of the first section of each line. That worked for me. Not prefect but I can clean up the couple of lines where it messed up.

    Thanks

Sign In or Register to comment.