Copying some part of file

anushree.a · December 2, 2008, 6:53am

Hey friends,
Here I am with another query. I have a TXT file.
Foe an example

EX ID : B-Mezine
....
...
...
Some lines of text (Not fixed in length n no of lines)..
...
...
..
END EX ID

Some blank lines in between two records(Not fixed in numbers)

EX ID : B-Mezine
....
...
...
Some lines of text (Not fixed in length n no of lines)
..
...
...
..END EX ID
Some blank lines in between two records(Not fixed in numbers)
EX ID : B-Mezine
....
...
...
Some lines of text (Not fixed in length n no of lines)
..
...
...
..END EX ID

Now if I consider portion from EX ID : B-Mezine to END EX ID a record then this file has some thousand record in it.

Now what I want is, I want to split this file in say 7700 records in each file. For eg. If a file has 38700 records and I want to split this file in 7700-7700 records then it should copy first 7700 records from the main file n paste it in a new file named as part1.txt, den next 7700 records-part2.txt n so on.

Can any one help me in this matter?

Thanx in advance
Anushree.

joeyg · December 2, 2008, 9:54am

In your example, you show many lines but three records. So, are you saying that the file hase 38500 records configured as your examples, and you want to make five 7700 record files from this?

anushree.a · December 2, 2008, 12:42pm

dear Joeyg,
Input file 38700 regords in the format which i hv illustrated (my format shows 3 records, but in actual input file there are 38700 records which I want to split in 1-7700 records in 1st file (o/p File name part1.txt), 7701-15400 in 2nd file (o/p File name part2.txt), 15401-23100 in 3rd file (o/p File name part3.txt) and so on.

Franklin52 · December 2, 2008, 1:42pm

Try this:

awk 'NR%7700==1 {i++} {print > "part"i".txt"}' file

Regards

anushree.a · December 2, 2008, 11:46pm

Thanx for the solution buddy,
But it is splitting the file on the basis of number of lines i.e. 7700 lines in each o/p file, which is not i am looking for. I have an input file which has around 35,500 number of records which starts by the pattern "EX ID : B-Mezine" and consists some data which is not fixed in lenght n number of lines, and after that it ends with pattern "END EX ID" and this is called as "A Record"

The script ideally should pick 7700 records from input file, write it to out put file named as part1.txt, then next 7700 records, writes it to o/p fil named as part2.txt nso on.

Waiting for your reply anxiously
Please...
Anushree A

anushree.a · December 3, 2008, 2:40am

Hey friends,
Here I am with another query. I have a TXT file.
Foe an example

EX ID : B-Mezine
....
...
...
Some lines of text (Not fixed in length n no of lines).....
...
..END EX ID

Some blank lines in between two records(Not fixed in numbers)

EX ID : B-Mezine
....
...
...
Some lines of text (Not fixed in length n no of lines)
..
...
...
..END EX ID
Some blank lines in between two records(Not fixed in numbers)
EX ID : B-Mezine
....
...
...
Some lines of text (Not fixed in length n no of lines)
..
...
...
..END EX ID
The portion EX ID : B-Mezine to END EX ID makes a RECORD

Now if I consider portion from EX ID : B-Mezine to END EX ID a record then this file has some thousand record in it.

Now what I want is, I want to split this file in say 7700 RECORD (Not lines) in each file. For eg. If a file has 38700 records and I want to split this file in 7700-7700 records then it should copy first 7700 records from the main file n paste it in a new file named as part1.txt, den next 7700 records-part2.txt n so on.

Can any one help me in this matter?

Thanx in advance
Anushree.

Christoph_Spohr · December 3, 2008, 3:13am

Respect the forum rules:

(4) Do not 'bump up' questions if they are not answered promptly. No duplicate or cross-posting and do not report a post or send a private message where your goal is to get an answer more quickly.

anushree.a · December 3, 2008, 3:39am

Sorry for that, but if you check carefully my earlier query was actually confusing because of which i didn't get exactly what i wanted. Any ways, if its still a wrong-do then sorry once again

Franklin52 · December 3, 2008, 3:48am

Try this one:

awk '
/^EX ID/{i++;p=1}
/END EX ID/{p=0;print > "part"i".txt"}
p{print > "part"i".txt"}' file

Regards

radoulov · December 3, 2008, 4:49am

You may try something like this:
(use nawk or /usr/xpg4/bin/awk on Solaris)

awk '/^EX/ && !(i++%n) { 
  if (c) close(fn)
  fn = ("part" ++c ".txt")
  }  
{ print > fn }
' n=7700 infile

Franklin52 · December 3, 2008, 4:55am

Duplicate of this post:

http://www.unix.com/shell-programming-scripting/91688-copying-some-part-file.html\#post302263651

No duplicate or cross-posting, please read the rules.

summer_cherry · December 3, 2008, 4:59am

nawk '/EX ID : B-Mezine/{n++}
/^$/ { next}
{
	f=sprintf("%s.txt",n)
	print >> f
}' filename

radoulov · December 3, 2008, 5:00am

Threads merged.

anushree.a · December 4, 2008, 1:11am

Hey friends first of all thanx for putting efforts.
I have tried Randoulov's solution as at a glance it is the only solution which has mentioned about 7700 records.
I.e.
awk '/^EX/ && !(i++%n) {
if (c) close(fn)
fn = ("part" ++c ".txt")
}
{ print > fn }
' n=7700

But I got following error

awk: A print or getline function must have a file name.
The input line number is 1. The file is test.txt.
The source line number is 5.

What may have gone wrong, can you please guide me?
Waiting for your reply.

radoulov · December 4, 2008, 3:02am

OK,
try this:

awk '/^EX/ && !(i++%n) { 
  if (c) close(fn)
  fn = ("part" ++c ".txt")
  }  
fn { print > fn }
' n=7700 infile

anushree.a · December 4, 2008, 7:17am

Dear Randoulov,
Thanx for the reply, i have tried it on my system, but it is giving following error:confused:

vxfs: msgcnt 132 mesg 001: V-2-1: vx_nospace - /dev/report/lv_report file system full (1 blo
k extent)

Also, the file is not gettig terminated at END EX ID

please can you spare some time to solve this problem :o

Anushree A

radoulov · December 4, 2008, 8:14am

Well,
this is because you run out of space. You should consider using a different filesystem for this operation, if there is not enough space on the current one.

Could you post a sample data from such a file?

anushree.a · December 5, 2008, 12:03am

Yes buddy,

Here it is,

EX ID : B-Mezine Body Language (Tto Order Q=14)
Prmnt Tto: 6
Avg size: 2.07 Sq Inch
Ext loctn: Lower Back
Mono Ink code: 174
Color ink code: 132, 111, 202
Sff: 1.04 mg
SX: F
Age: 19
Artist: Ankita
Artist Code: AKT_Mum_Colaba
Dt: 07272008 00:02
DDt: 08072008
END EX ID

EX ID : B-Mezine Body Language (Tto Order Q=15)
Prmnt Tto: 6
Avg size: 3.00 Sq Inch
Ext Loctn: Arnd Belly Button
Mono Ink code: 174
Color ink code: None
Sff: 2.41 mg
CrcM: 12 mns
Dr: none
Pn: Brb
Status: Healing
SX: F
Age: 23
EDrtn: 1:30 Approx
Artst: Ankita
Artist Code: AKT_Mum_Colaba
Dt: 07272008 07:16
DDt: 08072008
END EX ID

EX ID : B-Mezine Body Language (Piercing Order Q=15)
Piercing: 2
Ext Loctn: Belly Button
Mono Ink code: None
Color ink code: None
CrcM: 10 mns
Dr: Shamila
Pn: Brb
Status: Healing
SX: F
Age: 24
Artst: Anushree
Artist Code: ANS_Mum_Colaba
Dt: 07282008 12:23
DDt: 08082008
END EX ID

Dear you can notice the length of each record (Text from EX ID : B-Mezine to END EX ID) is not fixed. In first record its 14 lines, 2nd record its 19 lines.

radoulov · December 5, 2008, 5:11am

I cannot reproduce the issue with the above sample.
Is the output below correct?

$ cat file
EX ID : B-Mezine Body Language (Tto Order Q=14)
Prmnt Tto: 6
Avg size: 2.07 Sq Inch
Ext loctn: Lower Back
Mono Ink code: 174
Color ink code: 132, 111, 202
Sff: 1.04 mg
SX: F
Age: 19
Artist: Ankita
Artist Code: AKT_Mum_Colaba
Dt: 07272008 00:02
DDt: 08072008
END EX ID


EX ID : B-Mezine Body Language (Tto Order Q=15)
Prmnt Tto: 6
Avg size: 3.00 Sq Inch
Ext Loctn: Arnd Belly Button
Mono Ink code: 174
Color ink code: None
Sff: 2.41 mg
CrcM: 12 mns
Dr: none
Pn: Brb
Status: Healing
SX: F
Age: 23
EDrtn: 1:30 Approx
Artst: Ankita
Artist Code: AKT_Mum_Colaba
Dt: 07272008 07:16
DDt: 08072008
END EX ID

EX ID : B-Mezine Body Language (Piercing Order Q=15)
Piercing: 2
Ext Loctn: Belly Button
Mono Ink code: None
Color ink code: None
CrcM: 10 mns
Dr: Shamila
Pn: Brb
Status: Healing
SX: F
Age: 24
Artst: Anushree
Artist Code: ANS_Mum_Colaba
Dt: 07282008 12:23
DDt: 08082008
END EX ID

$ nawk '/^EX/ && !(i++%n) {
  if (c) close(fn)
  fn = ("part" ++c ".txt")
  }
{ print > fn }
' n=2 file
$ head -200 part*
==> part1.txt <==
EX ID : B-Mezine Body Language (Tto Order Q=14)
Prmnt Tto: 6
Avg size: 2.07 Sq Inch
Ext loctn: Lower Back
Mono Ink code: 174
Color ink code: 132, 111, 202
Sff: 1.04 mg
SX: F
Age: 19
Artist: Ankita
Artist Code: AKT_Mum_Colaba
Dt: 07272008 00:02
DDt: 08072008
END EX ID


EX ID : B-Mezine Body Language (Tto Order Q=15)
Prmnt Tto: 6
Avg size: 3.00 Sq Inch
Ext Loctn: Arnd Belly Button
Mono Ink code: 174
Color ink code: None
Sff: 2.41 mg
CrcM: 12 mns
Dr: none
Pn: Brb
Status: Healing
SX: F
Age: 23
EDrtn: 1:30 Approx
Artst: Ankita
Artist Code: AKT_Mum_Colaba
Dt: 07272008 07:16
DDt: 08072008
END EX ID


==> part2.txt <==
EX ID : B-Mezine Body Language (Piercing Order Q=15)
Piercing: 2
Ext Loctn: Belly Button
Mono Ink code: None
Color ink code: None
CrcM: 10 mns
Dr: Shamila
Pn: Brb
Status: Healing
SX: F
Age: 24
Artst: Anushree
Artist Code: ANS_Mum_Colaba
Dt: 07282008 12:23
DDt: 08082008
END EX ID

anushree.a · December 8, 2008, 12:29am

Yes buddy that is exactly what i wanna do. Only diff is, in part1.txt u hv taken 2 records, instead of it, it will have first 7700 records.. next 7700 records will b in part2.txt and so on.