Remove the duplicate content in a file

Here is the contents of test.txt

Dependencies Resolved
Changes in packages about to be updated:

ChangeLog for: 1:perl-Archive-Extract-0.38-131.el6_4.x86_64,

- Resolves: #915692 - CVE-2013-1667 (DoS in rehashing code)

Dependencies Resolved
Changes in packages about to be updated:

ChangeLog for: 1:perl-Archive-Extract-0.38-131.el6_4.x86_64,

- Resolves: #915692 - CVE-2013-1667 (DoS in rehashing code)

ChangeLog for: openldap-2.4.23-32.el6_4.1.x86_64
* Mon Apr 22 12:00:00 2013 Jan Syn�ek <jsynacek@redhat.com> 2.4.23-32.1
- fix: NSS related resource leak (#954299)


Expecting result like:

Dependencies Resolved
Changes in packages about to be updated:

ChangeLog for: 1:perl-Archive-Extract-0.38-131.el6_4.x86_64,

- Resolves: #915692 - CVE-2013-1667 (DoS in rehashing code)

ChangeLog for: openldap-2.4.23-32.el6_4.1.x86_64
* Mon Apr 22 12:00:00 2013 Jan Syn�ek <jsynacek@redhat.com> 2.4.23-32.1
- fix: NSS related resource leak (#954299)


Thanks

What have you tried and where are you stuck?

1 Like

Hi Scott,

So far I have collected uniq package names list.

Tried like:

$sed -n '/ChangeLog for: libblkid-2.17.2-12.9.el6_4.3.x86_64/,/Dependencies Resolved/p' centos6.txt 
ChangeLog for: libblkid-2.17.2-12.9.el6_4.3.x86_64,
             : libuuid-2.17.2-12.9.el6_4.3.x86_64,
             : util-linux-ng-2.17.2-12.9.el6_4.3.x86_64
* Tue Apr 23 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4.3
- fix #917678 - mount in RHEL 6.4 ignores user option

* Tue Apr 16 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4.2
- make patch for #911756 more robust

* Tue Apr 16 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4.1
- fix patch for #911756 to make it usable on big-endian machines

* Wed Apr 10 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4
- fix #911756 - Make silicon medley signature recognition more robust


Dependencies Resolved
ChangeLog for: libblkid-2.17.2-12.9.el6_4.3.x86_64,
             : libuuid-2.17.2-12.9.el6_4.3.x86_64,
             : util-linux-ng-2.17.2-12.9.el6_4.3.x86_64
* Tue Apr 23 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4.3
- fix #917678 - mount in RHEL 6.4 ignores user option

* Tue Apr 16 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4.2
- make patch for #911756 more robust

* Tue Apr 16 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4.1
- fix patch for #911756 to make it usable on big-endian machines

* Wed Apr 10 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4
- fix #911756 - Make silicon medley signature recognition more robust


Dependencies Resolved
ChangeLog for: libblkid-2.17.2-12.9.el6_4.3.x86_64,
             : libuuid-2.17.2-12.9.el6_4.3.x86_64,
             : util-linux-ng-2.17.2-12.9.el6_4.3.x86_64
* Tue Apr 23 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4.3
- fix #917678 - mount in RHEL 6.4 ignores user option

* Tue Apr 16 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4.2
- make patch for #911756 more robust

* Tue Apr 16 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4.1
- fix patch for #911756 to make it usable on big-endian machines

* Wed Apr 10 12:00:00 2013 Karel Zak <kzak@redhat.com> 2.17.2-12.9.el6_4
- fix #911756 - Make silicon medley signature recognition more robust

I'm looking out to print only 1 occurrence of it.

Thanks

awk '!A[$0]++' test.txt
1 Like

@Yoda: that might be dangerous as might remove single lines that could be duplicates of other records. I have to admit this is a remote possibility. Still, you can make it safer adding sth like RS="\n\n\n" to your proposal

1 Like

The occurrences are just random number of times.. all I need is just print the 1st occurrence of the start pattern match..

---------- Post updated at 11:51 AM ---------- Previous update was at 11:42 AM ----------

awk '!A[$0]++' RS="\n\n\n"  test.txt   # works great .

Have a good day :slight_smile: