Replacing tag based on condition

Hi All,

I am having a file like below. The file will having information about the records.If you see the file the file is header and data. For example it have 1 men tag and the tag id will be come after headers. The change is I want to convert All pets tag from P to X. I did a sed like below and it worked

sed -i 's/P\([0-9]\)/X\1/g' $1

the issue is when it is chaning P to X it is converting all for example if my Women tag is WP0001 it changed that to WX0001 . I want to change only the pets tag to X. I have to read the pet 11 from the header and then change those only to X in the file. Is there any easy way to do that

Note: I have commented the information in the file . The text after ----- in the file is a comment I put for easy reading the actual file is without that

Original File

N44361,
New Insu,
Men,1,                                             ----------------------  MEN (1 here)
Women,57,					----------------------- Women(57 here)
Kids,65,						----------------------- Kids (65 here))
Pets,11,						----------------------- Pets (11 here)
Home,1,						----------------------- Home (1 here)
M11,						----------------------- Men tag
WP770059634,WP770059639,WP770059644,P770059654,P770059655,P770059656,W770059657,W770059658,W770059659,W770059660,   --------- Women tag
W770059661,W770059662,W770059663,W770059664,W770059665,W770059666,W770059667,W770059668,W770059669,W770059670,   --------- Women tag
W770059671,W770059672,W770059673,W770059674,W770059675,W770059676,W770059677,W770059678,W770059679,W770059680,   --------- Women tag
W770059681,W770059682,W770059683,W770059684,W770492470,W770492472,W770492474,W770755150,W770755151,W770492477,   --------- Women tag
W770755128,W770755131,W770755132,W770755135,W770755139,W770755140,W770755141,W770755142,W770755143,W770755144,   --------- Women tag
W770755149,W770755148,W770755147,W770755146,W770755153,W770755154,W5002,			--------- Women tag
K70059634,K70059639,K70059644,K70059654,K70059655,K70059656,K70059657,K70059658,		----------- Kid tag
K70059659,K70059660,K70059661,K70059662,K70059663,K70059664,K70059665,K70059666,		----------- Kid tag
K70059667,K70059668,K70059669,K70059670,K70059671,K70059672,K70059673,K70059674,		----------- Kid tag
K70059675,K70059676,K70059677,K70059678,K70059679,K70059680,K70059681,K70059682,		----------- Kid tag
K70059683,K70059684,K70492470,K70492472,K70492474,K70755150,K70755151,K70492477,		----------- Kid tag
K70755128,K70755131,K70755132,K70755135,K70755139,K70755140,K70755141,K70755142,		----------- Kid tag
K70755143,K70755144,K70755149,K70755148,K70755147,K70755146,K70755153,K70755154,		----------- Kid tag
K5002,K5007,K5012,K5017,K5022,K5027,K5032,K5037,								----------- Kid tag
K5042,											-----------Kid tag
P300102,P2200,P2201,P2202,P2203,P2204,P2205,P2206,P2207,P2208,	  ----------- Pet tag
P2209,											----------- Pet tag
P1917079,										 ----------- Home tag

Expected output

N44361,
New Insu,
Men,1,                                             ----------------------  MEN (1 here)
Women,57,					----------------------- Women(57 here)
Kids,65,						----------------------- Kids (65 here))
Pets,11,						----------------------- Pets (11 here)
Home,1,						----------------------- Home (1 here)
M11,						----------------------- Men tag
WP770059634,WP770059639,WP770059644,P770059654,P770059655,P770059656,W770059657,W770059658,W770059659,W770059660,   --------- Women tag
W770059661,W770059662,WP770059663,W770059664,W770059665,W770059666,W770059667,W770059668,W770059669,W770059670,   --------- Women tag
W770059671,W770059672,W770059673,W770059674,W770059675,W770059676,W770059677,W770059678,W770059679,W770059680,   --------- Women tag
W770059681,W770059682,W770059683,W770059684,W770492470,W770492472,W770492474,W770755150,W770755151,W770492477,   --------- Women tag
W770755128,W770755131,W770755132,W770755135,W770755139,W770755140,W770755141,W770755142,W770755143,W770755144,   --------- Women tag
W770755149,W770755148,W770755147,W770755146,W770755153,W770755154,W5002,			--------- Women tag
K70059634,K70059639,K70059644,K70059654,K70059655,K70059656,K70059657,K70059658,		----------- Kid tag
K70059659,K70059660,K70059661,K70059662,K70059663,K70059664,K70059665,K70059666,		----------- Kid tag
K70059667,K70059668,K70059669,K70059670,K70059671,K70059672,K70059673,K70059674,		----------- Kid tag
K70059675,K70059676,K70059677,K70059678,K70059679,K70059680,K70059681,K70059682,		----------- Kid tag
K70059683,K70059684,K70492470,K70492472,K70492474,K70755150,K70755151,K70492477,		----------- Kid tag
K70755128,K70755131,K70755132,K70755135,K70755139,K70755140,K70755141,K70755142,		----------- Kid tag
K70755143,K70755144,K70755149,K70755148,K70755147,K70755146,K70755153,K70755154,		----------- Kid tag
K5002,K5007,K5012,K5017,K5022,K5027,K5032,K5037,								----------- Kid tag
K5042,											-----------Kid tag
X300102,X2200,X2201,X2202,X2203,X2204,X2205,X2206,X2207,X2208,	  ----------- Pet tag
X2209,											----------- Pet tag
P1917079,										 ----------- Home tag

Thanks in Advance

perl -pe 's/\bP/X/g;' file

It also affects the comments but you say they aren't in the actual file.

It also affect the Home tag. How do you intend to fix that? Will the numeric string for the Home tag always be longer than for the Pet tags? If so,

perl -pe 's/\bP(?=\d{1,6},)/X/g;'

Andrew

1 Like

You would need to either:-

  • extend the expression to include what is before the letter P to ensure it was prefixed by start of line or a comma
  • follow the existing substitution with another that changes WX back to WP

Which of these would seem most logical to you?

Is using sed a requirement, or could we explore other ways?

Kind regards,
Robin

1 Like

I cannot use perl. I can use either awk and sed

Thanks,
Arun

To come up with a fail safe solution for this problem, you'll need to count tags until you reach the target ones. For this, additional info is required: Where does the header end? Is it always ending by the "home" entry, or always a constant line count? Are all tag lines terminated by a comma, even if the type changes?
Assuming "home", and "yes", try

awk '
$1 == SRCH      {CNT  = $2
                 PRCU = 1
                }

!PRCU           {SUM += $2
                }

!HDDONE         {print
                 if (/^Home/) HDDONE = 1
                 next
                }

TAGS < SUM      {TAGS += NF - 1
                 print
                 next
                }
CURR < CNT      {CURR += gsub (PRFX, REPL)
                 print
                 next
                }

1

' FS=, SRCH="Pets" PRFX="P" REPL="X" file
N44361,
New Insu,
Men,1,
Women,57,
Kids,65,
Pets,11,
Home,1,
M11,
WP770059634,WP770059639,WP770059644,P770059654,P770059655,P770059656,W770059657,W770059658,W770059659,W770059660,
W770059661,W770059662,W770059663,W770059664,W770059665,W770059666,W770059667,W770059668,W770059669,W770059670,
W770059671,W770059672,W770059673,W770059674,W770059675,W770059676,W770059677,W770059678,W770059679,W770059680,
W770059681,W770059682,W770059683,W770059684,W770492470,W770492472,W770492474,W770755150,W770755151,W770492477,
W770755128,W770755131,W770755132,W770755135,W770755139,W770755140,W770755141,W770755142,W770755143,W770755144,
W770755149,W770755148,W770755147,W770755146,W770755153,W770755154,W5002,
K70059634,K70059639,K70059644,K70059654,K70059655,K70059656,K70059657,K70059658,
K70059659,K70059660,K70059661,K70059662,K70059663,K70059664,K70059665,K70059666,
K70059667,K70059668,K70059669,K70059670,K70059671,K70059672,K70059673,K70059674,
K70059675,K70059676,K70059677,K70059678,K70059679,K70059680,K70059681,K70059682,
K70059683,K70059684,K70492470,K70492472,K70492474,K70755150,K70755151,K70492477,
K70755128,K70755131,K70755132,K70755135,K70755139,K70755140,K70755141,K70755142,
K70755143,K70755144,K70755149,K70755148,K70755147,K70755146,K70755153,K70755154,
K5002,K5007,K5012,K5017,K5022,K5027,K5032,K5037,
K5042,
X300102,X2200,X2201,X2202,X2203,X2204,X2205,X2206,X2207,X2208,
X2209,
P1917079,
1 Like

Always Home is the end.I made that check in the other script. Please ignore my below email the above thing worked. Thanks a lot

Thanks a lot. The tags are always delimited by comma and the end is not always home. But I can make the change to include number of the count like below.

N44361,
New Insu,
5, --- new count  added 
Men,1,
Women,57,
Kids,65,
Pets,11,
Home,1,
M11,
WP770059634,WP770059639,WP770059644,P770059654,P770059655,P770059656,W770059657,W770059658,W770059659,W770059660,
W770059661,W770059662,W770059663,W770059664,W770059665,W770059666,W770059667,W770059668,W770059669,W770059670,
W770059671,W770059672,W770059673,W770059674,W770059675,W770059676,W770059677,W770059678,W770059679,W770059680,
W770059681,W770059682,W770059683,W770059684,W770492470,W770492472,W770492474,W770755150,W770755151,W770492477,
W770755128,W770755131,W770755132,W770755135,W770755139,W770755140,W770755141,W770755142,W770755143,W770755144,
W770755149,W770755148,W770755147,W770755146,W770755153,W770755154,W5002,
K70059634,K70059639,K70059644,K70059654,K70059655,K70059656,K70059657,K70059658,
K70059659,K70059660,K70059661,K70059662,K70059663,K70059664,K70059665,K70059666,
K70059667,K70059668,K70059669,K70059670,K70059671,K70059672,K70059673,K70059674,
K70059675,K70059676,K70059677,K70059678,K70059679,K70059680,K70059681,K70059682,
K70059683,K70059684,K70492470,K70492472,K70492474,K70755150,K70755151,K70492477,
K70755128,K70755131,K70755132,K70755135,K70755139,K70755140,K70755141,K70755142,
K70755143,K70755144,K70755149,K70755148,K70755147,K70755146,K70755153,K70755154,
K5002,K5007,K5012,K5017,K5022,K5027,K5032,K5037,
K5042,
X300102,X2200,X2201,X2202,X2203,X2204,X2205,X2206,X2207,X2208,
X2209,
P1917079,