Need help with sed/awk command

Dear all,

I have a file named as fileName with following entities,

functions
{
    planeDictName
    {
        type            surfaces;
        functionObjectLibs ( "libsampling.so" );
        outputControl   timeStep;
        surfaceFormat   vtk;
        fields          ( p U );
        interpolationScheme cellPoint;
        surfaces
        (
            planeName
            {
                type plane;
                    basePoint (0.0 0.0 0.025);
                    normalVector (0 0 1);
                triangulate false;
                interpolate true;
            }
        );
    }


    planeDictName2
    {
        type            surfaces;
        functionObjectLibs ( "libsampling.so" );
        outputControl   timeStep;
        surfaceFormat   vtk;
        fields          ( p U );
        interpolationScheme cellPoint;
        surfaces
        (
            planeName
            {
                type plane;
                    basePoint (0.0 0.0 0.075);
                    normalVector (0 0 1);
                triangulate false;
                interpolate true;
            }
        );
    }
}

Entities in red colour are user defined names. I want to read these names using awk, grep or sed commands and store in shell variables.

Can somebody help me?

Thanks & Regards,
linuxUser_

What have you tried?

too tough question. :frowning:
I don't see any condition to read the parameters in this format except functions :-o.
There is no rule to provide it like this

functions {     planeDictName     {

it can be something like this as well.

functions {





    planeDictName     {

string between two limiters { and } in this case will not work but it may be done as follows

sed 's/.*{\(.*\)}\.*/\1/' file

but I am still dumb in my problem.

Thanks & Regards,
linuxUser_

---------- Post updated at 11:03 PM ---------- Previous update was at 10:54 PM ----------

Ideal way for the solution may be,

create a string of length n, (2 in this case as only 2-plane's data needs to be written).

shell parameters:

dictName[1] = planeDictName
dictName[2] = planeDictName2
:
:
dictName[n] = .... dicts as many as present.

planeName[1] = planeDictName
planeName[2] = planeDictName2
:
:
planeName[n] = .... planes as many as present.

and fields .... there is no limit for number of fields...

1st question,
Is shell script is good choice for this problem?

Well, what you're asking for isn't trivial. There's not a one-liner I can wave at you to fix it, especially when you point out that your data isn't "pretty" the way you presented it. Neither awk nor sed nor most commandline tools are really suited for parsing recursive grammar. You have to chew through it character by character.

Do you have a C compiler?

Yes, gcc

I guess the regex [Pp]lane.*[Nn]ame.* (although ideal for the eample given) won't work for the general case. Try to describe in plain English WHAT you want to extract.

---------- Post updated at 20:02 ---------- Previous update was at 19:55 ----------

Maybe a first step :

awk '{CNT=CNT + gsub (/{/,"") - gsub(/}/,""); if (CNT==1 && !/^ *$/) print}' file
    planeDictName
    planeDictName2
1 Like

in functions{}, there will be many sub-dicts.
one of them is

    planeDictName
     {
         type            surfaces;
         functionObjectLibs ( "libsampling.so" );
         outputControl   timeStep;
         surfaceFormat   vtk;
         fields          ( p U );
         interpolationScheme cellPoint;
         surfaces
         (
             planeName
             {                 type plane;
                     basePoint (0.0 0.0 0.025);
                     normalVector (0 0 1);
                 triangulate false;
                 interpolate true;
             }
         );
     }

in this I want to store names(in red colour) in shell variables

for this case, consider shell variables
dictName, planeName, fieldNames

dictName = planeDictName
planeName = planeName
fieldNames[1] = p
fielsNames[2] = U

Another question. If these name are user defined, how will I know I'm in the right structure and not the wrong one, if not by the name?

It must be a continuous string contains a-z and 0-9, else application will not work :expressionless:

typical names as follows,
sampleData1
dataForPlane1
etc..,

sampleData1 is more than a-z 0-9 .

you mean to say D is not belongs to it????
sorry please add A-Z as well

Try

awk     '                       {CNT=CNT+gsub(/{/,"")-gsub(/}/,""); if (CNT==1 && !/^ *$/) print "dictname[" ++dc "]=" $0}
         /fields/               {for (i=3; i<NF; i++) print "fieldname[" ++fc "]="$i}
         /^ *surfaces */        {S=1; next}
         S                      {gsub(/[^0-9A-Za-z]*/, ""); if (!/^ *$/) {print "planename[" ++pc "]=" $0; S=0}}
        ' file
dictname[1]=    planeDictName
fieldname[1]=p
fieldname[2]=U
planename[1]=planeName
dictname[2]=    planeDictName2
fieldname[3]=p
fieldname[4]=U
planename[2]=planeName
1 Like

thanks a lot... its working fine up to this :slight_smile:

I want to add one more limiter to avoid any other entities with {} as current script search for strings inside {}.
Ideal condition is functions{ "read here only "}

Use the "surfaces" line as an example to implement your own solution.

one more issue:

fields have no limit.
I mean lets say in dict1 fields are p, U
in dict2 may be T will be the field

all I mean to say is, Instead of saving field names as continuous string, dict[1].fieldName[1] = p, dict[1].fieldName[2] = U etc..
will be more unique

---------- Post updated at 12:05 PM ---------- Previous update was at 01:30 AM ----------

Can you explain this condition?

if (CNT==1 && !/^ *$/)

Why don't you give it a try? Every fieldname will have its own array element.
EDIT: Oh, got you now. shells don't have those structures. Recent shells with associative arrays might allow for an approach comig close...

CNT represents the "level" of "{...}" nestings. So, if the level is 1 deep, and if there's more than an "empty" (nothing but spaces) line, print that. This is, of course, heavily depending on the structure of your file. If all the info were written in a single line, that logic would be doomed.

1 Like
#!/bin/bash
declare -a dictName;
declare -a planeName;
declare -a fieldName;
dCount=0;
pCount=0;
fCount=0;

awk     '/^ *functions */       {F=1; next}
         F                      {CNT=CNT+gsub(/{/,"")-gsub(/}/,""); if (CNT==1 && !/^ *$/) {dictName[dCount]=$0; dCount=$((dCount+1))} }
         /fields/               {for (i=3; i<NF; i++) {fieldName[fCount]=$i; fCount=$((fCount+1))}}
         /^ *surfaces */        {S=1; next}
         S                      {gsub(/[^0-9A-Za-z]*/, ""); if (!/^ *$/) {planeName[pCount]=$i; pCount=$((pCount+1)); S=0}}
        ' file

Hi, will it work something like this?(as shown above?)

I am able to get the names wat eva I want but unable to store in a string array.

I don't think that will work. You can't use shell variables inside an awk script. There is a mechanism to pass variables (cf. man awk), but you won't get back any values into variables except by command substitution. On top, you stopped using that F logical variable half way. Try like second line

!F {next}

and then, somewhere reasonable, sth containing

!CNT {exit}

This may fail if the input file structure is different from the one you presented.

You could try printing all the assignments to a file and then source that file from your shell. Or, in recent shells with "process substitution" sth like

. <(awk 'BEGIN {print "X=17"}')
echo $X
17
1 Like

OMG... I see lot more stuffs to understand to finish this job :frowning:

Can you give me a example storing a variable like this

declare -a dictName
j=0
while(some condition that satisfies my requirement-for looping)
dictName[j] <(awk '/^ *functions */       {F=1; next}
         F                      {CNT=CNT+gsub(/{/,"")-gsub(/}/,""); if (CNT==1 && !/^ *$/) {dictName[dCount]=$0; dCount=$((dCount+1))} }')

Thanks and Regards,
linuxUser_

---------- Post updated at 04:51 PM ---------- Previous update was at 04:49 PM ----------

One more thing, why is that space before the dictNames? when printing?
Can I remove that space ???

dictname[1]=[SPACE]planeDictName

As I said, awk does NOT use shell variables like that, nor the $((...+1)) shell construct. As you are using bash, the process substitution might work. If

awk '                       {CNT=CNT+gsub(/{/,"")-gsub(/}/,"");
                                 if (CNT==1 && !/^ *$/) {gsub (/ /,_); print "dictname[" ++dc "]=" $0}
                                }
         /fields/               {for (i=3; i<NF; i++) print "fieldname[" ++fc "]="$i}
         /^ *surfaces */        {S=1; next}
         S                      {gsub(/[^0-9A-Za-z]*/, ""); if (!/^ *$/) {print "planename[" ++pc "]=" $0; S=0}}
        ' file

produces

dictname[1]=planeDictName
fieldname[1]=p
fieldname[2]=U
planename[1]=planeName
dictname[2]=planeDictName2
fieldname[3]=p
fieldname[4]=U
fieldname[5]=T
planename[2]=planeName

, sourcing the process substitution

. <(awk    '            {CNT=CNT+... )

will assign all those variables:

echo ${dictname[@]} ${planename[@]} ${fieldname[@]} 
planeDictName planeDictName2 planeName planeName p U p U T

But - be careful with that sourcing, as every malicious result will be executed as well!

1 Like