JSON structure to table form in awk, bash

Hello guys,

I want to parse a JSON file in order to get the data in a table form.

My JSON file is like this:

{
   "document":{
      "page":[
         {
            "@index":"0",
            "image":{
               "@data":"ABC",
               "@format":"png",
               "@height":"620.00",
               "@type":"base64encoded",
               "@width":"450.00",
               "@x":"85.00",
               "@y":"85.00"
            }
         },
         {
            "@index":"1",
            "row":[
               {
                  "column":[
                     {
                        "text":""
                     },
                     {
                        "text":{
                           "#text":"Text1",
                           "@fontName":"Arial",
                           "@fontSize":"12.0",
                           "@height":"12.00",
                           "@width":"71.04",
                           "@x":"121.10",
                           "@y":"83.42"
                        }
                     }
                  ]
               },
               {
                  "column":[
                     {
                        "text":""
                     },
                     {
                        "text":{
                           "#text":"Text2",
                           "@fontName":"Arial",
                           "@fontSize":"12.0",
                           "@height":"12.00",
                           "@width":"101.07",
                           "@x":"121.10",
                           "@y":"124.82"
                        }
                     }
                  ]
               }
            ]
         },
         {
            "@index":"2",
            "row":[
               {
                  "column":{
                     "text":{
                        "#text":"Text3",
                        "@fontName":"Arial",
                        "@fontSize":"12.0",
                        "@height":"12.00",
                        "@width":"363.44",
                        "@x":"85.10",
                        "@y":"69.62"
                     }
                  }
               },
               {
                  "column":{
                     "text":{
                        "#text":"Text4",
                        "@fontName":"Arial",
                        "@fontSize":"12.0",
                        "@height":"12.00",
                        "@width":"382.36",
                        "@x":"85.10",
                        "@y":"83.42"
                     }
                  }
               },
               {
                  "column":{
                     "text":{
                        "#text":"Text5",
                        "@fontName":"Arial",
                        "@fontSize":"12.0",
                        "@height":"12.00",
                        "@width":"435.05",
                        "@x":"85.10",
                        "@y":"97.22"
                     }
                  }
               }
            ]
         },
         {
            "@index":"3"
         }
      ]
   }
}

I've been trying with awk doing like below, but I get an output far from what I´d like:

awk -F: '
        /"#text"/      {a01=$2}
        /"@data"/      {a02=$2}
        /"@fontName"/  {a03=$2}
        /"@fontSize"/  {a04=$2}
        /"@fontStyle"/ {a05=$2}
        /"@format"/    {a06=$2}
        /"@height"/    {a07=$2}
        /"@type"/      {a08=$2}
        /"@width"/     {a09=$2}
        /"@x"/         {a10=$2}
        /"@y"/         {z=a01" "a01" "a02" "a03" "a04" "a05" "a06" "a07" "a08" "a09" "a10" "$2; print z}' input.json

|"85.00",,coded",
|"83.42"",coded",
|"124.82",coded",
|"69.62",,coded",
|"83.42",,coded",
|"97.22",,coded",

I'd like to get an output like this(where NaN is to know that for that parameter there is no value):

   #text @data @fontName @fontSize @format @height          @type  @width      @x      @y
0    NaN   ABC       NaN       NaN     png  620.00  base64encoded  450.00   85.00   85.00
1  Text1   NaN     Arial      12.0     NaN   12.00            NaN   71.04  121.10   83.42
2  Text2   NaN     Arial      12.0     NaN   12.00            NaN  101.07  121.10  124.82
3  Text3   NaN     Arial      12.0     NaN   12.00            NaN  363.44   85.10   69.62
4  Text4   NaN     Arial      12.0     NaN   12.00            NaN  382.36   85.10   83.42
5  Text5   NaN     Arial      12.0     NaN   12.00            NaN  435.05   85.10   97.22

May someone help me out with this problem. Thanks

Please search the forums for json shell utilities.

This same question has been asked and answered a few times already.

Thanks!

Thanks for your answer. Maybe you or someone other expert could show me an example that applies to my case with awk or another specific tool.

Thanks in advance

Maybe you can do some of your own reading and study:

https://stedolan.github.io/jq/

jq is a lightweight and flexible command-line JSON processor.

Seems plenty of examples on the jq site.

See also, here at unix.com:

JSON Output format

Small modifications to your awk approach get you pretty close to what you want:

awk -F: '
BEGIN           {a01 = a02 = a03 = a04 = a05 = a06 = a07 = a08 = a09 = a10 = "NaN"
                 print "#text @data @fontName @fontSize @format @height          @type  @width      @x    @y"
                }
                {gsub (/[",]/,_)
                }
/#text/         {a01=$2}
/@data/         {a02=$2}
/@fontName/     {a03=$2}
/@fontSize/     {a04=$2}
/@fontStyle/    {a05=$2}
/@format/       {a06=$2}
/@height/       {a07=$2}
/@type/         {a08=$2}
/@width/        {a09=$2}
/@x/            {a10=$2}
/@y/            {print CNT++ " " a01 " " a02 " " a03 " " a04 " " a05 " " a06 " " a07 " " a08 " " a09 " " a10 " " $2
                 a01 = a02 = a03 = a04 = a05 = a06 = a07 = a08 = a09 = a10 = "NaN"
                }
'  file
#text @data @fontName @fontSize @format @height          @type  @width      @x    @y
0 NaN ABC NaN NaN NaN png 620.00 base64encoded 450.00 85.00 85.00
1 Text1 NaN Arial 12.0 NaN NaN 12.00 NaN 71.04 121.10 83.42
2 Text2 NaN Arial 12.0 NaN NaN 12.00 NaN 101.07 121.10 124.82
3 Text3 NaN Arial 12.0 NaN NaN 12.00 NaN 363.44 85.10 69.62
4 Text4 NaN Arial 12.0 NaN NaN 12.00 NaN 382.36 85.10 83.42
5 Text5 NaN Arial 12.0 NaN NaN 12.00 NaN 435.05 85.10 97.22
2 Likes

Thanks so much RudyC. It works pretty nice.