Split JSON to different data files

Hi Gurus,

I have below JSON file, now I want to rewrite this file into a new file.
I will appreciate if anyone can help me to provide the solution...I can't use jq.

{
   "_id": "3ad893cb4cf1560add7b4caffd4b6126",
   "_rev": "1-1f0ce165e1d210319cf6e9f9c6ff654f",
   "name": �couchdb_1.couchdb",
   "type": "couchdb",
   "ts": 1445785730,
   "couchdb": { 
       "auth_cache_misses": { "current": null, "sum": null, "mean":  null, "stddev": null, "min": null, "max": null },
       "database_writes":   { "current": 1955, "sum": 1955, "mean": 0.004, "stddev": 0.061, "min": 0, "max": 1 },
       "open_databases":    { "current": 47, "sum": 47, "mean": 0, "stddev": 0.03, "min": 0, "max": 14 },
       "auth_cache_hits":   { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "request_time":      { "current": 934798.325, "sum": 934798.325, "mean": 247.236, "stddev": 9323.841, "min": 0, "max": 415733 },
       "database_reads":    { "current": 688315, "sum": 688315, "mean": 1.316, "stddev": 69.941, "min": 0, "max": 5497 },
       "open_os_files":     { "current": 101, "sum": 101, "mean": 0, "stddev": 0.061, "min": -1, "max": 28 }
   },
   "httpd_request_methods": {
       "PUT":    { "current": 18, "sum": 18, "mean": 0, "stddev": 0.009, "min": 0, "max": 1 },
       "GET":    { "current": 11172, "sum": 11172, "mean": 0.021, "stddev": 0.747, "min": 0, "max": 66 },
       "COPY":   { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "DELETE": { "current": 2, "sum": 2, "mean": 0, "stddev": 0.003, "min": 0, "max": 1 },
       "POST":   { "current": 1948, "sum": 1948, "mean": 0.004, "stddev": 0.061, "min": 0, "max": 1 },
       "HEAD":   { "current": 1, "sum": 1, "mean": 0, "stddev": 0.004, "min": 0, "max": 1 } 
   },
   "httpd_status_codes": { 
       "200": { "current": 9073, "sum": 9073, "mean": 0.017, "stddev": 0.589, "min": 0, "max": 53 },
       "201": { "current": 1949, "sum": 1949, "mean": 0.004, "stddev": 0.061, "min": 0, "max": 1},
       "202": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "301": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "304": { "current": 81, "sum": 81, "mean": 0, "stddev": 0.026, "min": 0, "max": 3 },
       "400": { "current": 2, "sum": 2, "mean": 0, "stddev": 0.005, "min": 0, "max": 1 },
       "401": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "403": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "404": { "current": 1585, "sum": 1585, "mean": 0.007, "stddev": 0.375, "min": 0, "max": 33 },
       "405": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "409": { "current": 4, "sum": 4, "mean": 0, "stddev": 0.008, "min": 0, "max": 1 },
       "412": { "current": 2, "sum": 2, "mean": 0, "stddev": 0.006, "min": 0, "max": 1 },
       "500": { "current": 1, "sum": 1, "mean": 0, "stddev": 0.004, "min": 0, "max": 1 }
   },
   "httpd": {
       "clients_requesting_changes": { "current": 0, "sum": 0, "mean": 0, "stddev": 0.033, "min": -2, "max": 2 },
       "temporary_view_reads":       { "current": 4, "sum": 4, "mean": 0, "stddev": 0.008, "min": 0, "max": 1 },
       "requests":                   { "current": 12186, "sum": 12186, "mean": 0.023, "stddev": 0.751, "min": 0, "max": 66 },
       "bulk_requests":              { "current": 1920, "sum": 1920, "mean": 0.004, "stddev": 0.06, "min": 0, "max": 1 },
       "view_reads":                 { "current": 206, "sum": 206, "mean": 0.003, "stddev": 0.062, "min": 0, "max": 2 }
   }
}

Now data file should be couchdb.txt
with content as below (if NULL then 0)

couchdb,couchdb=auth_cache_misses, current=0,sum=0, mean=0,stddev=0, min=0, max=0
couchdb,couchdb=database_writes, current=1955, sum=1955, mean=0.004, stddev=0.061, min=0, max=1
......until couchdb block finished.

Then in the same file, next block httpd_request_methods will write:

couchdb, httpd_request_methods=PUT, current=18, sum=18, mean=0, stddev=0.009, min=0, max=1
couchdb, httpd_request_methods=GET, current= 11172, sum= 11172, mean=0.021, stddev=0.747, min=0, max=66
.....until httpd_request_methods finished.

next httpd_status_codes will write

couchdb, httpd_status_codes=200, current=9073,sum=9073,mean=0.017, stddev=0.589,min=0, max=53
couchdb, httpd_status_codes=201, current=1949, sum=1949, mean=0.004, stddev=0.061, min=0, max=1
.......
couchdb,httpd_status_codes=500,current=1,sum=1,mean=0,stddev=0.004, min=0, max=1
until httpd_status_codes finished 

next httpd block will write

couchdb, httpd=clients_requesting_changes,current=0, sum=0, mean=0, stddev=0.033,min=-2, max=2
......
couchdb, httpd= view_reads,current=206,sum=206,mean=0.003,stddev=0.062,min=0, max=2
until httpd block finished.

Since it is a JavaScript object try JavaScript using your favorite browser (then save the page to create a file, or write JavaScript to save to file (if the browser supports it)):

<html>
<head>
<script>
obj = {
   "_id": "3ad893cb4cf1560add7b4caffd4b6126",
   "_rev": "1-1f0ce165e1d210319cf6e9f9c6ff654f",
   "name": "couchdb_1.couchdb",
   "type": "couchdb",
   "ts": 1445785730,
   "couchdb": {
       "auth_cache_misses": { "current": null, "sum": null, "mean":  null, "stddev": null, "min": null, "max": null },
       "database_writes":   { "current": 1955, "sum": 1955, "mean": 0.004, "stddev": 0.061, "min": 0, "max": 1 },
       "open_databases":    { "current": 47, "sum": 47, "mean": 0, "stddev": 0.03, "min": 0, "max": 14 },
       "auth_cache_hits":   { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "request_time":      { "current": 934798.325, "sum": 934798.325, "mean": 247.236, "stddev": 9323.841, "min": 0, "max": 415733 },
       "database_reads":    { "current": 688315, "sum": 688315, "mean": 1.316, "stddev": 69.941, "min": 0, "max": 5497 },
       "open_os_files":     { "current": 101, "sum": 101, "mean": 0, "stddev": 0.061, "min": -1, "max": 28 }
   },
   "httpd_request_methods": {
       "PUT":    { "current": 18, "sum": 18, "mean": 0, "stddev": 0.009, "min": 0, "max": 1 },
       "GET":    { "current": 11172, "sum": 11172, "mean": 0.021, "stddev": 0.747, "min": 0, "max": 66 },
       "COPY":   { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "DELETE": { "current": 2, "sum": 2, "mean": 0, "stddev": 0.003, "min": 0, "max": 1 },
       "POST":   { "current": 1948, "sum": 1948, "mean": 0.004, "stddev": 0.061, "min": 0, "max": 1 },
       "HEAD":   { "current": 1, "sum": 1, "mean": 0, "stddev": 0.004, "min": 0, "max": 1 }
   },
   "httpd_status_codes": {
       "200": { "current": 9073, "sum": 9073, "mean": 0.017, "stddev": 0.589, "min": 0, "max": 53 },
       "201": { "current": 1949, "sum": 1949, "mean": 0.004, "stddev": 0.061, "min": 0, "max": 1},
       "202": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "301": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "304": { "current": 81, "sum": 81, "mean": 0, "stddev": 0.026, "min": 0, "max": 3 },
       "400": { "current": 2, "sum": 2, "mean": 0, "stddev": 0.005, "min": 0, "max": 1 },
       "401": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "403": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "404": { "current": 1585, "sum": 1585, "mean": 0.007, "stddev": 0.375, "min": 0, "max": 33 },
       "405": { "current": null, "sum": null, "mean": null, "stddev": null, "min": null, "max": null },
       "409": { "current": 4, "sum": 4, "mean": 0, "stddev": 0.008, "min": 0, "max": 1 },
       "412": { "current": 2, "sum": 2, "mean": 0, "stddev": 0.006, "min": 0, "max": 1 },
       "500": { "current": 1, "sum": 1, "mean": 0, "stddev": 0.004, "min": 0, "max": 1 }
   },
   "httpd": {
       "clients_requesting_changes": { "current": 0, "sum": 0, "mean": 0, "stddev": 0.033, "min": -2, "max": 2 },
       "temporary_view_reads":       { "current": 4, "sum": 4, "mean": 0, "stddev": 0.008, "min": 0, "max": 1 },
       "requests":                   { "current": 12186, "sum": 12186, "mean": 0.023, "stddev": 0.751, "min": 0, "max": 66 },
       "bulk_requests":              { "current": 1920, "sum": 1920, "mean": 0.004, "stddev": 0.06, "min": 0, "max": 1 },
       "view_reads":                 { "current": 206, "sum": 206, "mean": 0.003, "stddev": 0.062, "min": 0, "max": 2 }
   }
};

for (i in obj) {
   if ((typeof obj) === "object") {
      for (j in obj) {
         if ((typeof obj[j]) === "object") {
            document.write(obj.type + "," + i + "=" + j + ",");
            var c=0;
            for (k in obj[j]) {
               document.write(((c++==0) ? "" : ",") + k + "=" + ((! obj[j][k]) ? 0 : obj[j][k]));
            }
         }
         document.write("<br>");
      }
   }
}
</script>
</head>
</html>

If the browser cannot be used then run a utility that can interpret and run JavaScript in the command line or UI like PhantomJS or other.

1 Like

rdrtx1's approach is probably the way to go, since it can parse the JSON syntax properly.

Here is an example what might be done with awk:

awk '
  /\{[ \t]*$/{
    level++
  }
  /^[ \t]*\},?[ \t]*$/ {
    level--; object=""
  } 
  level==1 && $2=="type"{
    close(objfile)
    objtype=$4
    objfile=objtype ".txt"
  }
  level==2 && object=="" {
    object=$2
    next
  }
  level==2 {
    line=objtype OFS object "=" $2
    for(i=4; i<=NF; i+=2) {
      split($(i+1),F,"[:, \t]*")
      line=line OFS $i "=" (F[2]=="null" ? 0 : F[2])
    }
    print line > objfile
  }
' FS=\" OFS=", " file

Output:

couchdb, couchdb=auth_cache_misses, current=0, sum=0, mean=0, stddev=0, min=0, max=0
couchdb, couchdb=database_writes, current=1955, sum=1955, mean=0.004, stddev=0.061, min=0, max=1
couchdb, couchdb=open_databases, current=47, sum=47, mean=0, stddev=0.03, min=0, max=14
couchdb, couchdb=auth_cache_hits, current=0, sum=0, mean=0, stddev=0, min=0, max=0
couchdb, couchdb=request_time, current=934798.325, sum=934798.325, mean=247.236, stddev=9323.841, min=0, max=415733
couchdb, couchdb=database_reads, current=688315, sum=688315, mean=1.316, stddev=69.941, min=0, max=5497
couchdb, couchdb=open_os_files, current=101, sum=101, mean=0, stddev=0.061, min=-1, max=28
couchdb, httpd_request_methods=PUT, current=18, sum=18, mean=0, stddev=0.009, min=0, max=1
couchdb, httpd_request_methods=GET, current=11172, sum=11172, mean=0.021, stddev=0.747, min=0, max=66
couchdb, httpd_request_methods=COPY, current=0, sum=0, mean=0, stddev=0, min=0, max=0
couchdb, httpd_request_methods=DELETE, current=2, sum=2, mean=0, stddev=0.003, min=0, max=1
couchdb, httpd_request_methods=POST, current=1948, sum=1948, mean=0.004, stddev=0.061, min=0, max=1
couchdb, httpd_request_methods=HEAD, current=1, sum=1, mean=0, stddev=0.004, min=0, max=1
couchdb, httpd_status_codes=200, current=9073, sum=9073, mean=0.017, stddev=0.589, min=0, max=53
couchdb, httpd_status_codes=201, current=1949, sum=1949, mean=0.004, stddev=0.061, min=0, max=1}
couchdb, httpd_status_codes=202, current=0, sum=0, mean=0, stddev=0, min=0, max=0
couchdb, httpd_status_codes=301, current=0, sum=0, mean=0, stddev=0, min=0, max=0
couchdb, httpd_status_codes=304, current=81, sum=81, mean=0, stddev=0.026, min=0, max=3
couchdb, httpd_status_codes=400, current=2, sum=2, mean=0, stddev=0.005, min=0, max=1
couchdb, httpd_status_codes=401, current=0, sum=0, mean=0, stddev=0, min=0, max=0
couchdb, httpd_status_codes=403, current=0, sum=0, mean=0, stddev=0, min=0, max=0
couchdb, httpd_status_codes=404, current=1585, sum=1585, mean=0.007, stddev=0.375, min=0, max=33
couchdb, httpd_status_codes=405, current=0, sum=0, mean=0, stddev=0, min=0, max=0
couchdb, httpd_status_codes=409, current=4, sum=4, mean=0, stddev=0.008, min=0, max=1
couchdb, httpd_status_codes=412, current=2, sum=2, mean=0, stddev=0.006, min=0, max=1
couchdb, httpd_status_codes=500, current=1, sum=1, mean=0, stddev=0.004, min=0, max=1
couchdb, httpd=clients_requesting_changes, current=0, sum=0, mean=0, stddev=0.033, min=-2, max=2
couchdb, httpd=temporary_view_reads, current=4, sum=4, mean=0, stddev=0.008, min=0, max=1
couchdb, httpd=requests, current=12186, sum=12186, mean=0.023, stddev=0.751, min=0, max=66
couchdb, httpd=bulk_requests, current=1920, sum=1920, mean=0.004, stddev=0.06, min=0, max=1
couchdb, httpd=view_reads, current=206, sum=206, mean=0.003, stddev=0.062, min=0, max=2

But this is only an approximation and it may work with this particular sample. However, if there is a variation in input file format, then it will likely fail, whereas rdtx1's approach will probably still work..

--
Note: the input file contains

   "name": �couchdb_1.couchdb",

which has a wrong quote character. It should be:

   "name": "couchdb_1.couchdb",
1 Like

thanks, geniuses... I couldn't be happy more. now, I need to see if I can store this datafile in influx. let you guys know.

Some python code for fun :slight_smile:

[highlight=python]#! /usr/bin/env python

from json import load

f_in = open('data.json')
data = load(f_in)
f_in.close()

f_out = open('output.dat', 'w')
for key in ('couchdb', 'httpd_request_methods', 'httpd_status_codes'):
for k1, v1 in data[key].iteritems():
string = "%s, %s=%s, " % ("couchdb", key, k1)
for k2, v2 in v1.iteritems():
string += "%s=%s, " % (k2, v2)
string = string.rstrip(', ') + '\n'
f_out.write(string)
f_out.close()
[/highlight]