Remove comma after last value inside an array(Linux)

akshay.kulkarni · August 12, 2022, 7:08am

I have this type of data. Values inside "percents" can be multiple one after another.

I want to remove comma after last value(Inside an array), but no luck.

          "percentiles_value": {
             "percentiles": {
                 "field": "value",
                 "percents": [
                      25,
                      50,
                  ],
                 "keyed": true,
                 "tdigest": {
                     "compression": 100
                 }
             }
        }
    }

Can anybody suggest how to achieve the same with sed or awk.

Neo · August 12, 2022, 7:57am

The larger question is "why?"

That comma should not cause any JSON parser an error; so why spend time on such a task?

Is this homework or related to some school assignment?

akshay.kulkarni · August 12, 2022, 8:51am

This is neither homework nor school assignment.

This is opensearch transform job to handle huge bank data for your information.

And yes, it is simple as that the comma with last value creates issue.

akshay.kulkarni · August 12, 2022, 8:51am

Job executed successfully without comma at last value.

MadeInGermany · August 12, 2022, 10:32am

Here is a sed "solution":

sed '$!N; s/,\( *\n *\]\)/\1/; P; D'

It puts two lines in the input buffer, looks for a comma in the first line and a ] in the second line.

N put a second line in the input buffer; the embedded newline becomes \n
P print the first line
D delete the first line
 capture group
\1 reference to the match of the 1st capture group
The $! is needed by a non-GNU sed where in the last line N exits without default-printing.

Of course sed has no idea about the structure.
A real solution is to fix your json parser.

Neo · August 12, 2022, 12:48pm

That is interesting.

Normally, based on my experience with JSON with Javascript, Ruby and PHP is that the trailing comma in a array such as the one you described so well is ignored and does not cause an exception condition.

Thank you for the excellent reply and explanation @akshay.kulkarni

I guess the opensearch parser is "too restrictive" or "very restrictive" for reasons unknown to us since other parsers just ignore the trailing comma in an array such as you have provided.

akshay.kulkarni · August 12, 2022, 1:05pm

Thanks @MadeInGermany

Suggested solution worked well with my transform_job_generator_script.

And yes elasticsearch/opensearch are very restrictive.

Here is the generated json output:

Neo · August 12, 2022, 1:08pm

Looks like in this case, @akshay.kulkarni is not in a position to change the (overly restrictive) behavior of the parser, so thanks to @MadeInGermany for his excellent suggestion, "saving the day" for Akshay.

Well done.

Neo · August 12, 2022, 1:19pm

Note, it seems there should be a configuration variable in that parser (could not find one) which sets the JSON parser mode to "less restrictive" so this comma is not an issue.

akshay.kulkarni · August 13, 2022, 9:54am

Thank you people for this useful information.