Convert common string to heading

SOM15 · January 23, 2025, 3:21pm

i want to Convert common string to heading
e.g data is long like below

1.2.3.4 value1=10000
1.2.3.4 value2=20000
1.2.3.4 value3=30000
2.3.4.5 value1=00001
2.3.4.5 value2=01000
2.3.4.5 value3=02000
6.7.8.9 value1=20300
6.7.8.9 value2=20400
6.7.8.9 value3=20500

Expected output

-  , 1.2.3.4, 2.3.4.5, 6.7.8.9
value1, 10000, 00001, 20300
value2, 20000, 01000, 20400
value3, 30000, 02000, 20500

I tried below something below but not able to make it

awk '{print $3,$4}'| awk 'NR == 1 {printf "%s", $0; next} {printf ",%s", $0} END {print ""}'

Neo · January 24, 2025, 1:17am

Hi @SOM15

Here is a Ruby solution to transform your input data into the desired output format. I did not test it however:

# Input data as a string (you can replace this with reading from a file)
data = <<~DATA
1.2.3.4 value1=10000
1.2.3.4 value2=20000
1.2.3.4 value3=30000
2.3.4.5 value1=00001
2.3.4.5 value2=01000
2.3.4.5 value3=02000
6.7.8.9 value1=20300
6.7.8.9 value2=20400
6.7.8.9 value3=20500
DATA

# Parse the input
rows = data.split("\n").map { |line| line.split }

# Initialize data structures
headers = []
values = Hash.new { |hash, key| hash[key] = {} }

# Process each line
rows.each do |key, pair|
  # Extract key and value
  value_name, value = pair.split("=")
  
  # Collect headers and values
  headers << key unless headers.include?(key)
  values[value_name][key] = value
end

# Output the result
puts "- ," + headers.join(",")
values.each do |value_name, entries|
  row = headers.map { |header| entries[header] || "NA" }
  puts "#{value_name}," + row.join(",")
end

SOM15 · January 24, 2025, 1:22am

Any solution in bash/awk, as it is part of my bash script ?

Neo · January 24, 2025, 1:25am

You can easily convert this script to any programming language you desire.

Put some effort into your own work, please.

Paul_Pedant · January 24, 2025, 8:46am

It is fairly obvious that the contents of the heading line needs data from input rows 1, 4 and 7. The contents of the last output row needs data from input rows 3, 6 and 9. So you need to design a solution which reads and stores all the input (in suitable arrays) and then reorders it for output. This is generally called "pivoting".

I'm not sure why you show an Awk attempt, reject a Ruby solution, but ask for a Bash solution. Please clarify your constraints.

MadeInGermany · January 24, 2025, 6:56pm

In bash, like embedded

awk 'awk code'

you can have embedded

ruby -e 'ruby code'

or

perl -e 'perl code'

or

python -c 'python code'

The only restriction is that you may not have a ' within the 'code'

MadeInGermany · January 24, 2025, 7:21pm

In this case, modified @Neo's ruby code for reading "input.dat" file:

#!/bin/bash
ruby -e '
# Initialize data structures
rows = []
headers = []
values = Hash.new { |hash, key| hash[key] = {} }

# Read lines from stdin
while line = gets
  rows << line.chomp.split
end

# Process each line
rows.each do |key, pair|
  # Extract key and value
  value_name, value = pair.split("=")
  
  # Collect headers and values
  headers << key unless headers.include?(key)
  values[value_name][key] = value
end

# Output the result
puts "- ," + headers.join(",")
values.each do |value_name, entries|
  row = headers.map { |header| entries[header] || "NA" }
  puts "#{value_name}," + row.join(",")
end
' < input.dat

MadeInGermany · January 25, 2025, 4:01pm

And pure bash without external commands:

!/bin/bash
# Initialize data structures
d=", " # delimiter
headers=""
declare -A values

# Read lines from stdin
while read key other
do
  vname=${other%%=*} val=${other#*=}
  values[$vname]+="$d$val"
  if [[ "$d$headers$d" != *"$d$key$d"* ]]
  then
    headers+="$d$key"
  fi
done < input.dat

# Print the variables
echo "- $headers"
for vname in "${!values[@]}"
do
  echo "$vname${values[$vname]}"
done

If you know bash well, and the input file is not too long, then you could do the Print section in one command:

printf "%s\n" "- $headers" "${values[@]/%/$vname}"

EmersonPrado · January 25, 2025, 10:54pm

Since you end up having a 2-D association, I do recommend against pure Bash (or any Linux shell), despite being more than a fan of that. Bash is quite powerful, but structured data is a serious weak spot.
You can get quite simple solutions with scripted languages like Python or Ruby, with easy call from your Bash script, as already recommended. I would only change embedding the code for a dedicated file with the Python, Ruby, or whatever, code. Will make everything easier - including linting each language.

Neo · January 28, 2025, 1:25am

Here is a reply from DeepSeek, as we study it's capabilities:

DeepSeek's Reply

To convert the given string data into the desired heading format, you can use awk to process the input and format it accordingly. Here's a step-by-step solution to achieve the expected output:

Solution using `awk`:

awk '
BEGIN {
    # Initialize arrays and variables
    split("", headers)
    split("", values)
    header_count = 0
}

{
    # Extract the IP and value parts
    ip = $1
    split($2, parts, "=")
    key = parts[1]
    value = parts[2]

    # Store unique IPs as headers
    if (!(ip in headers)) {
        headers[ip] = ++header_count
        header_order[header_count] = ip
    }

    # Store values in a 2D array
    values[key][headers[ip]] = value
}

END {
    # Print the header row
    printf "-"
    for (i = 1; i <= header_count; i++) {
        printf ", %s", header_order[i]
    }
    print ""

    # Print the value rows
    for (key in values) {
        printf "%s", key
        for (i = 1; i <= header_count; i++) {
            printf ", %s", values[key][i]
        }
        print ""
    }
}
' input.txt

Explanation:

Initialization:
- headers array stores the unique IP addresses.
- values array stores the values associated with each key (value1, value2, etc.).
- header_count keeps track of the number of unique IPs.
Processing Each Line:
- Extract the IP address and the key-value pair.
- Store unique IPs in the headers array and maintain their order in header_order.
- Store the values in the values array, indexed by the key and the IP's position in the headers.
Printing the Output:
- Print the header row starting with - followed by the unique IPs.
- Print each key followed by its corresponding values for each IP.

Example Input (`input.txt`):

1.2.3.4 value1=10000
1.2.3.4 value2=20000
1.2.3.4 value3=30000
2.3.4.5 value1=00001
2.3.4.5 value2=01000
2.3.4.5 value3=02000
6.7.8.9 value1=20300
6.7.8.9 value2=20400
6.7.8.9 value3=20500

Expected Output:

-, 1.2.3.4, 2.3.4.5, 6.7.8.9
value1, 10000, 00001, 20300
value2, 20000, 01000, 20400
value3, 30000, 02000, 20500

This script should give you the desired output format. If you have any further questions or need additional modifications, feel free to ask!

Neo · February 11, 2025, 1:25am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.