Convert common string to heading

i want to Convert common string to heading
e.g data is long like below

1.2.3.4 value1=10000
1.2.3.4 value2=20000
1.2.3.4 value3=30000
2.3.4.5 value1=00001
2.3.4.5 value2=01000
2.3.4.5 value3=02000
6.7.8.9 value1=20300
6.7.8.9 value2=20400
6.7.8.9 value3=20500

Expected output

-  , 1.2.3.4, 2.3.4.5, 6.7.8.9
value1, 10000, 00001, 20300
value2, 20000, 01000, 20400
value3, 30000, 02000, 20500

I tried below something below but not able to make it

awk '{print $3,$4}'| awk 'NR == 1 {printf "%s", $0; next} {printf ",%s", $0} END {print ""}' 

Hi @SOM15

Here is a Ruby solution to transform your input data into the desired output format. I did not test it however:

# Input data as a string (you can replace this with reading from a file)
data = <<~DATA
1.2.3.4 value1=10000
1.2.3.4 value2=20000
1.2.3.4 value3=30000
2.3.4.5 value1=00001
2.3.4.5 value2=01000
2.3.4.5 value3=02000
6.7.8.9 value1=20300
6.7.8.9 value2=20400
6.7.8.9 value3=20500
DATA

# Parse the input
rows = data.split("\n").map { |line| line.split }

# Initialize data structures
headers = []
values = Hash.new { |hash, key| hash[key] = {} }

# Process each line
rows.each do |key, pair|
  # Extract key and value
  value_name, value = pair.split("=")
  
  # Collect headers and values
  headers << key unless headers.include?(key)
  values[value_name][key] = value
end

# Output the result
puts "- ," + headers.join(",")
values.each do |value_name, entries|
  row = headers.map { |header| entries[header] || "NA" }
  puts "#{value_name}," + row.join(",")
end

Any solution in bash/awk, as it is part of my bash script ?

You can easily convert this script to any programming language you desire.

Put some effort into your own work, please.

2 Likes

It is fairly obvious that the contents of the heading line needs data from input rows 1, 4 and 7. The contents of the last output row needs data from input rows 3, 6 and 9. So you need to design a solution which reads and stores all the input (in suitable arrays) and then reorders it for output. This is generally called "pivoting".

I'm not sure why you show an Awk attempt, reject a Ruby solution, but ask for a Bash solution. Please clarify your constraints.

1 Like

In bash, like embedded

awk 'awk code'

you can have embedded

ruby -e 'ruby code'

or

perl -e 'perl code'

or

python -c 'python code'

The only restriction is that you may not have a ' within the 'code'

In this case, modified @Neo's ruby code for reading "input.dat" file:

#!/bin/bash
ruby -e '
# Initialize data structures
rows = []
headers = []
values = Hash.new { |hash, key| hash[key] = {} }

# Read lines from stdin
while line = gets
  rows << line.chomp.split
end

# Process each line
rows.each do |key, pair|
  # Extract key and value
  value_name, value = pair.split("=")
  
  # Collect headers and values
  headers << key unless headers.include?(key)
  values[value_name][key] = value
end

# Output the result
puts "- ," + headers.join(",")
values.each do |value_name, entries|
  row = headers.map { |header| entries[header] || "NA" }
  puts "#{value_name}," + row.join(",")
end
' < input.dat

And pure bash without external commands:

!/bin/bash
# Initialize data structures
d=", " # delimiter
headers=""
declare -A values

# Read lines from stdin
while read key other
do
  vname=${other%%=*} val=${other#*=}
  values[$vname]+="$d$val"
  if [[ "$d$headers$d" != *"$d$key$d"* ]]
  then
    headers+="$d$key"
  fi
done < input.dat

# Print the variables
echo "- $headers"
for vname in "${!values[@]}"
do
  echo "$vname${values[$vname]}"
done

If you know bash well, and the input file is not too long, then you could do the Print section in one command:

printf "%s\n" "- $headers" "${values[@]/%/$vname}"

Since you end up having a 2-D association, I do recommend against pure Bash (or any Linux shell), despite being more than a fan of that. Bash is quite powerful, but structured data is a serious weak spot.
You can get quite simple solutions with scripted languages like Python or Ruby, with easy call from your Bash script, as already recommended. I would only change embedding the code for a dedicated file with the Python, Ruby, or whatever, code. Will make everything easier - including linting each language.

Here is a reply from DeepSeek, as we study it's capabilities:

DeepSeek's Reply


To convert the given string data into the desired heading format, you can use awk to process the input and format it accordingly. Here's a step-by-step solution to achieve the expected output:

Solution using awk:

awk '
BEGIN {
    # Initialize arrays and variables
    split("", headers)
    split("", values)
    header_count = 0
}

{
    # Extract the IP and value parts
    ip = $1
    split($2, parts, "=")
    key = parts[1]
    value = parts[2]

    # Store unique IPs as headers
    if (!(ip in headers)) {
        headers[ip] = ++header_count
        header_order[header_count] = ip
    }

    # Store values in a 2D array
    values[key][headers[ip]] = value
}

END {
    # Print the header row
    printf "-"
    for (i = 1; i <= header_count; i++) {
        printf ", %s", header_order[i]
    }
    print ""

    # Print the value rows
    for (key in values) {
        printf "%s", key
        for (i = 1; i <= header_count; i++) {
            printf ", %s", values[key][i]
        }
        print ""
    }
}
' input.txt

Explanation:

  1. Initialization:

    • headers array stores the unique IP addresses.
    • values array stores the values associated with each key (value1, value2, etc.).
    • header_count keeps track of the number of unique IPs.
  2. Processing Each Line:

    • Extract the IP address and the key-value pair.
    • Store unique IPs in the headers array and maintain their order in header_order.
    • Store the values in the values array, indexed by the key and the IP's position in the headers.
  3. Printing the Output:

    • Print the header row starting with - followed by the unique IPs.
    • Print each key followed by its corresponding values for each IP.

Example Input (input.txt):

1.2.3.4 value1=10000
1.2.3.4 value2=20000
1.2.3.4 value3=30000
2.3.4.5 value1=00001
2.3.4.5 value2=01000
2.3.4.5 value3=02000
6.7.8.9 value1=20300
6.7.8.9 value2=20400
6.7.8.9 value3=20500

Expected Output:

-, 1.2.3.4, 2.3.4.5, 6.7.8.9
value1, 10000, 00001, 20300
value2, 20000, 01000, 20400
value3, 30000, 02000, 20500

This script should give you the desired output format. If you have any further questions or need additional modifications, feel free to ask!

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.