Convert csv to markdown table bash

Ihattaren · December 22, 2024, 8:24am

I am attempting to convert a csv file into a markdown table using bash(I know there are tools online that do it for free,however this is for learning purposes.)

#!/usr/bin/env bash
echo >table2.md
var=0
while IFS="\n" read -r table_f; do
    printf "|" >>table2.md
    # print the first value before comma
    awk '{print $1}' FS='[,]' table.txt >>table2.md
    printf "|" >>table2.md
    # print the first value after comma
    awk '{print $2}' FS='[,]' table.txt >>table2.md
    if [[ var -eq 0 ]]; then
        var=$var+1
        echo "--------" >>table2.md
    else
        # echo "Test"
    fi
done <table.txt

I am doing this however this is resulting weird outputs in a new line instead of side by side although I am using -n flag in echo/printf.

primitive type,wrapper class
byte,Byte
short,Short
int,Integer
float,Float
double,Double
char,Character
boolean,Boolean
void,Void

This is how my table.txt file looks like.

MadeInGermany · December 22, 2024, 9:56am

awk loops over all lines of the given file. You certainly want it to run on the current line only.

IFS="\n" might not work. Documented is IFS="" or IFS=

It must be $var in [[ ]]. Only in a forced number context you may omit the $ before var.
(( )) or $(( )) enforce arithmetics i.e. a number context.
Correct expressions:

[[ $var -eq 0 ]]
(( $var == 0 ))
(( var == 0 ))

var=$(( $var + 1 ))
var=$(( var + 1 ))
(( var += 1 ))
(( var++ ))

Perhaps you want

#!/bin/bash
var=0
while IFS="," read -r f1 f2; do
  printf "|%s|%s\n" "$f1" "$f2" >&3
  if (( var++ == 0 )); then
    echo "--------" >&3
  fi
done <table.txt 3>table2.md

Where (( var++ == 0 )) is shorter than [[ $(( var++ )) -eq 0 ]]

Neo · December 22, 2024, 10:20am

I put your requestion to the ChatGPT o1 "reasoning model" since o3 is not yet available to the public. See if this helps you a bit @Ihattaren

ChatGPT o1

Here’s a simpler approach that relies on reading one CSV line at a time and then parsing that line into two columns—rather than repeatedly running awk over the entire file.

Example:

#!/usr/bin/env bash

# Clear/create table2.md
echo "" > table2.md

# Print a header row in the Markdown table (optional)
echo "| primitive type | wrapper class |" >>table2.md
echo "| -------------- | ------------- |" >>table2.md

# Read the file line-by-line, splitting on comma
while IFS="," read -r col1 col2
do
    # Build each Markdown row
    echo "| $col1 | $col2 |" >>table2.md
done < table.txt

ChatGPT o1 Explanation

Use a single while loop

You only need one pass through table.txt. Let the Bash built-in read command handle splitting each line on commas.

Read columns directly

Instead of calling awk for each line, you can do:

while IFS="," read -r col1 col2
do
    # ...
done < table.txt

This way, $col1 becomes everything before the comma, and $col2 becomes everything after the comma.

Write to the output file

Each iteration prints its row in Markdown syntax. For example:

echo "| $col1 | $col2 |" >> table2.md

Header row logic

If you want the first line to be a header with a Markdown separator under it, you can just hardcode them to appear before reading the file (as shown in the example).

This avoids the “weird output” caused by repeatedly calling awk '{print $1}' FS="[,]" table.txt from inside the loop (which re-reads the entire file every time instead of just using the line that was read in the current iteration).

End ChatGPT o1 "reasoning model" suggestion.

Matt-Kita · December 22, 2024, 12:31pm

Using ANSI-C IFS=$'\n' should work (although wouldn't make much sense imo, in such case better to use mapfile -t instead of while IFS=$'\n' read), as the default in Bash is <space><tab><newline>

$ echo -n "$IFS" | cat -veT
 ^I$
$

Btw, where exactly is it "documented"?

It actually can be var in [[ ]], even regardless it's quoted or not (with a remark, that parameter expansion syntax - when quoted - always has to use double quotes, see example below).
Referring to variables only by their names is allowed (yet not recommended), as arithmetic evaluation is still correctly performed on them (such evaluation also "enforces number context").

$ var="2+2"
$ declare -p var
declare -- var="2+2"
$ [[ "$var" -eq 4 ]] && echo "it's four"
it's four
$ [[ "var" -eq 4 ]] && echo "it's four"
it's four
$ [[ $var -eq 4 ]] && echo "it's four"
it's four
$ [[ var -eq 4 ]] && echo "it's four"
it's four
$ [[ 'var' -eq 4 ]] && echo "it's four"
it's four
$ [[ '$var' -eq 4 ]] && echo "it's four"
bash: [[: $var: syntax error: operand expected (error token is "$var")
$

See the explanation from man bash in ARITHMETIC EVALUATION section:

       Shell variables are allowed as operands; parameter  expansion  is  per‐
       formed before the expression is evaluated.  Within an expression, shell
       variables may also be referenced by name without  using  the  parameter
       expansion  syntax.  A shell variable that is null or unset evaluates to
       0 when referenced by name without using the parameter expansion syntax.
       The  value  of a variable is evaluated as an arithmetic expression when
       it is referenced, or when a variable which has been given  the  integer
       attribute using declare -i is assigned a value.  A null value evaluates
       to 0.  A shell variable need not have its integer attribute  turned  on
       to be used in an expression.

(( $var == 0 )); var=$(( $var + 1 )) on the other hand is actually discouraged in most (not in all) cases.

The key lesson from this: you can use [[ var -eq 0 ]] but you shouldn't (for the sake of "consistent use of parameter expansion syntax"), and you should probably use var=$(( var + 1 )) instead of var=$(( $var + 1 )), as the latter may be sometimes problematic.

Didn't you mean >>&3 (append) and not >&3 (overwrite)?

MadeInGermany · December 22, 2024, 8:12pm

You are right, the -eq operator enforces the arithmetic evaluation.

>&3 just writes to the opened stream i.e. it appends.
The open() is performed by the 3>table2.md, overwrite. 3>>table2.md would append. (And close() is performed when the code block is left.)

BTW echo "" > table2.md is a bad suggestion (from chatgpt) because it makes an empty line.
And again a block redirection is better:

{
echo ... >&3
while ...
do
  echo ... >&3
done < table.txt
} 3> table2.md

Ihattaren · December 23, 2024, 4:33am

@MadeInGermany
Can you explain what is >&3 doing ? And what is the combination of

{
echo ... >&3
while ...
do
  echo ... >&3
done < table.txt
} 3> table2.md

This doing?

Neo · December 23, 2024, 5:22am

You're correct that ChatGPT sometimes makes suboptimal coding choices, especially in initial iterations—much like humans:

In my experience, human collaborators often make just as many coding mistakes or suboptimal choices as ChatGPT 4. An iterative, collaborative process tends to yield the best results, whether working with humans or machine intelligence.

The key difference, however, is that human mistakes are generally met with more forgiveness, while there’s a tendency to expect perfection from machines.

I’ve found great success working with ChatGPT 4o because I approach it as a collaborative partner, engaging in an interactive process rather than expecting the first reply to be flawless—just like in human collaboration.

What’s particularly fascinating is that machine intelligence often makes different types of mistakes than humans. This creates a unique synergy where combining the two forms of intelligence—human and machine—can yield truly exceptional results, leveraging the unique strengths and perspectives of each.

Just like with humans, we “trust but verify.” Machine intelligence is no different—we should verify everything, just as we do with human intelligence.

MadeInGermany · December 23, 2024, 7:18am

A >&3writes to the stream that is opened with 3> filename. The stdout stream is untouched.

{
echo ...
while ...
do
  echo ...
done < table.txt
} > table2.md

is possible but occupies the stdout stream which is &1.
Likewise the < table.txt occupies the stdin stream which is &0. (You could use read <&4 and 4< table.txt to allow an input from the terminal.)
The stderr stream is &2.
The &3 is the first free stream. Some shells use &5 or &6 in debug mode so you should not use them.

A while-do-done is a code block that can be entirely redirected.
A { } is an explicit code block.

Ihattaren · December 23, 2024, 7:39am

lot to learn!