awk not outputting properly

Hi Everyone,

Long time lurker here. I have a project of bringing every one of our data centers to a newly enforced company standard. Standard naming conventions, domain migrations, etc. So, the people who are setting the standards are providing me with a CSV file. Column 1 has the old value, Column 2 has the new value. It looks like this example I pulled off the Internet:

President ,Home State
George Washington,Virginia
John Adams,Massachusetts
Thomas Jefferson,Virginia
James Madison,Virginia
James Monroe,Virginia
John Quincy Adams,Massachusetts
Andrew Jackson,Tennessee
Martin Van Buren,New York
William Henry Harrison,Ohio
John Tyler,Virginia
James K. Polk,Tennessee
Zachary Taylor,Louisiana
Millard Fillmore,New York
Franklin Pierce,New Hampshire
James Buchanan,Pennsylvania
Abraham Lincoln,Illinois
Andrew Johnson,Tennessee
Ulysses S. Grant,Ohio
Rutherford B. Hayes,Ohio
James A. Garfield,Ohio
Chester A. Arthur,New York
Grover Cleveland,New York
Benjamin Harrison,Indiana
Grover Cleveland (2nd term),New York
William McKinley,Ohio
Theodore Roosevelt,New York
William Howard Taft,Ohio
Woodrow Wilson,New Jersey
Warren G. Harding,Ohio
Calvin Coolidge,Massachusetts
Herbert Hoover,Iowa
Franklin D. Roosevelt,New York
Harry S. Truman,Missouri
Dwight D. Eisenhower,Texas
John F. Kennedy,Massachusetts
Lyndon B. Johnson,Texas
Richard Nixon,California
Gerald Ford,Michigan
Jimmy Carter,Georgia
Ronald Reagan,California
George H. W. Bush,Texas
Bill Clinton,Arkansas
George W. Bush,Texas
Barack Obama,Illinois

I am using awk to parse through and find the desired value of column one and then have it print the value right behind it. Here is my script but I cannot for the life of me to get one variable to output properly.

#!/bin/bash

# generate array of old file/folder/etc names

fileList=`commands to build array of data` # for arguments sake the presidents CSV file printed above

oldFile="/path/to/my/file_list.txt" # source of text file with old and new values

# loop through fileList and move folders or files

for i in ${fileList} ; do

newFile=$(/usr/bin/awk -F, '/${i}/ { print $2;exit }' ${oldFile})

if [[ `/usr/bin/awk -F, '/${i}/ { print $2;exit }' ${oldFile}` == ${newFile} ]]

  then /bin/echo "${i} needs to be moved to ${newFile}"
         /bin/echo "bunch of commands to rename folders"
  
else /bin/echo "${i} does not have a match"
  
fi
done

exit 0

Sorry for being really vague but trying to get a proof of concept here with out giving out any confidential information that may or may not be on the actual real scripts I am writing. Basically, it seems to work but when I run the script via bash -x /path/to/script the newFile never outputs right at all.

The most useful info for people here to help you is input and desired output, otherwise it will take a long time for people to figure out what exactly you want before they start thinking of a solution to your problem.

It doesn't make much sense you assign newFile to some awk command, then check if that same command is equal to the previous answer?

Also, when you run under -x do you notice it trying to match ${i} instead of the value inside of $i? awk is a separate program, it doesn't know your shell variables unless you pass them. Also it will be treated as a regex in that context so a dot will match any character.

for file in ${fileList}; do
   newFile=$(awk -F, -v "file=$file" '$1 == file { print $2; exit }' ${oldFile})
1 Like

Very confusing. What I interpreted is:
1) you have a list of old to new translations in the form: <old>,<new>.
2) you have a list of existing files in the form: /some/path/file-name
3) filenames can have embedded spaces
4) you need to take /some/path/file-name and convert it to /some/path/new-name and then execute the move command do do the work.

Assuming the above list, here is a skeleton script assuming that the translations is in xlate.list and the existing files are in exist.list.

#!/usr/bin/ksh
typeset -A xlate
while IFS="," read oldv newv        # build translation from csv file
do
    xlate["$oldv"]="$newv"
done <xlate.list

# read list of existing files/directories
while read existing
do
    basename=${existing##*/}
    if [[ -n ${xlate[$basename]} ]]
    then
        echo "move '$existing' to '${xlate[$basename]}/${existing%/}'"
        # command to move the file
    else
        echo "existing file has no new value: '$existing'"
    fi
done <exist.list  #</path/to/existing/list

Sorry for being so confusing.

The /path/to/file will be where the comma delimited file lives. On it it will have an existing value, and the value next to it that it should be changed to. So for example, there will be entries like this on the source file:

oldvalue1,newvalue1
oldvalue2,newvalue2
oldvalue3,newvalue3

So, I am basically creating an array of folders by either the find command or ls command depending on the task. I was not the original administrator of any of these servers or file shares. Then checking my comma delimited source file and if it finds a match in the first value, it will rename it via the mv command to the new standardized naming conventions.

My only hang up with my script is I cannot get awk to output one command properly. The thing that kills me, is that it works fine in the terminal, but when I put it in a script it just has a blank output. So if I run this in terminal:

/usr/bin/awk -F, '/oldvalue1/ { print $2;exit }' /path/to/input/file

that works and it outputs newvalue1, going off the example I gave earlier. However, when I put that command in my script and use variables, it doesn't seem to output properly. I tested my output by doing this:

bash -x /path/to/script.sh and every line prints out, there are no errors, it works as intended, but I am not seeing that output when I do that. It outputs a blank value. This is what is totally baffling me.

Sorry if I am still being confusing.

I think you could do things more efficiently than executing an awk to read your CSV file for every old file. However, working with what you've given, you could try this:

/usr/bin/awk -F, 'match( $1, ov ) { print $2; exit }'  ov=$oldvalue /path/to/input/file

The assumption is that the shell variable oldvalue has the name of the file that your script is trying to find in the /path/to/input/file file.

Thanks, I need to sit down and read the book on awk, but never have the time. I will try this when I get to my hotel room. Currently traveling and on my hotspot. I will need to VPN to grab all my resources needed to test this.

Thanks,
ZB

well here is what I got and I think maybe the problem is my text file I was given has carriage returns in it? I am not sure of the source of it's creation as it probably changed hands half a dozen times until coming down the pipe to me.

My script basically looks like this now:

#!/bin/bash

# generate list of all folders that need to be renamed

fileList=$(/bin/ls -l /path/to/a/bunch/of/folders)

# input file of old,new folder names

folderNames="/path/to/folders_list.txt"

/bin/echo ${folderNames}

# loop through fileList and move home folder 

for i in ${fileList} ; do

newFolder=$(/usr/bin/awk -F, '/${i}/ { print $2;exit }' ${folderNames})


if [[ `/usr/bin/awk -F, '/${i}/ { print $2;exit }' ${folderNames}` == "${newFolder}" ]]

  then /bin/echo "${i} needs to be moved to ${newFolder}"
          /bin/echo "place holder for work flow mv of commands..."
  
  else /bin/echo "${i} does not have a match"
  
fi
done

exit 0

So, in my testing, everything outputs when ran from sh -x /path/to/script.sh or bash -x, except the newFolder variable will never output properly. Now that I think about it, I think the source file containing the old,new names could have been created on a Windows box, and was definitely created in Microsoft Office.

I haven't tried running the script on any machine just yet because of the echo output not matching.

Any ideas? Sorry, I am not the strongest awk programmer.

Do you really need to check the file and the directory?

while IFS=, read -r olddir newdir; do
    newdir=${newdir%$'\r'} #remove dos line endings...?

    echo mv "$olddir" "$newdir"
done < folder_list.txt

or using globs and than awk:

for dir in /path/to/bunch/of/folders/*; do
    [[ -d $dir ]] || continue

    newFolder=$(awk -F, -v "olddir=$dir" '$1 == olddir {sub(/\r/,"");print $2;exit}')

    if [[ $newFolder ]]; then
        echo "got to move $dir to $newFolder"
    else
        echo "no match: $dir"
    fi
done

untested quick reply...

Yeah the file contains the list of the new standard naming convention for shared folders on the network. Basically, I inherited other sys admin's servers and now am merging remote offices and trying to enforce standards put forth by upper management.

They supply me a csv file of a list of all the old folders, and then what the new folder name should be in the column next to it. During my troubleshooting to get that newFolder to echo out right I just dumped everything in a plain text file, which is comma delimited.

Sorry if this isn't making much sense, I can use awk, but am not a grand master at it.

Try using dos2unix to convert your data file, then process it with your script.

Well is that a reason why my echo output would come up blank?

Then I don't see much issue in trying the first code snipplet. It works here:

mute@goflex:~/test$ ls
folder_list.txt  olddir  olddir2  olddir3  script
mute@goflex:~/test$ ./script
mute@goflex:~/test$ ls
Projects  Something Awesome  Web Site  folder_list.txt  script
mute@goflex:~/test$ cat folder_list.txt
olddir,Something Awesome
olddir2,Projects
olddir3,Web Site
mute@goflex:~/test$ cat script
#!/bin/bash
while IFS=, read -r olddir newdir; do
        newdir=${newdir%$'\r'} #remove dos line endings...?

        mv "$olddir" "$newdir"
done < folder_list.txt
mute@goflex:~/test$

I think this may not work, since there is one master list and a ton of shares, and the file shares will have tons of different sets of folders. To give a broad example, some offices will have say an accounting department with different folders, and other offices won't. So, I need to test whole file shares against a master naming convention list.

Does that make sense? Will this work?

See when I do this:

awk -F, '/testfolder1/ { sub(/\r/,"");print $2;exit }' ~/home/folders_list.txt 
testfolder2

So, I have a variable that grabs the new folder by comparing the folder name in the loop to the master text list, if it finds it, it prints the second value delimited by comma. I am sure I am not coding this the most efficient way, but I am not a total awk wizard here, just know enough to get jobs done. So, my variable is:

newFolder=`awk -F, '/t${i}/ { sub(/\r/,"");print $2;exit }' ~/home/folders_list.txt `
echo ${newFolder}
''

So, when I echo out the output of the newFolder it always turns up blank when I test it.

See that output works like a charm when I run it manually. When I pass my array of data through a loop and use ${i} in place of testfolder1 and run the script via bash -x /path/to/script, the output is blank. It outputs nothing. When I run the command manually, it outputs testfolder2. In my text file somewhere in the middle there is this entry:

testfolder1,testfolder2

So, is there any valid reasons why my echo will not output?

So the directories won't be there to move ...?
I think it'd much better to read through the file once and perform the work, then for each directory to spawn off awk to find it in a file.. Maybe a style thing. In the end directories that exist will be moved. You can suppress errors or check for it first I guess

[[ -d $olddir ]] && {
    echo "Moving $olddir"
    mv "$olddir" "$newdir"
}

Correct, I think. Basically between mergers and new policies we are migrating in tons of remote offices into one global standard. These mobile offices have been self managing for years. Someone has pieced together a master list of all the folders on all the servers and shares. They have input all the old names and what the new names should be. So, not every folder exists on every share, there is just one master list. That is why I was building an array of data from the share point, and then looping it through the master list and moving folders only when matches were found.

This isn't a very elegant solution but I am working with what I got here.

Thanks,

---------- Post updated at 12:46 PM ---------- Previous update was at 09:14 AM ----------

So, is the problem that I am passing ${i} in awk, inside single quotes? Which is why it outputs blank because it is interpreting it as a literal character and not a variable?

newFolder=`awk -F, '/${i}/ { sub(/\r/,"");print $2;exit }' ~/home/folders_list.txt `

Since ${i} is in single quotes inside the awk logic it is not passing it as a variable?

---------- Post updated at 01:01 PM ---------- Previous update was at 12:46 PM ----------

So, after reading through the man page I think this is what I want to do with awk

newFolder=`/usr/bin/awk -F, -v var=${i} '/var/ { sub(/\r/,"");print $2;exit }' ~/test/folders_list.txt`

That seems to pass the variable properly but doesn't grab the correct value.

also quote your shell expansions, -v var="$i"
if you REALLY want to use use /var/ (which matches anywhere in the line, not just the old directory column like you should) then you cannot use that shorthand, because /var/ treats var as the regex itself... you'd use $0 ~ var which would still treat the variable as a regex (but you've a fixed string, dots should not match ANY character probably). So you'd use index($0,var) .. But still, see post #3 :wink: I really think you want $1 == var (unless i'm still totally mistaken about whats going on here)

So, yeah I figured it out, and it is working. I guess I wasn't grasping at first you cannot pass a bash variable to an awk program.

here is my new method and the output works as expected. Thanks for all your suggestions man I super appreciate it. This isn't high priority but I cannot let it sit any longer, otherwise I will have managers asking me why it is not done yet.

newFolder=`/usr/bin/awk -F, '/^'${i}'/ { sub(/\r/,"");print $2;exit }' ~/path/to/master_list.txt`

:wall:

Yeah I know probably not the most elegant solution, but it does work. Sorry, I am not quite an awk master here and am working with what they gave me.

---------- Post updated at 01:59 PM ---------- Previous update was at 01:52 PM ----------

yeah man, that is exactly what I did. I travel for work right now so I am in different time zones and hotels 6 days a week, and only off one day. I am pretty much half zombie right now.

Haha, I will totally paypal you money for a beer, you deserve it after dealing with me!