I need bash script that monitor folders for new pdf files and create xml file for rss feed with newest files on the list. I have some script, but it reports errors.
#!/bin/bash
SYSDIR="/var/www/html/Intranet"
HTTPLINK="http://TYPE.IP.ADDRESS.HERE/pdfs"
FEEDTITLE="Najnoviji dokumenti na Intranetu OUG"
FEEDLINK="http://TYPE.IP.ADDRESS.HERE/pdfs"
FEEDDESC="Novi dokumenti"
RSSDIR="/var/www/html/rss"
#DESC="`date`"
function testing_variables {
if [ ! -d ${RSSDIR} ]; then
echo -e 'ERROR: $RSSDIR does not exists!\nPlease create a directory and set the right path for $RSSDIR variable!'
exit 1
fi
if [ ! -d ${SYSDIR} ]; then
echo -e 'ERROR: $SYSDIR does not exists!\nPlease create a directory and set the right path for $SYSDIR variable!'
fi
}
function rss_header {
### RSS HEADER
echo "<!--?xml version=\"1.0\"?-->
<rss version="\"2.0\"">
<channel>
<title>${FEEDTITLE}</title>
<link>${FEEDLINK}
<description>${FEEDDESC}</description>" > $1
}
function rss_body {
#RSS BODY
for FILES in `find ${SYSDIR} -type f -name "*.pdf" | xargs ls -t | grep -i ${2}`; do
NAME="`basename $FILES`"
#PARENTDIR="`dirname $FILES | awk -F "/" '{print $NF}'`"
echo " <item>
<title>${NAME}</title>
<link>${HTTPLINK}/${2}/${NAME}
<!-- <description>${DESC}</description> -->
</item>" >> ${1}
done
}
function rss_footer {
### RSS FOOTER
echo "</channel></rss>" >> ${1}
}
### Main code ###
for FILES in `find ${SYSDIR} -type f -name "*.pdf" | xargs ls -t`; do
PARENTDIR="`dirname $FILES | awk -F "/" '{print $NF}'`"
rss_header ${RSSDIR}/${PARENTDIR}.xml
rss_body ${RSSDIR}/${PARENTDIR}.xml ${PARENTDIR}
rss_footer ${RSSDIR}/${PARENTDIR}.xml
done
It reports error on line 13 "syntax error near unexpected token '$'{\r' '
and 'function testing_variables {
I am not very familiar with code in this script, I just adapted existing script from script library, changed folder names and file type. Please can you review script and correct errors?
Now that you have made changes to your script based on the suggestions you have already received, what errors are you getting? What does your script look like now? What is it doing wrong?
Or, are you just saying that you want the UNIX and Linux Forums to serve as your unpaid programming staff?
If you try something and it doesn't work, it would help if you tell us it didn't work instead of having us assume that everything that was suggested worked.
I fixed errors with dos2unix command, but when I run script, there is no output in rss folder, and there is no errors reported after running the script. I inserted pdf in SYSDIR before running script.
You have a loop over all PDF files in and under $SYSDIR in your main loop that includes a call to rss_body which includes a loop over all PDF files in and under $SYSDIR . This is almost certainly not what you want, but without a better description of where PDF files are located in your file hierarchy and what XML files you're trying to create, I'm not clear on what you want to accomplish.
What OS are you using? How are you invoking feedgen1.sh ?
Are you getting any output at all from feedgen1.sh ?
Are any files being created by your current script? (And, if so, what is in them?)
What is the output from the commands:
ls -l feedgen1.sh
find $SYSDIR -type f -name '*.pdf' -exec ls -l {} +
find $RSSDIR -type d -exec ls -ld {} +
find $RSSDIR -type f -name '*.xml' -exec ls -l {} +
Is the output you see from the above commands representative of the locations of the PDF files you want to report in your XML files (or do you just have 1 or 2 PDF files installed for testing)? Is the output you see from the above commands representative of the directory structure you hope to see under $RSSDIR ?
What XML files are you hoping to create from the output shown by the above commands?
The variable DESC is unset in this script, but $DESC is used in rss_body . What is the description tag in your XML files supposed to contain for your PDF files?
Files will be located in subfolders inside Intranet folder specified in the script. I will upload them manually, daily. I want script to create xml file for rss feed, for new files uploaded. OS is Ubuntu server 14.04, script is invoked by cronjob. There is no output at all. PDFs will be uploaded in folders, not created by script. I will try your code in monday, at work.
If you are on Linux, you might consider using inofitywait in your main section.
Something like :
#!/bin/bash
DIR=/dir/to/watch
inotifywait -m -e create --format %f $DIR | while read File
do
case ${File##*.} in
[Pp][Dd][Ff])
printf "%s\n" "Found pdf file $File with ext ${File##*.}" # here you will call function per detected pdf filename, log, handle errors
;;
esac
done
I need one xml for all pdfs in and under specified directory.
For description tags I need pdf name, and folder name for two levels up. Info should be sorted by date, showing newest first, and links to files.
Your script is creating 3 XML tags per PDF file in the rss_body function:
the data stored between <title> and </title> tags in the final component of the absolute pathname of the PDF file,
the data stored after the <link> tag (there is no closing </link> tag) is the string stored in the shell variable $RSSDIR followed by a slash, the last directory in the absolute pathname of a PDF file (not necessarily the directory of the current PDF file's pathname), followed by a slash and the final component of the absolute pathname of the PDF file, and
the data stored between <description> and </description> tags is an empty string.
Should there be a </link> tag after the data you insert following the <link> tag?
Please show an explicit example of the data that you want created for the following PDF file:
-rw-r--r-- 1 dwc staff 2895323 Oct 23 2013 /var/www/html/Intranet/pdf/IEEE/20601-Rev-D7r02-clean.pdf
Your current code is creating one XML file for each different final directory name in the PDF pathnames found. One of these XML files is created for each PDF file found. If another PDF file with the same final directory name is found, it overwrites the previous XML file. (This is just slow if there is only one directory under $SYSDIR with that name. If there are two or more directories with the same final component name, there could be several problems.)
Furthermore, if there are more PDF files than xargs will process in a single invocation of ls your files will NOT be sorted from newest to oldest; there will be groups of PDF files sorted in timestamp order, but the complete list might not be correctly ordered. So, to get a time ordered list of files we either need to gather data needed for each file into single lines in a file that we can sort by timestamp, or we need to create files in a single directory with the same timestamps as your PDF files that we can then sort us ls -t . Creating a single file will probably be faster if you have an easy way to convert file timestamps into text. If not, we can use touch to copy file timestamps to other files.
The > at the start of next to the last two command lines you typed is a secondary prompt indicating that you probably mistyped or omitted a quote in the first find command. Hit the control (or cntl or ctl depending on who made your keyboard) key and the c key at the same time to generate an interrupt signal to get back to your primary prompt. And, then, copy the find command I requested and paste it into your shell.
And, for the third time, what operating system are you using?
I'm sorry. I completely misunderstood your problem. I thought you wanted help with a bash shell script. Now it appears that you are unable to run simple bash shell commands on the system where you want to run that bash shell script with the environment variables set as they are used in that script.
If you look back through this thread, you'll see that I have asked several questions that you still have not answered. Without answers to those questions (and a real sample of the desired output based on at least two PDF files from different direcctories), I can't figure out what needs to be done to satisfy your requirements.