Download pdf's using wget convert to txt

wget -i genedx.txt

The code above will download multiple pdf files from a site, but how can i download and convert these to .txt?

I have attached the master list (genedx.txt - which contains the url and file names)

as well as the two PDF's that are downloaded. I am trying to have those two files download as text files. Thank you.

pdftotext

is that a seperate command or can it be used with the wget command? Thanks.

It is a separate command, which -- like any other separate command -- you can use with wget, either by piping the output or by feeding the resulting file into it once wget is done.

So would the command be:

 wget -i genedx.txt | info_sheet_ube.pdf Info_Sheet_XomeDx.pdf 

and where do I download access pdftotext? Thanks.

No, pipes do not work that way.

What you would actually do depends on the contents of genedx.txt, and what you want to do with it.

Here is the second google hit.

After installing PDFMiner, do batch conversion with a for loop. Nothing to do with pipe here.

$ for f in `ls *.pdf`; do pdf2txt.py $f > ${f%.pdf}.txt; done

So just:

Directory containing the 4 pdf files

 cd "C:\Users\cmccabe\Desktop\PDF" 

followed by:

 for f in `ls *.pdf`; do pdf2txt.py $f > ${f%.pdf}.txt; done 

Thanks.