Wget-403-Forbidden-Error

Hi Friends,

I did an extensive search over the internet and tried all possible solutions that were recommended, but couldn't figure this out.

Please see this link

http://www.dli.gov.in/data6/upload/0159/808/PTIFF/00000007.tif

It works.

But, when I try the following command

wget -r -nd --no-parent -U firefox -A tif http://www.dli.gov.in/data6/upload/0159/808/PTIFF/

I get the 403 forbidden error.

Could you please suggest a way around?

I cannot see that link. No such server.

In any case, there's no reason a server needs to permit you to see the index of a folder. If it's also forbidden in a browser, then it's just plain forbidden because they don't want you to do that.

A quick google search of that URL suggests the image you want is part of the "Brihatkathamanjari", available here in a variety of forms:

1 Like

Hi Corona,

I could access that link.

But anyways, thanks for your response.

---------- Post updated at 10:13 PM ---------- Previous update was at 10:04 PM ----------

I figured out that the file numbers starts with 7 preceding zeroes and for every every number greater than 0, the preceding zeroes are decreased in number.

For ex:

00000001.tif. It goes like this until 00000009.tif

And then

00000010.tif until 00000099.tif (Note the 6 preceding zeroes)

And then

00000100.tif till 00000999.tif (Note the 5 preceding zeroes)

I used this command

wget http://www.dli.gov.in/data6/upload/0159/808/PTIFF/0000000{1..94}.tif

But I could only get until 00000009.tif. Could you please suggest a for loop?

Thanks

It simply cannot be accessed from here. DNS returns nothing. Very very strange.

If it's somehow valid where you are, you could try playing with the referer settings:

wget --referer=http://www.dli.gov.in/ -U netscape

...which should pretend a little more to be a web browser and not a mining robot.

But actually, it would be simpler to go to http://www.dli.gov.in/data6/upload/0159/808/PTIFF/ in your browser since you say it works from there, then just save the list of URL's.

1 Like

Corona,

Actually only the tif files are made public. All the above folders are forbidden. :slight_smile:

Could you please comment on the above for loop request?

Then, for your original question, you have your answer. It won't work with wget if it won't work with your browser.

for ((N=1; N<100; N++))
do
        printf "%s/%06d.tif\n" "http://www.dli.gov.in/data6/upload/0159/808/PTIFF" $N
done | wget -I -

Isn't that obvious? Those filenames have 8 meaningful digits, but the patterns "000000010" onward will have 9. Try

wget http://www.dli.gov.in/data6/upload/0159/808/PTIFF/0000000{1..9}.tif
wget http://www.dli.gov.in/data6/upload/0159/808/PTIFF/000000{10..94}.tif

That's also why you should use printf "%s/%08d.tif\n" in Corona688's proposal above.

1 Like