Google Webmaster Tools Shows Problems with Soft 404 Errors

Well,

Sorry, but I cannot seem to fix the problem with the steady decline of ranking for unix.com pages with Google.

Google Webmaster Tools show that they are dropping our pages from the indexes more and more because of "Soft 404" errors which started after we moved to the new data center.

I cannot find the problem and the pages serve fine but Google shows these "Soft 404" errors.

I'm lost.

This is why our traffic at unix.com continues to fall, and I cannot find a way to fix it.

I'm sure Neo knows this:

Soft 404 are not actual defined errors. They are usually caused by thin content - examples: an URL that finds a useless error message, or a basically empty page. Or the in the case of case of forums only: no posted response. Automatic generation of Wordpress tags when someone looks for something new and non-existent, is an example of the "oops page". Apparently soft 404 a is a googlebot response, not a real error. But you get dinged anyway.

The basic idea is:
I do a search on the the word 'furkle' - UNIX.com returns a 200, with a page saying 'Oops'. Sort of like 'dangling URLs'. Googlebot then has the smarts to see that this is a thin page. For forums only when googlebot sees a question with with zero replies - i.e, possibly lots of views - but no posted answers this situation generates this kind of response. You also get this when someone writes a nice piece, but nobody answers it, even though it is high content, "likes" do not count.

I helped clean up literally thousands really old posts on a science site - for the very reason I mentioned above, per the site owner. The discussion sub-forum had about 1500 posts we removed, for example. Humans had to go in on zero reply posts, and do one of:

  1. delete the post
  2. add a small bit of content like a link to some relevant comment or a link to external/internal page
  3. flag the post for someone else who knows the stuff required. Because the post has some merit.

For Neo:
Do we have a way to create tracking the original source of possible airball URL requests? Please share if you do.
How is our zero reply problem?
I can only play with ordering by view count on a given forum:

Lots of zero replies. Do not know if this is bad or not.

2 Likes

Hey Jim,

Thanks for replying,

Google Webmaster Tools provide the URLS of all the 404 errors.

When I test them, the URLs are all OK.

When I run some Chome extension to read the HTTPD headers, I see 404s which become 200s. It is like our URL rewriting software is causing this, but when I ask DragonByte support, they tell me their software does not cause this problem and are not helpful.

The problem started when we moved to a new server and a new version of Apache, so my guess is the issue is in the Apache2 configuration, but I cannot find the problem.

It's been like this (going down and down in the search ranking and number of pages indexed) since I moved to the new data center last year.

I cannot figure it out.

Let me know if you need to be added to webmaster tools for access, if you are experienced in GWT.

Thanks.

I feel a bit guilty because I have noticed a few months ago strange behaviour and yes 404 errors quite often when just wanting to follow old posts and change room, I believed it was due to revamp of the site and so something normal, that is why I mentioned once I was a bit lost... I should have been more explicit and mentioned I also noticed these 404 appearing from time to time, but now I see no more...
Is it possible in Google DB the refs to links are now obsolete and needs a good refresh? This reminds me of when HP restructures ITRC, even now you may fall on pages not found... just like apps that had the server name(physical) or IP hard coded, you migrate and nothing works... though all is the same except for the server name or IP (HP did worse...)

1 Like

Hi Victor,

Yeah, I feel bad too. I've been developing new features using Vue.js and the main site is having some issues since we moved to the new server.

I have spent around 20 hours on the problem so far, and no joy.

Google has dropped many great links from the index. We really do not have many "expired links".

As mentioned, in GWT, when I click on a link that Google says is a soft 404 error, the page loads fine. But if check with some header checking tools, the HTTP headers show 404 and then 200, which I would attribute to our URL rewriting software, but it was not an issue before changing data centers, and DBSEO says "no way it is our software".

It's like there is some small configuration error in Apache2, but I am lost to find it.

@neo......remember this (heated) discussion between us?

Copying our content????

In your post#4 you said that you had put a 302 on the old site.

Is that 302 still in place? Could it be causing this?

1 Like

Hey Dennis,

Thanks . That should not make a different, but I will follow up anyway and look deeper to see if any 302 redirect I configured related to that might have some adverse effect or unintended consequence.

I doubt it, but anything is worth a try to fix this problem!

1 Like

I upgraded our version of DBSEO and checked many links, some 10 years old and older, and none show any soft 404 errors in DevTools (Chrome 73).

Maybe the issue was with the version of DBSEO were were running.

Will log into GWT soon and post some of the links GWT recently reported as "soft 404" and we can examine them one at a time.

OK.

I have rechecked three dozen or more links that Google Search Console had listed as "soft 404" errors, using three different HTTP header inspection tools (using Chome browser):

  • DevTools (Chrome 73)
  • HTTP Header Spy 2.0.49
  • HTTP Headers 1.1

None of them show any 404 errors of any kind, hard or soft; and all links are providing the correct 200 responses. Some of the URL rewritten by DBSEO go 302 -> 200 , but this is technically correct.

As far as all the tools in my toolbox, we are running as we should.

Also, from Google Search Console:

Data anomalies in Search Console - Search Console Help

So,, let's let it run for a while and see how things go and what direction things move.

This is interesting. Google Search Console lists this page as "Soft 404" cannot index status, but when I check all the HTML headers when Google checks, it is "clean"...

https://www.unix.com/man-page/centos/9/idr_replace/
HTTP Header Spy
Settings
GEThttps://search.google.com/search-console?hl=en
HTTP/1.1 302

394 ms

172.217.194.101

GEThttps://search.google.com/search-console?hl=en&resource_id=https://www.unix.com/
HTTP/1.1 200

810 ms

172.217.194.101

status
200
content-type
text/html; charset=utf-8
x-ua-compatible
IE=edge
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:27:07 GMT
content-security-policy
script-src 'report-sample' 'nonce-CS8CsEeKh+4vkhMHLDMl9w' 'unsafe-inline';object-src 'none';base-uri 'self';report-uri /_/SearchConsoleAggReportUi/cspreport;worker-src 'self'
content-security-policy
script-src 'nonce-CS8CsEeKh+4vkhMHLDMl9w' 'self' 'unsafe-eval' https://apis.google.com https://ssl.gstatic.com https://www.google.com https://www.gstatic.com https://www.google-analytics.com https://www.googleapis.com/appsmarket/v2/installedApps/;report-uri /_/SearchConsoleAggReportUi/cspreport
content-encoding
gzip
server
ESF
x-xss-protection
0
x-frame-options
SAMEORIGIN
x-content-type-options
nosniff
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://search.google.com/_/SearchConsoleAggReportUi/mutate?ds.extension=133948722&f.sid=4969859514164868889&bl=boq_searchconsoleserver_20190505.01_p0&hl=en&soc-app=629&soc-platform=1&soc-device=1&_reqid=3444832&rt=c
HTTP/1.1 200

1s 499 ms

172.217.194.101

status
200
content-type
application/json; charset=utf-8
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:28:28 GMT
content-disposition
attachment; filename="response.bin"; filename*=UTF-8''response.bin
x-content-type-options
nosniff
content-encoding
gzip
server
ESF
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
GEThttps://www.google.com/recaptcha/api2/bframe?hl=en&v=v1555968629716&k=6Ley2w8UAAAAAPOj6LHO_9ROattTY3rSLldc87NQ&cb=lun2cldlmsex
HTTP/1.1 200

97 ms

172.217.194.103

status
200
content-type
text/html; charset=utf-8
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:28:27 GMT
content-security-policy
script-src 'report-sample' 'nonce-hpZLk7PbCZAwTsdIEDtxBA' 'unsafe-inline' 'strict-dynamic' https: http: 'unsafe-eval';object-src 'none';base-uri 'self';report-uri https://csp.withgoogle.com/csp/recaptcha/1
content-encoding
gzip
x-content-type-options
nosniff
x-xss-protection
1; mode=block
content-length
1117
server
GSE
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://www.google.com/recaptcha/api2/reload?k=6Ley2w8UAAAAAPOj6LHO_9ROattTY3rSLldc87NQ
HTTP/1.1 200

295 ms

172.217.194.103

status
200
content-type
application/json; charset=utf-8
content-encoding
gzip
date
Tue, 07 May 2019 05:28:28 GMT
expires
Tue, 07 May 2019 05:28:28 GMT
cache-control
private, max-age=0
x-content-type-options
nosniff
x-frame-options
SAMEORIGIN
x-xss-protection
1; mode=block
content-length
9276
server
GSE
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://search.google.com/_/SearchConsoleAggReportUi/mutate?ds.extension=105397184&f.sid=4969859514164868889&bl=boq_searchconsoleserver_20190505.01_p0&hl=en&soc-app=629&soc-platform=1&soc-device=1&_reqid=3544832&rt=c
HTTP/1.1 200

859 ms

172.217.194.101

status
200
content-type
application/json; charset=utf-8
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:28:34 GMT
content-disposition
attachment; filename="response.bin"; filename*=UTF-8''response.bin
x-content-type-options
nosniff
content-encoding
gzip
server
ESF
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://search.google.com/_/SearchConsoleAggReportUi/data?ds.extension=105397185&f.sid=4969859514164868889&bl=boq_searchconsoleserver_20190505.01_p0&hl=en&soc-app=629&soc-platform=1&soc-device=1&_reqid=3644832&rt=c
HTTP/1.1 200

395 ms

172.217.194.101

status
200
content-type
application/json; charset=utf-8
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:28:34 GMT
content-disposition
attachment; filename="response.bin"; filename*=UTF-8''response.bin
x-content-type-options
nosniff
content-encoding
gzip
server
ESF
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://search.google.com/_/SearchConsoleAggReportUi/data?ds.extension=105397185&f.sid=4969859514164868889&bl=boq_searchconsoleserver_20190505.01_p0&hl=en&soc-app=629&soc-platform=1&soc-device=1&_reqid=3744832&rt=c
HTTP/1.1 200

303 ms

172.217.194.101

status
200
content-type
application/json; charset=utf-8
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:28:36 GMT
content-disposition
attachment; filename="response.bin"; filename*=UTF-8''response.bin
x-content-type-options
nosniff
content-encoding
gzip
server
ESF
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://search.google.com/_/SearchConsoleAggReportUi/data?ds.extension=105397185&f.sid=4969859514164868889&bl=boq_searchconsoleserver_20190505.01_p0&hl=en&soc-app=629&soc-platform=1&soc-device=1&_reqid=3844832&rt=c
HTTP/1.1 200

318 ms

172.217.194.101

status
200
content-type
application/json; charset=utf-8
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:28:37 GMT
content-disposition
attachment; filename="response.bin"; filename*=UTF-8''response.bin
x-content-type-options
nosniff
content-encoding
gzip
server
ESF
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://search.google.com/_/SearchConsoleAggReportUi/data?ds.extension=111586864.125286702.129561079.129631777.141416310.152262421.176495832.192628524&f.sid=4969859514164868889&bl=boq_searchconsoleserver_20190505.01_p0&hl=en&soc-app=629&soc-platform=1&soc-device=1&_reqid=3944832&rt=c
HTTP/1.1 200

679 ms

172.217.194.101

status
200
content-type
application/json; charset=utf-8
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:28:37 GMT
content-disposition
attachment; filename="response.bin"; filename*=UTF-8''response.bin
x-content-type-options
nosniff
content-encoding
gzip
server
ESF
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
GEThttps://support.google.com/apis/tooltips?v=1&key&helpcenter=webmasters&ids=9022846&hl=en
HTTP/1.1 200

56 ms

172.217.194.101

status
200
strict-transport-security
max-age=31536000; includeSubdomains
date
Tue, 07 May 2019 05:28:38 GMT
expires
Tue, 07 May 2019 05:28:38 GMT
cache-control
private, max-age=0
content-type
application/json; charset=UTF-8
access-control-allow-origin
https://search.google.com
access-control-allow-credentials
true
access-control-allow-methods
POST, GET, OPTIONS, PUT
access-control-allow-headers
X-SupportContent-XsrfToken, Authorization, X-SupportContent-AllowApiCookieAuth, Content-Type, If-None-Match
access-control-max-age
3600
x-content-type-options
nosniff
content-disposition
attachment; filename="f.txt"
content-encoding
gzip
server
support-content-ui
content-length
371
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://search.google.com/_/SearchConsoleAggReportUi/log?f.sid=4969859514164868889&bl=boq_searchconsoleserver_20190505.01_p0&hl=en&soc-app=629&soc-platform=1&soc-device=1&_reqid=4044832&rt=j
HTTP/1.1 200

89 ms

172.217.194.101

status
200
content-type
application/json; charset=utf-8
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:28:39 GMT
content-disposition
attachment; filename="response.bin"; filename*=UTF-8''response.bin
x-content-type-options
nosniff
content-encoding
gzip
server
ESF
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://play.google.com/log?format=json&hasfast=true
HTTP/1.1 200

44 ms

172.217.194.102

status
200
access-control-allow-origin
https://search.google.com
access-control-allow-credentials
true
access-control-allow-headers
X-Playlog-Web
p3p
CP="This is not a P3P policy! See g.co/p3phelp for more info."
content-type
text/plain; charset=UTF-8
content-encoding
gzip
date
Tue, 07 May 2019 05:28:38 GMT
server
Playlog
cache-control
private
content-length
131
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
expires
Tue, 07 May 2019 05:28:38 GMT
POSThttps://search.google.com/_/SearchConsoleAggReportUi/mutate?ds.extension=105397184&f.sid=4969859514164868889&bl=boq_searchconsoleserver_20190505.01_p0&hl=en&soc-app=629&soc-platform=1&soc-device=1&_reqid=4144832&rt=c
HTTP/1.1 200

632 ms

172.217.194.101

status
200
content-type
application/json; charset=utf-8
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:28:43 GMT
content-disposition
attachment; filename="response.bin"; filename*=UTF-8''response.bin
x-content-type-options
nosniff
content-encoding
gzip
server
ESF
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://search.google.com/_/SearchConsoleAggReportUi/data?ds.extension=105397185&f.sid=4969859514164868889&bl=boq_searchconsoleserver_20190505.01_p0&hl=en&soc-app=629&soc-platform=1&soc-device=1&_reqid=4244832&rt=c
HTTP/1.1 200

373 ms

172.217.194.101

status
200
content-type
application/json; charset=utf-8
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:28:44 GMT
content-disposition
attachment; filename="response.bin"; filename*=UTF-8''response.bin
x-content-type-options
nosniff
content-encoding
gzip
server
ESF
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://search.google.com/_/SearchConsoleAggReportUi/data?ds.extension=105397185&f.sid=4969859514164868889&bl=boq_searchconsoleserver_20190505.01_p0&hl=en&soc-app=629&soc-platform=1&soc-device=1&_reqid=4344832&rt=c
HTTP/1.1 200

325 ms

172.217.194.101

status
200
content-type
application/json; charset=utf-8
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:28:46 GMT
content-disposition
attachment; filename="response.bin"; filename*=UTF-8''response.bin
x-content-type-options
nosniff
content-encoding
gzip
server
ESF
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://search.google.com/_/SearchConsoleAggReportUi/data?ds.extension=105397185&f.sid=4969859514164868889&bl=boq_searchconsoleserver_20190505.01_p0&hl=en&soc-app=629&soc-platform=1&soc-device=1&_reqid=4444832&rt=c
HTTP/1.1 200

298 ms

172.217.194.101

status
200
content-type
application/json; charset=utf-8
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:28:48 GMT
content-disposition
attachment; filename="response.bin"; filename*=UTF-8''response.bin
x-content-type-options
nosniff
content-encoding
gzip
server
ESF
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://search.google.com/_/SearchConsoleAggReportUi/data?ds.extension=105397185&f.sid=4969859514164868889&bl=boq_searchconsoleserver_20190505.01_p0&hl=en&soc-app=629&soc-platform=1&soc-device=1&_reqid=4544832&rt=c
HTTP/1.1 200

305 ms

172.217.194.101

status
200
content-type
application/json; charset=utf-8
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:28:51 GMT
content-disposition
attachment; filename="response.bin"; filename*=UTF-8''response.bin
x-content-type-options
nosniff
content-encoding
gzip
server
ESF
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://search.google.com/_/SearchConsoleAggReportUi/data?ds.extension=105397185&f.sid=4969859514164868889&bl=boq_searchconsoleserver_20190505.01_p0&hl=en&soc-app=629&soc-platform=1&soc-device=1&_reqid=4644832&rt=c
HTTP/1.1 200

341 ms

172.217.194.101

status
200
content-type
application/json; charset=utf-8
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:28:53 GMT
content-disposition
attachment; filename="response.bin"; filename*=UTF-8''response.bin
x-content-type-options
nosniff
content-encoding
gzip
server
ESF
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://search.google.com/_/SearchConsoleAggReportUi/data?ds.extension=105397185&f.sid=4969859514164868889&bl=boq_searchconsoleserver_20190505.01_p0&hl=en&soc-app=629&soc-platform=1&soc-device=1&_reqid=4744832&rt=c
HTTP/1.1 200

309 ms

172.217.194.101

status
200
content-type
application/json; charset=utf-8
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:28:55 GMT
content-disposition
attachment; filename="response.bin"; filename*=UTF-8''response.bin
x-content-type-options
nosniff
content-encoding
gzip
server
ESF
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://search.google.com/_/SearchConsoleAggReportUi/data?ds.extension=105397185&f.sid=4969859514164868889&bl=boq_searchconsoleserver_20190505.01_p0&hl=en&soc-app=629&soc-platform=1&soc-device=1&_reqid=4844832&rt=c
HTTP/1.1 200

308 ms

172.217.194.101

status
200
content-type
application/json; charset=utf-8
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:28:58 GMT
content-disposition
attachment; filename="response.bin"; filename*=UTF-8''response.bin
x-content-type-options
nosniff
content-encoding
gzip
server
ESF
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://search.google.com/_/SearchConsoleAggReportUi/data?ds.extension=105397185&f.sid=4969859514164868889&bl=boq_searchconsoleserver_20190505.01_p0&hl=en&soc-app=629&soc-platform=1&soc-device=1&_reqid=4944832&rt=c
HTTP/1.1 200

531 ms

172.217.194.101

status
200
content-type
application/json; charset=utf-8
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:29:00 GMT
content-disposition
attachment; filename="response.bin"; filename*=UTF-8''response.bin
x-content-type-options
nosniff
content-encoding
gzip
server
ESF
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://search.google.com/_/SearchConsoleAggReportUi/data?ds.extension=111586864.125286702.129561079.129631777.141416310.152262421.176495832.192628524&f.sid=4969859514164868889&bl=boq_searchconsoleserver_20190505.01_p0&hl=en&soc-app=629&soc-platform=1&soc-device=1&_reqid=5044832&rt=c
HTTP/1.1 200

504 ms

172.217.194.101

status
200
content-type
application/json; charset=utf-8
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:29:00 GMT
content-disposition
attachment; filename="response.bin"; filename*=UTF-8''response.bin
x-content-type-options
nosniff
content-encoding
gzip
server
ESF
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
GEThttps://support.google.com/apis/tooltips?v=1&key&helpcenter=webmasters&ids=9022846&hl=en
HTTP/1.1 200

51 ms

172.217.194.101

status
200
strict-transport-security
max-age=31536000; includeSubdomains
date
Tue, 07 May 2019 05:29:01 GMT
expires
Tue, 07 May 2019 05:29:01 GMT
cache-control
private, max-age=0
content-type
application/json; charset=UTF-8
access-control-allow-origin
https://search.google.com
access-control-allow-credentials
true
access-control-allow-methods
POST, GET, OPTIONS, PUT
access-control-allow-headers
X-SupportContent-XsrfToken, Authorization, X-SupportContent-AllowApiCookieAuth, Content-Type, If-None-Match
access-control-max-age
3600
x-content-type-options
nosniff
content-disposition
attachment; filename="f.txt"
content-encoding
gzip
server
support-content-ui
content-length
371
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://search.google.com/_/SearchConsoleAggReportUi/log?f.sid=4969859514164868889&bl=boq_searchconsoleserver_20190505.01_p0&hl=en&soc-app=629&soc-platform=1&soc-device=1&_reqid=5144832&rt=j
HTTP/1.1 200

69 ms

172.217.194.101

status
200
content-type
application/json; charset=utf-8
cache-control
no-cache, no-store, max-age=0, must-revalidate
pragma
no-cache
expires
Mon, 01 Jan 1990 00:00:00 GMT
date
Tue, 07 May 2019 05:29:01 GMT
content-disposition
attachment; filename="response.bin"; filename*=UTF-8''response.bin
x-content-type-options
nosniff
content-encoding
gzip
server
ESF
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
POSThttps://play.google.com/log?format=json&hasfast=true
HTTP/1.1 200

205 ms

172.217.194.102

status
200
access-control-allow-origin
https://search.google.com
access-control-allow-credentials
true
access-control-allow-headers
X-Playlog-Web
p3p
CP="This is not a P3P policy! See g.co/p3phelp for more info."
content-type
text/plain; charset=UTF-8
content-encoding
gzip
date
Tue, 07 May 2019 05:29:01 GMT
server
Playlog
cache-control
private
content-length
131
x-xss-protection
0
x-frame-options
SAMEORIGIN
alt-svc
quic=":443"; ma=2592000; v="46,44,43,39"
expires
Tue, 07 May 2019 05:29:01 GMT

and I check the same outside of Google Search Console, I get a very clean and simple 200:

HTTP Header Spy
Settings
GEThttps://www.unix.com/man-page/centos/9/idr_replace/
HTTP/1.1 200 OK

1s 194 ms

209.126.104.117

Date
Tue, 07 May 2019 05:31:57 GMT
Server
Apache/2.4.29 (Ubuntu)
Cache-Control
private
Pragma
private
X-UA-Compatible
IE=7
Vary
Accept-Encoding
Content-Encoding
gzip
Content-Length
8757
Keep-Alive
timeout=5, max=100
Connection
Keep-Alive
Content-Type
text/html; charset=UTF-8

The above posts makes no sense.

Google declares this page is "not indexable" due to "soft 404 errors" but Google's own "Check Live URL tool" shows the page has no 404 errors, soft or not, and is fully 200 "clean".

When I repeat the test without Google, results are also "200" clean as a whistle.

Maybe it's a Google plot against this site, as all the SEO gurus tell me our on-page SEO is really great

Hmmm.

2 Likes

Hi Dennis,

I checked that that 302 has not been in place for some time, as it was related to the site below, which was the user of the IP address before we changed hosting providers when we had the data center crisis:

-rw-r--r-- 1 root root 1244 Nov 17 05:53 armazem239.com.br.conf
-rw-r--r-- 1 root root 2256 Nov 17 05:53 armazem239.com.br-le-ssl.conf

According to Google Search Console, the site began having problems around mid March; and from end of March / early April until now we have lost a lot of organic Google referral traffic.

The graph over 16 months shows that, except for the end of year slow down, google search referrals have been steady until mid March when they started on a steady decline which is still going in the wrong direction.

Google announced a Google bug in the first week of April where they said the dropped links from sites; and we were seeming effected by that bug.

I have checked Google Search Console and are site does not report errors or any issues; so maybe over time our index will increase and traffic will back to normal.

Also, sometime in last April, Revive Ad server notified customers that their ad server was subject to SQL injection attacks. We upgrade within a few days of the advisory, but after upgrading, I noticed our ad server had already been compromised and was serving adware invisible to the user. I fixed that problem as soon as I found out.

The "soft 404 errors" seem to be better; but I have requested Google to revalidate and increase crawl rate.

The SQL injection issue is fixed, so I'm guessing in April we got hit by a number of "whammys" all of which have been fixed by either Google or Revive; so my fingers are crossed things will recover slowly.

Also, I plan to move off Revive ad server and to another ad server platform because I've lost confidence in Revive. Currently looking for an alternative.

Cheers.

See also: How Bad Was Google's Deindexing Bug? - Moz

According to many SEO blogs, we are just one of many websites adversely effected by the deindexing bug; and when we combine that with the Revive bug, it was a very rough month for us in April.

Hi Jim,

I think you hit the nail on the head.

I was checking various links in the Google Search Console and noticed that short questions with no replies, what you refer to as "thin content" were being classified as "soft 404" by Google even though the server did not give any errors and the HTTP headers were fine.

This post is a clear example:

https://www.unix.com/shell-programming-and-scripting/177815-error-occurred-during-initialization-vm.html

This thread gave a "soft 404 error" in Google Search Console over and over. Then I wrote a reply (you can easily see the reply) retested and Google gave the green light, all OK.

So, Jim McNamara was right on target.

Google is dropping pages from the index with "thin content" which includes very short forum questions without answers. Long questions with plenty of content appear to be fine.

Great work Jim.

I need to issue a new badge to you for this when I get a chance !!! :slight_smile:

2 Likes

Update:

Just issued Jim a blue "non coding" admin badge based on Jim nailing this "soft 404 error" issue first and correctly. I know it is not much, but it's one cool thing we have to show appreciation and merit here.

I was busy and did not have time to really look deeper into the "soft 404 error" until today, and Jim was correct and right on!

Well done!

1 Like

Update:

Oddly enough, Google seems to have added "AI" to evaluate link names and titles and classify them "soft 404 errors".

For example, I have found a number of threads with titles and links with phrases like "file not found" when a user is asking a basic question about some file not being found.

Google's AI will also flag this as a "soft 404 error", I assume because of their AI classifier.

Editorial Comment

This shows, yet again, why AI is bad and certainly not "artificially intelligent" ... it is definitely "artificial" and "systematic" and "stupid", so better to call it what is really is:

"Artificially and Systematically Stupid" OR "ASS" for short. . :wink: :wink:

Or maybe even more accurate for this kind of AI:

Artificially and Systematically Stupid Having Only Low Endowment

:wink: :wink: :wink: :wink: :wink: :wink:

This problem with 302 -> 200 redirects has been solved.

See Discussion Below:

302 Redirects Issues Effected Google Search Console (GSC) Fixed - DBSEO Goto Rewrite Problems Solved

I noticed today some "Google 404 errors" for man pages with "thin content", so I am going to create "similar man pages" for "man pages" using a mysql full text search algorithm, for all man pages less than 3000 chars.

Created the DB entry and PHP code already and am running a cron in the background to create "similarman for man pages".

Still running this cron to create similar man pages from man pages, and just tested some of the results and the similar man pages are looking good.

FWIW, here is the PHP code I quickly put together for this caper:

<?php
include_once '/var/www/global.php';
global $vbulletin, $neo_global, $quickload;
$t1 = time();
if ($quickload > 5) {
    $makeLimit = 1;
} elseif ($quickload > 4) {
    $makeLimit = 10;
} elseif ($quickload > 3) {
    $makeLimit = 20;
} elseif ($quickload > 2) {
    $makeLimit = 25;
} else {
    $makeLimit = 30;
}

$strlen = 2500;

$neo_conn = getManDBConn();
$sql = 'select manid, text from neo_man_page_entry where similarman = "blank" and strlen < ' . $strlen . ' order by strlen ASC LIMIT ' . $makeLimit;
$maninfo = mysqli_query($neo_conn, $sql);
$t1 = time();
mysqli_query($neo_conn, "SET sort_buffer_size = 2048000");
$counter = 0;
while ($man = mysqli_fetch_assoc($maninfo)) {
    $out = '';
    $searchstr = $man['text'];
    $searchstr = html_entity_decode($searchstr);
    $searchstr = stripslashes($searchstr);
    $searchstr = strip_tags($searchstr);
    $searchstr = str_replace("'", " ", $searchstr);
    //echo $searchstr ."<p>";
    $sql2 = "select manid, MATCH(text) AGAINST ('" . $searchstr . "' IN NATURAL LANGUAGE MODE) as score,strlen FROM neo_man_page_entry where strlen > 2000 AND strlen < 2000000  ORDER BY score DESC LIMIT 15";
    $resultMan = mysqli_query($neo_conn, $sql2);

    $counter++;
    while ($manpage_raw = mysqli_fetch_assoc($resultMan)) {
        //echo $manpage_raw['manid']."<P>";
        $out .= $manpage_raw['manid'] . ",";
    }
    $out = substr($out, 0, -1);
    //echo $out."<p>";
    $insert = "UPDATE neo_man_page_entry set similarman = '" . $out . "' where manid =" . $man['manid'];
    mysqli_query($neo_conn, $insert);
    //echo $insert . "<p>";

}

$sql2 = 'select count(1) as count from neo_man_page_entry where similarman = "blank" and strlen < ' . $strlen;
$countinfo = mysqli_query($neo_conn, $sql2);
while ($count_raw = mysqli_fetch_assoc($countinfo)) {
    $rem = $count_raw['count'] / ($makeLimit * 60);
    $remaining = number_format($count_raw['count'] / ($makeLimit * 60), 1);
    if ($rem > 0) {
        break;
    }

}
$t2 = time();
$td = $t2 - $t1;
error_log(time() . " Time: " . $td . " Inserts: " . $counter . " Floor: " . $strlen . " Limit: " . $makeLimit . " ToDo: " . $count_raw['count'] . " RemainingTime: " . $remaining . " Hours QLoad: " . $quickload . "\n", 3, '/var/log/apache2/debug/neo_sim_man_man_pages_timing.log');
closeManDBConn($neo_conn, $maninfo);
closeManDBConn($neo_conn, $resultMan);
ubuntu@ tail -f neo_sim_man_man_pages_timing.log
1576821537 Time: 53 Inserts: 30 Floor: 2500 Limit: 30 ToDo: 56594 RemainingTime: 31.4 Hours QLoad: 1.56
1576821597 Time: 53 Inserts: 30 Floor: 2500 Limit: 30 ToDo: 56564 RemainingTime: 31.4 Hours QLoad: 1.59
1576821656 Time: 54 Inserts: 30 Floor: 2500 Limit: 30 ToDo: 56534 RemainingTime: 31.4 Hours QLoad: 1.59
1576821716 Time: 54 Inserts: 30 Floor: 2500 Limit: 30 ToDo: 56504 RemainingTime: 31.4 Hours QLoad: 1.79
1576821776 Time: 54 Inserts: 30 Floor: 2500 Limit: 30 ToDo: 56474 RemainingTime: 31.4 Hours QLoad: 1.77
1576821826 Time: 44 Inserts: 25 Floor: 2500 Limit: 25 ToDo: 56449 RemainingTime: 37.6 Hours QLoad: 2.07
1576821895 Time: 53 Inserts: 30 Floor: 2500 Limit: 30 ToDo: 56419 RemainingTime: 31.3 Hours QLoad: 1.83
1576821956 Time: 51 Inserts: 30 Floor: 2500 Limit: 30 ToDo: 56389 RemainingTime: 31.3 Hours QLoad: 1.84
1576822014 Time: 52 Inserts: 30 Floor: 2500 Limit: 30 ToDo: 56359 RemainingTime: 31.3 Hours QLoad: 1.53
1576822078 Time: 52 Inserts: 30 Floor: 2500 Limit: 30 ToDo: 56329 RemainingTime: 31.3 Hours QLoad: 1.12

As a side note,

I was a bit surprised to see how good the results are so far. When I checked about 20 similar man page entries, they were "spot on" and will be helpful for readers / future voyagers to the site. Of course man page with "nothing very similar" get more mixed results.

In a few days, I will add the code to the man pages that checks the strlen of the man page requested and if under 2000 (or maybe 1500), it will include another man page underneath in a section called something like "Check Out this Similar Man Page".. or something like that. This should mitigate the "soft 404" errors keeping certain man pages from being indexed by Google. (Status Update: This Todo is DONE)

1 Like