UNIX.com is getting crushed in google search these days

For over a decade, unix.com has been in the top tier for search referrals. The keyword "unix" used to rank #4, and when it was down, it was #9. At times, we were close to #2 on Google for the "unix" keyword. Now, in some geos (in the US for example yesterday), in Google search the "unix" keyword search puts unix.com on page 4.

In other words, unix.com is getting crushed in Google search results based on recent changes to their search algorithm.

This problem started around the end of March, early April.

Our "on page SEO" is great. I have had it reviewed by a number of "SEO people" and they all say it's fine and offer no suggestions for "on page" SEO improvement.

Originally, I thought there was a bug in our Apache2 server causing "soft 404 errors", but after digging deeper into this, I confirmed:

Google Webmaster Tools Shows Problems with Soft 404 Errors

These "soft 404 errors" in Google Search Console are not really "errors". These Google search "soft 404 errors' are generated by Google's internal "AI" (or their algorithms) and they are basically penalizing the forum format where people tend to ask short questions and replies (which are often but not always, short replies).

Yesterday I confirmed this "penalty" by examining dozens of links where Google Search Console reported "soft 404 errors" and in each case I could correct this "error" by adding more text to the posts or replies.

So, basically we are being penalized now for being a "forum" and not a "articles" site.

This seems to correspond to advice from SEO'ers recently (who were not aware of the soft 404 error issue on this site) that we transform the site to a "news article site"... something I have been very hesitant to do (and am not going to do); as I have always wanted unix.com to be a resource to help unix and linux users on a day to day basis when they are in need of help.

Unfortunately, our traffic from Google Search referrals is tanking because of changes in Google's algorithm, and this is bring less and less users to the site with their questions.

This is not the case in Bing, where we rank very high and the "unix" keyword bounces from the #1 to the #3 position daily for this site. But Bing does not not generate much traffic when compared to Google, so unix.com is getting "killed" by Google now due to our decade long forum format.

When I was testing the Google "soft 404 errors" yesterday, I found that if a user had posted a message with a title like:

"Please help me with my file not found error"

Google ranked this as a "soft 404 error" because of the "file not found" in the title, even if there were pages of replies.

In other words, Google's current indexing algorithm is penalizing unix.com and the overall forum format and using some kind of "AI" to even look at the content. If the content had words like "file not found", they classify this as a "soft 404 error". This really took me by surprise. I expected the "thin content" to be problematic, but not "file not found" in a forum user's unix filesystem question.

So, in a nutshell, this is the reason that are visitors have dropped around 50% since the beginning of March. We are getting crushed by Google's algo changes this year.

I do not have any good answer to this. I cannot change how Google does things. Our on-page SEO is fine.

My deep analysis yesterday confirmed that Google is penalizing us ("soft 404 errors", a Google term not a real HTTP error) because we are using a "forum format" and many of the questions and answers are short replies and not long articles with a lot of content. This is especially true of short questions with no answers or questions with deleted answers, etc.

3 Likes

Also, note this additional "ding" from a Google bug:

https://support.google.com/webmasters/answer/6211453#general

first mentioned around 5 April.

This also roughly corresponds to the decrease in Google referral traffic over the same period.

April 9-25, 2019

April 5, 2019

1 Like

So the way the title is formulated has an impact... hmmm
But asking a question, a suggestion or help in plain english doesn't mean to be verbose, you expect that in the body...

This may relaunch the debate on compiling some knowledge we have here, I tried to see what it can represent, and was frightened by the amount of time and the judgement effort it asks when you have many solutions... But not impossible, to not be drowned by the load it would ask a serious team work to share the tasks, and I believe the first thing to do is to create a list of Q , the most frequent ones and important that needs to be answered, then each of us take one Q and search the forum for the answers provided here...
The group effort again is to decide after in what form we deliver the lot
A fundamental question, links to good solutions, then adding variation the the themes? Do we vote for best solution and provide alternatives?
This will only make sense if we choose the correct Q/A which represent the most of beginners in UNIX world challenges, it might if we formulate correctly the title and make an effort in body content help unix.com gain lost popularity, maybe also diminish workload in replying again and again to the same questions...
This also make me wonder if replies like: Searching the forum you would have found this solution to you Q <link to unix.com solution> are being now penalised and so we are doomed to answer systematically which means the above effort for a helpful Q/A effort would be vain...

1 Like

I think when a user asks a "short question", there is no problem if they get a "long reply".

One idea I had was to write some PHP code that would add a block with a "Fun UNIX Fact" or "Fun Linux Fact" and append it to any post which:

  • A short question without a reply, and;
  • The original question is very short.

Yesterday, when checking the links in Google Search Console, tried this technique on a few unanswered initial posts, and I added a block of text to those 'soft 404 errors' and then the thread passed.

Or instead of "Fun UNIX Fact", I would write some code to take the "Similar Thread" data for that unanswered thread, and create a forum reply with the post data from a similar thread somehow, or append a block of the summary data to the original question, and kind of embedded "Topics Related to This" block, appended to short questions with no replies after a period of time (like a week).

I need to check to see if the current "soft 404 error" pages have similar threads at the bottom.

If they do, then another possibility would be to rewrite the similar thread block to include summary data for those similar thread, say the first 10 link lines for each "similar thread" in that block. That might work :slight_smile:

3 Likes

Also, here is an example of how broken Google's algorithm is now.

This thread:

https://www.unix.com/unix-for-dummies-questions-and-answers/139931-executing-function.html

Before that thread above (about 10 minutes ago) used to have a title:

Error Executing Function

It would not pass Google's algorithm and so it was rejected in the index (not included in the Google search index).

I then changed the title to:

Executing Function

and re-ran the Google "checker tool" then it passed with flying colors.

The means that Google is doing simple algorithmic classification of titles (and maybe body text) for keywords like "error" or "not found" and they are rejecting those pages in their index!

This is very, very bad for unix.com, because people are always asking about errors, files not found, and using all these kind of keywords which Google's "very bad AI" thing means the page has a error or is a page not found, but actually it is a user asking about such things in a forum.

This is a perfect example of AI gone bad.

Failed Google's Index Checking:

Passed Google's Index Checking:

This is what these highly paid $200K USD a year Google AI engineers are doing with all their "AI and classification skills" ? !!! . ROTFL

Amazing how stupid AI actually is, isn't it? It's even more amazing that "big tech" thinks this is "the future of mankind"..... They actually pay teams huge salaries money to write these ' very simple, "child-like" classification algorithms.. ?

Google, you should be ashamed of yourself!

Check this out!

Here I query the forum DB and quickly check the title in each thread for the word "error" ... this alone is around 12,000 pages Google will probably reject in their index because it has the world "error" in the title:

mysql> select count(*) from thread where  title LIKE '%error%';
+----------+
| count(*) |
+----------+
|    12031 |
+----------+
1 row in set (0.10 sec)

These will probably get rejected too:

mysql> select count(*) from thread where  title LIKE '%not found%';
+----------+
| count(*) |
+----------+
|      350 |
+----------+
1 row in set (0.09 sec)

LOL, I guess we will need use a thesaurus for "error" and to write SQL REGEX queries and change "error" to "misunderstanding" or "transgression" or "screw-up" to get these pages to index now !

Misunderstanding Executing Function

Or maybe:

Screw-up Executing Function

We truly live in a dystopian world now... thanks to big tech.

3 Likes

Hi Neo
In any case, a links to similar threads is a very common practice. Even there are options invisible to authorized users.

As a quick upgrade to "similar threads", I just added "thread preview" with the "forum title" in the similar thread code.

It looks much better and I am sure it will help out with SEO a bit.

So someone writing in any forum or open support site something like " after updating with patch xxx.vv I get error YYY cannot resolve... " It would be rejected by Google AI... Difficult to believe that MS, IBM or other big vendor would accept that... wonder if it not again a question of $$$ to let them pass through which big accounts can afford but not the others and so they are dropped in the search engine..
What do you see when you make a same query in google and duckduckgo ?
Just to see the impact... Just wondering...

This issue is only with Google search. I don't see the issue in Bing.

Anyway, as you can imagine, I do not have time to look at every search engine (duck, ask, baidu, etc) ; I only pay attention to the search engines that drive "real" traffic to the site.

If you are really interested in helping out, you can access to the search console and webmaster tools for each of those search engines; but honestly, you will need to spend considerable time doing each possible keyword combo to see exactly what Google is doing. I only stumbled on this working on "soft 404 errors", which I now understand are not "real errors" but "Google's optimization". "soft 404 is very misleading.

Anyway, Victor, I only report what I see and what is happening at unix.com and why our index is getting hammered by Google with the "soft 404 errors" reported in Google Search Console, where Google has rejected the links and deleted them from their index; and where i can test them.

Also, the complete algorithm of what and how Google is matching these "error" and "not found" words is not known to me (it is a closely guarded Google secret) and nor is it published by Google.

Anyway, it's wearisome to say the least.

Update:

I think I have "fixed" most of the "soft 404" errors in Google Search Console by modifying the code and content. The revalidation is slow going, but so far the "soft 404 errors have dropped from around 7000 to around 2000 and Google says 'Looking Good" as they slowly revalidate.

I also added code in the posts code block to overflow (add scroll bar) where users have posted long lines of code without code tags. There are too many posts without code tags and all these posts caused problems with Google's mobile validator; and then Google drops them from the index for "not mobile friendly" .

I also added the post summary to similar threads so Google now does not flag pages as "soft 404" for "thin content" when there are a number of similar threads listed.

Also, looking at the Google search ranking, for the keyword "unix", I noticed that our site is not the only site which has dropped. A number of older sites, including the Open Group, who used to have two pages in the top ten, have dropped even lower than our site.

Also stackexchange for the same "unix" keyword used to be high on the second page, now it hangs around the bottom of page four.

So, it seems like changes in Google's algorithm has effected many sites, and not only this one. Many have benefited and others have fallen from grace.

Let's see if the changes I made reverse our downward trend, or as I suspect, we are just an "old and long in the tooth forum site" falling out of favor with Google because our content has "aged" over the years and there are lots of new sites with more modern formats coming up on the net who also have a much strong social media presence.

1 Like

Also, I found another problem today where a domain in Brazil was using our IP address and duplicating our content.

So, today I created an Apache2 fix for this problem.

Maybe fixing this problem will help a bit as well with Google search referrals and ranking. Maybe not.

2 Likes

Today I spoke with a consultant who went over unix.com in the Google Search Console and he concurred all is looking good (and the soft 404 errors seemed to be well on their way to being fixed) and the site should start to re-index over time; and reminded me it simply takes time, especially with all the changes in Google Search Console and the many millions and millions of websites.

If anyone needs some Google Search Console (GSC) help or advice, please return the kindness he has given to unix.com and contact him on Fiverr.

  • About Me:
  • I'm a self-employed freelance Web Developer and Digital Marketer. I've 06 years' experience Web Development and Digital Marketing.
  • I would love to support you with all my Tech Efforts & Experience. I believe in providing the best possible service for you, because your satisfaction is my motivation!
  • Looking Forward to Work With You.
  • Thanks..
  • I can help you with:
  • 1- Website Development
  • 2- Website Speed Optimization
  • 3- Website Security
  • 4- Search Engine Optimization (SEO)
  • 5- Social Media Marketing (SMM)
  • 6- Killer SEO Content Writing

Referring to Victor's post#9, I agree with him that, since this is a forum, we cannot (directly) control content. Given the size of DB we must have with all the historical questions, surely we can't be expected to make thousands of modifications. A forum is what is it; a forum.

I do notice that searching for "Unix forum" we are top.

As you say Neo, let's see what happens but if what you've done doesn't fix it then we ought to put this problem in front of John Mueller at Google (Zurich) or, if we can't get to him, Gary Illyes at Google (Zurich) and see what they say/recommend.

The change of algo must have given this issue to many forums.

1 Like

Well, the many people I have discussed this with all seem to agree that Google is penalizing forums in one way or another, in their algorithm changes.

I serious doubt Google's algo development team wakes up in the morning and says "let penalize forums", but I am confident they are making their algorithms more "AI-like" and we can only see the symptoms of what they are doing.

In the last week, I have directly seen perfectly good forum discussions marked by Google as "soft 404" and not indexed because of a single keyword like "error"in the title of the discussion. When I manually change the title and remove the keyword "error", it passes Google's algorithm in flying colors. I have seen this for other phrases like "not found" as well.

This does make sense if you look at it from a global AI perspective. AI is not intelligence and nor are methods like Bayesian classifiers. It is collecting data globally and I am sure many bad links on the net return responses that have "error" or "not found" in the text. So, then, speaking globally, Google's then classifies links with test with "error" or "not found" in the title or the meta data as "bad".. or in their case "soft 404" and they do not index it.

So, the website a forum about dogs and cats, then that site probably does not have have titles and meta data like "My Dog Has An Error" or "Please Help Me with My Cat Error". So, based on my years of working with such classifiers (we also ran a Bayesian anti-spam classifier here at unix.com for many years), it is easy to see how a classifier could penalize a technology forum dealing with software errors as a matter. These are simply false positives in Google's algorithm.

This same is true of "thin content"

If someone asks a short question about grep and they get a short but accurate reply, and even if that reply is very helpful to everyone, Google's classifier cannot score that. Google will just score on the content "thinness".

I spend most of the week looking at all the links on our site which Google has classified as "soft 404 errors" and in each cause, either "thinness" or a keyword in the title or meta data like "error" or "not found" was the cause. In each case I confirmed it by double checking before and after I made the change.

So, to help with the soft 404s on posts and discussion threads, I added summary test to "similar threads". Now those pass Google's classifier and are currently being validated as "looking good".

As for all the "titles" and meta data with keywords like "error" and "not found".. that is a huge problem and of course we cannot change 12K threads and make the title and metadata senseless to pass Google's classifiers.

Unfortunately, this is how the classifiers work and it is really a very poor design which would classify a technology forum with links with "error" and "not found" in the metadata as "soft 404" but Google does not listen to me. In fact, since I left the US over 10 years ago and live on the seacoast in Thailand, very few people listen to me like they used to. People are mostly jealous! LOL

Anyway, I digress.

The good news is that I have made a lot of changes this week and learned a lot. The bad news is I cannot say with any assurance that the changes I make will have any immediate effect. Five or ten years ago, I could see changes have an effect very quickly; but as many have pointed out to me recently, the network is orders of magnitude larger now than it was back then and it is growing larger at neck-breaking speeds.

Also, one of the best ways for any site to have a better search engine ranking is to have many quality backlinks.

I have looked at the web pages of many of the members who have many posts at unix.com and you might be surprised to know that very few of our members who have web sites have a link to unix.com on their web pages.

Every credible link from relevant pages (technology related sites, personal pages, social media posts, posts in other tech sites) which points back to any unix.com page (homepage, your profile page, or a discussion thread or post) helps boost our search engine rankings.

Update:

Google Search Console shows downward trend has reversed and traffic is on the increase again.

Whew! That was a lot of work! But it seems to be working.

I will do a video on this in about a week or so and show the downward trend, the various issues, changes I made to reverse the downward trend, and results.

4 Likes

Well, it been around two weeks and in many geographic areas, for example the USA, a Google search for keyword "unix' has our site back on the first page.

Google Search Console shows the traffic from Google has increased to near mid March levels (erasing around six weeks of decline) and and is rising steadily.

This seems to indicate that we are "out of the woods" and my long hours of reviewing our search optimization in Google Search Console and making a few changes here and there, has paid off.

Yea!

In a week or so, if the upwards trend continues, I will do a YT video on this. If not, no video on this topic, LOL.

5 Likes

Well, at least we are number one on Bing for the "unix" keyword, but only back up to number 8 - 10 on Google (currently #8 or #9 for Google US)

Working on all the "soft 404" errors seem to be working.

Today:

Currently Active Users: 4168 (3 members and 4145 guests and 20 Spiders)

Edit: Or maybe not really... seeing a lot of abnormal bot traffic from Taiwan.

1 Like