I've noticed that Google searches on the Wiki using site:wiki.nesdev.com often act dumb. For example, when I search for MMC3, none of the 21 hits is the main MMC3 page, even though it has MMC3 in the title. Using intitle:MMC3 gives zero hits (even if I add MMC3 again as a normal search string).
Is there something telling Google to skip documents? At least a week or two ago, I was searching for things on nesdevwiki and Google had lots of hits to the now-nonexistent site. Maybe that has something to do with it, like Google thinks the new site is a spam mirror or something, I dunno. I see no robots.txt, so that wouldn't be it (some bad entry in one or something).
Google searches on wiki act dumb
-
blargg
- Posts: 3715
- Joined: Mon Sep 27, 2004 8:33 am
- Location: Central Texas, USA
-
Banshaku
- Posts: 2417
- Joined: Tue Jun 24, 2008 8:38 pm
- Location: Japan
I don't know enough about media wiki to give an answer but could it be related to the latest spam links that we received about essays that could affect google since it must be a common spam link?
Maybe there is a way to check on google to see how nesdev is affected by this but I don't know about that too. I will see if I can give it a look.
Maybe there is a way to check on google to see how nesdev is affected by this but I don't know about that too. I will see if I can give it a look.
-
blargg
- Posts: 3715
- Joined: Mon Sep 27, 2004 8:33 am
- Location: Central Texas, USA
I just noticed that Google is also getting hits within the skins/ directory on the Wiki. I'm thinking that should have an empty index.html in it. Otherwise one gets useless hits, even when restricting via a site: as mentioned in an earlier message.
-
koitsu
- Posts: 4201
- Joined: Sun Sep 19, 2004 9:28 pm
- Location: A world gone mad
This is because of the secure MPM we use for Apache. A more appropriate fix would be to place an .htaccess file in /home/ndwiki/www/w which contains:blargg wrote:I just noticed that Google is also getting hits within the skins/ directory on the Wiki. I'm thinking that should have an empty index.html in it. Otherwise one gets useless hits, even when restricting via a site: as mentioned in an earlier message.
Code: Select all
Options -IndexesI've put said .htaccess in place; verified as working. It may take a few weeks before Google picks up the changes, as their crawler sometimes takes a while to notice such things.