PDA

View Full Version : Solved Google refuse to index the additional URLs created by vBET



myandy99
25-08-11, 04:07
vBET 3.4.1
vBSEO 3.6
vBulletin 3.6.8
vBSEO Sitemap

I integrated vBET with vBSEO Sitemap and the URLs generated in sitemap increased significantly because of additional languages. However Google refuses to index most of them on web with the following message sent to me on my Webmaster Tool:


Googlebot found an extremely high number of URLs on your site: http://xxxx.com/

Googlebot encountered problems while crawling your site http://xxxx.com/.

Googlebot encountered extremely large numbers of links on your site. This may indicate a problem with your site's URL structure. Googlebot may unnecessarily be crawling a large number of distinct URLs that point to identical or similar content, or crawling parts of your site that are not intended to be crawled by Googlebot. As a result Googlebot may consume much more bandwidth than necessary, or may be unable to completely index all of the content on your site.

More information about this issue

Here's a list of sample URLs with potential problems. However, this list may not include all problematic URLs on your site.

(note: the sample URLs listed are the foreign language URLs(such as http://xxx.com/forums/no/something-something.html) and they are working URLs, NOT broken URLs).


As a result, while with vBET I submitted much more URLs than before (661,934) to Google through sitemap, Google rejected most of them and only indexed 44,506 of them.

The detailed message that explains why Google did not index those URLs is like the following:


URLs not followed
When we tested a sample of URLs from your Sitemap, we found that some URLs redirect to other locations. We recommend that your Sitemap contain URLs that point to the final destination (the redirect target) instead of redirecting to another URL.

HTTP Error: 301
URL:
http://xxx.com/forums/sl/blah-blah-blah/314-what-ever-sample.html
Problem detected on: Aug 18, 2011

Shouldn't Google index more of my URLs if not all? I read on this forum the good stories about vBET increasing the indexed URLs on web, thus bringing in additional traffic. This has not happened to me.

How to fix my problem?

kamilkurczak
25-08-11, 08:21
hello,
can you paste here a full message from google? with full list of wrong urls? if you dont want to paste it here - you can PM to me. I need to check this links noticed by google:) thanks

myandy99
26-08-11, 06:15
Just PM'ed you. Total 2 PMs because the Google message is very long. Thanks

kamilkurczak
26-08-11, 09:02
hello,
I checked your links and I didn't notice any wrong urls and double redirections. Can you paste here your robots.txt file content?
secondly - using google chrome tools (ctrl+shift+j in chrome) I noticed some errors - you can check it usng google chrome browser.
(bad request 400 , image not found = 404) you can check it.

I will try to find why google send to you that message .

secondly. I noticed some pages with double lang in url, for example: forums/no/pl/photoplog/image-former-policeman-hold-tourist-bus-passengers-hostage-manila-photos-images-2944/ so there is issue with redirection. are you sure that you integrate vBET with vBSEO correctly?

myandy99
26-08-11, 17:33
I thought I followed the instructions to integrate vBET with vBSEO. What do you think I might have missed? I can double check.

forums/no/pl/photoplog is for a third party vBulletin mod and it has not been integrated with vBET (I don't know how to do it) so I am not surprised to see odd URLs for photoplog under vBET. Photoplog is installed under site root, not forum root so the URL really should be root/whatever-language/photoplog, not root/forums/whatever-language/photoplog. But anyway Google is complaining about my regular forum URLs being too many and problematic.

This is my robots.txt before I received the Google message:

User-agent: Slurp

User-agent: *
Disallow: /forums/cron.php
Disallow: /forums/login.php
Disallow: /forums/newreply.php
Disallow: /forums/newthread.php
Disallow: /forums/printthread.php
Disallow: /forums/private.php
Disallow: /forums/profile.php
Disallow: /forums/register.php
Disallow: /forums/report.php
Disallow: /forums/reputation.php
Disallow: /forums/search.php
Disallow: /forums/sendmessage.php
Disallow: /forums/payments.php
Disallow: /forums/cgi-bin/
Disallow: /forums/includes/
Disallow: /forums/living_avatars/
Disallow: /forums/customavatars/
Disallow: /forums/members.php
Disallow: /forums/memberlist.php
Disallow: /roommate
Disallow: /roommate-notused
Disallow: /3rdparty/
Disallow: /cgi-bin/
Disallow: /classes/
Disallow: /crons/
Disallow: /functions/
Disallow: /import/
Disallow: /includes/
Disallow: /jscript/
Disallow: /license/
Disallow: /modules/
Disallow: /moderator/
Disallow: /psystems/
Disallow: /smarty/
Disallow: /temp/
Disallow: /ajax/
Disallow: /news.php/
Disallow: /forums/members/

Sitemap: http://xxxx.com/forums/sitemap_index.xml.gz


This is my robots.txt after I received Google messages (I added disallow to /forums/language/members)


User-agent: Slurp

User-agent: *
Disallow: /forums/cron.php
Disallow: /forums/login.php
Disallow: /forums/newreply.php
Disallow: /forums/newthread.php
Disallow: /forums/printthread.php
Disallow: /forums/private.php
Disallow: /forums/profile.php
Disallow: /forums/register.php
Disallow: /forums/report.php
Disallow: /forums/reputation.php
Disallow: /forums/search.php
Disallow: /forums/sendmessage.php
Disallow: /forums/payments.php
Disallow: /forums/cgi-bin/
Disallow: /forums/includes/
Disallow: /forums/living_avatars/
Disallow: /forums/customavatars/
Disallow: /forums/members.php
Disallow: /forums/memberlist.php
Disallow: /roommate
Disallow: /roommate-notused
Disallow: /3rdparty/
Disallow: /cgi-bin/
Disallow: /classes/
Disallow: /crons/
Disallow: /functions/
Disallow: /import/
Disallow: /includes/
Disallow: /jscript/
Disallow: /license/
Disallow: /modules/
Disallow: /moderator/
Disallow: /psystems/
Disallow: /smarty/
Disallow: /temp/
Disallow: /ajax/
Disallow: /news.php/
Disallow: /forums/members/
Disallow: /forums/af/members/
Disallow: /forums/sq/members/
Disallow: /forums/ar/members/
Disallow: /forums/hy/members/
Disallow: /forums/az/members/
Disallow: /forums/eu/members/
Disallow: /forums/be/members/
Disallow: /forums/bg/members/
Disallow: /forums/ca/members/
Disallow: /forums/zh-CN/members/
Disallow: /forums/hr/members/
Disallow: /forums/cs/members/
Disallow: /forums/da/members/
Disallow: /forums/nl/members/
Disallow: /forums/en/members/
Disallow: /forums/et/members/
Disallow: /forums/tl/members/
Disallow: /forums/fi/members/
Disallow: /forums/fr/members/
Disallow: /forums/gl/members/
Disallow: /forums/ka/members/
Disallow: /forums/de/members/
Disallow: /forums/el/members/
Disallow: /forums/ht/members/
Disallow: /forums/iw/members/
Disallow: /forums/hi/members/
Disallow: /forums/hu/members/
Disallow: /forums/is/members/
Disallow: /forums/id/members/
Disallow: /forums/ga/members/
Disallow: /forums/it/members/
Disallow: /forums/ja/members/
Disallow: /forums/ko/members/
Disallow: /forums/lv/members/
Disallow: /forums/lt/members/
Disallow: /forums/mk/members/
Disallow: /forums/ms/members/
Disallow: /forums/mt/members/
Disallow: /forums/no/members/
Disallow: /forums/fa/members/
Disallow: /forums/pl/members/
Disallow: /forums/pt/members/
Disallow: /forums/ro/members/
Disallow: /forums/ru/members/
Disallow: /forums/sr/members/
Disallow: /forums/sk/members/
Disallow: /forums/sl/members/
Disallow: /forums/es/members/
Disallow: /forums/sw/members/
Disallow: /forums/sv/members/
Disallow: /forums/zh-TW/members/
Disallow: /forums/th/members/
Disallow: /forums/tr/members/
Disallow: /forums/uk/members/
Disallow: /forums/ur/members/
Disallow: /forums/vi/members/
Disallow: /forums/cy/members/
Disallow: /forums/yi/members/
Disallow: /forums/member.php
Disallow: /forums/af/member.php
Disallow: /forums/sq/member.php
Disallow: /forums/ar/member.php
Disallow: /forums/hy/member.php
Disallow: /forums/az/member.php
Disallow: /forums/eu/member.php
Disallow: /forums/be/member.php
Disallow: /forums/bg/member.php
Disallow: /forums/ca/member.php
Disallow: /forums/zh-CN/member.php
Disallow: /forums/hr/member.php
Disallow: /forums/cs/member.php
Disallow: /forums/da/member.php
Disallow: /forums/nl/member.php
Disallow: /forums/en/member.php
Disallow: /forums/et/member.php
Disallow: /forums/tl/member.php
Disallow: /forums/fi/member.php
Disallow: /forums/fr/member.php
Disallow: /forums/gl/member.php
Disallow: /forums/ka/member.php
Disallow: /forums/de/member.php
Disallow: /forums/el/member.php
Disallow: /forums/ht/member.php
Disallow: /forums/iw/member.php
Disallow: /forums/hi/member.php
Disallow: /forums/hu/member.php
Disallow: /forums/is/member.php
Disallow: /forums/id/member.php
Disallow: /forums/ga/member.php
Disallow: /forums/it/member.php
Disallow: /forums/ja/member.php
Disallow: /forums/ko/member.php
Disallow: /forums/lv/member.php
Disallow: /forums/lt/member.php
Disallow: /forums/mk/member.php
Disallow: /forums/ms/member.php
Disallow: /forums/mt/member.php
Disallow: /forums/no/member.php
Disallow: /forums/fa/member.php
Disallow: /forums/pl/member.php
Disallow: /forums/pt/member.php
Disallow: /forums/ro/member.php
Disallow: /forums/ru/member.php
Disallow: /forums/sr/member.php
Disallow: /forums/sk/member.php
Disallow: /forums/sl/member.php
Disallow: /forums/es/member.php
Disallow: /forums/sw/member.php
Disallow: /forums/sv/member.php
Disallow: /forums/zh-TW/member.php
Disallow: /forums/th/member.php
Disallow: /forums/tr/member.php
Disallow: /forums/uk/member.php
Disallow: /forums/ur/member.php
Disallow: /forums/vi/member.php
Disallow: /forums/cy/member.php
Disallow: /forums/yi/member.php
Disallow: /forums/hy/member.php
Disallow: /forums/az/member.php
Disallow: /forums/eu/member.php
Disallow: /forums/ka/member.php
Disallow: /forums/ur/member.php

Sitemap: http://xxxx.com/forums/sitemap_index.xml.gz

kamilkurczak
29-08-11, 09:12
you can check our robots file from here
http://www.vbenterprisetranslator.com/forum/general-discussions/243-vbet-performance.html#post1178

vBET
29-08-11, 20:10
Hi. I just posted info to Kamil to forward me your PM. As I see the issue is caused by double language code in URL's - this is why you got the email from Google - you have many URL's with same content.

First of all - such links shouldn't be created. The cause here can be not integrated plugins.

Next even if such URL is created vBET suppose to find out about it and redirect. Redirection do not happens on your forum and this is the problem. I do not know why it doesn't redirect - is it vBET bug or because of you are using it on not officially supported vBulletin version. vBET 3.x is for vBulletin 3.8, and you are using vBulletin 3.6. I need to check it on place. Please PM me your Admin CP and FTP access details.

myandy99
30-08-11, 04:47
Just PM'ed you

vBET
31-08-11, 04:33
OK so - now I see real issue - links for photoplog shouldn't include /forums at all. This was integration issue - you didn't made any integration effort and link was relative not absolute so vBET assumed that it is still forum link.

You do not see how it is contributed to Google complaint but I see it. Each (most) of links included there are translated links - each translated page included broken link to photoplog (like DOMAIN/forums/pl/photoplog/index.php was included on each page with polish translation). So each translated page included link to duplicated content, because under those broken links you have photoplog page which includes main forum page content.

I just added photoplog to ignored URLs so vBET will NOT generate such broken links for photoplog anymore. It would be better to integrate and have it also translated, but it is not covered by free support. Please open new thread in integration section for help - still It will be tricky if possible at all, because you have it combined with vBSEO rules - so most probably standard integration hints will not be enough. Making integration by our staff is paid service and it costs $30 (http://www.vbenterprisetranslator.com/integration-service.php).

So actual solution solves the issue by keeping your photoplog pages out of vBET system. Broken links will not be generated anymore and already recognized broken links by Google will be redirected so link like:
DOMAIN/forums/pl/photoplog/gallery-view-3.html
will be redirected to:
DOMAIN/forums/photoplog/gallery-view-3.html

So you still need to add .htaccess rule which will redirect such requests to: DOMAIN/photoplog/gallery-view-3.html
This is already out of vBET system (vBET now ignores photoplog on your forum) - it is because your forum allows for such broken links.

This should solve the issue - I do not see any other anomalies. If you do - please let us know where. Please take special attention for all links going outside your forum directory - check does after translation those are correct. Those should be absolute, otherwise vBET will see it like part of forum content and add translation tracking there. It is possible to track translation to pages outside forum directory and support translation there if it is generates using vBulletin engine, but it requires integration - by default vBET translates forum.

Please tell do you need more help in this area.

PS.
One more anomaly, but not related to vBET - Google show several links to ecards pages, where after going there I saw same page content after redirect but other URL, please make sure that redirect sends appropriate header telling that content not exists.

vBET
04-09-11, 19:48
Answering your question in PM why you had warning about page which is now perfectly OK.

Possible reasons:
1. It had link to duplicated content (to wrong photopost link which was on each your translated page before we made the change)
2. When Google came there first time you could have issues with Google translation itself - like exceeded quota, or temporary blocked outgoing requests from your server, or temporary blocked by Google Translation API itself (from other reason than translation) - so if nay of those happened when Google robot visited your NOT cached yet page, then robot saw only layout with all texts blank (translation result was empty string for each translation) - so all pages was same then.

Cannot tell did it happened, but those are possible reasons which we see right now. Also please note that right now when your cache is filled it is less probable that you will exceed Google Translation API quota (it is set from some time - after Google made announcement that Translation API will be closed).

At this very moment we do not see any issues with rest of your URLs. If you see any issue, please let us know we will react immediately to help you. At this moment it looks fine. Do you need more help with this issue? :)

vBET
07-09-11, 19:04
As you wrote in PM - at this moment no more help is needed here. In case if you will need more help in this area, please just write here :)

AfrikaansAlbanianArabicBelarusianBulgarianCatalanChineseCroatianCzechDanishDutchEnglishEstonianFilipinoFinnishFrenchGalicianGermanGreekHaitian CreoleHebrewHindiHungarianIcelandicIndonesianIrishItalianJapaneseKoreanLatvianLithuanianMacedonianMalayMalteseNorwegianPersianPolishPortugueseRomanianRussianSerbianSlovakSlovenianSpanishSwahiliSwedishTaiwaneseThaiTurkishUkrainianVietnameseWelshYiddish
Translations made by vBET Translator 4.9.2