Archive | SharePoint Search RSS for this section

The SharePoint item being crawled returned an error when attempting to download the item…

I just came across this one with one of my SharePoint 2010 customers.

They were facing this error message in their crawl log for 3 lists and couldn’t figure out what’s the issue. When they tried to browse the list everything looked fine, except for that the view was a access based datasheet view.

So we started looking into this issue by drilling into the ULS logs and found the following:

CSTS3Accessor::Init fails, Url sts4s://yoursite/siteid={GUID}/weburl=yourweburl/webid={WebID}/listid={ListID}/viewid={viewI}, hr=8004FD0F  [sts3handler.cxx:312]  d:\office\source\search\native\gather\protocols\sts3\sts3handler.cxx

We also checked the IIS logs on the crawl target server and found the request for this particular element in the IIS log returned a 200 OK.

One of the first things you will find on the internet when searching for this issue, is to check the useragent settings on your crawl servers which you can find in the registry of the box under:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\14.0\Search\Global\Gathering Manager\UserAgent

The common statement is to update the user agent string, or at least the Internet Explorer (MSIE) part of it, to represent a more up to date user agent. (Out of the box it shows version 4 – which we also could see in the IIS logs of the crawl target).

So we gave this approach a shot and changed the registry values, rebooted the servers and started a full crawl… and after all those steps we felt pretty confident to get rid of those errors… but after the crawl was done when we checked the crawl log, for sure the errors still persisted.

So after checking one more time that we had all the registry keys done right we cranked up verbose logging for search ready to start drilling deeper into the issue. Unfortunately the ULS logs turned out not to be of great help.

So we went to the crawl server and tried browsing the page again, even looked at the zone settings and everything looked just fine.

On the edge of desperation we came up with the idea to login as the search crawl account and try to browse the page and BOOOM here SharePoint is screaming at us with a nice big error message, which basically told us that the list has exceeded the threshold for lookup columns (which out of the box is set to 8).

This Access WebDatabase had 16 lookup columns. So for testing purposes we changed the threshold in central admin to be 17 and started a incremental crawl and voila all our crawl issues disappeared from the logs. Obviously this should not be the final fix and my customer agreed to change the design of the list to reduce the amount of lookup columns. But at the end the crawl logs in SharePoint told us the truth even though as so often they didn’t tell us the whole story…

Hope it helps.

List Threshold and Search crawl error–The Site cannot be downloaded

After doing a migration from SharePoint 2007 to SharePoint 2010 for one of my clients I experienced a funny Search crawl error. My client had a webpart on their SharePoint 2007 environment which displays the recently changed documents on the site. Everything was working just fine in SharePoint 2007 and the migration to SharePoint 2010 was running successfully as well.

Once we configured a search crawl to index the content we encountered several issues in the crawl logs telling us that the site can’t be downloaded and for that can’t be crawled.

After we ensured that the crawl account had appropriate permissions on the webapplication I used a little move I pull from time to time to troubleshoot searrch crawl issues. I logged in as the crawl account and we found that when the crawl account accesses the site a threshold error gets thrown which tells us that the webpart is exceeding the allowed threshold of 5000 items (default setting for the webapplication).

You can adjust the threshold in the webapplication general settings in central administration I would not recommend it though because you open up the door for very severe performance issues within your environment. Rather I would recommend to rethink the way you query your data from SharePoint to be more performance efficient.

My client was using his account with administrative privileges to verify if everything was working as expected after the migration but for this account the threshold will not be applied. The crawl account just has read permissions to the content and for that wasn’t able to crawl the content due to the threshold.

After removing the webpart from the site(s) crawling was just working fine. The webpart will need to be replaced by one beeing a little bit more aware of the threshold in SharePoint 2010 Smiley.

Crawl error: Crawling ganttview.aspx

A very interesting error crossed my way the last week, you will find it in your SharePoint crawl log and it looks somewhat like this:

image

Which tells you that you can not crawl SharePoint2010’s ganttview.aspx which is the gantt view you can create for task lists or project task lists.

As often you can find the answer to this issue in the Microsoft forums:

http://social.msdn.microsoft.com/Forums/en-US/sharepoint2010general/thread/f5bebbf9-028d-4c0a-bee6-3e5d0755c08c/

Here is the solution suggested in the post. (I tested it in my environment and after changing the registry key suggested in the solution (and an reboot) the error went away)

Go to Registry editor in your SharePoint indexing server.

Navigate the following path within the registry editor:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\14.0\Search\Global\Gathering Manager

Find the ‘UserAgent‘ key and change to:

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.3; .NET4.0C; MS-RTC LM 8; Tablet PC 2.0)

A restart of the server is required!

A big thank you goes out to the author of this solution ssaz in the Microsoft forums.