The Content Database Support and Remote BLOB Storage Myth

There’s a popular myth that keeps popping up that I wanted to post about.

Why is it so popular?
Well, because it seems intuitive if you aren’t working with SharePoint on a regular basis. If you are then I’m sure you don’t think this… and if you did, well shortly you’ll know the truth.

So here’s the myth
“We don’t need to split our content across separate content databases because if we need more than 200GB support for each database we will [1] move subsites around to different site collections in different databases or [2] use remote blob storage and put it all on file shares… then we’ll have a very small content database size.”

Why is this a myth?
Let’s address the second part of the statement first – “[2] use remote blob storage and put it all on file shares… then we’ll have a very small content database size”. This is a myth because the content database will still not be supported by Microsoft. The reason for this is that both the actual database size itself PLUS the content offloaded and stored on file shares count in the 200GB (or 4TB if you meet additional requirements). This means that even if you had a 1GB database and 225GB offloaded onto fileshares for this content database, then you’re actually at 226GB and therefore not supported if you do not meet these requirements. If you do meet the requirements and have a 1GB database with 4.5TB offloaded onto fileshares for a specific content database, now you’re at 5.5TB of content and again, not supported.
From http://technet.microsoft.com/en-us/library/cc262787.aspx: “If you are using Remote BLOB Storage (RBS), the total volume of remote BLOB storage and metadata in the content database must not exceed this limit.”

Now let’s address the first part of the statement “we will [1] move subsites around to different site collections in different databases”. This also is a myth because although this is doable, it doesn’t make it a good idea.. Do you remember that old Chris Rock line.. “You can drive your car with your feet if you want to, but that don’t make it a good, [expletive] idea?” Yes? Well that’s the same here. So why isn’t it a good idea? In this case it’s because subsites are contained within a site collection. There is a close relationship between a site collection and its subsites. Objects such as site columns and content types are associated with and shared by subsites. If you want to move an individual subsite, then you have to consider how you are going to move these shared objects as well – and this is where it gets tricky. This is because there are a number of objects are difficult to move – for example workflow history and approval field values. Even if you investigate using third party tools to perform the move your subsites, you will likely encounter issues.

Ok I get it, but what do you recommend… what is the fix?
Essentially what needs to happen is that the architecture of the SharePoint environment should be considered carefully up front as much as possible, in conjunction with:

In case you are thinking, well, surely Microsoft should just raise their support limits even higher, or that subsites should be able to be moved around in a more full fidelity manner. I understand this point and was guilty thinking this myself when I first started working with SharePoint. As time grew on though, I also asked myself.. Is there any other product that offers all of the functionality that SharePoint does and has comparable supportability limits? Well frankly I couldn’t think of any. Besides, given that there is full transparency with the supportability limits and a wealth of information on TechNet to make it clear (at least to IT pros) what to do and what not to, I’m happy with this, at least for now.

SharePoint 2013: Search Suggestions Not Working After Configuration

Simple trap to avoid when configuring SP2013 Search – once you’ve set up search suggestions, they won’t automatically show when you do a search.

The reason for this is that behind the scenes there is a timer job that performs the processing of the search suggestions you’ve added.

To have the suggestions appear immediately, you need to run this command in PowerShell:

Start-SPTimerJob -Identity “prepare query suggestions”

 

Architectural Mistakes to Avoid #1 – Interstate Stretched Farm

In discussions with IT Pro’s at client sites, a few times I have seen them start off designing their farm to handle performance requirements for interstate users (e.g. Brisbane, Sydney, Melbourne) by having the core of the farm in Sydney, and then one web front end in Brisbane and another in Melbourne. Essentially an architecture that looks like this:

SP Architecture - Unsupported

What’s the challenge here?

The challenge is that technically it won’t be supported by Microsoft, because what has essentially been created here is a stretched farm, that has a packet latency of > 1ms between the WFEs (W), App Servers (A) and SQL Servers (S).  So why isn’t an environment like this supported? Because it will cause performance problems, as all the internal farm servers need to communicate with one another quickly. To get an idea for how significant the performance will be degraded, the typical statistic quoted is 50% per 1ms delay, ouch!

Also, occasionally I have heard the statement that, yes, it is possible to ping Sydney to Melbourne in < 1ms. Well, with the help of Physics 101 we can prove that this cannot be the case. Enter Wolfram Alpha to save us some time – let’s check how long it would take for a beam of light to travel from Sydney to Melbourne (just in one direction, not bouncing back again):

WolframAlpha

2.38ms. How about light being sent through fibre? 3.34ms. What does this mean? In the absolute optimal case, it would take at least 3.34ms for data to be sent from Sydney to Melbourne. But not really – because there is of course routing overhead and network congestion. And this is why an interstate stretched farm such as this cannot be supported by Microsoft.

So how do we fix the supportability issue?

To get the farm back into a usable (and supported state) we basically need to drop the idea of the web front end in Brisbane and Melbourne.  Then all requests for users in Brisbane and Melbourne are routed through Sydney.

SP Architecture - Supported

The other solution here, if you really must stretch the farm across data centres (usually for cheap(er) and simple(r) Disaster Recovery) is to ensure that the data centres are in the same city – e.g. Sydney CBD to Mascot.  Note that this doesn’t address the original concern though – improving performance for interstate users.

How do we improve the performance for interstate users in a publishing (e.g. intranet / public website) scenario?

If you’re having performance issues where users in Brisbane and Melbourne are performing heavy reads of content and few writes – e.g. in an intranet scenario, then you’ll want to ensure that you are using SharePoint Publishing Cache aggressively.  This will give you a dramatic performance boost because SharePoint won’t be fetching data out of SQL constantly and then trying to render it.  Users will just get a straight HTML dump of pages.

How do we improve the performance for interstate users in a collaboration scenario?

The most popular solution employed here is to use Wan Optimization (WanOp) devices such as those made by RiverBed and SilverPeak.  These devices have the ability to not only cache data/content, at each branch (i.e. Brisbane and Melbourne) but also perform compression and de-duplication techniques to minimize the number of bytes actually sent to the client.  Note that these capabilities are required other than just simple caching of the data, because in a collaboration scenario, the content is typically changing regularly.

Of course, from Windows 7 and Windows 8 client machines also have Microsoft BranchCache built-in which provides similar capabilities to the WanOp devices, though it does have limitations (e.g. it only works with Windows devices).  Here are some further details on BranchCache:

  • http://technet.microsoft.com/en-au/network/dd425028.aspx
  • http://www.enterprisenetworkingplanet.com/windows/article.php/3896131/Simplify-Windows-WAN-Optimization-With-BranchCache.htm

Of course, the overall number of servers and specifications needs to be determined during the SharePoint infrastructure design process (e.g. in the above diagram for a reasonably sized office it would be wise to add at least one more WFE for performance and high availability), however hopefully I’ve at least shown you one critical design mistake to avoid.

How To Add a Custom Thesaurus in SharePoint 2013

Bella Engen recently posted a useful 101 walk through on how to use a custom Thesaurus for SharePoint 2013 Search which you can find here.

Adding a thesaurus can be incredibly useful to get the most out of SharePoint Search, because not all users search for content with the same keywords or phrases.  Furthermore, as your business evolves, you will typically have product, service or business unit names change over time – and having thesaurus can help unlock this… however it is important to know that you’ll often need a combination of a Thesaurus + Query rules, which Steve Mann explains in detail here.

New SharePoint 2013 Search Articles

It’s really great to see the wealth of knowledge regarding SharePoint 2013 Search being captured and shared by various bloggers amongst the community, and also I’m happy to see a lot of fresh content from Microsoft, which shows just how committed they are to ensuring the success of the product.

Here are a bunch of new Search related posts from Microsoft in the last month or so:

SharePoint 2013 and Office 365 Feature Matrix Spreadsheet

Andrew Connell has recently posted a fantastic feature matrix spreadsheet for SP2013 and Office 365 – based on this Microsoft TechNet article.

It’s much easier to digest because you can use Excel to filter for all of the “Yes” values or all of the “No” values and see very quickly what is or isn’t included in Foundation, Standard and Enterprise versions.

SharePoint 2013 Feature Comparison Maxtrix Spreadsheet

SharePoint 2013 Feature Comparison Maxtrix Spreadsheet

You can download it from here:
Andrew’s SP2013 / O365 Feature Matrix spreadsheet

You can also see Andrew’s original post here.

Anti-Virus Solutions for SharePoint 2013

Well it seems that due to the earlier release of SP2013 than many vendors expected, at the moment there is only one anti-virus vendor that supports SharePoint 2013 other than Microsoft – ESET.  ESET’s product also is only Beta – so this isn’t really ideal for production usage just yet.

ESET Security for SharePoint 2013

Microsoft of course have ForeFront Protection for SharePoint 2010 however the whole ForeFront product line has been discontinued, so you cannot buy it.  If you have an Enterprise Agreement, and want to get it – perhaps speak with your Microsoft Account Manager and they may be able to help you out, depending on your agreement and when you speak with them.  If you do already have it, you’ll be supported until 31st December 2015 and receive anti-virus definition updates until then. From that point onward, you’ll need to migrate to another product. This was flagged by Spencer Harbar here.

You can also read Spencer’s sumamry of anti-virus products and their compatibility & supportability with SP2013.

Unsupported Installation Scenarios on SP2013

Understanding the scenarios in which SharePoint is not supported are extremely important when designing SharePoint farms; as if you experience any trouble with your environment and need to get Microsoft support involved, they typically won’t be able to help you, and will instead ask you to get your environment in a supported state.

On SP2013 it is important to note that these scenarios are not supported:

  • Installation on a machine that is not joined to a domain (i.e. a machine in a workgroup)
  • Installation on a Virtual Machine (VM) that uses Dynamic Memory
  • Installation on a Domain Controller (only supported in development environments – not production)
  • Installation on Resilient File System (ReFS). ReFS is a new file system built into Windows 8 that is designed to work as advertised (be more resilient to common errors that would cause corruption or availability issues). Only NTFS is supported for SP2013 at the moment.
  • Installation on a Windows Web Server

Here’s the link to the Original support article.

SharePoint 2013 vs. FAST Search for SharePoint 2010

Ok so the key differences between SP2013 and Fast Search for SharePoint 2010 are officially up on TechNet.

In summary, a number of features have gone.  Some people may be upset, though overall I think the majority of changes make sense because it simplifies the platform.  FAST for SharePoint (FS4SP) included a number of features that were baked in from FAST ESP (the standalone, pre-Microsoft product) and became redundant when SharePoint was added to the mix.

So, what are the key differences?:

  • FAST Search Database Connector: Unsupported.  The FS4SP DB connector was built in when FS4SP was rebuilt from FAST ESP.  Even with the release of FS4SP Microsoft’s recommendation was to use BCS wherever possible (instead of the FAST connectors)… primarily because it would be deprecated. In summary: You should be using BCS to index DB content (or a third party connector)
  • FAST Search Lotus Notes connector: Unsupported. Use BCS or for enhanced security handling, if you have the budget you’ll want to consider BA-Insight
  • FAST Search Web Crawler: Unsupported. The SP2013 web crawler provides similar capabilities to the FAST Search web crawler.
  • Find Similar Results: Unsupported.  It was hardly used.
  • FQL Operators:
    • ANY: Now has the same affect as OR.  Use WORDS instead – e.g. WORDS(TV, Television)
    • RANK: Use XRANK with updated syntax
    • XRANK: Updated syntax
  • Approach for Querying URLs: FS4SP provided the ability to query URLs using these operators: STARTS-WITHENDS-WITH and PHRASE. For performance reasons (at query time), this is no longer supported.  Instead you must query the full URL, the leading part of the URL – or add managed properties yourself to search any other part of the URL
  • Search Scope Filters: FS4SP Scopes need to be converted to SP2013 result sources
  • Anti-Phrasing: Unsupported.  FS4SP had the ability to filter out common words/phrases – e.g. “how can I”, “what is”, “who is”. SP2013 includes these phrases in search queries though. The workaround here is either to train your users to not enter redundant phrases (e.g. “how can I”), or to extend the search query web part to filter the phrases before you submit it to SharePoint
  • Offensive Content Filtering: Unsupported.  This was not built into SP2013 given the limited usage it had on FS4SP.
  • Substring Search: Unsupported.  This was not used much – primarily in situations where recall (overall number of documents retrieved) was more important than precision (high degree of relevance) – so this is not a big deal for most companies. Turning on substring search also had the downfall of bloating the search index.
  • Person and Location Entity Extraction: You need to use your own custom dictionaries.  Typically each business is different and has their own people and locations they care about. On FS4SP there was actually a lot of tuning necessary to get it working properly, because you would get overlap between People and Location names.
  • Number of Custom Entity Extractors: This is now limited to 12, and for many businesses this won’t be an issue. Primarily because on a given Search Centre, for performance and screen real-estate reasons you will generally want to keep the # search filters to an absolute max. of 6 to 10.  You really only want to include the filters that the majority of your end users will actually use and not bloat it with ones that will benefit 1 user in 1000.  The limitation of 12 could be an issue for organisations that are well advanced on their FAST implementation and are using Search for multiple applications that depend on entity extraction.
  • Document Formats: FS4SP supported several hundred file types after enabling the Advanced Filter Pack. However, many of these were legacy file types and investment has not been made in building these into SP2013.  If you have a file type that is not supported then your best option is to look for a third party iFilter/Connector  – e.g. from the iFilterShop or BA Insight
  • Pipeline Extensibility: This feature allows you to perform dynamic calculations or manipulations of document meta-data before being indexed.  On FS4SP you used to create your code and build it as an exe file and then put it on each FS4SP server that had a Document Processor role.  Now on SP2013 the approach is to use Web Service calls. I haven’t tried this yet on SP2013, though with FS4SP there was a fairly fundamental performance problem with the pipeline extensibility: Your extension (the exe) was opened and closed for EVERY single document going through the index. In some cases, due to extending the pipeline, I’d see crawl performance drop from 30 to 40 docs/sec to 10 docs/sec or less! Due to that performance impact, on FS4SP its absolutely critical that the exe you write is optimised as you can make it. I’m looking forward to testing this out on SP2013.
  • Custom XML Processing: Unsupported. This was another feature baked into FAST ESP and then made its way into FS4SP. It provided a way to manipulate XML files as they were going through the index – though it generally wasn’t super easy to configure.  The approach now is to call out to a web service that will process the XML for you
  • Docpush: Docupush was used to add (mostly) test documents to the index from the command line. This was built into the original FAST ESP product and made its way into FS4SP – though isn’t really needed now. If you just need to do a quick “is search at least partially working” test on SharePoint 2013 and you don’t have proper content sources to hook up to, you can still just do as you would on SP2010 or 2007 – just upload documents to a Document Library on a test site and run a crawl – pretty straight forward.

Improved: Developer Dashboard in SP2013

The developer dashboard was a great new feature when SP2010 was released, and now with SP2013 it has been further improved, with the key changes being:

  • The Dashboard now appears in a separate window – it isn’t rendered out at the bottom of the page you are diagnosing any more.
  • Cumulative page requests are shown, instead of just the last request.  This certainly helps in SP2013 given the use of Minimal Download Strategy (MDS).
  • The execution Plan of SharePoint stored procedures can now be viewed.
  • The dashbaord is implemented as separate WCF service named diagnosticsdata.svc
  • ULS Logs can be viewed through the browser (rather than opening the logs in ULSViewer or Wordpad) – but this should still be done from the server for security reasons
  • The Dashboard can now be extended with Javascript 🙂