ballardchalmers - home

Contact us 01342 410223

Blog

FILESTREAM and Partitioning

There is a requirement for a large number of large PDF documents to be managed by a SQL Server database without the contents of the documents actually being stored in the database, for size reasons. FILESTREAM meets these requirements.

There is an additional requirement to store documents of different types in different folders so as to be able to take advantage of multiple disk drives. Partitioning allows a single table to be allocated to multiple file groups according to values in certain columns and so again this meets the requirements.

Combining FILESTREAM and partitioning, however, turned out not to be as simple as expected.

I started by enabling FILESTREAM on SQL Server using:

sp_configure 'filestream access level', 2

RECONFIGURE

I then created a database with two FILESTREAM file groups using:

 CREATE DATABASE [FileStreamExample] ON  PRIMARY

( NAME = N'FileStreamTest', FILENAME =N'c:\FileStreamExample\FileStreamExample.mdf'),

FILEGROUP [FStream1] CONTAINS FILESTREAM 

( NAME = N'FStream1', FILENAME = N'c:\FileStreamExample\FStream1'),

 FILEGROUP [FStream2] CONTAINS FILESTREAM

( NAME = N'FStream2', FILENAME = N'c:\FileStreamExample\FStream2')

 LOG ON

( NAME = N'FileStreamTest_log', FILENAME =N'c:\FileStreamExample\FileStreamExample_log.ldf')

Note that each FILESTREAM file group can only one have one folder associated with it, hence the need for multiple file groups.

I then proceeded to setup partitioning by creating a partitioning function. The document type will be either 1 or 2 and each one of these should be assigned to a different file group:

CREATE PARTITION FUNCTION FSPartFunc ( int )

AS RANGE LEFT

FOR VALUES (1)

Next I assigned created two partitioning schemes, one for the FILESTREAM content and one for the other columns in the table. I need both because the table and the FILESTREAM need to follow the same partitioning scheme:

CREATE PARTITION SCHEME FSPartScheme

AS PARTITION FSPartFunc

TO (FStream1, FStream2)

 

CREATE PARTITION SCHEME PartScheme

AS PARTITION FSPartFunc

TO ([PRIMARY], [PRIMARY])

Then I created the table:

CREATE TABLE FSPartExampleTable (

      ID uniqueidentifier not null ROWGUIDCOL,

      DataType int NOT NULL,

      TheData varbinary(max) filestream NOT NULL)

ON PartScheme(DataType)

FILESTREAM_ON FSPartScheme

But I got error: A table with FILESTREAM column(s) must have a non-NULL unique ROWGUID column.

So I added a PRIMARY KEY on the ID column as below:

CREATE TABLE FSPartExampleTable (

      ID uniqueidentifier not null ROWGUIDCOL PRIMARY KEY,

      DataType int NOT NULL,

      TheData varbinary(max) filestream NOT NULL)

ON PartScheme(DataType)

FILESTREAM_ON FSPartScheme

But then I got the error: Partition columns, DataType in this case, for a unique index must be a subset of the index key. I tried again with a UNIQUE constraint and got the same error.

It seemed like there were two conflicting requirements here: Firstly the ID column must be unique and secondly the unique column must contain the partitioning column. So I tried to create a primary key based on both ID and DataType:

CREATE TABLE FSPartExampleTable (

      ID uniqueidentifier not null ROWGUIDCOL,

      DataType int NOT NULL,

      TheData varbinary(max) filestream NOT NULL,

PRIMARY KEY (ID, DataType ))

ON PartScheme(DataType)

FILESTREAM_ON FSPartScheme

But then I was back to the first error: But I got error: A table with FILESTREAM column(s) must have a non-NULL unique ROWGUID column.

So FILESTREAM is pretty insistent on having a unique index on the ID column, whereas partitioning wants to have a composite index if the index is to be partitioned. That’s when I realised what the issue was, the unique indexes are being associated with the partitioning scheme of the table and actually I do not need this to be the case. So I changed my primary key to be non-clustered, so it is separate from the table, and associated it with a different file group, [PRIMARY] in this example, and the issue was solved:
CREATE TABLE FSPartExampleTable (

      ID uniqueidentifier not null ROWGUIDCOL

            PRIMARY KEY NONCLUSTERED ON [PRIMARY],

      DataType int NOT NULL,

      TheData varbinary(max) filestream NOT NULL)

ON PartScheme(DataType)

FILESTREAM_ON FSPartScheme

 

Now the FILESTREAM data for DataType 1 and 2 are stored in different folders as required:

INSERT INTO FSPartExampleTable

VALUES (NEWID(), 1, CAST ('Part 1 ...data' AS varbinary(max)))

INSERT INTO FSPartExampleTable

VALUES (NEWID(), 2, CAST ('Part 2 ...data' AS varbinary(max)))

By Geoff Ballard, Chief Technical Officer

Have a comment? Let us know below, or send an email to enquiries@ballardchalmers.com

Comment on this article

About

Ballard Chalmers is one of the UK’s leading enterprise software engineering companies, specialising in development for the Microsoft Enterprise Application Platform. We are principally dedicated to software development and integration for the Microsoft Cloud, as well as on-premises and hybrid systems. Our expertise is in Microsoft .NET, SQL Server, BizTalk Server and SharePoint Server.

Let's talk

  01342 410223

Find out more

Thank you for contacting us.

We will be in touch with you shortly.