Search code examples
asp.netsql-server-2008searchfull-text-searchsqldatatypes

How to save documents like PDF,Docx,xls in sql server 2008


I develop a web application that let users to upload files like images and documents. this file divided into two parts :

  1. binary files
  2. document files

I want to allow users to search documents that uploaded. specialy using full text search. What data types I should use for these two file types?


Solution

  • You can store the data in binary and use full text search to interpret the binary data and extract the textual information: .doc, .txt, .xls, .ppt, .htm. The extracted text is indexed and becomes available for querying (make sure you use the CONTAINS keyword). Needless to say, full text search has to be enabled.Not sure how adding a full text index will affect your system - i.e., its size. You'll also need to look at the execution plan to ensure the index gets used at query time.

    For more information look at this:

    http://technet.microsoft.com/en-us/library/ms142499(SQL.90).aspx

    Pros: The main advantage of storing data in the database is that it makes the data "self-contained". Since all of the data is contained within the database, backing up the data, moving the data from one database server to another, replicating the database, and so on, is much easier.

    also you can enable versioning of files and also make it easier for load balanced web farms.

    Cons: you can read it here: https://dba.stackexchange.com/questions/3924/sql-server-2005-large-binary-storage. But this is something that you have to do in order to search through the files efficiently.

    Or the other thing that I could suggest is probably storing keywords in the database and then linking the same to file in the fileshare.

    Here is an article discussing abt using a FileStream and a database: http://blogs.msdn.com/b/manisblog/archive/2007/10/21/filestream-data-type-sql-server-2008.aspx