Friday, May 8, 2009

SQL Server 2008 Product Overview


Microsoft Data Platform Vision
A variety of factors are converging to create an information storage explosion. Enabled by new types of information, such as digitization of images and video, and sensor information from RFID tags, the amount of digital information within an organization is mushrooming. Growing regulatory compliance and globalization require that information be stored securely and available at all times. At the same time, the cost of disk storage has dramatically decreased, enabling organizations to store more data per dollar invested. Users must quickly sift through mountains of data to find relevant information. Furthermore, they want to use this information on any device and with the programs that they use every day, such as Microsoft Office System applications. Managing this data explosion and the increase in user expectations creates numerous challenges for the enterprise.

The Microsoft data platform vision meets these needs by providing a solution that organizations can use to store and manage many types of data, including XML, e‑mail, time/calendar, file, document, geospatial, and so on, while providing a rich set of services to interact with the data: search, query, data analysis, reporting, data integration, and robust synchronization. Users can access information from creation to archiving on any device, from a server to a desktop or mobile device.

What’s New in SQL Server 2008

SQL Server 2008 delivers on the Microsoft data platform vision by enabling organizations to run their most mission-critical applications while lowering the cost of managing the data infrastructure and delivering insights and information to all users. This platform has the following qualities:

  • Trusted—Enables organizations to run their most critical applications with very high levels of security, reliability, and scalability.
  • Productive—Enables organizations to reduce the time and cost required to develop and manage their data infrastructure.
  • Intelligent—Provides a comprehensive platform that delivers insights and information where your users want it.

Trusted
In today’s data-driven world, organizations need continuous access to their data. SQL Server 2008 provides robust security features, reliability, and scalability for mission-critical applications.

Protect Your Information
Building on the proven strengths of SQL Server 2005, SQL Server 2008 extends security capabilities with the following enhancements:

1. Transparent Data Encryption

SQL Server 2008 enables the encryption of entire databases, data files, and log files, without the need for application changes. Encryption enables organizations to meet the demands of regulatory compliance and overall concern for data privacy. Some of the benefits of transparent data encryption include securing data from unauthorized users and encrypting backups. These can be enabled without changing existing applications.

2. External Key Management

SQL Server 2008 provides a comprehensive solution for encryption and key management. To meet the growing need for greater security of information within data centers, organizations have invested in vendors to manage security keys within the enterprise. SQL Server 2008 provides excellent support for this need by supporting third-party key management and hardware security module (HSM) products.

3. Enhanced Auditing

SQL Server 2008 improves compliance and security by enabling you to audit activity on your data. Auditing can include information about when data has been read, in addition to any data modifications. SQL Server 2008 has features such as enhanced configuration and management of audits in the server, which enable organizations to meet varied compliance needs. SQL Server 2008 can also define audit specifications in each database, so audit configuration can be ported with databases. Filtering of audits to specific objects allows better performance in audit generation and flexibility in configuration.

Ensure Business Continuity

With SQL Server 2008, Microsoft continues to give organizations the ability to provide highly reliable applications with simplified management.

1. Enhanced Database Mirroring

SQL Server 2008 builds on SQL Server 2005 by providing a more reliable platform that has enhanced database mirroring. New features include:

  • Automatic page repairSQL Server 2008 enables the principal and mirror computers to transparently recover from 823 and 824 errors on data pages by requesting a fresh copy of the corrupted page from the mirroring partner.
  • Improved performanceSQL Server 2008 compresses the outgoing log stream in order to minimize the network bandwidth required by database mirroring.
  • Enhanced supportability
    - SQL Server 2008 includes additional performance counters to enable more granular accounting of the time spent across the different stages of Database Management System (DBMS) log processing.
    - SQL Server 2008 includes new Dynamic Management Views and extensions of existing views to expose additional information about mirroring sessions.

2. Hot Add CPU

Extending existing support in SQL Server for adding memory resources online, Hot Add CPU allows a database to be scaled on demand. In fact, CPU resources can be added to SQL Server 2008 on supported hardware platforms without requiring application downtime.

Optimized and Predictable System Performance

Organizations are faced with growing pressure to provide predictable response and to manage increasing volumes of data for growing numbers of users. SQL Server 2008 provides a comprehensive set of features to provide scalable and predictable performance for any workload on your data platform.

Resource Governor

SQL Server 2008 enables organizations to provide a consistent and predictable response to end users with the introduction of Resource Governor. Resource Governor enables database administrators to define resource limits and priorities for different workloads, which enables concurrent workloads to provide consistent performance to end users.

Plan Freezing

SQL Server 2008 enables greater query performance stability and predictability by providing new functionality to lock down query plans, enabling organizations to promote stable query plans across hardware server replacements, server upgrades, and production deployments.

Data Compression

Improved data compression enables data to be stored more effectively and reduces the storage requirements for your data. Data compression also provides significant performance improvements for large input/output-bound workloads such as data warehousing.

Backup Compression

Keeping disk-based backups online is expensive and time consuming. With SQL Server 2008 backup compression, less disk I/O is required, less storage is required to keep backups online, and backups run significantly faster.

Performance Data Collection

Performance tuning and troubleshooting are time-consuming tasks for the administrator. To provide actionable performance insights to administrators, SQL Server 2008 delivers more extensive performance data collection, a new centralized data repository for storing performance data, and new reporting and monitoring tools.

Extended Events

SQL Server Extended Events is a general event-handling system for server systems. The Extended Events infrastructure is a lightweight mechanism that supports capturing, filtering, and acting upon events generated by the server process. The ability to act upon events allows users to quickly diagnose run time problems by adding contextual data, such as Transact-SQL call stacks or query plan handles, to any event. Events can be captured into several different output types, including Event Tracing for Windows (ETW). When Extended Events are output to ETW, correlation with operating system and database applications is possible, allowing for more holistic system tracing.

Productive

SQL Server 2008 reduces the time and cost of managing systems and, along with the Microsoft .NET Framework and Visual Studio® Team System, enables developers to build powerful, next-generation database applications.

Policy-Based Management

As part of an ongoing effort by Microsoft to reduce the total cost of ownership, SQL Server 2008 introduces the Policy-Based Framework, a new management framework for the SQL Server Database Engine. Policy-Based Management delivers the following benefits:

  • Compliance with policies for system configuration
  • Monitors and prevents changes to the system by authoring policies against the configuration
  • Reduces total cost of ownership by simplifying administration tasks
  • Detects compliance issues in SQL Server Management Studio

The Policy-Based Framework is a system for managing one or more instances of SQL Server 2008. To use this framework, SQL Server policy administrators use SQL Server Management Studio to create policies that manage entities on the server, such as the instance of SQL Server, databases, and other SQL Server objects. The Policy-Based Framework consists of three components: policy management, policy administrators who create policies, and explicit administration. Administrators select one or more managed targets and explicitly check that the targets comply with a specific policy, or explicitly force the targets to comply with a policy.

Automated Administration

Policy administrators enable automated policy execution by using one of the following execution modes:

  • Enforce – Uses DDL triggers to prevent policy violations
  • Check on Changes – Uses event notification to evaluate a policy when a relevant change occurs
  • Check on Schedule – Uses a SQL Server Agent job to periodically evaluate a policy

Streamlined Installation

SQL Server 2008 introduces significant improvements to the service life cycle for SQL Server through its re-engineered installation, setup, and configuration architecture. These improvements separate the installation of the bits on the computer from the configuration of the SQL Server software, which enables organizations and software partners to provide recommended installation configurations.

Accelerated Development

SQL Server enables developers to create the next generation of data applications with an integrated development environment and a higher level of data abstraction, while simplifying access to data.

ADO.NET Entity Framework

A trend among database developers is to define high-level business objects, or entities, that they then map to the tables and columns stored in a database. Rather than programming against tables and columns in a database, developers use high-level entities such as ‘Customer’ or ‘Order’ to represent the underlying data. The ADO.NET Entity Framework enables developers to program against relational data in terms of such entities. Programming at this level of abstraction is highly productive and allows developers to take full advantage of entity-relationship modeling.

Language Integrated Query

Microsoft Language Integrated Query (LINQ) enables developers to issue queries against data by using a managed programming language such as Microsoft Visual C#® or Microsoft Visual Basic® NET, instead of SQL statements. LINQ enables seamless, strongly typed, set-oriented queries written in .NET Framework languages to run against ADO.NET (LINQ to SQL), ADO.NET DataSets (LINQ to DataSets), the ADO.NET Entity Framework (LINQ to Entities), and to the Entity Data Service Mapping Provider. SQL Server 2008 features a new LINQ to SQL Provider that enables developers to use LINQ directly on SQL Server 2008 tables and columns.

CLR Integration and ADO.NET Object Services

The object services layer of ADO.NET enables the materialization, change tracking, and persistence of data as Common Language Runtime (CLR) objects. Developers using the ADO.NET Entity Framework can program against a database by using CLR objects that are managed by ADO.NET. SQL Server 2008 introduces more efficient, optimized support that improves performance and simplifies development.

Service Broker Scalability

SQL Server 2008 continues to enhance the capabilities of Service Broker.

  • Conversation Priority – Enables you to configure priorities so that the most important data is sent and processed first.
  • Diagnostic tool – The diagnostic tool improves your ability to develop, configure, and manage solutions that use Service Broker, such as diagnosing missing route or incorrectly configured security issues prior to application deployment.

Transact-SQL Improvements

SQL Server 2008 enhances the development experience for Transact-SQL programmers with several key improvements.

  • Table Value Parameters - In many customer scenarios, it is necessary to pass a set of table-structured values (rows) to a stored procedure or function on the server. These values may be used for directly populating or updating a table or for more complex manipulation of data. Table valued parameters provide an easier way to define a table type and to allow applications to create, populate, and pass table-structured parameters to stored procedures and functions.
  • Object Dependencies - The object dependencies improvement provides reliable discovery of dependencies among objects through newly introduced catalog view and dynamic management functions. Dependency information is always up-to-date for both schema-bound and non-schema-bound objects. Dependencies are tracked for stored procedures, tables, views, functions, triggers, user-defined types, XML schema collections, and more.
  • DATE/TIME Data Types - SQL Server 2008 introduces new date and time data types:
    - DATE – a date only type
    - TIME – a time only type
    - DATETIMEOFFSET – a time zone-aware date/time type
    - DATETIME2 – a date/time type with larger fractional seconds and year range than the existing DATETIME type

The new data types enable applications to have separate date and time types, while providing large date ranges or user-defined precision for time values.

Occasionally Connected Systems

With mobile devices and on-the-go workers, occasionally connected has become a way of life. SQL Server 2008 delivers a unified synchronization platform that enables consistent synchronization across applications, data stores, and data types. In a joint effort with Visual Studio, SQL Server 2008 enables the rapid creation of occasionally connected applications by way of new synchronization services in ADO.NET and offline designers in Visual Studio. SQL Server 2008 provides support for change tracking, enabling customers to develop caching-based, synchronization-based, and notification-based applications using a robust implementation with minimal performance overhead.

Beyond Relational Data

Increasingly, applications are incorporating a much wider variety of data types than has been traditionally supported by a database. SQL Server 2008 builds on a strong legacy of supporting non-relational data by providing new data types that enable developers and administrators to efficiently store and manage unstructured data such as documents and images. Support for managing advanced geospatial data has also been added. In addition to new data types, SQL Server 2008 provides a rich set of services on the different data types while providing the reliability, security, and manageability of the data platform. The next section of this white paper covers some of the advances in non-relational data storage.

FILESTREAM Data

The new SQL Server 2008 FILESTREAM data type allows large binary data like documents and images to be stored directly in an NTFS file system; the document or image remains an integral part of the database and maintains transactional consistency. FILESTREAM enables the storage of large binary data, traditionally managed by the database, to be stored outside the database as individual files that can be accessed using an NTFS streaming API. Using NTFS streaming APIs allows efficient performance of common file operations while providing all of the rich database services, including security and backup.

Geographical Information

SQL Server 2008 provides comprehensive spatial support to consume, extend, and use location information in spatially enabled applications.

  • GEOGRAPHY data type
    This feature enables you to store planar spatial data that conforms to industry spatial standards like Open Geospatial Consortium (OGC). This enables developers to implement “Flat Earth” solutions by storing polygons, points, and lines that are associated with projected planar surfaces, as well as naturally planar data such as interior spaces.
  • GEOMETRY data type
    This feature enables you to store geodetic spatial data and perform operations on it. Use latitude and longitude coordinates to define areas on the earth’s surface and associate geographical data with industry-standard ellipsoid such as WGS84, which is used in GPS solutions worldwide.

HIERARCHY ID

SQL Server 2008 enables database applications to model tree structures in a more efficient way than previously possible. HierarchyId is a new system type that can store values that represent nodes in a hierarchy tree. This new type features a flexible programming model. It is implemented as a CLR user-defined type (UDT) that exposes several efficient and useful built-in methods for creating and operating on hierarchy nodes.

Integrated Full-Text Search

Integrated Full-Text Search makes the transition between Full-Text Search and relational data seamless, while enabling the use of full-text indexes to perform high-speed text searches on large text columns.

Sparse Columns

This feature provides a highly efficient way of managing empty data in a database by enabling NULL data to consume no physical space. For example, sparse columns allow object models that typically contain numerous null values to be stored in a SQL Server 2008 database without experiencing large space costs. Sparse Columns also enable administrators to create tables with more than 1,024 columns.

Large User-Defined Types

SQL Server 2008 eliminates the 8,000-byte limit for user-defined types, enabling users to dramatically expand the size of their UDTs.


Intelligent

Business Intelligence (BI) continues to be a key area of investment for most organizations and an invaluable source of information for users at all levels of the organization. SQL Server 2008 provides a comprehensive platform for delivering intelligence where your users want it.

Integrate Any Data

Organizations continue to invest in BI and data warehousing solutions in order to derive business value from their data. SQL Server 2008 provides a comprehensive and scalable data warehouse platform that enables powerful analyses with a single analytical store that meets the needs of thousands of users across terabytes of data. Following are some of the advances in SQL Server 2008 in data warehousing.

Data Compression

Data volumes in data warehouses continue to grow with the addition of myriad operational systems. Data compression built into SQL Server 2008 enables organizations to store data more efficiently, while also improving performance due to lower I/O demands.

Backup Compression

Keeping disk-based backups online is expensive and time consuming. With SQL Server 2008 backup compression, less storage is required to keep backups online, and backups run significantly faster since less disk I/O is required.

Partitioned Table Parallelism

Partitions enable organizations to manage large, growing tables more effectively by transparently breaking them into manageable blocks of data. SQL Server 2008 builds on the advances on partitioning in SQL Server 2005 by improving performance on large, partitioned tables.

Star Join Query Optimizations

SQL Server 2008 provides improved query performance for common data warehouse scenarios. Star join query optimizations reduce query response time by recognizing data warehouse join patterns.

Resource Governor

SQL Server 2008 enables organizations to provide a consistent and predictable response to end users with the introduction of Resource Governor. Resource Governor allows organizations to define resource limits and priorities for different workloads, which enables concurrent workloads to provide consistent performance.

GROUPING SETS

GROUPING SETS is an extension to the GROUP BY clause that enables users to define multiple groupings in the same query. GROUPING SETS produce a single result set that is equivalent to a UNION ALL of differently grouped rows, making aggregation querying and reporting easier and faster.

Change Data Capture

With change data capture, changes are captured and placed in change tables. It captures the complete content of changes, maintains cross-table consistency, and even works across schema changes. This enables organizations to integrate the latest information into the data warehouse.

MERGE SQL Statement

With the introduction of the MERGE SQL statement, developers can more effectively handle common data warehousing scenarios like checking whether a row exists and then executing an insert or update.

Scalable Integration Services

The key advances in scalability of Integration Services are:

  • SQL Server Integration Services (SSIS) pipeline improvements
    Data Integration packages can now scale more effectively, making use of available resources and managing the largest enterprise-scale workloads. The new design improves the scalability of run time into multiple processors.
  • SSIS persistent lookups
    The need to perform lookups is one of the most common extraction, transformation, and loading (ETL) operations. This is especially prevalent in data warehousing, where fact records must use lookups to transform business keys to their corresponding surrogates. SSIS increases the performance of lookups to support the largest tables.
  • SSIS high-performance connectors
    SSIS has new connectors for SAP Netweaver BI, Oracle and Teradata, especially designed for high-performance loading of data into an enterprise data warehouse.
  • Data profiling
    The new data profiling tool in SSIS enables users to analyze source data for a variety of properties such as data type, lengths, histograms of data values, and the strength of integrity relationships. This helps improve the quality of data going into a data warehouse.

Deliver Relevant Reports

Addressing the strong momentum in the business intelligence market, SQL Server 2008 provides a scalable BI infrastructure that enables IT to drive business intelligence throughout the organization by managing reports and analyses of any size or complexity. SQL Server 2008 enables organizations to efficiently deliver relevant, personalized reports to thousands of users in the format and the location that makes the most sense for them. By enabling the delivery of interactive, enterprise reports where users want them, the number of users who can be reached with Reporting Services is greatly expanded. This enables users to gain instant access to relevant information that provides insight into what is going on within their respective business areas, leading to in better, faster, and more relevant decisions. SQL Server 2008 empowers all audiences though authoring, management, and consumption of reports with the following reporting enhancements:

1. Enterprise Reporting Engine

Reports can easily be delivered throughout the organization with simplified deployment and configuration. This enables users to easily create and share reports of any size and complexity.

2. New Report Designer

An enhanced Report Designer is optimized for creating extensive reports, enabling organizations to accommodate all reporting needs. Unique layout capabilities allow the design of reports of any structure, while new visualization enhancements further enrich the user experience.

3. Report Builder

Easily build reports with any structure through a redesigned Report Builder end-user tool that has the familiar look and feel of Microsoft Office 2007, including a “ribbon” interface and the ability to integrate powerful visualizations into reports.

4. Powerful Visualization

SQL Server 2008 extends the visual components available within reports. Visualization tools such as charts and gauges make reports more accessible and understandable.

5. Microsoft Office Rendering

SQL Server 2008 provides new Microsoft Office rendering that enables users to access reports directly from within Word. In addition, the existing Excel® renderer has been greatly enhanced to support features such as nested data regions, sub-reports, and merged cells. This lets users maintain layout fidelity and improves the overall usability of reports created in Microsoft Office applications.

6. Microsoft SharePoint Integration

SQL Server 2008 Reporting Services offers deep integration with Microsoft Office SharePoint® Server 2007 and Microsoft SharePoint Services, providing central delivery and management of enterprise reports and other business insights. This empowers users to access reports that combine structured as well as unstructured information relevant to their decision making directly in their business portals.

Drive Actionable Insight

Instant access to accurate information that enables users to quickly answer even the most complex questions is the premise of Online Analytical Processing (OLAP). SQL Server 2008 builds on the strong OLAP capabilities of SQL Server 2005 by delivering faster query times for all users. This performance boost enables organizations to perform highly complex analyses with a large number of dimensions and aggregations. This speed, combined with deep integration with Microsoft Office, allows SQL Server 2008 Analysis Services to enable every user to derive actionable insights. SQL Server Analysis Services provides the following analytical advances:

1. Design to Scale

SQL Server 2008 drives broader analysis with enhanced analytical capabilities and with more complex computations and aggregations. New cube design tools help users streamline the development of the analysis infrastructure, enabling them to build solutions for optimized performance. Best Practice Design Alerts have been built into the design, enabling developers to integrate real-time alerts at design time, which optimizes design. The Dimension Designer enables easy viewing and editing of attribute relationships and also provides multiple built-in validations for support of ideal dimension design.

2. Block Computations

Block computations provide a significant improvement in processing performance, enabling users to increase the depth of their hierarchies and the complexity of the computations.

3. Write-back to MOLAP

New MOLAP-enabled write-back capabilities in SQL Server 2008 Analysis Services remove the need to query ROLAP partitions. This provides users with enhanced writeback scenarios from within analytical applications, without sacrificing OLAP performance.

4. Resource Monitor

A new Resource Monitor in SQL Server 2008 provides detailed insight into resource utilization. With this Resource Monitor, the database administrator can quickly and effortless monitor and control the analysis workload, including identifying which users are running which queries and how long they run, enabling the administrator to better optimize server utilization.

5. Predictive Analysis

SQL Server Analysis Services continues to deliver advanced data mining technologies. Better Time Series support extends forecasting capabilities. Enhanced mining structures deliver more flexibility to perform focused analysis through filtering as well as to deliver complete information in reports beyond the scope of the mining model. New cross-validation enables confirmation of both accuracy and stability for results that you can trust. Furthermore, the new features delivered with SQL Server 2008 Data Mining Add-ins for the 2007 Office System empower every user in the organization with even more actionable insight at the desktop.

Conclusion

SQL Server 2008 provides the technology and capabilities that organizations count on to manage the growing challenges of managing data and delivering actionable insights to users. With significant advancements in key areas, SQL Server 2008 is a trusted, productive, intelligent data platform. SQL Server 2008 is an integral part of the Microsoft Data Platform vision that is designed to meet the needs of managing and working with data today and beyond.
SQL Server 2008 is a significant product release that delivers many new features and key improvements, making it the most robust and comprehensive release of SQL Server to date.
This white paper provides only an overview of the many new features, benefits, and functionality in SQL Server 2008.

Thursday, May 7, 2009

Codd’s 12 Rules for an RDBMS

Although most of us think that any database that supports SQL is automatically considered a relational database, this isn’t always the case—at least not completely. In Chapter 1, I discussed the basics and foundations of relational theory, but no discussion on this subject would be complete without looking at the rules that were formulated in 1985 in a two-part article published by Computerworld magazine (“Is Your DBMS Really Relational?” and “Does Your DBMS Run By the Rules?” by E. F. Codd, Computerworld, October 14 and October 21, 1985). Many websites also outline these rules. These rules go beyond relational theory and defines more specific criteria that need to be met in an RDBMS, if it’s to be truly relational.

It might seem like old news, but the same criteria can still be used today to measure how relational a database is. These rules are frequently brought up in conversations when discussing how well a particular database server is implemented. I present the rules in this appendix, along with brief comments as to whether SQL Server 2008 meets each of them, or otherwise. Relational theory has come a long way since these rules were first published, and “serious” theorists have enhanced and refined relational theory tremendously since then, as you’d expect. A good place for more serious learning is the website http://www.dbdebunk.com/, run by C. J. Date and Fabian Pascal, or any of their books. If you want to see the debates on theory, the newsgroup comp.databases.theory is a truly interesting place. Of course, as the cover of this book states, my goal is practicality, with a foundation on theory, so I won’t delve too deeply into theory here at all. I present these 12 rules simply to set a basis for what a relational database started out to be and largely what it is and what it should be even today.

All these rules are based upon what’s sometimes referred to as the foundation principle, which states that for any system to be called a relational database management system, the relational capabilities must be able to manage it completely.

For the rest of this appendix, I’ll treat SQL Server 2008 specifically as a relational database engine, not in any of the other configurations in which it might be used, such as a plain data store, a document storage device, or whatever other way you might use SQL Server as a storage engine.


Rule 1: The Information Rule


"All information in the relational database is represented in exactly one and only one way—by values in tables."


This rule is an informal definition of a relational database and indicates that every piece of data that we permanently store in a database is located in a table.


In general, SQL Server fulfills this rule, because we cannot store any information in anything other than a table. We can’t use the variables in this code to persist any data, and therefore they’re scoped to a single batch.


Rule 2: Guaranteed Access Rule


"Each and every datum (atomic value) is guaranteed to be logically accessible by resorting to acombination of table name, primary key value, and column name."


This rule stresses the importance of primary keys for locating data in the database. The table namelocates the correct table, the column name finds the correct column, and the primary key valuefinds the row containing an individual data item of interest. In other words, each (atomic) piece ofdata is accessible by the combination of table name, primary key value, and column name. Thisrule exactly specifies how we access data using an access language such as Transact-SQL (T-SQL)in SQL Server.


Using SQL, we can search for the primary key value (which is guaranteed to be unique, basedon relational theory, as long as it has been defined), and once we have the row, the data is accessedvia the column name. We can also access data by any of the columns in the table, though we aren’talways guaranteed to receive a single row back.


Rule 3: Systematic Treatment of NULL Values


"NULL values (distinct from empty character string or a string of blank characters and distinctfrom zero or any other number) are supported in the fully relational RDBMS for representingmissing information in a systematic way, independent of data type."


This rule requires that the RDBMS support a distinct NULL placeholder, regardless of datatype. NULLsare distinct from an empty character string or any other number, and they are always to be consideredas unknown values.


NULLs must propagate through mathematic operations as well as string operations. NULL + = NULL, the logic being that NULL means “unknown.” If you add something known tosomething unknown, you still don’t know what you have, so it’s still unknown.


There are a few settings in SQL Server that can customize how NULLs are treated. Most of thesesettings exist because of some poor practices that were allowed in early versions of SQL Server:
  • ANSI_NULLS: Determines how NULL comparisons are handled. When OFF, then NULL = NULL isTrue for the comparison, and when ON (the default), NULL = NULL returns UNKNOWN.
  • CONCAT_NULL_YIELDS_NULL: When the CONCAT_NULL_YIELDS_NULL setting is set ON, NULLs aretreated properly, such that NULL + 'String Value' = NULL. If the CONCAT_NULL_YIELDS_NULLsetting is OFF, which is allowed for backward compatibility with SQL Server, NULLs are treatedin a nonstandard way such that NULL + 'String Value' = 'String Value'.
Rule 4: Dynamic Online Catalog Based on the Relational Model


"The database description is represented at the logical level in the same way as ordinary data,so authorized users can apply the same relational language to its interrogation as they applyto regular data."


This rule requires that a relational database be self-describing. In other words, the database mustcontain certain system tables whose columns describe the structure of the database itself, or alternatively,the database description is contained in user-accessible tables.


This rule is becoming more of a reality in each new version of SQL Server, as with the implementationof the INFORMATION_SCHEMA system views. The INFORMATION_SCHEMA is a schema that has aset of views to look at much of the metadata for the tables, the relationships, the constraints, andeven the code in the database.


Anything else you need to know can most likely be viewed in the system views (in the SYS schema).They’re the system views that replaced the system tables in 2005 that we had used since the beginningof SQL Server time. These views are far easier to read and use, and most all the data is self-explanatory,rather than requiring bitwise operations on some columns to find the value.

Rule 5: Comprehensive Data Sublanguage Rule


"A relational system may support several languages and various modes of terminal use. However,there must be at least one language whose statements are expressible, per somewell-defined syntax, as character strings and whose ability to support all of the following iscomprehensible:

a. data definition

b. view definition

c. data manipulation (interactive andby program)

d. integrity constraints

e. authorization

f. transaction boundaries (begin, commit,and rollback)."


This rule mandates the existence of a relational database language, such as SQL, to manipulatedata. SQL as such isn’t specifically required. The language must be able to support all the centralfunctions of a DBMS: creating a database, retrieving and entering data, implementing databasesecurity, and so on. T-SQL fulfils this function for SQL Server and carries out all the data definitionand manipulation tasks required to access data.


SQL is a nonprocedural language, in that you don’t specify “how” things happen, or evenwhere. You simply ask a question of the relational server, and it does the work.


Rule 6: View Updating Rule


"All views that are theoretically updateable are also updateable by the system."


All views that are theoretically updateable are also updateable by the system.This rule deals with views, which are virtual tables used to give various users of a database differentviews of its structure. It’s one of the most challenging rules to implement in practice, and no commercialproduct fully satisfies it today.


A view is theoretically updateable as long as it’s made up of columns that directly correspondto real table columns. In SQL Server, views are updateable as long as you don’t update more than asingle table in the statement; neither can you update a derived or constant field. SQL Server 2000also implemented INSTEAD OF triggers that you can apply to a view (see Chapter 6). Hence, this rulecan be technically fulfilled using INSTEAD OF triggers, but in what can be a less-than-straightforwardmanner. You need to take care when considering how to apply updates, especially if the view containsa GROUP BY clause and possibly aggregates.


Rule 7: High-Level Insert, Update, and Delete


"The capability of handling a base relation or a derived relation as a single operand appliesnot only to the retrieval of data but also to the insertion, update, and deletion of data."


This rule stresses the set-oriented nature of a relational database. It requires that rows be treated assets in insert, delete, and update operations. The rule is designed to prohibit implementations thatsupport only row-at-a-time, navigational modification of the database. The SQL language coversthis via the INSERT, UPDATE, and DELETE statements.


Even the CLR doesn’t allow you to access the physical files where the data is stored, but BCPdoes kind of go around this. As always, you have to be extra careful when you use the low-level tools that can modify the data without going through the typical SQL syntax, because it can ignore the rules you have set up, introducing inconsistencies into your data.


Rule 8: Physical Data Independence


"Application programs and terminal activities remain logically unimpaired whenever anychanges are made in either storage representation or access methods."


Applications must still work using the same syntax, even when changes are made to the way inwhich the database internally implements data storage and access methods. This rule implies that the way the data is stored physically must be independent of the logical manner in which it’s accessed. This is saying that users shouldn’t be concerned about how the data is stored or howit’s accessed. In fact, users of the data need only be able to get the basic definition of the data theyneed.


Other things that shouldn’t affect the user’s view of the data are as follows:
  • Adding indexes: Indexes determine how the data is stored, yet the user, through SQL, willnever know that indexes are being used.
  • Changing the filegroup of an object: Just moving a table to a new filegroup will not affect theapplication. You access the data in the same way no matter where it is physically located.
  • Using partitioning: Beyond moving entire tables around to different filegroups, you canmove parts of a table around by using partitioning technologies to spread access around todifferent independent subsystems to enhance performance.
  • Modifying the storage engine: From time to time, Microsoft has to modify how SQL Server operates (especially in major version upgrades). However, SQL statements must appear toaccess the data in the same manner as they did in any previous version, only (we hope) faster.
Microsoft has put a lot of work into this area, because SQL Server has a separate relational engine and storage engine, and OLE DB is used to pass data between the two. Further reading on this topic is available in SQL Server 2008 Books Online in the “Database Engine Components” topic or in Inside Microsoft SQL Server 2005: The Storage Engine by Kalen Delaney (Microsoft Press, 2006).

Rule 9: Logical Data Independence


"Application programs and terminal activities remain logically unimpaired when information preserving changes of any kind that theoretically permit unimpairment are made to the base tables."

Along with rule 8, this rule insulates the user or application program from the low-level implementation of the database. Together, they specify that specific access or storage techniques used by the RDBMS—and even changes to the structure of the tables in the database—shouldn’t affect the user’s ability to work with the data.

In this way, if you add a column to a table and if tables are split in a manner that doesn’t add or subtract columns, then the application programs that call the database should be unimpaired. For example, say you have the table in Figure A-1.













The user should be unaffected. If you were to implement INSTEAD OF triggers on the view that had the same number of columns with the same names, you could seamlessly meet the need to manage the view in the exact manner the table was managed. Note that the handling of identity columns can be tricky in views, because they require data to be entered, even when the data won’t be used. See Chapter 6 for more details on creating INSTEAD OF triggers.

Of course, you cannot always make this rule work if columns or tables are removed from the system, but you can make the rule work if columns and data are simply added.

Rule 10: Integrity Independence

"Integrity constraints specific to a particular relational database must be definable in the relational data sublanguage and storable in the catalog, not in the application programs."

The database must support a minimum of the following two integrity constraints:

  • Entity integrity: No component of a primary key is allowed to have a NULL value.
  • Referential integrity: For each distinct non-NULL foreign key value in a relational database, there must exist a matching primary key value from the same domain.

This rule says that the database language should support integrity constraints that restrict the data that can be entered into the database and the database modifications that can be made. In other words, the RDBMS must internally support the definition and enforcement of entity integrity (primary keys) and referential integrity (foreign keys).

It requires that the database be able to implement constraints to protect the data from invalid values and that careful database design is needed to ensure that referential integrity is achieved. SQL Server 2008 does a great job of providing the tools to make this rule a reality. We can protect our data from invalid values for most any possible case using constraints and triggers. Most of Chapter 6 was spent covering the methods that we can use in SQL Server to implement integrity independence.

Rule 11: Distribution Independence

"The data manipulation sublanguage of a relational DBMS must enable application programs and terminal activities to remain logically unimpaired whether and whenever data are physically centralized or distributed."

This rule says that the database language must be able to manipulate data located on other computer systems. In essence, we should be able to split the data on the RDBMS out onto multiple physical systems without the user realizing it. SQL Server 2008 supports distributed transactions among SQL Server sources, as well as other types of sources using the Microsoft Distributed Transaction Coordinator service.

Another distribution-independence possibility is a group of database servers working together more or less as one. Database servers working together like this are considered to be federated.With every new SQL Server version, the notion of federated database servers seamlessly sharing the load is becoming a definite reality. More reading on this subject can be found in the SQL Server 2008 Books Online in the “Federated Database Servers” topic.

Rule 12: Non-Subversion Rule.

If a relational system has or supports a low-level (single-record-at-a-time) language, that low-level language cannot be used to subvert or bypass the integrity rules or constraints expressed in the higher-level (multiple-records-at-a-time) relational language.

This rule requires that alternate methods of accessing the data are not able to bypass integrity constraints, which means that users can’t violate the rules of the database in any way. For most SQL Server 2008 applications, this rule is followed, because there are no methods of getting to the raw data and changing values other than by the methods prescribed by the database. However, SQL Server 2008 violates this rule in two places:

  • Bulk copy: By default, you can use the bulk copy routines to insert data into the table directly and around the database server validations.
  • Disabling constraints and triggers: There’s syntax to disable constraints and triggers, thereby subverting this rule.

It’s always good practice to make sure you use these two features carefully. They leave gaping holes in the integrity of your data, because they allow any values to be inserted in any column. Because you’re expecting the data to be protected by the constraint you’ve applied, data value errors might occur in the programs that use the data, without revalidating it first.