When building SQL Server data warehouses should user-defined schemas be used? Should tables and views be prefixed? Or, maybe we should do both. After talking to a few people about this and doing a little research, one thing is clear. Deciding to use schemas and/or prefixes is much more of a personal preference than it is a technical decision. So, I decided to compile a list of technical and not so technical things to consider when choosing to use schemas, prefixes, both, or none.
Security
Schemas can be used to assign rights to a group of objects. Logins are mapped to users and users (or roles) are granted permissions to schemas. The same can be accomplished by separating a group of objects into a separate database.
Backups
If all schema objects are in a single filegroup then you can backup the schema independently. If you separate objects at the database level, then the same is true. Since there is no work to be done when grouping objects into different databases, this option may have a slight advantage.
Performance
There is no significant performance difference between any of these options. It is best to include the schema name in queries so that the database engine doesn’t need to figure it out, but this applies regardless of the chosen path.
Database Properties
It is a quick and easy right click task to retrieve the size of a database or a table. To get the size of schema, however, requires slightly more work depending on whether the schema is isolated to a single filegroup. Some properties are defined only at the database level. One important setting is the database recovery model. Placing stage tables in a database using Simple Recovery Model and persistent objects in a separate database using Full Recovery Model is an important option that can only be available at the database level.
Organization
One real difference between using schemas, separating object by database, and using prefixes is that they provide unique organization options. Each option provides a namespace. Let’s take a look at the options individually.
Database Separation
[Server].[LOB_Stage].[dbo].[myTable]
[Server].[LOB_Transform].[dbo].[myTable]
[Server].[LOB_Mart].[dbo].[myTable]
User-Defined Schemas
[Server].[LOB].[Stage].[myTable]
[Server].[LOB].[Transform].[myTable]
[Server].[LOB].[Mart].[myTable]
Prefix Object Name
[Server].[LOB].[dbo].[s_myTable]
[Server].[LOB].[dbo].[t_myTable]
[Server].[LOB].[dbo].[d_myTable]
Schema & Prefix
[Server].[LOB].[Raw].[s_myTable]
[Server].[LOB].[Raw].[p_myTable]
[Server].[LOB].[Mart].[d_myTable]
Here are some points that I found interesting from around the web. Feel free to post your thoughts as well.
Buck Woody says…
“Leaving everything in the dbo schema is like putting everything in the root directory of your hard drive. We use folders there to group information and programs, and you should do the same using schemas with your database objects.”
Brent Ozar says…
“…schemas are naming conventions in lipstick. When I look at somebody’s code, the last thing I ever think is, “Man, if these tables just had schemas, I could understand everything better.”
What is a Schema?
“Microsoft SQL Server 2005 introduced the concept of database object schemas. Schemas are analogous to separate namespaces or containers used to store database objects. Security permissions apply to schemas, making them an important tool for separating and protecting database objects based on access rights. Schemas reduce the work required, and improve the flexibility, for security-related administration of a database.” Microsoft Learn
So there you have it. Clearly there are some strong opinions about schema usage out there. Personally, I could go either way. The good news is that the LeapFrogBI DWA platform makes all of this very easy to implement. The configuration deployment will generate all required databases and schemas that do not already exist. Each component is then responsible for creating the target table or view. Prefixing is defined at the project level and is also automatically applied during the build process.