Archive for the ‘SQL Server 2008’ Category

Have you heard and understand why processing a ‘select * from any table’ can be negative on the performance impact? While I was doing some demo prep for my upcoming PASS Summit Pre-Conference I saw this behavior at its worst.
Before we jump into why this is such a negative impact on your performance I should tell you about my test environment. I have a table with 4.6 million records in it. The table has the following structure:
CREATE TABLE [dbo].[Crimes](
[ID] [varchar](500) NULL,
[CaseNumber] [varchar](500) NULL,
[Date] [varchar](500) NULL,
[Block] [varchar](500) NULL,
[IUCR] [varchar](500) NULL,
[PrimaryType] [varchar](500) NULL,
[Description] [varchar](1000) NULL,
[LocationDescription] [varchar](500) NULL,
[Arrest] [varchar](500) NULL,
[Domestic] [varchar](500) NULL,
[Beat] [varchar](500) NULL,
[District] [varchar](500) NULL,
[Ward] [varchar](500) NULL,
[CommunityArea] [varchar](500) NULL,
[FBICode] [varchar](500) NULL,
[XCoordinate] [varchar](500) NULL,
[YCoordinate] [varchar](500) NULL,
[Year] [varchar](500) NULL,
[UpdatedOn] [varchar](500) NULL,
[Latitude] [varchar](500) NULL,
[Longitude] [varchar](500) NULL,
[Location] [varchar](500) NULL,
[BeatInt] [int] NULL
) ON [PRIMARY]

(One thing that you may have already noticed is all the varchar columns. The reason all the data types are variable characters is due to the poor source file I was working from. Data that I would expect to be all numeric had some text in it.)

I then added an index to improve query performance.
Create Index beat on crimes (beat)

If I want to get all the crimes in a beat, I can do this in a couple ways. First I can try the Select * method. The ‘*’ indicates I want all the columns returned.
Select * from Crimes
where beat like ‘1533’

My test execution took 17 seconds. When I changed the query to return only the specific columns that I needed by replacing the * with the specific column names:
Select PrimaryType,Description from Crimes
where beat like ‘1533’

The results were returned in 11 seconds rather than the 17 seconds (36% improvement on performance). One reason for this is simply the amount of data that needs to be read from storage is trimmed down. Both queries returned approximately 27 thousand rows.
Let us consider one more test. This time we will explicitly list out each of the columns in our select clause. This is more or less doing the exact same thing as Select *.
Select
[ID] ,[CaseNumber],[Date] ,[Block] ,[IUCR] ,[PrimaryType] ,[Description],[LocationDescription] ,[Arrest] ,
[Domestic] ,[Beat] ,[District] ,[Ward] ,
[CommunityArea] ,[FBICode] ,[XCoordinate] ,[YCoordinate] ,[Year] ,
[UpdatedOn] ,[Latitude] ,[Longitude] ,[Location] ,[BeatInt]
from Crimes
where beat like ‘1533’

When each of the columns are listed out it takes just as long as it does if I were to do a Select *. Even though two queries are comparable, I prefer to see the columns listed out. One of the reasons has to do with future compatibility, if sometime down the row a column gets added to the table then I don’t have to worry if I am sending more data to an application front end then in needs.
The cause of the performance increase is simple, there is less data to read from the storage and less data that needs to be transferred to the client. When retrieving data from SQL Server use the query to limit your results to just the data you need and the overall performance should be better than if you were to bring back all the data.
These examples have drastic differences. It is easy to see how when you are looking at return row set of 27,000 records how performance can be improved. Other areas you will see improvement in performance by removing the ‘Select *’, (that we have not talked about in this post) include: 

• Less data to be transferred over internal networks.
• Less data to load into applications
• Less data to send over the internet
This is one of many small adjustments which can make a big difference in how you’re SQL Server Performs.

Sometimes managing multiple file groups and the amount of data that is in them can cause a few logistic issues. Recently I found myself needing to know what tables are in what file groups and what the sizes of those tables are. With the database the size it is and the number of objects that exist in it, I needed a script. The goal of this script is only to define the tables, and the sizes of the tables/indexes and what file group the tables are on (not the location of indexes). Consider it a quick reference. I have not tested it against partitioned tables yet however I plan to in the future.

To get the below script I started a profiler trace, and then executed the standard “Disk Usage By Table” report. After I found the query that was executed by the report I copied it and then added some basic file group data collecting to it.

SELECT
       --(row_number() over(order by a3.name, a2.name))%2 as l1,
       a3.name AS [schemaname],
       a2.name AS [tablename],
       a5.FileGroup as [filegroup],
       a1.rows as row_count,
       (a1.reserved + ISNULL(a4.reserved,0))* 8 /1024 AS 'reserved (mb)',
       a1.data * 8 /1024 AS 'data (mb)',
       (CASE WHEN (a1.used + ISNULL(a4.used,0)) > a1.data THEN (a1.used + ISNULL(a4.used,0)) - a1.data ELSE 0 END) * 8 /1024 AS 'index_size (mb)',
       (CASE WHEN (a1.reserved + ISNULL(a4.reserved,0)) > a1.used THEN (a1.reserved + ISNULL(a4.reserved,0)) - a1.used ELSE 0 END) * 8 / 1024 AS 'unused (mb)'
FROM
       (SELECT
              ps.object_id,
              SUM (
                     CASE
                           WHEN (ps.index_id < 2) THEN row_count
                           ELSE 0
                     END
                     ) AS [rows],
              SUM (ps.reserved_page_count) AS reserved,
              SUM (
                     CASE
                           WHEN (ps.index_id < 2) THEN (ps.in_row_data_page_count + ps.lob_used_page_count + ps.row_overflow_used_page_count)
                           ELSE (ps.lob_used_page_count + ps.row_overflow_used_page_count)
                     END
                     ) AS data,
              SUM (ps.used_page_count) AS used
       FROM sys.dm_db_partition_stats ps
       GROUP BY ps.object_id) AS a1
LEFT OUTER JOIN
       (SELECT
              it.parent_id,
              SUM(ps.reserved_page_count) AS reserved,
              SUM(ps.used_page_count) AS used
       FROM sys.dm_db_partition_stats ps
       INNER JOIN sys.internal_tables it ON (it.object_id = ps.object_id)
       WHERE it.internal_type IN (202,204)
       GROUP BY it.parent_id) AS a4 ON (a4.parent_id = a1.object_id)
INNER JOIN sys.all_objects a2 ON ( a1.object_id = a2.object_id)
INNER JOIN sys.schemas a3 ON (a2.schema_id = a3.schema_id)
Join
       (
       Select groupid, data_space_id, indx.id, ds.name as 'FileGroup', ob.name as 'tblName'
       from sysindexes indx
       join sys.data_spaces ds ON indx.groupid = ds.data_space_id
       join sysobjects ob ON [ob].[id] = [indx].[id]
       where indx.indid = 1 or indx.indid = 0
       and ob.type = 'u'
       ) AS a5 on a2.name = a5.tblName
WHERE a2.type <> N'S' and a2.type <> N'IT'
ORDER BY a3.name, a2.name

December 2, 2013 I posted a blog titled, SQL Server Error Log. Often the things I post about are the items that I am using that day, or recently. Many of the topics, are simply based on how I may have made using SQL Server a bit easier for me to use one way or another. In the SQL Server Error Log post I had mentioned a few of the basic things I do when configuring a server to make the Error Log a little easier for me to navigate, or get the information I need quicker. It occurred to me after a question from a well-respected community member that an example on how I use xp_readerrorlog, could add to the usefulness of the post. Thanks for the suggestion.

In the post I recommend tabling your filtered error logs. I am fond of basing all my work on the negative (I will explain more of that in a later post). The term working with the negative for me means working on the things that I am not expecting. For example, I expect that every hour my transaction logs are going to be backed up, I don’t need an alert or an email each time that the transaction log completes without any issues. I do however want the alarms blasting every time they don’t execute as planned. I apply the same logic to my error logs, however even if I could remove entries from my error logs I always want them as they are. In order for me to have the best of both worlds, I like to take all the entries that I am not expecting and store them in a table so I can filter them with a query, join them to other tables to get other relevant events that were occurring at that time and perform other tasks that may be a bit difficult with the error logs as they stand.

Where to start?

When I start working with a new server, if I have the opportunity I like to create a database I call dbUtilities. The Tag on this post will show you some other posts where I have used this database. After that database has been created, or you have picked another location to store your table, start with a table that is similar to this one. I create this table so I can store my information long term. You may want to add some additional fields such as a PK.

On a quick side note, you should customize this code to meet your needs; the posted version here is in the simplest form. Make sure you look at the added notes at the bottom of this post before executing this script on your servers.

Create
Table
MyErrorlog
(LogDate
datetime,
ProcessorInfo VARCHAR (100),ErrorMSG
VARCHAR(2000))

Once I have my final destination created, I want to insert all my errors that exist in the error log into a temp table where I can start the filtering process.

DECLARE
@Errorlog
TABLE (LogDate
datetime,
ProcessorInfo VARCHAR (100),ErrorMSG
VARCHAR(2000))

INSERT
INTO
@Errorlog

EXEC
sp_executesql
N’xp_readerrorlog’

From this point I start to remove the entries that I expect to see, for each server this is on, the filtering could be different. I leave each of the delete statements as an individual statement, for ease of understanding and customizing. If performance is of any concern it would be better to combine the statements.

Delete

FROM
@Errorlog

WHERE
ErrorMSG
LIKE
‘%Log was backed up%’;

Delete

FROM
@Errorlog

WHERE
ErrorMSG
LIKE
‘%Setting database option COMPATIBILITY_LEVEL%’;

Delete

FROM
@Errorlog

WHERE
ErrorMSG
LIKE
‘%were backed up%’;


Delete

FROM
@Errorlog

WHERE
ErrorMSG
LIKE
‘%DBCC TRACEON%’;

Delete

FROM
@Errorlog

WHERE
ErrorMSG
LIKE
‘%without errors%’;

Once the filtering is done I insert these rows into a final table. (I create this all as one stored procedure, and execute it once a day. I am really only concerned with the last days’ worth of information.)

INSERT
INTO
MyErrorlog

        SELECT
Logdate,
‘Error Log’,
SUBSTRING(ErrorMSG, 1, 2000)

        FROM
@Errorlog

        WHERE
LogDate
>
DATEADD(dd,
1,
GETDATE())

Some additional notes:

  • In the post SQL Server Error Log I mentioned cycling your error log to keep it small and manageable. This can be key when executing xp_readerrorlog. Large logs could impact the performance on your system, and the way I use the code I am expecting a smaller log.
  • In my case, I normally complete an additional step where I insert the rows into a table that I truncate and rebuild each day. The purpose of this table is so I can use it as a base for a reporting services report and pull information from that table and then subscribe to the report. This way I can other items such as failed jobs, replication status, and job executions that have failed to name a few. This is why I use the ‘ProcessorInfo’ column in the tables. It gives me the space to identify the source of the data in my final table.
  • Don’t forget to have an archive plan for your old data. I am a pack rat, and hate the idea of getting rid of info, I often get values from trends. So I will copy the data on occasion and store it somewhere offline if I have to.

Often when I am requested to look at an issue with SQL Server I have to spend some time in the error log.

On a side note, I think the log itself is a commonly overlooked tool for troubleshooting. Often it can be a bit cumbersome to work with if the default settings are used.

When SQL Server is installed, the error log is configured to start a new log when the service is restarted and only a handful of logs are retained long term. In some cases where troubleshooting has lead people to restart the service, or server a number of times valuable information can be lost simply because of the number of error logs that are retained.

On the other side of the spectrum if the server has been on line for quite some time, the error log can grow to extremely large sizes. Depending on a few settings that may have been made elsewhere many of these log entries are not really much assistance in working though issues. Imagine having an error log with half a million rows, and even though these half a million rows are sorted by date time, you have to scan the log to see when problems started. While you are considering how you may filter the log, or copy the log to a format that is easier for you to manage and review. Don’t forget that a log that size will need to be copied or loaded and sometimes both. Many times this has to be done while there is an issue of some sort, that very well may be causing production issues, and time is critical (nothing like having management wanting a problem solved and you have to wait for a huge error log file to load). The good news is there are easier ways to do this. Here are a few tips I like to stick to with my error logs.

  • Cycle your error log – Large error logs can be avoided by taking a couple steps, one of these can be simply cycling the error log. When execute sp_cycle_errorlog will simply start a new log, I like to schedule this to happen on a regular basis for example every Sunday night. On servers that have a lot error log entries, because of security logging or frequent transaction log backups then I may even consider cycling the error log more often.
  • Change the retention – By default you are going to have 7 error logs, if you are considering cycling your error log, (I believe you should) I would recommend also changing the retention time of your error log, often I will keep 60 days’ worth of logs, but your requirements may be different. The easiest way to do this is to Right click on the Error Logs folder in Management Studio, select configure and increase the Max number of error logs before they are recycled.
  • Table Filtered Logs – One thing I like to do, purely for troubleshooting on super busy systems is to use the stored procedure xp_readerrorlog to insert the rows of data into a table. When I do this, I can then filter out the items that I don’t need to see. This is also an easy way to compile logs from multiple servers into one table, and used in combination with a mail profile send your self the results. This will keep you from having to review many error logs on many servers. Often what I do is use the stored procedure to insert into a temp table, and since this is a filtered error log I will remove known entries from that table that I am aware of, items like transaction log backups successful, log on attempts (I often count those before I remove them to make sure the failed ones are not indicating another issue) and other repeated entries. Once I filter the temp table of everything I am aware of, I then insert the log into a permanent table for safekeeping. The permanent table then is managed to a retention time as well. Remember this table is filtered view, I use it in place of the error log for quick access so I can filter it as I would any other table, but it does not replace having the detailed error log. It just gives me more filtering options (Like joining to job history when a job keeps failing).

The end result is that for my use when the log grows to a size like that, it is frustrating to use the log to find the information that is critical. Managing your error log is important and the end result is that it will make your systems easier to manage and troubleshoot.

** Note: Fixed a typo change sp_readerrorlog to xp_readerrorlog.

I can never be reminded enough that it does not hurt to just check in on the servers you manage. With SQL Server as stable as it is there are many times well… you can just forget about some of the servers. So if you have some time today, make sure that you are completing these actions on your servers:

  • Backup your databases, even on a development role server you need to make sure that you have backups and it meets the company’s needs.
  • Don’t just check to make sure your Backups are being done, Test the restores.
  • Check your patch levels; do you have the new patches from Microsoft? Have they been installed?
  • Check your SQL Server Error logs, is there anything looking just a bit off?
  • Check your system Logs, do you know what all the warnings are?
  • Check your SQL Server Job History, are they still running like they should?
  • Have you done a comparison between your databases in the different environments? One thing to remember he is many shops that use tools to do the deltas between production and development or UAT servers can accidently role something into production that should not be there. It is just as important to make sure your development servers have version of the database that matches production as it is to make sure you don’t develop on production servers.
  • Check your security accounts; is there anything there that you did not anticipate?
  • How is the disk space on the servers? Are you at risk for running out of space and taking an extended outage because you have to just find some place for that data?

You may want to consider creating an auto ticket or even a calendar reminder if you can that will remind you to check these and other items on a regular basis.

I caught a cold or something last Tuesday that hit me harder than anything else I have had in years. Just as the cold was getting started I have a migration that I had to complete. This migration had been planned for about 3 months with the last 4 weeks just getting all the users that were going to be impacted ready. As I started to get ready for the migration I had realized the importance of checklists. First off we were taking a stand alone Windows 2003 SQL Server 2005 to a Clustered Windows R2, SQL Server 2008. This alone is a violation of one of the core rules that I like to live by.

Change only one thing at a time. – The more you change at one point in time the more possibilities that something will go wrong, and if they do go wrong then the complexity of fixing these just increases…

Well to address the rule first. A number of years ago I would have been very aggressive about this rule, and many of the rules that I think are best practice for a database servers. But the reality is really harsh here. If I would have stuck to this rule I would have impacted the clients not once, but three times. Once for the hardware upgrade, once for the OS upgrade and once for the SQL Server upgrade. So it may be best practice, but it would also be a major impact to customers.

So in a shop where you are supporting a customer that is paying for your service and that service is to provide a database that is to be online with little interruption is it really considered best practice to take the server down time and time again to upgrade it? I think it could be argued that the business needs to have a big say in how this systems is upgraded. So when the requirements are that the database is not to go offline for an extended period of time a new set of challenges arise. This is where the checklist is priceless.

The Migration started on Wed. afternoon last week and we did have a bit of a late start, but the impact to the customer was as advertised. We had some issues that impacted the users down the road and if it were not for the great sys admins we would still be working on those.

This fall I hope to present a session on how I managed to move over 300 Gigs of data, across 2 servers with an upgrade of an OS, hardware and SQL Server with the database only being down for 14 minutes. (Ok 14 min and 12 seconds). It’s not as hard as you think it is, but it does require a special attention to detail.

 

In the United States of America you are innocent until found guilty. I want to make sure that I am very clear on this fact, because one of the many databases that are involved in protecting this country came under fire recently. I have done a bit of research and there is not a lot of information that has been released about this. But I can tell you that there is an individual who in his own description calls himself a Systems Analyst at the Department of Homeland Security, I will call him “Bob” to protect his rights.

From what I have gathered “Bob” who worked with the TSA database, found out that he was being let go. It appears that “Bob” was not too happy about this and tried to take the database down. According to a few articles that I have found it looks as if the database was the one that is used to protect us as we take flights around this country…

So I have to ask? Are we being serious enough about the permissions? Is our data secure? Is your company data secure? This is not a joke and the data that we keep nowadays is more important than ever. But I am always surprised at how many databases are left open. In this case “Bob” may have been a DBA, and had all the permissions that he needed.

But I ask you to remember:

  • When someone turns in two weeks notice as a DBA, don’t be surprised when they are walked out the door that second.
  • Next time you are audited and the auditors are being a real pain, hope they are the ones that audit the database where your data is stored.
  • If you wonder why developers are not in production and only Admin’s and Change Managers can change production, it is to limit liability, it’s not a matter of trust.

Just in case you were wondering… “Bob” is up on two charges of the Federal kind. These are the kind of charges that can put you in a very bad place located underground. According to the Linked In site that I found “Bob” is working as a consultant for the last 6 months. (I found “Bob’s” LinkedIn Page by Name, City, past Employment)

Have you checked your consultants recently?

Story in EarthTimes

Denver Post

Computer World