I like really open T-SQL Tuesday questions, they really promote a lot of thinking, and I enjoy reading all the different perspectives. If you look at this month’s question from Aaron Nelson (B|T), it promotes some good discussion on logging. So a simple question on what your take is on logging, I hope will show there is a lot of different ways that people log what they are doing, and what the systems are doing. If you spend some time reviewing the answers my guess is that you may find a few tips on how you might be able to log things differently than you are now, and maybe even cover some areas that might have some gaps in your logging strategy. On a quick side note… I hosted T-SQL Tuesday #30, and it was an awesome experience. I would highly recommend that you talk to Adam Machanic (B|T) and really consider hosting one for yourself.
There are a couple different thoughts that come to mind when I start thinking about how I use logging. The Utility Database session that I am working on has a lot of logging in it, to help me with understanding what is going on and how I might be able to start predicting things like growth, or patterns when performance becomes an issue. The logging that I wanted to mention in this post is something that I didn’t design and I really didn’t do a lot of work on. The developer that was working on it wanted to come up with a solution to an issue, this was a great match. Now that it is in place and it is starting to be used, I see so many uses for it that I am going to push for its use enterprise wide.
The Case: The environment I am working in has a number of different applications or services, each one of them does something drastically different and most of them hit the database. There are so many of them, and the development process is moving so fast that it is really difficult to keep up on all of them. So from my viewpoint when they come to me with an error, or something that I need to address there are so many questions that I need to have answered. For example, a month or so ago, they service I am talking about started to get very intermittent connection errors. The errors before the logging solution were logged in a text file, this text file made it difficult to see things like error patterns, or patterns in the time of day. Doing any sort of deep analysis on these would require us to load them in another tool so we could see these patterns. End result, we had log files in multiple locations because of the HA solution we have, and tracking down issues was a pain at best.
The Solution: As basic and as simple that it sounds the solution was a simple table in a database. Rather than the application logging the messages in a file somewhere, they were redirected to log them into a database table. After we were able to see some of the benefits that this had, we were able to start to expand the capabilities to pro-active monitoring, alerting and history of the application. Tables were added to normalize the database a little more, and then an archive plan was created. Now there is a single location for these messages and they are already in a format that allows us to use tools that we have to filter, and review the data. One simple table cuts down on a lot of trouble shooting time, and gives us a new view into the health of the applications.