So you need a new tool, or you may be evaluating if your tool is the right tool for you. But does this mean that you have to buy the most expensive tool on the market? Over the last few weeks I have been asked at least three times about the capabilities of SQL Server and if it can handle a job or a situation. These questions have looked like:
- Can SQL Server handle a 30TB system?
- Can SQL Server handle all the processing that is needed for our data warehouse?
- Whether a single SQL server 2005 or 2008 on Window 2005 can handle 5,000-10,000 concurrent connections?
In each one of the questions the person who has asked the question was already on SQL Server. One of the observations I made was in each case it appears that they have identified the database as a bottle neck. In each case there were circumstances surrounding the question that added additional difficulty. For example, the 30 TB question, can SQL Server handle that much information. The client was importing data via an ETL tool, the primary complaint was that the end to end process was not importing records quick enough. They were seeing an insert rate that was about 10% of comparable machines according to TPC benchmarks. With the hardware testing out with a much higher insert rate then what they were achieving there was obviously some items that needed to be researched. In this particular case they were measuring the rows per second by taking the total number of rows inserted into the database and dividing it by the total time. The issue with this was that when you do that there are too many steps in the process to identify SQL Server as being the issue. It would be comparable to me saying it takes me 7 hours to fly from Denver to Tucson. The flight itself is only an hour and a half, but I have to get up early to drive a couple hours to Denver, then I have to be at the airport early and then I have to wait for my bag when I get to Tucson. All the surrounding times are being figured into the total travel; however the flight is still only an hour and a half.
With the question about concurrent connections the requestor was currently running SQL Server 2000. They were looking to upgrade to SQL Server 2005 with the current workload. The question is a fair one; however there are a number of issues that could impact the number of concurrent connections. The answer is not always going to be tossing more hardware at the problem.
A few years ago I was in a shop where we were supporting close to 60 SQL Servers. I was with that company for about 5 years and it felt like at least once a year I had to fight off the question of can SQL Server do what we need it to do. This was a shop where the load could have been easily handled by SQL Server but the shop just did not see the advantage to tuning the SQL Statements or get expert SQL developers. The hard part of these cases is that if you are fighting these issues and it is an issue that is related to bad code, or a bad design even indexes that are just not there where they should be. The answer is not change RDBMS the answer is fix the problem. I think to many times people look at a database that is performing slowly and they start to think they need more hardware, or they need to change RDBM’s. When I work with performance problem systems I look at the hardware, I look to see if there is a lot of Disk I/O if there is it could be a missing index, this does not mean that added money needs to be spent to upgrade the hardware; it just means that we can use those as key indicators.
Sorry about the ramblings today. I think I am going to try to get out a new SQL Quiz tonight.