|

SQL Server and Other Topics

rss

I got my start with SQL Server in 1997 with SQL Server 6.5 at BellSouth Cellular. From that point on I've had the opportunity to work on Versions 6.0 to 2014. I specialize in Performance Tuning, High Availability and SQL Development. Contact me through this site or through my Twitter handle @SQLDiver


To achieve a position as an operational DBA is not hard these days, you just have to be in the right place at the right time. The hard part is reaching the competence level of senior DBA where you are able to handle any issue that is encountered.

Nothing is more frightening than receiving that first call in the middle of the night, "the database is down"! Holy $#!^ what do I do now? It is going to happen, it is going to be frightening, but if you work hard to expand your knowledge before you get here, you'll be alright.

So this brings me to the "expand your knowledge" talk. There are several things you can do that will force you to expand your knowledge:

  1. Take a SQL Server Class. Well, if you're like me, and you don't have a good understanding about what it is you're trying to learn, it isn't going to help much. You just spent hundreds of dollars trying to increase your knowledge.
  2. Trial by fire, this is the Oh Shit class, where as an accidental DBA you learn from your experiences.
  3. Mentors. With a mentor (or mentors) you will benefit the most. Having experienced, smart people around you, you will have the opportunity to learn from others mistakes. I've had some great mentors over the years, many of them subordinates that worked for me. 

    There are rules about asking questions of your mentors:

    1. Google the question first and try to figure it out on your own first. You will learn more by trying to find the solution yourself first.
    2. Write down what you found out and try to describe how it is related to your issue.
    3. Discuss the issue with the mentor. Talk through it at a white board so you can walk through it afterwards. Understand the solution. 
  4. The SQL Server Community. I can't explain enough how important (and different) the SQL Server Community is. Let me start by saying the Professional Association of SQL Server is a differentiator and an advantage over the competitor software. They sponsor the PASS Summit every year, PASS Rally every year, 24 hours of PASS and many SQL Saturday events every year (including my favorite SQL Saturday Atlanta).
    1. The people you will meet will be life and career changing: Karla, Rob, Geoff, Denny, Paul, Phil, David, Kimberly, Jen, Julie, Audrey, Mark, Susan, Stuart, Aaron, Erin... and on and on! 
    2. The SQL Saturday and 24 Hours of PASS training is FREE.
    3. PASS Summit and SQL Rally are phenomenal, the training is constant, for all levels of experience and the organizers don't get enough kudos for the work they do. It is worth every penny spent.
    4. Local users groups, most affiliated with PASS, are great places to meet on a monthly basis to discuss SQL Server, usually involves a presentation and is also FREE.
  5. READ, READ, READ... 
  6. Online training... love PluralSite. Your head will pop from the knowledge you can obtain from PluralSite.

 


Just playing with the idea, I populated a "staging" table (77 columns) with 1369779 records.

The table is using a monotonically increasing value for the primary key. I created 3 separate tables, one is memory optimized and durable, one is memory optimized and non-durable and the last one is not memory optimized.

A simple INSERT INTO each of these tables (after running DBCC DROPCLEANBUFFERS) resulted in the following execution times:

Table  Execution Times 
Non - Memory Optimized  3 minutes and 28 seconds
Memory Optimized - Durable  23 seconds
Memory Optimized - Not Durable  9 seconds

One of the first things a consultant usually does when starting a consulting contract is they document the environments. In my last contract with one of the large investment banks in New York City, I was doing just that. In my discovery phase I ran into a strange anomaly that didn't make sense.

This story is about the physical and logical CPUs reported from the following query.

Most DBAs should recognize the query:

SELECT cpu_count AS [Logical CPU Count], 
hyperthread_ratio AS [Hyperthread Ratio],
cpu_count/hyperthread_ratio AS [Physical CPU Count], 
physical_memory_in_bytes/1048576 AS [Physical Memory (MB)]
FROM sys.dm_os_sys_info;

First step is documenting the lower environments, and in my discovery interview with the client I was told that the UAT environment was almost identical to the production. There should be very little differences. 

The UAT environment had 10 HP DL580 G7 Servers with Windows 2008 R2 Enterprise installed. All were running SQL Server 2008 Enterprise running SP3 without any CUs applied. The machines each had 4 sockets populated with Intel(R) Xeon(R) CPU E7- 4870 @ 2.40GHz which are 10 core processors.

One of the machines appeared to only have two sockets populated. And this is where the story gets interesting, as a consultant, it is difficult to come in as an outsider and ask the question, "How many physical CPUs does this server have" and then ask, "are you sure" without annoying them. Sorry, I had to ask as I only see two of the four processors.  

So the puzzle becomes interesting when I look at the production environment and notice 5 of the 10 servers also have the same issue, but also gives us some clues about the issue. The production environment has 5 servers with Intel(R) Xeon(R) CPU X7560 @ 2.27GHz processors which are 8 core processors and 5 of the same processors being used in UAT. The servers with 8 core processors were reporting the correct number of physical processors and 64 logical processors. The Intel(R) Xeon(R) CPU E7- 4870 @ 2.40GHz servers were showing 2 physical processors with 40 logical processors.

All of the servers in the production environment have hyperthreading turned on, 1 server in UAT has hyperthreading enabled... guess which one? 

Working in a bank requires extreme amounts of patience, as the change management process is very strict, and there is a significant segregation of duties that makes it hard to build a trust quickly with the server group and the database administrators quick enough to grease the path for the change management ticket implementation. In other words, lots of questions and lots of hesitation to change. The change management process is very important, as the cost of downtime is high.

Ticket created to disable hyperthreading in the one UAT server that was having the processor issue as soon as the realization that hyperthreading appear to be the issue and it had something to do with 64 vs 80 logical processors. 

After a week of patiently waiting, hyperthreading was disabled (which requires a server reboot) on the one UAT server ... voila, I can see all 4 processors. Now came the need for an explanation as the explanation is going to be necessary to implement in production. 

This is where having friends that are SQL Server MVPs with deep knowledge of Windows comes in handy. After brainstorming with Denny Cherry, he has a thought and says, "SQL Server 2008 doesn't understand what a Processor Group is". That is the ultimate clue into why we can't see more than 2 processors, but trying to put the puzzle together and explain it to the management is the trick. Why would we only see 2 physical processors, and only see 40 logical processors? It would make more sense to me that we'd see 64 logical processors vs. 40 logical processors. By the way even verified the number of schedulers in the DMV sys.dm_os_schedulers.

With one of the MVPs and a couple of CSS support engineers (and a little research), this following is the best answer to what we were experiencing:.

First, let me explain the Kernal group ( K-Group aka processor group) concept. In Windows 2008, the maximum logical processors was 64. With the number of cores per processor reaching levels greater than 10, the probability of exceeding the 64 logical processor max with hyperthreading was very likely. In Windows 2008 R2, Microsoft added 4 K groups, each that had a maximum of 64 logical processors, so they could get to 256 maximum (4 K groups). 

So, a processor group looks like this:

For a four socket machine, a K-group can contain multiple NUMA nodes with a physical processor which contains multiple logical processors. The key is each K-group has a maximum logical processor limit of 64. More info about NUMA.

The following is the best representation of what happens when you have 4 sockets populated with 10 core processors:

As Windows 2008 R2 starts up, it determines it is going to need two processor groups, as 80 logical processors exceeds the 64 maximum per K-group. Windows "balances" the logical processors between the two groups with 40 in group 0 and 40 in group 1.

The trouble comes in when SQL Server 2008 (and coincidentally SQL Server 2008 R2)  starts up. Since SQL Server 2008 does not understand the K-groups, it only uses K-group 0 which only has 2 physical processors and 40 logical processors. 

In summary, if you have an installation of Windows 2008 R2 with SQL Server 2008 or 2008 R2, and you're only seeing two of your four installed processors in the DMV, sys.dm_os_sysinfo then you are most likely suffering from the same issue as described in this blog.

By the way, you are also suffering from half of your memory being accessed as foreign memory instead of local memory (cross numa), which can be a performance issue. In addition, the below graph shows the results of disabling hyperthreading in this scenario.

The dotted line is the point where hyperthreading was disabled. This was the weekend before a client migration that increased the transactions considerably (shown in load), but the waits decreased considerably (MS-SQL) and CPU usages decrease considerably (CPU).

Solution: Disable hyperthreading for the scenario where you are running Windows 2008 R2 and SQL Server 2008 R2 or below. Use this blog as documentation for management justification. It was a challenge justifying and convincing management to turn hyperthreading off. An additional argument by the system engineers that all servers should either have hyperthreading enabled or disabled was even more of a challenge. 

The fact that the servers didn't have matching resources should have been justification enough to only change the servers experiencing the issue until further testing would confirm the proper hyperthreading for the client's workload. 

By the way, Denny blogged about this same issue here.

Related Post: As The World Turns: SQL Server NUMA Memory Node and the Operating System Proximity by Bob Dorr

Having worked for very large enterprise corporations, I expected the effort to gain access to the environments to be quite difficult, but its still frustrating.

I've been brought in with the task of performance tuning post migration from Sybase to SQL Server 2008. The migration will continue on through the time my contract is up. Two weeks in and I'm not sure what the environments look like. Their UAT environment is an exact copy of production  according to the database team.

The team seems to be quite competent and very willing to help.

First lesson, patience. More to come once I dig into it.


Today I take the leap towards joining many of my friends in the big world of consulting. Having worked my whole life for "the man", an opportunity appeared that pushed me out of my comfort zone doing something I really enjoy doing, SQL Server.

The contract, starting August  12th, out of the gate is with Barclays Capital in New York which should last about 90 days.

I'll check in frequently to let you know how it is going.