Hosting Partners  |  About Us  |  Blog  |  Legal  |  Portal Login

The Planet Blog

 
Archive for the ‘Tips and Tricks’ Category

Ben KeenerIn Know Thy Backups – Part I, we started discussing the most common strategies of backing up your data, and before we continue that discussion, I should clarify that we’re not talking about hardware configurations like RAID or backup products like Evault and Data Protection Servers. These backup schemes can be executed without spending a dime on additional equipment or resources. While there are best practices and recommendations for making backups and keeping them safe, if your budget is limited, you can protect and preserve your data using one of these schemes on your local workstation or on a secondary drive in your server.

When we looked at the full server and simple incremental backups in our previous post, we noticed a significant limitation: losing a single backup can be catastrophic to restoring data. In the next two schemes, we’ll evaluate solutions that protect us from this vulnerability.

Differential Incremental Backups

A differential scheme requires a full backup reference point and then makes a backup of all changes to the server from that reference point on each subsequent backup. This method requires more storage space than incremental backups but generally doesn’t need as much space as a full backup.

Based on the volume of changes made between the first backup, the reference point and the current backup, differential incremental backups may require additional server resources than an incremental backup. Simple and multi-level incremental backups constantly update the reference point with minimal load, while differential backups update the reference point with a new full backup.

Example: Differential Incremental Backups

As in the previous example, I am using a schedule of backups that starts with a full backup on Sunday, with additional backups on the following days. This time, I’m using differentials. Let’s say that on Thursday I find some inconsistencies in the database when compared to the paper files I received from a vendor. After investigating, I find that my database is corrupted. I determine that I will not be able to recover the database as it is, so I review my backups.

Somehow, I cracked the DVD that my Tuesday backup was stored on, but all of the other discs are here. I start by restoring the Sunday backup and then the Wednesday backup, hoping the corruption occurred after the backup was made. Thankfully, the restoration works, and we are up and running again after losing minimal data. If I had been using simple incremental backups, I would have been able to restore only up to Monday because Tuesday’s backup disc was broken.

Multi-level Incremental Backups

There’s a more granular and robust backup scheme that is less vulnerable than simple incremental backups and less server-intensive than differential backups: The multi-level incremental backup. Multi-level increments assign a level to each backup and then make a comparison against the last lower-level backup made. Only the changes between the reference point and the current data are saved.

This arrangement allows you to design a backup scheme around your needs and the capabilities of your server, and you can decide how many backups you will need for a full restoration to the latest restore point. You will control the number of backups required for a given restore by determining the number of levels in the system. In the event of a disaster, you need a single backup of each level, and each higher level backup must use the lower level as its reference point.

Example: Multi-Level Incremental Backups

This time I am in charge of a Sendmail server that is always under heavy stress. Because this server is extremely important to my business, I need to ensure both its availability and responsiveness at all times. I also need to maintain archives of the e-mail on the server. To do this, I decide to implement a multi-level incremental backup scheme since I need more granular backup configuration that does not generate a great deal of load on the server. This scheme meets that need. It still retains the weakness of incremental backups, but I partially mitigate those weaknesses with scheduling.

At the first of every month, a full backup is scheduled. This is my Level 0 backup, and it is named level0.name of the month. The following day I run a Level 1 backup. This backup holds only the changes since the most recent Level 0 copy called level1.first.name of the month. The subsequent days of that week, I create a Level 2 backup called level2.first.day of the week.name of the month. This process continues until the Sunday after the first Level 2 backup.

On the next Sunday, I make another Level 1 backup called level1.second.name of the month. The subsequent days of that week, I make Level 2 backups called level2.second.day of the week.name of the month. I continue in this vein with every Sunday being a Level 1 backup and the rest of the week being Level 2 backups until the end of the month. On the first day of the next month, I start all over with another Level 0 copy.

I make certain to save multiple copies of the files after I test the archive. I also check to be certain it’s not corrupted, to minimize the risk of data loss through a faulty archive. This scheme allows me to restore to any point within the month in just three steps, as long as all of the archived backups work.

If I need to restore the data from April 17, 2009, I would need the archives for level0.april, level1.third.april, and level2.friday.third.april. I would restore them in sequence from Level 0 to Level 1 to Level 2.

Choosing Your Backup Scheme

As I said in the beginning of this post, these backup schemes are available to you without the use of an additional server or any expensive backup management software. All of the above are viable options for making your backups; however, not every scheme is perfect for every situation. You should review your requirements and the available resources to determine which scheme best fits your needs.

-Ben

Ben KeenerMore often than not, server backups are misunderstood. With dozens of hardware options and hundreds of software options, finding the right backup can be intimidating. To assuage some of those fears and clear up a bit of that confusion, let’s go over a few of the most common backup schemes. This list isn’t all-inclusive, and the options presented shouldn’t be mistaken for backup plans. A backup scheme is simply a method of creating backups. A backup plan (or disaster recovery plan) is a scheduled implementation of a backup scheme. As we evaluate each scheme, we’ll look at the requirements, costs and benefits, and by the end of our tour, you can decide which best fits your business.

Before we get too far into the specifics of the different schemes, we should define some fundamental terms that we’ll use throughout the comparison:

  • An archive is a set of data that is being preserved
  • A reference point is a single archive against which comparisons are made
  • A restore point is the most recent working backup

The key question a backup scheme answers is this: “If a server suffers a catastrophic failure, what is needed to resume operations with minimal downtime and data loss?” Again, the backup scheme is not a complete disaster recovery plan — its focus is the restoration of data.

The four basic backup schemes we’ll compare are full-server backups, simple incremental backups, multi-level incremental backups and differential incremental backups. The primary considerations about the method that should be used are the server load generated by the backup process, the backup file size, and the speed with which a backup can be restored.

Full Server Backups

A full server backup is one of the simplest methods for a backup scheme. It takes only a single backup archive to create a restore point, which makes data restoration simple and fast. The drawbacks are the amount of time it takes to make the backup, the load it generates, and the total size of the backup. Each backup scheme we’re comparing uses a full backup of the server.

As we evaluate the other schemes, you’ll note they all start with a full backup as a reference point, and create their own restore points as they move forward.

Simple Incremental Backups

A simple incremental backup attempts to resolve some of the issues with full backups, and it does a good job. With an incremental backup, a single full backup is made that serves as both a restore point and the initial reference point. On subsequent backups, it becomes a little more complex. Instead of making a new full backup when it is updated, this scheme compares the current state of the server against the state of the server as it was in the reference point (the first full backup). If it locates any changes, it backs up those changes and generates a new snapshot of the drive as another reference point. This new reference point is then used for the next incremental backup.

This backup structure means the restore point on a server with this backup will consist of the initial reference point and all subsequent incremental backups that use this reference point. This dependency is the primary weakness in simple incremental backups: All of the backups — from the original reference point to the incremental additions recording changes from the reference point — must be uncorrupted and complete for the backup to fully restore the data. If any backup is missing, corrupt or incomplete, the restoration can’t be completed.

The server load created and storage space required for this type of backup is generally less than what you’ll see in a full backup scheme, especially when there aren’t many differences between the backup point and the reference point. On the other side of the spectrum, if the entire data set changes between backups, the storage requirements and server load will be the same as they were when full backups were being performed.

Example: Simple Incremental Backups

I am implementing incremental backups for a database that houses all of my users’ data. I decide I am going to start with a full backup each Sunday — the slowest day of the week for the database — and do an incremental backup on each subsequent day. This process starts over again every Sunday. On Friday, my server suffers a catastrophic hard drive failure. I am told by the technician who replaced the drive that the controller failed, and the heads were idly tapping the side of the drive cage. Everything on the drive is lost.

I gather my backups and begin to restore them on the new replacement drive. The backups from Sunday, Monday and Tuesday restore without a hitch, but Wednesday’s backup is corrupted and will not complete. This means I have lost all of the data from Wednesday and Thursday. Without Wednesday’s backup, the rest of my incremental backups are useless.

There are two incremental backup schemes that attempt to address this issue: the differential and the multi-level incremental backup schemes. In Part II of “Know Thy Backups,” we’ll explain the pros and cons of these methods, and you’ll be ready to plan your backup strategy.

-Ben

Matthew BoehmOften, developers and administrators come across situations in which the only apparent solutions are overly complex and require significant effort to implement. Most of the time this happens because they don’t know something else – and usually something easier – exists.

This practice was evident in a recent case where a customer didn’t know it was possible to do table joins inside an UPDATE statement.

Here’s an example of a task he was faced with: You have a user table and a user_subscription table; the former contains information about each user, while the latter contains start and end dates for particular subscription products for each user. You want to extend the access period for each user that lives in the UK using product #5.

A common way of tackling this is to create a temporary table and populate it with all of the userIds from the user table that has the country = 'UK' property, then update the subscription table specifying only those userIds in the temporary table:

CREATE TEMPORARY TABLE ukUsers (userId INT(11));
INSERT INTO ukUsers SELECT userId FROM user WHERE country = 'UK';
UPDATE user_subscription 
  SET expireDate = DATE_ADD(expireDate, INTERVAL 3 DAY)
  WHERE userId IN (SELECT * FROM ukUsers) AND productId = 5;

While the code outlined above gets the job done, it could be better written. Let’s see how the SELECT statement could be used to retrieve the user information (i.e. get all UK users with subscription to product #5):

SELECT u.firstName, u.lastName
FROM user u LEFT JOIN user_subscription s USING(userId)
WHERE u.region = 'UK' AND s.productId = 5;

Now, let’s translate the above get into an UPDATE statement:

UPDATE user u LEFT JOIN user_subscription s USING(userId)
SET s.expireDate = DATE_ADD(s.expireDate, INTERVAL 3 DAY)
WHERE u.region = 'UK' AND s.productId = 5;

It’s as simple as that! We no longer need the temporary table, and it saves us a couple of extra SQL commands.

For more information on MySQL UPDATE syntax, visit: http://dev.mysql.com/doc/refman/5.1/en/update.html

-Matthew

Kevin HazardThe top vote-getter in Katie’s quick survey about what you’d like to see more of on The Planet Blog was “Contests/Competitions,” so we’ve got a quick and easy way for all of our customers to enter to win a Flip MinoHD camcorder:

Take Our 6-Question Hosting History Survey!

Flip MinoHD

It is quite possible to thoughtfully complete the entire survey in about 25 seconds, so don’t miss your chance to join in on the fun!

Click Here to Take the Survey.

We’re always looking for ways to involve you, learn more about you and give away some cool stuff, so let us know if you’ve got any other ideas for competitions or games.

The random drawing for the camcorder will occur on April 8, so get your entry submitted before 11:59pm CDT on April 7 to be eligible.

One entry per customer. One account number submitted in Question 1 of the survey will be selected at random as the winner. The winner will be notified via an email to the master user on the account.

-Kevin

George Govantes“What counters are important when troubleshooting SQL Server with Windows’ Perfmon (Performance Monitor)?”

As a database administrator architect, I can’t tell you how many times I’ve gotten that question from Windows administrators and new database administrators, and it’s a pretty daunting question to answer. When a SQL instance is installed on a server, there may be more than a thousand new performance counters added to Perfmon and to the Performance Monitor section of the Reliability and Performance Monitor in Windows 2008, so where does one start?

When it comes to performance tuning and troubleshooting, SQL Server has not changed much over the years. In a nutshell, it is all about CPU, memory and I/O. To reduce the time it takes to determine the bottleneck in a given environment, we should isolate the counters to these specific areas.

Of the many performance counters that can be selected when troubleshooting a SQL Server, these key indicators can help to quickly isolate bottlenecks and direct your investigation to the appropriate resources for corrective action:

At the Operating System Level

CPU Counter

Server Work Queues > Queue Lengths
Description: Queue Length is the current length of the server work queue for this CPU.
Value: A sustained queue length greater than twice the number of CPU cores might indicate processor congestion.

Memory Counter

Memory > Pages/Sec
Description: A page fault occurs when the operating system (OS) cannot find the requested information in its physical memory, forcing the OS to seek the information at the disk level. A soft page fault is when a page is found elsewhere in the physical memory, and a hard fault requires disk access. Most processors can handle large numbers of soft faults without significant consequence. However, hard faults, which require disk access, can cause significant delays.
Value: This value should stay below 20-25 pages per second.

I/O Counters

Physical Disk > Average Read Queue Length
Physical Disk > Average Write Queue Length
Description: Avg. Disk Queue Length is the average number of read or write requests that were queued for the selected disk during the sample interval.
Value: The value for this counter should always be under 2. This is the most reliable counter to use when the SQL instance is using external SAN storage for its databases.

Determining Application Resource Usage

Process Counter:

From Perfmon, select “Process” then select the application in question – in this case, sqlservr.
SQL Perfmon
This will add all of the operating system counters associated with the application.
SQL Perfmon
For SQL Server, the counters usually reflect information about the internal operations of the SQL instance that may not show up as a problem at the operating system level. For example, a customer might call to report that the server hangs for several moments throughout the day, but there are no errors in the application or system logs, and the OS Performance counters show all counters within normal limits.

These SQL performance counters can provide deeper insight into how the database engine is working against the customer’s application to help in your investigation.

SQL Server

SQL Server: General Statistics > User Connections
Description: Number of users connected to the system.
Value: This number will vary between applications on a given server.

SQL Server

SQL Server: Locks > Lock Waits/Sec
Description: Number of lock requests that could not be satisfied immediately and required the caller to wait before being granted the lock.
Value: This number should be near 0. If the value is greater than 0 for a sustained period, the “system hanging” as described by the customer is most likely due to locking and blocking issues within the database.

SQL Server

SQL Server: Buffer Manager > Buffer cache hit ratio
Description: The percentage of database pages that were found in physical memory. SQL Server stores information inside of 8KB data pages.
Value: The buffer pool cache hit ratio should be 100%. Values below 90% indicate that SQL Server is experiencing memory pressure.

SQL Server

SQL Server: Buffer Manager > Page life expectancy
Description: The number of seconds that a database page remains in physical memory.
Value: Values under 300 seconds indicate that the SQL instance is experiencing memory pressure.

SQL Server

SQL Server: Buffer Manager > Page reads/sec
SQL Server: Buffer Manager > Page writes/sec
Description: These values show the number of database reads and writes requested.
Value: Values for these counters will vary between database applications, but this information is useful when determining if SQL Server is the primary application using the disk. If the Buffer Manager page read-writes are low but disk-queue lengths are high, there might be a disk bottleneck. If the Page read-writes are higher than normal, a memory shortage is likely to exist.

SQL Troubleshooting Summary Chart

Counter Preferred Value
Server Work Queues > Queue Lengths < 2 per processor
Memory > Pages /Sec < 20 page faults per second per processor
Physical Disk > Average Read Queue Length
Physical Disk > Average Write Queue Length
< 2 x the number of spindles
SQL Server: General Statistics > User Connections Varies
SQL Server: Locks > Lock Waits/Sec < 500MS
SQL Server: Buffer Manager > Buffer cache hit ratio > 95
SQL Server: Buffer Manager > Page life expectancy > 300
SQL Server: Buffer Manager > Page reads/sec Value should be used to confirm I/O or memory bottlenecks
SQL Server: Buffer Manager > Page writes/sec Value should be used to confirm I/O or memory bottlenecks

The performance counter suggested may not help resolve complex situations that involve application-level performance issues, but in my experience, it is a great starting point.

-George

Laurence SimonHi there. It’s me again. I’m still in the trenches, trying to find a decent Reuben sandwich in the tunnels.

Daylight saving time ended last month in Texas, and as I changed my clocks, I dutifully replaced the battery in my smoke alarm, just like the public service announcements advised. As I balanced on a chair on top of my coffee table, I pondered: Why is the smoke alarm running on a battery and not, say, plugged into the wall?

Smoke Alarm

Well, in a fire, power can get interrupted, so it’s best to have the smoke alarm running on an independent source of power that you can test and replace when necessary. In a way, this applies to your server, too.

Since my previous post, I’ve counted up a total of 18 incidents where a customer did not get notification of an update to a ticket or a critical message regarding a change to our services. Every one of those missed communications could be traced to an e-mail address residing on the affected server. Each one of those customers suffered downtime because they either didn’t receive an alert in a reasonable amount of time or they didn’t respond to a technician who was ready and waiting to assist them.

And those incidents were just the ones that I fielded … there’s no telling how many folks out there have their “server smoke detectors” hard-wired into the wall.

It’s critical that you keep your contact information up-to-date in Orbit and that you provide us with an external e-mail address. Same thing goes with the monitoring system: Use a contact address that will work if the server goes down.

The best solution I’ve found is to use a third-party provider like Gmail, Hotmail or Yahoo Mail, and forward those contacts/alerts to your primary mailbox (which can be on your server). By setting your contact address to a third-party provider first, you can ensure that you will always be able to access any important notifications. If you use a mail program like Outlook or Thunderbird, you can have the program check your third-party mailbox in parallel with your primary mailbox, eliminating the need to set up an auto-forward rule.

My advice: Do both.

Yes, I know that means you’ll get two messages for every one delivered to that emergency line, but in the end, isn’t it better to be over-notified of a potential emergency than to not hear about it at all?

-Laurence

Matthew BoehmLet’s face it — we’ve all seen some bad code in our time. During my eight-plus years of using PHP, I’ve inherited several legacy systems, and when reading over the previous programmer’s code, I often sit back and say, “What the heck was this guy thinking?”

Combine a programmer’s bad code with his or her lack of understanding about how databases work, and you’ve got one inefficient, unresponsive Web site.

The best example of this that I’ve encountered was one programmer’s attempt simply to get the number of rows in a table with a particular column value. This was his solution:

$rowCount = 0;
$res = mysql_query("SELECT * FROM users_activity");
while($row = mysql_fetch_row($res))
{
  if($row[‘activity] == ‘LOGIN')
  {
    $rowCount++;
  }
}

I hope all of you are cringing as badly as I did when I came across it the first time. For those of you that don’t understand why this is horribly inefficient, I’ll EXPLAIN.

MySQL has an often overlooked but extremely powerful built-in query analyzer that allows you to take any SELECT SQL statement, preface it with the keyword EXPLAIN, and get an analysis of the query with lots of information on how to optimize it.

Let’s use the above SQL statement on a table with over 4 million rows. Here’s the schema:

CREATE TABLE `users_activity` (
  `userId` int(11) NOT NULL,
  `activity` varchar(30) NOT NULL DEFAULT 'LOGIN',
  `notes` varchar(200) DEFAULT NULL,
  `timestamp` datetime NOT NULL DEFAULT '1970-01-01 00:00:00',
  KEY `userId_2` (`userId`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8

The first concern we have is that the query has no WHERE clause. Not having this clause requires MySQL to scan the entire table (full_table_scan), which usually involves reading the entire table from disk into memory, and if you only need to do this once, you just wasted all that memory. With EXPLAIN, MySQL can show us exactly how many rows it checks:

>EXPLAIN SELECT * FROM users_activity
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: users_activity
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 4060232
        Extra:

MySQL will have to examine 4,060,232 rows to satisfy this query. Let’s add a WHERE clause and see how that helps:

>EXPLAIN SELECT * FROM users_activity WHERE activity = 'LOGIN'
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: users_activity
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 4060232
        Extra: Using where

Hmm … Well, that didn’t change the number of rows evaluated, did it? The query optimizer still had to look at all the rows, one by one, to determine if the activity column was equal to ‘LOGIN’. Even with our WHERE clause, this query is still almost as bad as the original author’s PHP code. Why did MySQL have to do this? To answer that, take a look at the NULL value returned for possible_keys. This result shows that MySQL was unable to find an index for the column to help narrow down the search. Let’s add an index and see what happens:

>ALTER TABLE `users_activity` ADD INDEX (`activity`);

>EXPLAIN SELECT * FROM users_activity WHERE activity = 'LOGIN'
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: users_activity
         type: ref
possible_keys: activity
          key: activity
      key_len: 92
          ref: const
         rows: 422436
        Extra: Using where

Very nice. We immediately reduced the number of rows that MySQL has to examine from 4,060,232 to 422,436. An 89 percent reduction in virtually no time at all.

This is just one of many scenarios in which using EXPLAIN can help you write better SQL statements. You can turn on the log_queries_not_using_indexes option to find out which SQL statements are not using indexes. By using this log and EXPLAIN, you can enhance the performance of your MySQL server by leaps and bounds.

For further information on EXPLAIN, visit http://dev.mysql.com/doc/refman/5.1/en/using-explain.html

-Matthew

 
 

Dedicated Servers

Managed Hosting

Colocation

Business Solutions

Why The Planet?

Contact Us