It is very common, despite affordable hardware, to have server load issues. There can be a number of reasons for a high load on the server, such as inadequate RAM/CPU, slower hard drives, or simply unoptimized software. This article will help you identify what the bottleneck is and where you need to invest. However, please do not take it as a replacement for professional advice/service. You should always seek professional service if you can afford the associated costs.

I) First of all, are you really in trouble?

Typically, people look up the load on control panels, using the “uptime” or “top” command. You can probably run the “uptime” command in your root shell to find out what the payload is, but I’d like you to use “top” for now (please). This will help you identify how many CPUs are being reported*. You should be able to see something like cpu00, cpu01, etc.
A load of ~1 per CPU is reasonable. For example, it’s fine if the load is 3.50 and you have 4 CPUs.

Another thing to consider when looking at load through uptime or higher is to understand what it shows. For example: (on a 2HT cpus server, reported as 4)

18:30:55 up to 17 days, 5:17, 2 users, load average: 4.76, 2.97, 2.62

The first part (3.76) shows the load average in the last 5 min, while the second (2.97) and the third (2.62) show averages of 10 and 15 min respectively. That’s probably a spike here that I wouldn’t be too worried about (a little careless?), but if you are, read on!

Pretty happy how you were able to identify that your server is really overloaded? I’m sorry to hear that, but you never know because sometimes servers can handle much more load than the load shown. After all, load averages are not that accurate and may not always be the final deciding factor. Confused? It was just technical information that you don’t have to worry about as much. Go ahead if your loads are anything to worry about.

*note the use of the term “informed”. I used this term because a P4 CPU with HT technology will report as 2 even if you know your server has a CPU.

II) Where is the problem?

To identify the problem, you need to run a series of logical tests (well, it’s not as scary as it sounds). All you need is some free time, probably 30-45 minutes, and root access to your server (don’t expect magic ;)). Ready to start? Let’s go!

Note: Do the checks several times to reach a good conclusion.

1. Check RAM memory (most common bottleneck!).

# free -m

The output should be similar to this:

# free -m

total cached used free shared buffers
Memory: 1963 1912 50 0 28 906
-/+ buffer/cache: 978 985
Change: 1027 157 869

Any reaction like, “Ohh God, almost all the RAM is used up”? Do not panic. Take a look at the buffers/cache it says “985” mb of RAM is still free in the buffers. As long as you have enough memory in the buffers and your server isn’t using much swap, you’ll be pretty good with RAM. Your server starts using SWAP (just like Pagefile), which is part of your disk allocated as memory, but it’s comparatively very slow and can slow down your system further if you have a busy hard drive (which I doubt you wouldn’t if you’re using so much RAM). In short, at least 175 MB available in buffer and no more than 200 MB in swap.

If RAM is the problem, you should probably look for optimizations in your PHP/Perl scripts, MySQL + server queries, and Apache.

2. Check if the I/O (input/output) usage is excessive

If there are too many read/write requests on a single hard drive, it will slow down and you will have to upgrade to a faster drive (with higher RPM and cache). The alternative option to a single faster drive is to split the load across multiple drives by spreading most of the requested content across multiple drives, which can be easily achieved using “symlinks” (soft links to files/folders). To identify, if your I/O problem is slowing down your server:

# upper part

Read the output in the “iowait” section, for each CPU. In ideal situations, it should be close to 0%. However, if you are analyzing at the time of a peak load, consider rechecking these values ​​several times to reach a good conclusion. Anything above 15% is worrisome. Next, you can check your hard drive speed to see if it’s really lagging:

If you know that your hard drive exists in /dev/sda or /dev/hda, simply do the following. Or run the “df -h” command to check which drive your data resides on.

# hdparm -Tt /dev/sda

The exit:

/dev/sda:

Cache read time: 1484 MB in 2.01 seconds = 739.00 MB/s

Time buffered disk reads: 62 MB in 3.00 seconds = 20.66 MB/s

It was impressive on buffer cache reads, most likely due to the onboard disk cache, however buffered disk reads are only 20.66 MB/sec. Anything under 25MB is something you should be concerned about.

3. Has all the CPU power been consumed?

# upper part

Check the top output to find out if you are using too much CPU power. You should look at the value under idle in addition to each CPU entry. Anything below 45% is something you really should be concerned about.

III) Problem identified, what is the solution?

To conclude, let me offer some solutions for each problem:

A global solution to all problems is to optimize MySQL and the web server, including PHP/Perl scripts and queries. Or the least you can do is optimize Apache and MySQL server parameters to make them work better.

1. Too much CPU usage

In “ps -auxf” or “top” look for processes that are using too much CPU. If it’s HTTP or MySQL, you’re better off optimizing your scripts and queries, if possible. In most cases it is extremely difficult to optimize all scripts and queries and a better option is to simply change/upgrade the CPU. A dual CPU should work better, but the type of upgrade you’re looking for depends on your current CPU.

2. RAM memory is exhausted

It’s like you’re in the same kind of situation as the CPU. Optimize HTTP, MySQL, scripts, etc. or go for a RAM upgrade. You can install Opcode caching software like APC (by Pear) for PHP to make it work better while lowering the load.

3. The drive is all used up (hey, I don’t mean space)

Here you have to go for a faster drive like SATA over normal IDE or SCSI over SATA. Well, I was just speaking in general. You have to consider factors like RPM and cache to end up opting for a worthwhile upgrade. The second option is to get multiple drives of the same class and spread the load across the drives. A common approach is to serve MySQL from a second drive.

IV.conclusion

Wasn’t that very helpful? My article might be flawed, ahh, sorry. It’s my first article and this thing really used up quite a few brain cells of mine. That’s a bit personal, isn’t it? Let’s get back to business.

FYI, in the example, the problem was that the I/O usage and hard disk became slow.

A guide can never be complete in itself or offer you everything you need to reach the level of expert (you need to continue learning to reach that level). Whenever in doubt, hire experts to check your server. Somehow, if you don’t have money to spend, you’re still safe! You can head over to our server optimization help section for help with optimizing your server.

Leave a Reply

Your email address will not be published. Required fields are marked *