Virtualizing MySQL , are you stupid ?

or timebound ? or don't you have any load on your DB at all ?

I personally don't see many reasons to virtualize your database, apart from the , we plan to start small and scale out, or the we need it now and we don't have the hardware yet , putting your database on a virtual platform where you have to share resources with other virtual machines doesn't really sound like a tempting proposition to me. Small, almost idle databases , maybe. But enterprise production level databases no thnx.

Sheeri Cabral also mentions the above reasons .. and there also .. Enterprise Production use isn't listed.

Databases typically require a good amount of memory , and steady disk access.
So if you are in a production environment with a fairly loaded database, would you want a 4Gb machine with full direct memory access, Or 3.5Gb of virtual memory that can be ballooned to 3 if underused. My pick is at the 4Gb real memory.

The original article at Sun argues the use of Virtual Harddisk to move around workloads between different servers or even Virtualization platforms. But it fails to describe the guaranteed performance penalty of not using raw disks but a filesystem on top of a loopback device. How many layers do you want before actually write to the disks. Good practice in a virtual environment is to dedicate full disks or LVM parts to the virtual machine hence lowering the overhead, but most (default) setups do the opposite.

And don't get me started about the myth of using virtualization for high-availability :)

Now can somebody please remove all the clueless marketing people from , thnx. (they can be identified with by a source and posts that mainly talk about Sun products including only a slight hint to MySQL)

(PS. What's a Market Development Engineer's job description anyhow ? , that's just a different name for Marketing Assistant right ?)


Luke Crawford's picture

#1 Luke Crawford : Virtualization is a tool to ease consolidation.

virtualization is just a tool to make consolidation easier. Consolidation saves you money when the server with the best value per unit of CPU/RAM/power leaves you with a server that is larger than you need. Consolidation, though, tends to reduce redundancy. Be careful with it. However, virtualizing MySQL can make sense in some cases.

the only reason to virtualize MySQL that I can think of is if you have 2 servers with way more ram than you need. Say that, like me, you buy servers based on the dollar-per-ram metric. Right now, this means your servers have 32GB ram each. (I use two quad-core opterons in each 32GB ram server) so say I have 2 servers with 32GB ram each, running Xen. (I do, in various states of assembly and burn-in) Further say I only need 1GB of data on my MySQL server. I can easily carve off a 4GB ram server from each Xen hardware host, and I can setup a MySQL cluster server, with redundancy for much less than it would cost to build two 4GB servers.

Ryan Thiessen's picture

#2 Ryan Thiessen : wow that's needlessly hostile

Here is a use case that I believe lends itself toward virtualization. I am using virtualized servers for this to a limited extent already and am strongly considering to expand upon it.

* Slave servers
* Not accessed by customers
* Allowed to fall behind masters during peak load
* Tuned to use low amounts of memory
* Used for ad-hoc queries and source for data warehouse
* Source of logical backups (for auditing)
* Using shared storage

In addition to this case, I would add that because of MySQL's suboptimal CPU usage on large multicore systems I can see use cases where servers that require little RAM could potentially be aggregated especially if they required limited writes. For example depending on workload and dataset size you will could likely get more out out 4 VMs using 4G memory + 2 procs than you could with a single process on a 8 proc machine with 16G memory.

Vide's picture

#3 Vide : Check your math..

You need less disk access if you have few writes AND you can cache almost any reads, so you need a good amount of RAM. You're putting 2 constraints that are mutually exclusive, at least if your DB isn't really small.
To sum it up, I pretty agree with Kris although there are some usecases for virtualized DBs, but not for real and heavy workloads

Kris Buytaert's picture

#4 Kris Buytaert : Slave servers ?

If I were you I'd check out some Slave vs Master load documentation ..

Anonymous's picture

#5 Anonymous : More of this stupid stupidity

Sometimes you become what you criticize. You left out quite a few benefits of virtualization by holding up to and generalizing your narrow view of what databases are useful for ("enterprise production level databases" this sounds like marketing crap).

Virtualization has great applicability for a wide variety of applications, performance is not always the most important factor. Resource partitioning, isolation, security, simulation, checkpointing (and etc) are just some of the advantages one gain when using virtualization.

You also make some dubious statements about "full direct memory access" and such. There are various types of virtualization and very few of then impose heavier restrictions/overhead on access to memory from a user space process (nothing more then the usual in a non-virtualized environment). There are virtualization techniques that yield a overhead of less then 5%.

Kris Buytaert's picture

#6 Kris Buytaert : Funny

Funny how people absolutely fail to read beyond 1 post ..

But the saddest part is they fail to cut trought the hype that is Virtualization today and keep following the Vendors that push virtualization for everything.

Anonymous's picture

#7 Anonymous : Vendors hyping technology..

Vendors hyping technology.. news at eleven! I guess I also should shouldn't be surprised if bloggers don't provide coherent arguments and use catchy titles like 'stupid' to draw audience.

Kris Buytaert's picture

#8 Kris Buytaert : Atracting vs the real argument

Yes it was the marketeers that pissed me of.. (I even tagged the article that way..)
And that was also the biggest part of the message ..

And I admit to using their (the marketeers) tricks to get your attention :)

Serge van Ginderachter's picture

#9 Serge van Ginderachter : Hey, if one can run a

Hey, if one can run a production system on Microsoft Access, I'm sure running it on a virtualised mysql would be a great step forward.

But then again, that might be not be enterprisy enough of course :)

Keith Murphy's picture

#10 Keith Murphy : wow

Can you inject a little more venom into the discussion? Because I don't think you had QUITE enough.

Seriously, you would be remiss to dismiss virtualization as you are doing. For one thing, not every server has to be a high-end setup. Secondly, you might be amazed at what virtual servers actually can do. I work with Sheeri and we do have customers right now who are running some pretty good loads on AWS. While it isn't suitable for every situation, it is suitable for many. As for the whole high-availability "issue" you brought up..someone is an idiot if they think one instance, wheter it is a server or a virtual server, is going to be online 24x7x365. You design around that. Is a virtual server any more reliable than physical server? I don't know, but I wouldn't trust either so it doesn't matter.

Thats all I have time for, but seriously, wow... slow down.

Bill Karwin's picture

#11 Bill Karwin : Market Development Engineer

A Market Development Engineer is someone who works in Marketing but knows that "memory" is RAM, not disk space. :-)

Mike's picture

#12 Mike : Virtualization is an entirely valid strategy

Most db implementations are NOT high-load customer-facing instances. While our team's (internal) production (non-)cluster of 6 machines serve about 1900 qps 24/7, we have dozens of internal and departmental-level databases that are a waste of stand-alone hardware. Virtualization is a huge win for these systems, and they outnumber our hardware-dependent systems by about 10 to 1. Furthermore, in the case of hardware failure under a database, sensible virtualization does not depend on hardware repairs/hardware procurement/OS buildout and configuration in order to get a system back up and running.

We've also encountered real-world scenarios where virtualization has eliminated downtime: for instance, about a year ago, we had a planned datacenter outage to upgrade power to our primary facility. We were able to transparently migrate dns, ldap, and other plant services to another center without incurring *any* downtime along the way.

Virtualization is a huge win under the right circumstances.

Amedee's picture

#13 Amedee : Enterprise is a Marketing lie


I told Kris almost exactly the same last week on a geekdinner. Why waste physical resources on 80% of the databases (small ones)? Virtualize them, and recycle resources for the 20% Real Resource Hogs.
Virtualization can be a huge win, but only when used appropriately.

I wonder, was it my story last week combined with the marketing lingo on that triggered his rant treshold?

Kris Buytaert's picture

#14 Kris Buytaert : Nope you didn't

It really is the Marketing Lingo that tipped me off ..

Kris Buytaert's picture

#15 Kris Buytaert : Not the only option

You mention low load machines, perfect opportunity.

But there are probably other alternatives to migrated dns,ldap and other services while not using Virtualization, that are a lot less complex. I`m just trying to cut trough the "Virtualize Everything" hype :)

It is a huge win in the right circumstances indeed.

bertdb's picture

#16 bertdb : other reasons for going virtual

Hey Kris, I think you're leaving out some major reasons for virtualizing applications, including databases.
How about: being able to migrate from one hardware generation to the next without doing a complete migration of OS and application.
How about: making 100% sure that you can easily restart your applications on your DR servers (even if you don't have them yet, and would just buy them on the spot after distaster struck).
You argue that a virtual harddisk is much slower than raw access because of extra FS layers. Not if these extra layers have been designed for exactly that ... think VMFS in VMware ESX. I know, there's a couple of percents performance to be lost even with VMFS. But if you were really looking for those last percents, you were looking for dedicated hardware (and therefore not virtualization) anyway.
Arguing that virtualization isn't an option because of shared resources means that you're not sizing your hardware correctly. Expecting to get all the HW performance in one VM isn't realistic in any scenario.
And IMnsHO, High Availability with VMs _is_ real, even though it stands as a distant second to OS or application level HA clustering. It shines when those two aren't available or too complicated. Then it is your only option, and a damn fine one I think.

Kris Buytaert's picture

#17 Kris Buytaert : LTNC :)

Ha Bob, since when do you have a blog :)

Migrating from one hardware generation to another is a valid option, but as mentioned in an earlier comment , Virtualization is not the only option. You could also do that using an application HA solution if available. Sometimes Virtualization adds another layer of complexity, sometimes it makes it easier.

Testing a DR scenario.. definitely one of the options on my Why Virutalization matters list ..

And I think you got my point.. if you care about those last %'s of performance, indeed, then why are you virtualizing.

You will be sizing good, we will, but most of the "cloud" vendors and ISP's won't. I've seen highprofile blogs move around from one shared hosting to another only to realize that most of them underestimate the need of their customers and give them less resources than needed. Obviously that's also the work of a Market Development Engineer who wants to upsell his products.

And as I mentioned before .. WholeSale HA is a valid alternative, just not for everybody ..
But you have to know what you buy into :)