Scaling Drupal

John Quinn writes about Scaling Drupal he is taking a one step at a time approach and is still writing his 4th and 5 stages.

His first step obviously is separating the drupal from a separate database server, and he chooses mysql for this purpose, moving your DB to a different machine is a good thing to do.

However then he gets this crazy idea of using NFS to share his his drupal shared files :(
(he even dares to mention that the setup ease is good) Folks, we abandonned NFS in the late nineties. NFS is still a recipe for disaster, it has performance issues , it as stability issues (stale locks), and no security admin in his right mind will tolerate portmap to be running in his DMZ.
(Also think about the IO path that one has to follow to serve a static file to a surfer when the file is stored on a remote NFS volume)

On top of that he adds complexity in a phase where it isn't needed yet. Because of the fact he needs to manage and secure NFS and he is storing his critical files on the other side of the ethernet cable he did create a single point of failure he didn't need creating yet.
Yes as soon as you start to scale you need to look at a scalable and redundant way to share your files.
When those files are pretty static you'll start out with a set of rsync scripts or scripts that push them to different servers upon deploying your application. When they are changing often you start looking into filesystems or block devices that bring you replication, such as DRBD or Lustre
But if today his NFS server goes down he is screwed, much harder than when his database has a hickup.

One could discuss the order of scaling, but adding more webservers might not always be the first step to take, one might want to tackle the database first depending on the application.
He decides to share the load of his application over multiple Drupal instances using apache mod_proxy , then adds Linux-HA to make it highly available.
I`m interested in knowing why he chose for apache mod_proxy and not for LVS

Although using NFS for me belongs in a How NOT to scale tutorial, his other steps give you a good idea of the steps to take.

I`m looking forward to his next steps :) I hope that in part 4 he also removes NFS in favour of a solution with no performance and locking issues that really takes away a big fat single point of failure. In part 5 he discusses how to scale your database environment. The actual order of implementing step 2 and 5 will be different for each setup.

Anyway.. I`m following up on his next steps.. interesting reading

Comments

Thomas Reitz's picture

#1 Thomas Reitz : Hardware

"Thanks for provide me this informative info. Great! I am really waiting for your next post.

Thanks for sharing!"


ForexMan's picture

#2 ForexMan : I prefer WordPress

Thank you. Of course, this choice of everyone, but I prefer WordPress. I have a question for you - why do you prefer Drupal? It is faster, less heavy server requests? you do not have sarvnitelny analizetih systems?


emule-server's picture

#3 emule-server : Thanks

Hi m8,

thanks for your blog. I searched another things but found your blog. So thanks for your work!!!

Best wishes from a still reader from belgium :-]


myspace proxy's picture

#4 myspace proxy : proxy servers

Oh those dang DNS servers really annoy me sometimes. When I try to point a domain sometimes I see i did it wrong and need to wait agian. I have banged my head off the pc screen with DNS errors.


Professional Web Design Company USA's picture

#5 Professional Web Design Company USA : tips on Joomla sought

I'm with Marina Andrew and would love to hear tips on Joomla from you.


Artur's picture

#6 Artur : Very useful information for

Very useful information for me!
Thank you!


immobilier maroc's picture

#7 immobilier maroc : GFS or OCFS2

Hi,

I'm considering different options for a SAN. Are there anyone who knows
of a comparison of the GFS, GFS2 and OCFS2 filesystems?
Mostly i'm concerned about performance and stability (maturity).

It'll be running on a RHEL 5.

Kind regards


Marina Andrew's picture

#8 Marina Andrew : i've already try to use

i've already try to use drupal for my web but i prefer use joomla, do you have any article about joomla?


Nayna's picture

#9 Nayna : joomla

is joomla really that good. I thought it was pretty outdated?


Invisalign Teen's picture

#10 Invisalign Teen : Scaling Drupal

[...Setting up shared files without NFS could be done by GFS or OCFS2 if you like , heck even SMB or DAV is a better alternative to NFS...] Completely agree. Use OCFS2


Tyrone's picture

#11 Tyrone : I have been commenting here

I have been commenting here for so long but where does my comments go?


Kris Buytaert's picture

#12 Kris Buytaert : Mollom

If your comments don't show up in the moderation queue .. Mollom has decided you're a spammer ..


Dueces's picture

#13 Dueces : What's NFS? Is it Need for

What's NFS? Is it Need for speed :P or what I can't get this abbreviation:(


Warhammer Online's picture

#14 Warhammer Online : Thanks for the scaling tips.

Thanks for the scaling tips.


Performance guy's picture

#15 Performance guy : Scaling

I would suggest memcache for sessions instead of NFS share. Content sync is pretty straightforward nowadays.


David Magda's picture

#16 David Magda : NFS and Linux

>[NFS] has performance issues , it as stability issues (stale locks)

Troll-ish comment, but:

Is this inherit in NFS or simply Linux's implementation of it? Solaris (client and server) and FreeBSD (client more so) have good NFS implementations.

This doesn't help you if you're using pure LAMP, but NFS has gotten a bad reputation because of one bad implemenation IMHO.


Gerhard Killesreiter's picture

#17 Gerhard Killesreiter : NFS not so bad

Drupal.org is using NFS to share both the PHP files as well as the uploaded files. I can remember one minor disruption of NFS service within thelast year.
Cheers,
Gerhard


FGM's picture

#18 FGM : No stale locks in NFS

Having written several NFS implementations myself, I feel you generalize too broadly: the NFS protocol does not have a "stale lock" problem, for the simple reason that NFS does not have locking at all, so no stale locks problem.

You can check this in RFC 1813.

When using NFS, locking is optionally handled by a separate service, most often the NLM protocol, as bundled in the NFSv4 suite. But it remains an option, and alternative locking solutions are possible.

NFS' statelesness offers an interesting property in such situations: the ability to handle crashes and restarts without any reconfiguration nor interrupted requests, everything continuing after the crash just like it was before, possibly hours later. Although admittedly unpleasant to the end user as unresponsive, it is an efficient way to avoid various error recovery techniques that can be required otherwise.


Grugnog's picture

#19 Grugnog : Slashdot.org and NFS

Some random thoughts...

I had similar impressions of NFS until I read that slashdot.org has used NFS since 2002 without any major problems: http://meta.slashdot.org/article.pl?sid=07/10/22/145209

We have used DRBD quite a bit - and I think that is a very useful tool with both rsync and NFS type setups.

I agree that rsync is probably the better & simpler solution for most sites, although if you start having serious numbers of files you will probably need to rig up some kind of inotify setup for this to work efficiently.


Anonymous's picture

#20 Anonymous : nfs vs rsync for a scalable drupal architecture

kris,

thanks a lot for taking the time to read and comment on my blog. while i agree that nfs creates probably an unnecessary choke point, it doesn't really add another single-point-of-failure, since if your database server goes down, you're SOL anyway.

i've had the NFS configuration running in a fairly high traffic production environment for 6 months now, without issue.

having said that, i still think that the rsync option that you outline is a better solution, despite lucs (valid) concerns about file system growth.

i'll update my blog with your thoughts.

thanks again.

p.s. my girlfriend (cailin) isn't happy with you, since i just made her discuss file system scaling techniques over our weekend brunch ;)


Bart van Bragt's picture

#21 Bart van Bragt : Single point of failure

thanks a lot for taking the time to read and comment on my blog. while i agree that nfs creates probably an unnecessary choke point, it doesn't really add another single-point-of-failure, since if your database server goes down, you're SOL anyway.

This is true but only from a hardware perspective. If for some reason your NFS server dies (and your DB server doesn't) you still have a problem. So using NFS does add an extra single point of failure, at least in it's default setup.

I'm currently looking into distributed/clustered/etc filesystems for a (small) webcluster that I'm building. It's not going to be using Drupal (yet) but that doesn't matter much for the problem at hand.

I've looked into OCFS2 and GFS but they seem to cause more problems than they solve. They are hardly a panacea. Hard to setup, problems with performance, bad support, etc, etc. I haven't been able to find a lot of succes stories, mostly people that have abandoned implementing these systems and who have reverted to NFS...

I don't really like NFS and the principles behind it but so far it seems to be the best thing out there (for a lot of situations anyway). I'm afraid I'm going to implement a HA NFS server with hearthbeat and DRBD. It seems like GFS and OCFS2 are mainly interesting from an academic point of view but I'd like to be proven wrong by people who are actually successfully using those systems in production on busy sites/systems.


Kris Buytaert's picture

#22 Kris Buytaert : Failing GFS and OCFS setups ?

Bart,

I'd love to read about people rolling back from GFS/ OCFS2 to NFS and their reasons for doing so .. I haven't met any yet.
Back in the early days GFS indeed was a pain to setup and little or no documentation was available. Also OCFS wasn't really what people expected it to be. But if we look today at OCFS2 and recent GFS versions there are lots of people out there implementing it , combinded with Xen

If you talk about academic setups then AFS and Lustre come to mind because honestly I yet have to see a relevant AFS implementation outside an academic environment. (Please point me to other cases)

The whole problem however is what you need as a feature set .. so there is no one good solution for all of the different cases.


John's picture

#23 John : more on nfs vs rsync

kris, i had a crack at writing up an rsync based alternative based on your comments;

http://www.johnandcailin.com/blog/john/scaling-drupal-step-one-b-nfs-vs-...

i hope i captured your thoughts properly. it's great to hear from everyone on their different ideas and approaches.

john.


Kris Buytaert's picture

#24 Kris Buytaert : More to come

John,

Given the huge interest in this topic I`ll be cranking up some more pointers on how to tackle this issue , taking in account different use cases.

No promise on a timeframe however :)


Luc Stroobant's picture

#25 Luc Stroobant : NFS without issues...

If your NFS share becomes unreachable and your webservers are not aware of that (so the system thinks the share is still mounted) your Apaches will try to read the files and choke. After a few minutes, the whole site will go down because your Apaches hit MaxClients. I've seen it happening...

For the database server: we'll be using MySQL replication with a virtual IP (heartbeat) for HA. I did some tests with that and it seems to work fine. If everything goes well, we'll put that in production during the next few weeks.


Kris Buytaert's picture

#26 Kris Buytaert : Weekend Brunch

I need to find out where you can discuss filesystem scaling issues over Brunch..
Guess your Girlfriend needs to meet my wife.

I suggest we invite Dang also :)


Kris Buytaert's picture

#27 Kris Buytaert : rsync ocfs

Luc,

Memcached would probably be part 6 or 7 in his list as he already mentionned where he will be going in his 5 articles :)

Setting up shared files without NFS could be done by GFS or OCFS2 if you like , heck even SMB or DAV is a better alternative to NFS.
GFS or OCFS2 will give you really shared access to files complete with locking etc.

If rsync is not an option .. csync2 might be. . The other question is , where are the files coming from ?
If you are the one generating them and pushing them to your servers rsync stays an option as you should be getting files out of your version control system on to your production platforms. It becomes a more complex story once you have UCG.


Luc Stroobant's picture

#28 Luc Stroobant : Re: rsync ocfs

On a community site there is almost always user generated content in your files directory, there's no way to avoid that. (avatars...) For the sync story: once you start syncing between 3 or more servers, that might become a problem. And don't forget that the syncing will take more CPU once your files directory grows larger (we have an UGC site with 50GB files...).

About GFS and OCFS2 (or GPFS), I know it's possible, but I wonder if those are really the solution. It adds a serious level of complexity to your "simple" webfarm nodes. Personally, I think it might be better to fix this by adapting Drupal to make it possible to store the files on other servers. After all, it's rather stupid to serve your static files with an expensive 15-20 MB preforked Apache with PHP in it. So if we could tell Drupal to put that content on another (redundant) system with a webserver optimized to serve static files, that could be a great improvement for the bigger Drupal sites.

Another thing I've been thinking about is to (NFS?)-mount the files directory on the webserver nodes, but use your reverse proxies to rewrite the http://yoursite/files/... requests to http://files.yoursite/... which is served from a different httpd on the fileserver. That way NFS will only be used to write the files and not to read them afterwards. (but it's still NFS...)


Luc Stroobant's picture

#29 Luc Stroobant : Scaling Drupal

One of the things in step 4 & 5 is probably adding memcached in the setup. We did some tests with it for stubru.be and the results are very impressive.

For NFS: you could create a redundant fileserver with NFS (drbd, Sun cluster...). We'll also add reverse proxies (Squid or Varnish, no mod proxy...) to cache most of the files on NFS in memory. I'm very interested to hear other ideas to setup shared files on more than two webservers without NFS.
Rsync is not an option for drupal shared files, unless you find a way to delete unused files. The rsync delete flag is not an option here. Without --delete, your files directory will keep growing, with --delete, you'll delete newly added files on webserver x if you rsync from webserver y...