Creating a distributed, scalable WordPress Platform on Amazon Web Services (AWS)


For CloudVane.com we wanted to have a highly scalable, distributed and performing Platform that is also easy to Maintain. These challenges weren’t that easy to achieve and initially we had to find a system.  As CloudVane is all about the Cloud, the solution was easy: it must be a Cloud Provider. We selected Amazon Web Services to server our Magazine.

To better understand the Performance of WordPress, we wanted to have a System that allows us to handle about 8 million hits per day. So we started with a standard WordPress Installation on Ubuntu with MySQL just to figure out what is possible (and what not). We didn’t add any Plugins or so, the first tests were a really plain System.

For the Test, we used Blitz.io, which returns great statistics about the Test run. Our first Test gave us the following results:

  • Delay: 477 MS FROM VIRGINIA
  • 60 Seconds Test Run with 20 Users per Second at maximum
  • Response with 20 Users per Second was about 1 Second

So what does this mean? First of all, we can handle about 20 Users per second. However, the delay of 1 second is not good. Per Day, we would handle about 560,000 hits. So we are still far away from our target of 8 Million Hits per day. The CPU Utilization wasn’t good either – it turned out that our instance takes 100%. So this is the very maximum of an Out-of-the Box WordPress installation. Below you can see some graphics on the Test run.

Test Run #1:

60 Seconds, maximum of 20 Users per Second:

Performance for an AWS Micro Instance measured by a Load Test
Performance for an AWS Micro Instance measured by a Load Test
Amazon Performance with WordPress and a Micro Instance on EC2
Amazon Performance with WordPress and a Micro Instance on EC2

As you can imagine, this simply does not meet our requirements. As a first step, we wanted to achieve better scaling effects for CloudVane.com. Therefore, we started up another Micro Instance with Amazon RDS. On the RDS Instance, we took advantage from the ready-to-use MySQL Database and connected it as the primary database for our WordPress Platform. This gives us better scaling effects since the WordPress instance itself doesn’t store our data anymore. We can now scale out our database and Web frontend(s) independent from each other.

But what about images stored on the Platform? They are still stored on the Web Frontend. This is a though problem! As long as we store our images in the instance, scaling an instance gets really though. So we wanted to find a way to store those instances on Blob Storage. Good to know that Amazon Web Services offers a Service called “Simple Storage Service” or “S3” in short. We integrated this service to replace the default storage system of WordPress. To boost performance, we also added a Content Distribution Network. There is another Service by Amazon Web Services, called “Cloud Front”. With Cloud Front, Content is delivered from various Edge-Locations all over the Globe. This should boost the performance of our Platform.

As a final add-on, we installed “W3 Total Cache” to boost performance by Caching Data. This should also significantly boost our performance. But now lets have a look at the new Load Test, again with Blitz.io. For our Test, we use the maximum we can do with our free tier: 250 concurrent users.

The output was:

  • An average of 15ms in delay
  • More that 10 million hits per day

Summing this up, it means that we achieved what we wanted: a fully scalable, distributed and performing WordPress platform. It is nice what you can do with a really great architecture and some really easy tweaks. Below are some graphics of our test run.

Load Testing an Amazon Web Service Micro Instance with Caching
Load Testing an Amazon Web Service Micro Instance with Caching
Amazon CPU Load on a Micro Instance with Caching and CDN
Amazon CPU Load on a Micro Instance with Caching and CDN
Advertisements

Creating a simple WordPress Blog with the Bitnami Stack on Amazon EC2


It is really easy to create a simple WordPress Blog on Amazon EC2 with the Bitnami Stack. To do so, simply click on “Launch Instance” in the Console.

Launch a new AWS Instance
Launch a new AWS Instance

 

Next, we get a Dialog where we can select the Wizard. For our sample, we use the “Classic” Wizard.

 

Create a new AWS EC2 Instance with the Wizard
Create a new AWS EC2 Instance with the Wizard

In the „Request Instances Wizard“, we now select the Tab „Community AMIs“ and type „wordpress“ in the Search Box. This will list us several WordPress-Enabled Instances.

 

Available AWS Community AMIs
Available AWS Community AMIs

We select an AMI that has the most recent WordPress Version installed. In the current case, it is “ami-018c8875“ but it might change over time.

In the next Dialog, we make sure to have “Micro” as Instance Type selected. This is the cheapest available instance type on EC2.

 

Select an instance type on AWS
Select an instance type on AWS

We simply confirm the next few Dialogs until we get to the point where we need to create a Key/Value Pair. This is necessary once we need to connect to the instance.

 

Create a new Key-Pair for an EC2 Instance on AWS
Create a new Key-Pair for an EC2 Instance on AWS

In the last Dialog simply click “Launch” and the Instance will be started.

Don’t forget to configure the security groups. If it is your first time with AWS, you might not have set HTTP Connections by the Firewall.

Amazon Web Services, the “Powered by Amazon Web Services” logo, are trademarks of Amazon.com, Inc. or its affiliates in the United States and/or other countries.

NoSQL as the Trend for databases in the Cloud?


SQL seems to be somewhat old fashioned when it comes to scalable databases in the cloud. Non-relational databases (also called NoSQL) seem to take over in most data storage fields. But why do those databases seem to be more popular than the “classic” relational databases? Is it due to the fact that professors at universities “tortured” us with relational databases and therefore reduced our interest – the interest of the “new” generation for relational databases? Or are there some hard facts that tell us why relational databases are somewhat out of date?

I was at a user group meeting in Austria, Vienna, one month ago where I talked about NoSQL databases. The topic seemed to be of interest to a lot of people. However, we sat together for about four hours (my talk was planned for one hour only) discussing NoSQL versus SQL. I decided to summarize some of the ideas in a short article as this is useful for cloud computing.

If we look at what NoSQL offers, we’ll find a numerous offers on NoSQL databases. Some of the most popular ones are MongoDB, Amazon Dynamo (Amazon SimpleDB), CouchDB, and Cassandra. Some people might think that non-relational databases might be for those people who are too “lazy” to do their complex business logic in the database. In fact, this logic reduces the performance of a system. If there is a need for a high-responsive and available system, SQL Databases might not be your best choice. But why is NoSQL more responsive than SQL-based systems? And why is there this saying that NoSQL allows better scalability than SQL-based systems? To understand this topic, we need to go back 10 years.

Dr. Eric A Brewer in his keynote “Symposium on Principles of Distributed Computing 2000” (Towards Robust Distributed Systems, 2000) addressed a problem that arises when we need high availability and scalability. This was the birth of the so-called “CAP Theorem.” The CAP Theorem says that a distributed system can only achieve two out of the three states: “Consistency, Availability and Partition tolerance.” This means:

  • That every node in a distributed system should see the same data as all other nodes at the same time (consistency)
  • That the failure of a node must not affect the availability of the system (availability)
  • That the system stays tolerant to the loss of some messages

Nowadays when talking about databases we often use the term “ACID,” but NoSQL is related to another term: BASE. Base stands for “Basically Available, Soft state, eventually consistent.” If you want to go deeper into eventually consistent, read the post by Werner Vogels – Eventually Consistent revisited. BASE states that all updates that occur to a distributed system will be eventually consistent after a period of no updates. For distributed systems such as cloud-based systems, it is simply not possible to keep a system consistent at all times. This results in bad availability.

To understand eventually consistent, it might be helpful to look at how Facebook is handling their data. Facebook uses MySQL, which is a relational (SQL) database. However, they simply don’t use such features as joins that MySQL offers them; Facebook joins data on the web server. You might think “What, are they crazy?” However, the problem is that the joins Facebook needs will sooner or later result in a very slow system. David Recordon, Manager at Facebook, stated that joins are better performing on the web server [1]. Facebook must know what is good performance or not as they will store some 50 petabytes of data by the end of 2010. Twitter, another social platform that needs to scale their platform, should also think about switching to NoSQL platforms. This will hopefully reduce the “fail whale” to a minimum [2].

Summing it up, NoSQL is relevant for applications that are in the need of large-scale global Internet applications. But are there any other benefits for NoSQL databases? Another benefit is that there are often no schemas associated with a table. This allows the database to adopt new business requirements. I’ve seen a lot of projects where the requirements changed over the years. As this is rather hard to handle with traditional databases, NoSQL allows easy adoption of such requirements. A good example of this is Amazon. Amazon stores a lot of data on their products. As they offer products of different types – such as personal computers, smartphones, music, home entertainment systems and books – they need a flexible database. This is a challenge for traditional databases. With NoSQL databases it’s easy to implement some kind of inheritance hierarchy – just by calling the table “product” and letting every product have its own fields. Databases such as Amazon Dynamo handle this with key/value storage. If you want to dig deeper into Amazon Dynamo, read Eventually Consistent [3] by Werner Vogels.

Will there be some sort of “war” between NoSQL and SQL supporters like the one of REST versus SOAP? The answer is maybe. Who will win this case? As with SOAP versus REST, there won’t be a winner or a loser. We will have more opportunities to choose our database systems in the future. For data warehousing and systems that require business intelligence to be in the database, SQL databases might be your choice. If you need high-responsive, scalable and flexible databases, NoSQL might be better for you.

Resources

  1. Facebook infrastructure
  2. Twitters switches to NoSQL
  3. Eventually Consistent
This post was originally posted by Mario Meir-Huber on Sys-Con Media.

 

Design Guidelines for Cloud Computing and Distributed Systems


Infrastructure as a Service and Platform as a Service offer us easy scaling of services. However, scaling is not as easy as it seems to be in the Cloud. If your software architecture isn’t done right, your services and applications might not scale as expected, even if you add new instances. As for most distributed systems, there are a couple of guidelines you should consider. I have summed up the ones I use most often for designing distributed systems.

Design for Failure

As Moore stated, everything that can fail, will fail. So it is very clear that a distributed system will fail at a certain time, even though cloud computing providers tell us that it is very unlikely. We had some outages [1][2] in the last year of some of the major platforms, and there might be even more of them. Therefore, your application should be able to deal with an outage of your cloud provider. This can be done with different techniques such as distributing an application in more than one availability zone (which should be done anyway). Netflix has a very interesting approach to steadily test their software for errors – they have employed an army of “Chaos Monkeys” [3]. Of course, they are not real monkeys. It is software that randomly takes down different Instances. Netflix produces errors on purpose to see how their system reacts and if it is still performing well. The question is not if there will be another outage; the question is when the next outage will be.

Design for at Least Three Running Systems
For on-premise systems, we always used to do an “N+1” Design. This still applies in the cloud. There should always be one more system available than actually necessary. In the cloud, this can easily be achieved by running your instances in different geographical locations and availability zones. In case one region fails, the other region will take over. Some platforms offer intelligent routing and can easily forward traffic to another zone if one zone is down. However, there is this “rule of three,” that basically says you should have three systems available: one for me, one for the customer and one if there is a failure. This will minimize the risk of an outage significantly for you.

Design for Monitoring
We all need to know what is going on in our datacenters and on our systems. Therefore, monitoring is an important aspect for every application you build. If you want to design intelligent monitoring, I/O performance or other metrics are not the only important things. It would be best if your system could “predict” your future load – this could either be done by statistical data you have from your applications‘ history or from your applications‘ domain. If your application is on sports betting, you might have high load on during major sports events. If it is for social games, your load might be higher during the day or when the weather is bad outside. However, your system should be monitored all the time and it should tell you in case a major failure might come up.

Design for Rollback
Large systems are typically owned by different teams in your company. This means that a lot of people work on your systems and rollout happens often. Even though there should be a lot of testing involved, it will still happen that new features will affect other services of your application. To prevent from that, our application should allow an easy rollback mechanism.

Design No State
State kills. If you store states on your systems, this will make load balancing much more complicated for you. State should be eliminated wherever and whenever possible. There are several techniques to reduce or remove state in your application. Modern devices such as tablets or smartphones have sufficient performance to store state information on the client. Every service call should be independent and it shouldn‘t be necessary to have a session state on the server. All session state should be transferred to the client, as described by Roy Fielding [4]. Architectural styles such as ROA support this idea and help you make your services stateless. I will dig into ReST and ROA in one of my upcoming articles since this is really great for distributed systems.

Design to Disable Services
It should be easy to disable services that are not performing well or influencing your system in a way that is poisoning the entire application. Therefore, it will be important to isolate each service from each other, since it should not affect the entire system’s functionality. Imagine the comment function of Amazon is not working – this might be essential to make up your mind about buying a book, but it wouldn’t prevent you from buying the book.

Design Different Roles
However, with distributed systems we have a lot of servers involved – and it‘s necessary not to scale a front-end or back-end server, but to scale individual services. If there is exactly one front-end system that hosts all roles and a specific service experiences high load, why would it be necessary to scale up all services, even those services that have minor load? You might improve your systems if you have them split up in different roles. As already described by Bertram Meyer [5] with Command Query Separation, your application should also be split in different roles. This is basically a key thing for SOA applications; however, I still see that most services are not separated. There should be more separation of concerns based on the services. Implement some kind of application role separation for your application and services to improve scaling.

There might be additional principles for distributed systems. I see this article as a rather “living” one and will extend it over the time. I would be interested is your feedback on this. What are your thoughts on distributed systems? Email me at mario.mh@cloudvane.com, use the comment section here or get in touch with me via Twitter at @mario_mh

  1. Azure Management Outage, Ars Technica
  2. Amazon EC2 Outage, TechCrunch
  3. The Netflix Simian Army, Netflix Tech Blog
  4. Representational State Transfer (REST), Roy Fielding, 2000
  5. Command Query Separation, Wikipedia
This post was originally posted by Mario Meir-Huber on Sys-Con Media.
The Image displayed for this post is Licenced under the Creative Commons and further details about the picture can be found here.

 

Get the latest news on Cloud Computing and Big Data


[widgets_on_pages id=”sb”]
[widgets_on_pages id=3]

The most popular Posts: