Why you should do performance tests

With this posting I want to start a series of small articles which deal with performance tests and performance tests in projects with CQ5. It can be applied to many other web projects.

This topic is one of my all-time favorites in a project plan. According to my experience with CQ projects in the last yearsperformance tests are usually planned to be executed 3 weeks before the golive, with some person days of preparation planned. Noone knows, what should be tested to what extent, and also noone knows what the results should be. And performance tests are also one the first activities removed from the project plan if time is getting short.

But of course most members (mostly the technical ones) of the project team know, that dropping the performance tests will cause heavy problems later in production. But the “told you so” afterwards isn’t really helpful …

Some reasons, why you should do a performance test:

  • During development the focus is on coding and functionality. Functional tests are executed by developers or small test teams. In a production use case the number of users is usually much higher. Just working on the assumption, that the system works properly, is highly risky.
  • When you have done a performance test, you should know some conditions, under which the system performs well or fails. In both cases you learn about your system and you can either address deficits early or reduce the risk of performance assumptions.
  • Operation will have an easier job, if you can tell them about the capabilities or limits of the system. Then they can act accordingly in case of problems and don’t need to waste effort to determine root causes and issues of your shiny new application. (In the end they cannot do much about it, especially on a production system. They need to bring it back online as soon as possible, they won’t debug it.)
  •  Performance deficits sometimes go hand in hand with stability issues. Consider a request, which requires some memory to execute. If you have 5 of such requests in parallel, it is not a problem, but in case you have 100 concurrently of them (due to the slow processing of them), the stability of your JVM might be impacted.
  •  The business aspect: A bad performance can slow down your business performance and have a huge impact on your KPIs like successful conversions. If you know, that certain parts of your site are slow, you can address these deficits early in the project, long before they impact the KPIs.

So from a risk-perspective it is very helpful to conduct performance tests. If the discussion is about cutting a feature or a performance test, strongly recommend the feature (unless it’s one if the features of the minimum-viable product). From an overall stress-level it is usually much easier to add a feature after golive than to have a  release, which has performance and/or stability problems.

Joergs rules for loadtests

In the article “Everything is content (part 2)” I discussed the problems of doing proper loadtests with CQ with respect to your CQ which gets (a bit) lower by every loadtest. In a comment Jan Kuźniak proposed to disable versioning and to restore your loadtest environment for every loadtest. Thinking about Jans contribution revealed a number of topics I consider as crucial for loadtests. I collected some of them and would like to share them.

  • Provide a reasonable amount of data in the system. This amount should be kind of equal to your production system, so the numbers are comparable. Being 20% off doesn’t matter, but don’t expect good results if your loadtest runs on 1000 handles but your production system heads directly to 50k handles. You may optimize the wrong parts of your code then.
    When you benchmarked a speedup of 20% in the loadtests but got nothing in production system, you already saw it.
  • When your loadtest environment is ready to run, create a backup of it. Drop the CQ loadtest installation(s) from time to time, restore it from the backup and re-run your loadtest a clean installation to verify your results. The point I already mentioned.
  • Always have the same configuration in the production and loadtest environment. That’s the reason why I disagree to disable versioning on the loadtesting environment. The effect of diverging configuration may be the same as in the above point: You may optimize the wrong parts of your code.
  • No error messages during loadtest. If an error messages indicates a code problem, it’s probably reproducable by re-running the loadtest (come on, reproducable bugs are the easiest ones to fix :-)). If it’s a content problem you should adjust your content. A loadtest is also a very basic regression test, so take the results (errors belong there also!) seriously.
  • Be aware of ressource virtualization! Today’s hype is to run as much applications as possible on virtualized environments (VMWare, KVM, Solaris zones, LPARs, …) to increase the efficency of the hardware usage and lower costs. Doing so very often removes some guarantees you need for comparing results of different loadtests. For exmple on one loadtest you have 4 CPUs for you, while on the second one you have 6 CPUs available. Are the results comparable? Maybe they are, maybe not.
    Being limited to always 4 CPUs offers comparable loadtests, but if your production systems requires 8 CPUs, you cannot load your loadtest system with production level numbers. Getting a decent loadtest environment is a hard job …
  • Have good test scenarios. Of course the most basic requirement. Don’t just grab the access.log and throw it at your load injector. Re-running GET requests is easy, but forget about POSTs. Modelling good scenarios is hard and needs much time.

Of course there are a lot of more things to consider, but I will limit myself to these points at the moment. Eventually there will be a part 2.

Everything is content (part 2)

Recently I pointed out some differences in the handling of the “everything is content” paradigma of Communique. A few days ago I found a posting of David Nüscheler over at dev.day.com, in which he explained the details of performance tuning.

(His 5 rules do not apply exclusively to Day Communique, but to every performance tuning session).

In Rule 2 he states:

Try to implement an agile validation process in the optimization phase rather than a heavy-weight full blow testing after each iteration. This largely means that the developer implementing the optimization has a quick way to tell if the optimization actually helped reach the goal.

In my experience this isn’t viable in many cases. Of course the developer can check quickly, if his new algorithm performs better ( = is faster) than the older one. But in many cases the developer doesn’t have all the ressources  and infrastructure available and doesn’t have all the content in his test system; which is the central point why I do not trust tests performed on developer systems. So the project team relies on central environments which are built for load testing, which have loadbalancers, access to directories, production-ready sized machines and content which is comparable to the production system. Once the code is deployed, you can do loadtesting. Either using commercial software or just using something like jmeter. If you can use Continious integration and and a autodeployment system, you can do such tests every day.

Ok, where have I started? Right, “everything is content”. So you run your loadtest. You create handles, modify them, activiate and drop them, you just request a page to view, you perform acitivies in your site, and so on. Afterwards you look at your results and hopefully they are better than before. Ok. But …

But Communique is not built to forget data — of course it does sometimes, but that’s not the point here :-), so all these activities are stored. Just take a look at the default.map, the zombie.map, the cmgr.hist file, … So all your recent actions are persisted and CQ knows of them.

Of course handling more of such information doesn’t make CQ faster. In case you have long periods of time between your template updates: Check the performance data directly after a template update and compare them to the ones a few months after (assuming you don’t have CQ instances which aren’t used at all). You will see a  decrease in performance, it can be small and nearly unmeasurable, but it is there. Some actions are slower.

Ok, back to our loadtest. If you run the loadtest again and again and again, a lot of actions are persisted. When you reproduce the code and the settings of the very first loadtest and run the very first loadtest again, and you do that on a system which already faced 100 loadtests, you will see a difference. The result of this 101st loadtest is different from the first one,although the code, the settings and the loadtest are essentially the same. All the same except CQ and its memory (default.map and friends).

So, you need a mechanism which allows you to undo all changes made by such a loadtest. Only then you can perfectly reproduce every loadtest and run them 10 times without any difference in the results. I’ll try to cover such methods (there are some of them, but not all equally suitable) in some of the next posts.

And to get back to the title: Everything is content, even the history. So in contrary to my older posting,where I said:

Older versions of a handle are not content.

They are indeed content, but only when it comes to slowing down the system 🙂