Friday, July 16, 2010

PostgreSQL: Performance Tuning

"Need is the mother of discovery" -Harpreet Singh
I wrote this line just few minutes before writing this blog, as my need of optimizing PostgreSQL's performance lead me to search/discover for some cool facts and features of postgres and tools related to it.

For any postgres user thinking about 100 or more concurrent users is like a nightmare. I will admit that some time back I was also a bit scared on thinking about 100 concurrent users with postgres, but with the end of my search I am happy that I found a usable way to achieve that.
"Knowledge increases by sharing"
So I thought I will pass it to everyone who is searching for it on the internet.
The need that triggered me to search for this was to recommend Hardware as well as Software configuration to support 100-200 concurrent users on Openbravo ERP and postgres/Oracle as the database.
For me as I am a postgres supporter I believed that postgres will be able to handle it. And yippee I was right.
Coming back to the main point:
Postgres doesn't support too many users (concurrent) by default, it comes with very solid configuration aimed at everyone's best guess as to how an "average" database on "average" hardware should be.
Postgres has some default configuration options to fine tune it, like:
- max_connections
- shared_buffers
- effective_cache_size
- etc etc.
But these are not enough for postgres to support 100+ (concurrent) users.
In a reply of my query to postgres performance mailing list, I came to know about connection pooling.
One and the only con that I saw in this is that it is external, I mean we have to configure an external tool to do connection pooling.
There are tools like pgpool to make the job easy for us (pgpool is a middleware that works between PostgreSQL servers and a PostgreSQL database client).
Connection pooling tools provide us features like:
- Connection Pooling: It reduces connection overhead, and improves system's overall throughput.
- Replication: Using the replication function enables creating a real-time backup on 2 or more physical disks.
- Load Balance: As the name suggests it distributes the queries on two or more replicated servers.
- Limiting Exceeding Connections: With the use of this extra connections are queued instead of returning an error immediately.
- Parallel Query: Using the parallel query function, data can be divided among the multiple (replicated) servers.
Configuring these properly can fine tune postgres' performance to handle 100-200 concurrent users.

Happy *postgresing*

To read more about performance tuning in postgreSQL read this.
For more on pgpool click here.

Thursday, July 1, 2010

Module Integration with CI (Hudson)

Module Integration with CI (Hudson)

Long back we (RM @ Openbravo) introduced CI (Continuous Integration) tool (Hudson) for testing code of our core ERP development branch.
Which allowed our developers to do:
- Daily Builds (Full/Incremental).
- Smoke Tests.
- DB Consistency Tests.
- etc.

But as most of the developers were becoming modular (Openbravo became modular with Openbravo ERP Version 2.50), CI was not able to maintain the pace and provide similar help for the module testing.

To set things in place we enabled our developers to integrate and test their modules using new CI even after committing even a single changeset to their module repository.

According to me the goal or I will say the purpose of this whole effort was to enable a developer to have a nice and sound sleep after he pushes his commit to the module repository.
Sounds confusing?

Let me explain it.
Earlier developers use to develop a module and used to do time consuming small manual testing to make sure that their code is bug free.
From a developers perspective he cannot sleep properly until his module is tested and deployed properly.

To enable CI for modules and save developers time (from manual testing) we created the setup which will help developers to directly configure a new job in Hudson to test their changesets. This new setup empowers them to do:
- Sanity Check
- Source compilation check and Create OBX
- Database consistency test
- Module's JUnit test
- Installation of the generated OBX
- Un-Installation of the installed module
- Selenium test (module smoke)
- Upgrade the module from previous published version in CR (Central Repository) to generated OBX
Even for a single new changeset in the module's repository.

And still it has endless possibilities, where we can integrate new test cases to this.

We have also created a template job (which is pre-configured with all these test cases) to help developers configure and run tests for their modules easily.
Developers will just have to copy the template job to a new job, change the variables to their modules related variables and then run the job. We have also created a simplified wiki for step by step instructions.

* Currenty Openbravo developers working on any Modules can take benefits of this tool (but sky is the limit, maybe someday we can allow partners/community to take advantage of this tool).