Open-source Web applications, PHP vs. Java

Part 1:

It is common knowledge that PHP does well in the open-source Web applications space. PHP has numerous representatives for most application categories, and for some it provides a clear leader, like WordPress. On the other hand, most Java counterparts have apparently failed to reach the same popularity.

Below is an overview of the existing open-source PHP and Java implementations for the following categories of applications: forum, blog, wiki and content management systems (CMS). These are among the most commonly used types of on-line software.

Forum Engines

There are lots of PHP forum engines in the open-source software arena, including:

  • phpBB which has 700+ mods and is the winner of the 2007 Community Choice Awards for the Best Project for Communications category,
  • vBulletin with 1000+ mods,
  • punBB having 300+ projects and 150+ styles.

Notice the large amount of plugins for each of these projects. This denotes a large and healthy user base, devoted to both using and extending the core application.

Java has the JForum and JavaBB projects, but they seem to have very few users by comparison, not to mention plugin contributors.

Blog Engines

In the area of blogging, the PHP based WordPress is extremely well spread. Chances are that virtually any blog you read is powered by WordPress. The project has 2000+ plugins and at least 5 dedicated printed books. No other PHP blog engine comes close to the popularity of WordPress.

For Java, there are two relevant open-source blog applications: Apache Roller and Pebble. Roller is more feature-rich, but that comes at a considerable price: large footprint and difficult configuration. It is an enterprise application focused on very large blogging sites (e.g. the Sun blogs), but it seemingly fails to satisfy small-scale needs. I have yet to find a plugin developer community around it.

Pebble on the other hand is focused on the other end of the spectrum, providing a much simpler configuration and lower footprint. But I still couldn’t find plugins.

Wiki Engines

In this category, PHP seems to have an overwhelming advantage. The MediaWiki project powers Wikipedia, the largest wiki by far. The project has lots of available extensions, of which 300+ are stable. There are lots of other PHP wiki engines one can choose from, but I just wanted to point out the most prominent one.

Java does not have too many production-ready wiki projects. In my opinion, JspWiki (with 50+ plugins) and XWiki are the most relevant. I wanted to mention SnipSnap as well, however it’s development is officially stopped.

Content Management Systems

Content management systems are a handy way of building dynamic sites instead of starting from scratch. PHP seems to provide everything one needs, including a healthy competition among its foremost projects. These are Joomla (2900+ extensions, 14+ books published) and Drupal (3600+ modules, 9+ books). There are also other CMS projects e.g. Mambo and the ancient PHP-Nuke.

Because in most cases CMSs are deployed on a dedicated server, Java should not be at a disadvantage in this category. There are several open-source Java CMSs to choose from: Apache Jackrabbit, Apache Lenya, Alfresco, Liferay, OpenCms, Nuxeo, Magnolia, Jahia etc. Among these, Alfresco (2 printed books and 20+ stable extensions) and Liferay (portlets based, 1 printed book, 25+ portlet plugins) seem the more popular.

A special note for the Java Content Repository API defined by JSR-170. Both Jackrabbit and Magnolia implement it. This means that third-party tools can access their repositories in a standardized way. It is a very good step towards ensuring that information stored inside a JCR compliant repository can outlive a particular JCR implementation.

For now, the JCR API does not seem enough however. The PHP CMS projects are continuously gaining ground, probably because PHP hosting is cheap and easy to set up, and because the projects themselves are highly usable. Take into account that DZone and implicitly JavaZone (the former JavaLobby) are running on Drupal. And I’m sure that Rick and Matt tried to choose the best option while not easily dismissing the Java CMSs.

In the second part of this post, I will try to explore the reasons for the current state and to figure out a way to change it. Meanwhile feel free to add your input to the above, the list of projects is far from exhaustive.

Part 2:

The first part of this article reviewed some relevant open-source Web applications in both the PHP and Java worlds. The conclusion was that there are massively popular PHP projects and somewhat popular if not obscure Java counterparts. I’m a Java fan and it pains me to discover this reality. The user comments also underlined this feeling.

Is it the technical merits of PHP?

In my experience, the technical merits of PHP are below those of Java as a language and as a runtime environment (standard API, virtual machine).

Compared to Java, the code quality of PHP projects has a faster decreasing rate as the codebase size grows. The root cause is that PHP was created to solve small size problems and this makes it difficult to manage larger projects.

PHP 3 and 4 had basic object-oriented features, while PHP 5 improved them considerably, both at the language and the runtime level. There are several PHP MVC frameworks to ease the structuring of larger projects, but these are most effective when running on PHP 5. Most popular open-source PHP projects still run on PHP 4 and tend not to use MVC frameworks at all.

Looking at the staggering number of plugins available for the popular PHP open-source projects, one could conclude that their code is easily understandable and that PHP has well-rounded application extension mechanisms. Well, not exactly true.

The typical PHP extension mechanism is procedural and works like this:

  • list the subdirectories of the extensions directory,
  • analyze the predefined directory structure for each extension,
  • execute some predefined PHP files that should auto-register their resources and actions.

The greatest concern? no protection of the core code. All the important internal structures of the PHP runtime are map-like data structures to which the PHP code has full access: the global variables map, the functions map, the classes map etc.

As a long-time Java developer, I see these as drawbacks. But they don’t seem to reflect as such on these popular projects with very large user and developer communities. But if it’s not the language, what is it?

The execution models

Compared to Java, developing and hosting PHP projects is dead-simple because of the execution model.

In a typical setup, each request to a PHP application is handled by a separate Apache process that uses its own instance of the PHP interpreter. After handling the request, the process is killed with no garbage left behind. This sounds inefficient for high-concurrency usage, but works great in a shared hosting environment. If a hosted application has no active requests, it doesn’t use memory at all. Development-wise, each PHP script is written like every script instance (process) is the only one running.

The Java Web application model uses servlets and multiple threads to handle requests. It scales upwards very well, hence its success in the enterprise space. Problems arise if you want to host many smaller and less frequently used applications inside the same Web container, precisely because of the threads. Developers have to be careful about concurrency issues.

The process control mechanisms available at the OS level are vastly superior than those available for Java threads. The result is that a hosting provider has strict control over the resources used by each PHP request. On the other hand, a Java thread backing a request is an object that you cannot control once started: it stops when it wants to, it uses as much system resources as it pleases. The Web containers only have mechanisms for controlling the pool of threads.

The end result is that there are technical limitations in setting up a competitively priced reliable Java hosting solution. This brings us to two fundamental questions: do we need successful open-source Java Web projects suitable for non-enterprise use? Can Java survive without such projects?

Java can probably survive, but survival and flourishing are two different things. If the average present-day college student or hobbyist finds pleasure in using and extending open-source PHP (and generally non-Java) Web applications, I believe that it is only a matter of time until the effects are felt in the enterprise space: less enthusiastic Java specialists, less innovation, decreasing quality of products and so on. Some say that the effects are already present ? what do you think?

Instead, wouldn’t you like to run your blog on a Java-based highly extensible engine? Wouldn’t you like to build complex sites using a Java-based CMS with many high quality, readily available, easy to develop modules? Something that can be as small or as large as you want. I would.

Building successful open-source Java Web applications – your input is needed

In my opinion, there are three important premises for an open-source Web project to succeed:

  • the intrinsic value of the project,
  • the enthusiasm of its community (both developers and users),
  • easy hosting.

If Java presence is to increase in the area of open-source Web applications, all three are needed. The first two items are crucial because valuable projects and enthusiastic communities can even help improve the existing entry-level hosting options.

The fastest way to obtain Java projects with the same functionality as PHP ones is by automated software translation. The company I co-founded has developed technology that allows translating PHP applications to Java. I have already talked about this while presenting the nBB2 project, the Java equivalent of phpBB 2. The migration algorithms can be customized to produce various output flavors, ranging from plain servlets and JSP pages to Web frameworks like Struts or Spring MVC.

We would like to translate more open-source PHP projects to Java, but this would be just a first step. The Java community would then have to step in by using and extending them.

In the comments following the first part of this article it has been suggested that Sun act and sponsor key open-source Java Web projects. I my opinion, Sun already provides the Java platform and an open-source stack to support such endeavors. It is up to the Java community to make proper use of the technologies at hand. I look forward to hearing your thoughts on this topic, both online and in person. Next week I’ll be at JavaOne, so if you are around feel free to pass by the Numiton booth 1224-8 in the startup row.



7 thoughts on “Open-source Web applications, PHP vs. Java

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s