Under The Hood – PHP and MySQL

(This introductory article was written in 2001 to help explain to clients why LAMP (Linux, Apache, PHP and MySQL) were chosen as the infrastructure for for certain kinds of web sites. We generally choose other tools now.)


Many of our dynamic web sites are built using PHP and MySQL. Although these products are frequently used together, they do not have to be, and each plays a separate role in the dynamic web content generation process. The two major components needed for that process are:

  1. A database system to hold the dynamic web site’s data. Although it is possible to build small sites on flat-file storage systems, a database server is a much more scalable, reliable foundation for a dynamic web site. The database server is not web-specific per se, although it does need to potentially support a heavy load of queries, slanted heavily towards “SELECT” (data reading) queries rather than data updating queries.
  2. A web-database integration tool (scripting language or equivalent) for dynamically generating HTML pages, email messages, etc. based on data stored in the database, and processing user interactions to update that database.

Often for the database, we choose MySQL for several reasons:

  • Reliability – based on reading comments from many users, MySQL appears to be a reliable product. We have experienced no MySQL failures of any kind ourselves.
  • Price – MySQL is free, keeping the budget to a minimum for smaller sites.
  • Speed – MySQL is designed to operate very quickly and efficiently for simple SELECT statements, at the expense of lacking some features that “heavier” databases have, such as transaction support, stored procedure, correlated subqueries, etc. Although these are very valuable features, most dynamic web sites do not need these features because of the nature of the features they provide.

We are also familiar with a number of other database choices for other projects:

  • Microsoft SQL Server – this product, in it’s 7.0 and higher versions, is a very nice database server with excellent integration (of course) in to the Windows NT environment. It would be an obvious choice to consider when working on a project which was required to be NT-based.
  • Oracle – Oracle is the market leading for large database servers, and runs behind the scenes and many of the internet’s busiest web sites. It has a mildly painful but extremely powerful stored procedure capability, and can be tuned for high performance under extreme load, if you can find a sufficiently skilled DBA. We are happy to bulid an Oracle-based solution where it is appropriate. Oracle is however quite expensive and require much more hardware to run well than MySQL, for example.
  • PostgreSQL – This is the “other” free database server out there. It has only recently begun to approach the stability and speed of MySQL, and it offer a much richer set of features, more similar to the top-tier commercial database servers.
  • Interbase – Released as one point as open source from Borland, Interbase is also very fully featured. It seems to not be as fast as MySQL for simple SELECTs (the most common case in web apps), but can handle complex tasks effectively and does not require much DBA attention.
  • Various others.

For our programming / web-database integration tool, we are using PHP, a web-specific scripting language. There are literally hundreds of web-database tools to choose from, so not surprisingly there are many excellent choices. Some of the reasons we use PHP are:

  • Tight integration with Apache, the most popular web server on the internet.
  • Good performance, because of that tight integration. PHP is an excellent compromise between power and efficiency; it provide a flexible, expressive language yet it is simple enough to teach to new team member quickly
  • A very rich array of built-in features. In particular, PHP has many capabilities “in the box” that have to be added on to other popular solutions like ASP.
  • Built in database access, with connection pooling.
  • It is Open Source, and under very active development; any bugs that appear get fixed quickly.
  • There are commercial performance enhancement mechanisms available if needed.

PHP is not without weaknesses, however. For example, its basic design encourages mixing the scripts in with the site’s HTML. This works well for small sites, but becomes very troublesome on large site where different people are responsible for the HTML and the programming. Another weakness is that the language relies mostly on programming technique to remain manageable and structured; it has almost no support for modularization. Some other web-database integration tools that address these issues and others are:

  • Java Servlets, which leverage the increasingly dominant Java language.
  • a Java-based application server, such as WebLogic, WebSphere, etc. These can be used to build systems which include servlets, JSPs,EJB, JMS, and other powerful Java technologies.
  • mod_perl integrates the remarkably expressive Perl language into Apache, and the wide array of accompanying Perl modules can implement isolation between the application code and HTML if desired.
  • Zope, an application server based on the Python language, has a strong following and provide the desired seperation between logic and HTML. It also offers total web-based management, scalability, and a vibrant developer community.

We are pleased and satisfied with the usability and performance of PHP and MySQL. They are an excellent combination for small to medium-scale read-intensive web application projects.

Recommended Reading

Design Patterns, Gamma et al. I have purchased and read several other books on this topic, including language-specific books and longer books. I still like this one best.

Programming Perl, Wall, Christiansen, and Orwant. The "camel book" is a clear and concise book about a perhaps unclear but very concise language.

UML Distilled, Martin Fowler, Kendall Scott, and Grady Booch. An excellent introduction to and explanation of the UML. This book can clear out the fog of an initally dizzying set of diagrams and distill out (!) their importance.

Refactoring : Improving the Design of Existing Code, Martin Fowler. This is a book that resonates strongly with experiences I have had where an existing code base had to be brought under control, cleaned up, and then enhanced. Such work is not totally ad hoc, but instead can be understood (and performed) as a series of well defined transformation. By doing so, and talking about it, our ability to talk about and manipulate software at a higher level is enhaced.

Extreme Programming Explained, Kent Beck. There is a lot of merit in going to original sources… the first book on a topic, a specification document on a technology, etc. This is that book for Extreme Programming, and the subtitle on the cover expresses the core thrust of XP in two words.

 

The links all go to Amazon for simplicitiy’s sake; you can perhaps get them elsewhere at a lower cost.

Refactoring

Here is a common situation where refactoring can be used:

Problem:

You have a body of code which exposes and API (set of entry points). You want this code to have a different API.

Non-refactoring based solution:

Take the old code apart. From the pieces, write new code that has the desired API. The problem with this is that it’s a big leap… with little confidence along the way that the code will work.

Refactoring based solution:

Create a empty implementation of the desired API. Use the old code’s API to implement these functions – create a wrapper around the old code. Verify (with unit tests) that the code does what you want, using the desired API.

Now, consider the “wrapper” and the old code as a body of code to be refactored… incrementally change it to improve the design, maintainability, etc.

Alternatively, if the code is “throw-away” code not of long term important, you have the option to stp – just leave the code wrappered and un-refactored.

Right now, most refactoring is happening “manually” – but in the future, a direction that IDEs can add more value is by automating the process. Imagine right-clicking a method and selecting “refactor up to superclass”. You can take a look at this idea with a tool that “adds-in” refactoring to your existing IDE, like TransMorgify. A new Java IDE, IntelliJ IDEA offers a limited set of refactoring tools, built in to the IDE.

KCSM – Kyle Cordes’s Session Manager

Back in the dark ages of PHP3, session support was not built in.

KCSM implements and ASP style session[] array in PHP3. There are two version of KCSM – one works with files stored on the web server (this requires no database, and should work with any web hosting account), and one that works with MySQL.

Download as a gzip’ed tar file: kcsm0.2.tgz

Download as a ZIP file: kcsm0.2.zip

An updated kcsm.php3 file, modified by a user (Garritt Grandberg) is available here.