Careers: Interviews
World Authority/Contributor to the PHP Project and Apache Module Author
This week, Stephen Ibaraki, I.S.P., has an
exclusive interview with an international authority in PHP, George
Schlossnagle.
George is a Principal at OmniTI Computer
Consulting, a Maryland-based tech company specializing in high-volume web and
email systems. Before joining OmniTI, George lead technical operations at
several high-profile web sites where he gained experience managing PHP in very
large enterprise environments. George is a frequent contributor to the PHP
community. His work can be found in the PHP core, as well as in the PEAR and
PECL extension repositories.
George is a frequent speaker on PHP and web
technology. He is a regular speaker at PHP-Con, ApacheCon, the International
PHP Conference as well as many other local and regional engagements. His
writings have appeared in SELECT magazine (Journal of the International Oracle
Users Group) and PHP Magazine. He has forthcoming articles in php|architect and
Oracle Technology Network. His book, Advanced PHP Programming, published by
Sams Publishing, is garnering wide attention.
Before entering into information
technology, George trained to be an applied mathematician and served a 2 year
stint as a teacher in the Peace Corps. His experience has taught him to value
an inter-disciplinary approach to problem solving that favors root-cause
analysis of problems over simply addressing symptoms.
Discussion:
Q: George, you are one of the world�s
foremost authorities in PHP. We appreciate you taking the time to speak with
us.
A: Thanks, it's my pleasure.
Q: Give us a life history, explain how did
you get into computing, and describe some valuable lessons learned?
A: My first computer was an Apple 2e, which
I got during middle school. Unlike other more prolific hackers, I was only a
casual user and only did simple programming. My first 'hardcore' exposure to
computing was in 1993 when I interned in a summer program at Argonne National
Labs. I had been looking for an internship in pure mathematics (what I was
majoring in), but since that type of internship is very rare I ended up doing
research on wavelet techniques for solving fluid mechanics problems. This
involved writing numerical analysis software in Fortran.
Over the next several years I continued
studying mathematics in graduate school, crossing back and forth over the line
between theoretical and computational analysis. When I finally left graduate
school the tech bubble was still inflating and the internet seemed a natural
place to go. I had been maintaining my grad school department's UNIX systems
and was competent in Perl and teaching myself C (a big break from the Fortran
that many people still use in computational mathematics!), so I got a job doing
systems administration at iVillage.com. I enjoyed the high-pressure environment
of dot-com operations, so after a year there, I joined Community Connect, Inc. (CCI)
which runs the online community sites BlackPlanet.com, AsianAvenue.com and
MiGente.com.
At CCI I was Director of Operations and was
responsible for the architecture all technical aspects of the site. The CCI
management gave me great leeway with infrastructure changes, and in the process
of growing the site to 130 million dynamic page requests per day, I learned a
tremendous amount about running PHP on a large site.
In 2002, I left CCI to join my brother at
OmniTI, a small consulting company he had formed and which I had been doing
off-hours work for several years. OmniTI is a small shop and we specialize in
building scalable web and email architectures. For the past year I've devoted
most of my work time to developing Ecelerity, a high speed MTA.
Q: What is your most surprising experience?
A: Gosh, that's a hard one. In the
technical venue, I would have to say the first time someone I didn't know
personally used a piece of open source software I wrote. On a personal front,
however, my wife and I are expecting our first child in the fall, so I've been
told that I will have many surprises ahead.
Q: Do you have any humorous stories to
share?
A: I have a rather dark sense of humor, so
for me a funny experience would be something like traveling to Portland for a
conference and finding my reservation cancelled and all the nearby hotels fully
booked. It was funny in retrospect at least.
Q: Please share your experiences in the
Peace Corps.
A: After two years in graduate school I began
to feel that the academic life was not for me; however I was afraid that if I
took a job in industry I would become dependent on the income and never leave,
so I went looking for something with a fixed tenure. Offering two years abroad,
the Peace Corps seemed perfect. After a rather long application process, I had
the opportunity to serve as a secondary school Math teacher in Nepal. My fellow
volunteers and I underwent intensive in-country language training, and I was
stationed in the village of Khamlalung, in Tehrathum district which lies in the
eastern hills of Nepal.
Khamlalung was pretty remote - it lies two
days from the nearest road and doesn't have electricity or running water. It
was also very beautiful. From the top of the ridgeline above my house, there
was an incredible view of Kangchenjunga (third highest mountain in the world.) I
spent two years in Khamlalung teaching 4th through 10th grade Math and English,
and working on a number of community projects.
Living in the developing world is an
amazingly different experience from living in the US. Aside from the obvious
problems such as language difficulties, lack of amenities, and loneliness,
there are a wide range of cultural differences that you can either choose to
embrace or distance yourself from. I tried my best to embrace them and made a
number of great friends as a result. It was both the hardest thing I've ever done,
and the most rewarding.
Unfortunately, in the past couple years
Khamlalung (like many places in Nepal) has become embroiled in military actions
between Maoist insurgents and the Nepali army, hurting many friends,
acquaintances and students.
Q: You are a regular speaker at PHP-Con,
ApacheCon, and the International PHP Conferences. Please share some valuable
speaking and technical tips from these conferences.
A: I think too many speakers underestimate
their audiences. Speakers often target the difficulty of their topic at the
median audience member, and as a result end up going too slowly for half the
people there. I try to make my talks a bit more advanced, or at least more
fast-paced. I would rather have people ask me questions after the session than
to bore everyone to death.
Q: Can you share your tips from your
writings in SELECT magazine and PHP Magazine?
A: SELECT was the first non-academic
journal I wrote for. In retrospect the article was pretty dry - it covered
capacity planning techniques for running Oracle on Solaris. Although the
material was narrowly focused, the basic idea was simple: in a web environment
where you have a large number of connections to a database, the key to avoiding
resource exhaustion is controlling connection concurrency. To do this, you
measure the average resource utilization for a single connection and then use
that to determine how many connections you can feasibly support. This general
strategy works for everything from sizing MySQL instances to webserver instances,
to any sort of client/server application.
My PHP Magazine article described the inner
workings of compiler caches. In PHP, all data that is created during a request
is destroyed at the end of it, including the parse tree for the script that was
executed. This means that every time a script is run (or included by another
script), it must be read in from disk and parsed before it is executed. For
large scripts, or scripts with a large number of includes, this can be
extremely expensive. A compiler cache saves the results from the initial parse
of every script and allows you to avoid the parse overhead on subsequent
requests. Because it runs inside the Zend Engine (the scripting core of PHP), a
compiler cache is completely transparent to the programmer and requires no
modification of PHP code to function. It's about the closest you can come to a
configuration setting like 'fast = true'.
Q: What can you share from your articles in
php|architect and Oracle Technology Network?
A: I've written two articles for
php|architect, the most recent an introduction to regular expressions. Many PHP
programmers are scared off by regular expressions, mostly because they are a
relatively complex and terse language in and of themselves. That's unfortunate,
because when properly used regular expressions are an incredibly powerful tool
for analyzing and modifying text (which is a major part of web programming). Hopefully
my article dispels some of the mystery surrounding regular expressions.
My OTN article is part of their series 'A
Hitchhiker's Guide To PHP'. It discusses how to avoid common pitfalls when
building large sites around PHP. The series also contains articles by core
developers Rasmus Lerdorf and Wez Furlong , so it's well worth checking out.
Q: Describe your collaborations with
Sterling Hughes, a core PHP contributor.
A: Sterling and I are good friends. We
co-presented a tutorial on performance tuning PHP at the International PHP
Conference in 2003, and he was a technical editor on Advanced PHP Programming. He's
made contributions to the APC and APD extensions for PHP. Like many of the best
technical relationships, our exchange of ideas has had much more impact on our
respective projects than any direct collaboration. We're hoping to write a book
together later this year.
Q: Your most recent book has a strong
endorsement from Rasmus Lerdorf, creator of PHP. Share your experiences with
the book and with Rasmus.
A: I met Rasmus for the first time at the
Apachecon 2000 conference. Although I doubt he remembers the meeting, it was
the first inspiration I had for writing the APC compiler cache for PHP. Our
conversation went something like this:
GS: I'd like to use a compiler cache but
the Zend Accelerator is really expensive.
RL: So go write your own. It can't be that
hard, the hooks all have to be there.
GS: (panicked) Uhm� yeah, I guess so.
And a couple months later, after coming to
terms with the fact that he had to be right, APC was born.
Advanced PHP Programming is my first book. The
PHP book market is flooded with introductory texts that walk you up from a
basic level, but I wanted a book that would be interesting to people that
already know PHP well. In the personal reviews I've read on my book, a number
of people have said things along the lines of "I didn't think that there
was anything to learn about PHP that I didn't already know." If I can
please that sort of reviewer, I'll consider the book a success.
Q: What ten compelling tips can you share
from the book?
A: In many ways the most critical aspect of
being a professional programmer is being organized. These tips are pretty
obvious and not specific to PHP, but enough people ignore them that they are
worth mentioning:
1) Document your code. "Self-documenting
code" tends to mean that you're too lazy to correctly document your code. Not
only does this make it hard for others to understand your code but for yourself
as well, if you write any volume of code (at least when you become old and
forgetful like me). Documentation should be insightful, and guide the reader
through potentially confusing logic.
2) Keep your API consistent. If you have a
set of function which require a resource (say a database handle), make sure it
always appears in the same position. This greatly reduces human error and keeps
people from constantly having to reference your documentation or source code.
3) Organize your code. Organization means
several things. First, reduce duplicated code by abstracting it into functions.
Second, group similar functions and classes together in include files so that
you know where to find what you are looking for. PHP parses and executes all
includes at runtime, so don't go overboard with creating a large library tree,
but also don't be afraid to have to include 5 or 10 files on every page. It's
always easier to combine things than to pull them apart.
4) Use a change-control system religiously.
I like CVS, because it's easy to find developers that know it; but Subversion,
BitKeeper and SourceSafe are fine options as well. The critical part is that
you should be able to easily roll your project back to any point in time and
have a complete record of all changes to the source.
5) Test your code. Using Unit Testing
consistently is a hard habit to acquire, but well worth it. The largest
obstacle to refactoring or enhancing in a large code base is that a small
change may have unexpected affects throughout the rest of the project. A
comprehensive Unit Testing suite is your insurance policy against this
uncertainty.
On the performance/scalability side of
things:
6) Use a compiler cache. I talked a bit
about why this is useful above, but it's worth reinforcing that a compiler
cache is by far the easiest way to improve the performance of your site.
7) Look for opportunities to use caching
techniques in your code. Most dynamic websites are not wholly dynamic, in the
sense that their data is often static for seconds, minutes or even hours. Exploit
this short-term static-ness for performance benefits.
8) Profile your code. It is pure hubris to
think you know where all the bottlenecks are in your code. Using a profiler
(like APD or Xdebug) will help you gain insight into the code path taken
through your scripts, and where time is being spent.
9) Control your resource usage. Making 100
database calls on a page or making inline calls to remote SOAP services is just
crazy. Don't tie your own performance to that of third-party services which you
can't directly control.
10) Design your projects for horizontal
scalability. Websites grow and shrink in popularity, and your application
should be able to grow and shrink with incremental addition and subtraction of
hardware resources.
Q: Can you provide debugging tips?
A: Many people like stepping debuggers
(which allow you to step through your code instruction by instruction, inspecting
and modifying variables as you go). Xdebug is one of the better PHP debuggers
in that sense. In an article I can no longer find a reference for, Martin
Fowler contends that the use of modern debuggers actually slows the development
of bug free code. The idea is that by prematurely focusing on minutiae, you
tend to lose the forest for the trees. Instead, comprehensive test suites helps
you quickly find the location of your potential bug.
Q: What future books can we expect from
you?
A: There are a few I'm thinking about, but
since the proposals aren't finished yet their details are secret for now. What
I can say is that they all revolve around PHP and Apache, which are the two
technologies I am most fond of.
Q: What are the most important trends to watch,
and please provide some recommendations?
A: Well, I'm not a very good predictor of
technology trends. I remember looking at eBay back in 1999 and thinking 'Boy,
these guys have no product.' But since you asked �
1) RSS and syndication formats. They change
the way the web is traversed.
2) Convergence of scripting languages. We
are already seeing this with many of the Microsoft languages compiling to CLR. Sterling
and Thies Arntzen are working on a PHP compiler that targets Parrot, the engine
to power Perl 6. This convergence is powerful as it helps pool development resources
to make all the concerned languages stronger.
3) Anti-Spam technologies are on the brink
of a revolution. There are many proposed standards on the table including Yahoo!'s
DomainKeys, Microsoft's Caller-ID, and SPF. It will be interesting to see where
the next year or two takes us there.
4) Look for PHP to continue penetrating the
corporate sphere. Although it tends to get little publicity, the number of
high-profile companies using PHP for both internal and external web-based
applications is growing.
Q: What are your top recommended resources
for both businesses and IT professionals?
A: I read tech news voraciously, so it's
hard to come up with a short list of resources. I subscribe to several hundred
RSS news feeds -- everything from major news portals such as Yahoo!'s
aggregated feeds, the BBC and Wired to small weblogs. Some of my favorite
sources of information (like the BoingBoing weblog or Bruce Schneier's
Cryptogram) aren't really technical resources, but are very interesting and
thoughtful reads. Basically, if it's smart and/or funny I'll read it. RSS feeds
make it easy to scan through news items efficiently to determine if I'm really
interested without spending the entire day surfing the web.
The only resource I can't imagine being
without it Google. How I functioned before it's invention is lost on me. News
feeds provide a broad net for information gathering, and knowing how to use a
search engine well allows you to drill as deep as you want into any item.
Q: What kind of computer setup do you have?
A: I use a Linux desktop (Redhat 7.3,
because it works and I don't want to break it by upgrading) and a Apple G4
Powerbook. I've tried to standardize on one or the other, but there are too
many aspects of each that I like.
Q: If you were doing this interview, what
three questions would you ask of someone in your position and what would be
your answers?
A: Q1: Why PHP?
A1: PHP has a number of things going for it
in the web space. It's highly portable, relatively fast, and has a very shallow
learning curve. Unlike a language like Java, where you really need skilled
technicians to program it, you can teach a non-programmer basic PHP in hours. PHP
also has an extremely strong community behind it, so there is good assurance
that the language will continue to evolve.
Q2: In Advanced PHP Programming, you've
chosen to target PHP5, which is still in beta. What motivated that choice?
A2: In truth, I hope the book is equally
applicable to both versions of PHP. My goal was to share my experiences on how
to write solid, manageable PHP applications. Ninety percent of the book is
completely agnostic to PHP versions. Aside from that general philosophy, PHP5's
object model is a real step up from that in PHP4, and I wanted to showcase some
of the changes.
Q3: When is a final release PHP5 coming
out?
A3: When it's done.
Q: Do you have any more comments to add?
A: Nope.
Q: George, thank you again for your time,
and consideration in doing this interview.
A: Thank you.
|