cfperl learns to talk
In this installment, Ted explains the workings of cfperl's top-level and compound-class parsers and moves us towards a clearer understanding of the important roles parsers play.
This chapter is from Ted's book, At the Helm, being published serially on the Web in the developerWorks column The road to better programming. Catch up on Ted's earlier chapters.
The top-level cfperl parser processes every configuration line before the line is granted meaning. The distinction is important. The top-level parser gets data in the form of lines of text. Its job is to distill the information from that data. The top-level parser will, for instance, remove all comments on a line by themselves, and all blank lines. All processed, and thus meaningful (to the cfperl interpreter), data is encapsulated for the second-level parsers. It is thus easier to write a second-level parser (for instance, a parser that interprets editing commands) when the top-level parser has taken care of removing conditional execution, blank lines, and other extraneous (to editing) information.
The top-level cfperl parser is the place where cfengine sections, compound classes (a sequence of classes separated by OR ["|"] and AND ["."] symbols, to build a boolean statement), and configuration action lines (anything to be parsed by a second-level parser) are parsed. Each one of those items will be explained in this article. Keep in mind that cfperl is a project in progress, and the cfperl version you see on the cfperl Web site may be different from the static version used in this article (see Resources for links to the Web site and the cfperl.pl download).
To understand the following sections, you must understand the basics of
Parse::RecDescent module. The Resources section has links to previous
articles on this very useful module, if you need that background. Also in Resources
you will find a link to a sample cfengine configuration, which is
essential to understanding the purpose and structure of cfperl.
On another note, I have abstained from actually naming the "#" character the way I name ":" (colon), for instance. The "#" character has at least 10 names, most commonly "pound" and "hash," but use of any particular name is guaranteed to confuse 50 percent of the readers.