The Lire::DlfConverter interface requires two kinds of methods. First, it requires methods which provide information to the framework on your converter. Second, it requires methods which will actually implement the conversion process. It this the format that this section documents.
The method name() should
            returns the name of our DLF converter. It is this name
            that is passed to the lr_log2report
            command. This name must be unique among all the converters
            registered and it should be restricted to alphanumerical
            characters (hyphens, period and underscores can also be
            used).
          
We will name our converter
            common_syslog:
            
sub name {
    return "common_syslog";
}            
            
The next two required methods are used to give more
            verbose information on your converter to the users. The
            converter's title() and
            description() can be use to
            display information about your converter from the user
            interface or to generate documentation.
          
The title() should simply
            returns a string:
            
sub title {
    return "Common Log Format embedded in Syslog DLF Converter";
}
            
The description() method
            should returns a DocBook
            fragment describing your converter and the log formats it
            support. If you don't know
            DocBook just restrict yourself
            to using the para elements to make
            paragraphs:
            
sub description {
   return <<EOD;
<para>This DLF Converter extracts web server's requests and error
information from a syslog file. 
</para>
<para>The requests and errors should be logged under the
<literal>httpd</literal> program name. The errors are mapped to the
<type>syslog</type> schema, the requests are mapped to the
<type>www</type> schema.
</para>
<para>Syslog records from another program than
<literal>httpd</literal> are ignored.
</para>
EOF
}
            
Two other meta-data methods are used by the framework itself. The first one specifies to what DLF schemas your DLF converter is converting to:
sub schemas {
    return ( "www", "syslog" );
}
            In our case, we are converting to the syslog and www schemas. Like we described it in our converter's description, we will map the web server's error message to the syslog schema and the request logs to the www schema. Other alternatives would have been to only map the requests information to www schema or map all the non-request records to the syslog schema. The rationale behind the current choice (besides this being an example) is that it make it convenient to process one log file to obtain a report containing the requests and errors from our web server. For that use case, it is best to ignore the non-web server related stuff.
The other method affects how the conversion process will be handled. Lire offers two mode of conversion, the line oriented one and the file oriented one. (Both will be described in the next section). If your log file is line-oriented (each lines is one log record) like most log files are, you should use the line-oriented conversion mode:
sub handle_log_lines {
    return 1;
}
            
The actual conversion process is handled through three
            methods: init_dlf_converter,
            finish_conversion() and either
            process_log_file() or
            process_log_line() depending on
            the conversion mode (as determined by
            handle_log_lines()'s return value.
          
The method
              init_dlf_converter() will be
              called once before the log file is processed. It should
              be use to initialize the state of your converter. Since
              our DLF Converter doesn't need any initialization and doesn't
              need any configuration, the method is simply empty:
              
sub init_dlf_converter {
    my ( $self, $process ) = @_;
    return;
}
              
The $process parameter which is
              passed to all the processing methods is an instance of
              Lire::DlfConverterProcess. This
              is the object which is driving the conversion process
              and it defines several methods which you will use in the
              actual conversion process.
            
The method
              finish_conversion() will be
              called once after the log file has been completely
              processed. This method will be mostly of use to stateful
              converter, that is DLF converters which generates DLF
              records from more than one line. Since this is not our
              case, we simply leave the method empty:
              
sub finish_conversion {
    my ( $self, $process ) = @_;
    return;
}
              
Whether you are using the file-oriented or
              line-oriented conversion mode, the principles are the
              same. You extract information from the log file and
              creates DLF records from it. Your DLF converter
              communicates with the framework by calling methods on
              the Lire::DlfConverterProcess
              object which is passed as parameter to your methods.
            
Here is the complete code of our conversion method:
use Lire::Apache qw/parse_common/;
sub process_log_line {
    my ( $self, $process, $line ) = @_;
    my $sys_rec = eval { $self->{syslog_parser}->parse( $line ) };
    if ( $@ ) {
        $process->error( $@, $line );
        return;
    } elsif ( $sys_rec->{process} ne 'httpd' ) {
        $process->ignore_log_line( $line, "not an httpd record" );
        return;
    } else {
        my $common_dlf = {};
        eval { parse_common( $sys_rec->{content}, $common_dlf ) };
        if ( $@ ) {
            $sys_rec->{message} = $sys_rec->{content};
            $process->write_dlf( "syslog", $sys_rec );
        } else {
            $process->write_dlf( "www", $common_dlf );
        }
    }
    
}
              
The first thing that should be noted is that in the
              line-oriented conversion mode, the method
              process_log_line() will be
              called once for each line in the log file.
            
Secondly, the actual parsing of the line is done
              using two functions: parse_common
              and Lire::Syslog's
              parse. These methods simply
              uses regular expressions to extract the appropriate
              information from the line and put it in an hash
              reference. What is important is that these methods
              already uses as key names the schema's field names.
            
Finally, you can see that there are four different
              methods used on the $process object to
              report different kind of information:
              
The example uses the
                      eval statement to trap
                      errors during the syslog record parsing. If the
                      line cannot be parsed as a valid syslog record,
                      it is an error and it is reported through the
                      error() method. The
                      first parameter is the error message and the
                      second one is the line to which the error is
                      associated. This last parameter is optional.
                    
When the syslog event doesn't come from the
                      httpd process, we ignore the
                      line. Ignored line are reported to the framework
                      by using the
                      ignore_log_line()
                      method. The first parameter is the line which is
                      ignored. The second optional parameter gives the
                      reason why the line was ignored.
                    
Finally, DLF records are created by using
                      the write_dlf() method.
                      Its first parameter is the schema to which the
                      DLF record complies. This schema must be one
                      that is listed by your converter's
                      schemas() method. The
                      second parameter is the DLF data contained in an
                      hash reference. The DLF record will be created
                      by taking for each field in the schema the value
                      under the same name in the hash. (Since in the
                      syslog schema, the field which
                      contains the actual log message is called
                      message, this is the
                      reason we
                      are assigning the content
                      value to the message key.)
                      Missing fields
                      or fields whose value is
                      undef will contains the
                      special LR_NA missing value
                      marker. Keys in the hash that don't map to a
                      schema's field are simply ignored.
                    
In our example, we distinguish between the
                      server's error message (mapped to the
                      syslog schema) and the request
                      information (mapped to the www
                      schema) based on whether
                      parse_common succeeded in
                      parsing the line.
                    
Another possibility, not shown in our example, is to ask that the line be saved for a later processing. This is mostly of use to converters who maitains state between lines. In the cases, it is quite the case that there are related lines that are missing from the end of the log file. In that case, you save the line and they will automatically seen by the next run of your converter on the same DLF store. This option is only available in the line-oriented mode of conversion.
The same principles apply when you are using the file-oriented mode of conversion. This mode will usually be used for binary log formats or format which aren't line-oriented like XML.
For demonstration purpose, the following code could be added to transform our line-oriented converter into a file-oriented one:
sub handle_log_lines { 
    return 0;
}
sub process_log_file {
    my ( $self, $process, $fh ) = @_;
    
    my $line;
    while ( defined( $line = <$fh> ) {
        chomp $line;
        $self->process_log_line( $process, $line );
    }
}
                
The difference between the above code and using the line oriented mode is that the framework won't be aware of the number of log lines processed and your converter might have troubles when processing log files which uses a different line-ending convention than the host you are runnig on. Bottom line is that you should use the line-oriented conversion mode when your log format is line oriented.