Developer Forums | About Us | Site Map


Useful Lists

Web Host
site hosted by netplex

Online Manuals

Cultured Perl: Reading and writing Excel files with Perl
By Teodor Zlatanov - 2004-09-23 Page:  1 2 3

Using the Spreadsheet::WriteExcel and Spreadsheet::ParseExcel modules

Only recently have the doors been open to Microsoft Excel, the most popular spreadsheet application for the desktop. This article takes a look at reading and writing Excel files in Windows and Linux, using Perl and a few simple modules. The author of this article, Teodor Zlatanov, is an expert in Perl who has been working in the community since 1992 and who specializes in, among other things, open source work in text parsing.

Parsing Excel files presents a conundrum any way you look at it. Until last year, UNIX modules were completely unavailable, and data from Excel files for Windows could only be retrieved with the Win32::OLE modules. But things have finally changed, thanks to two Perl hackers and a lot of volunteer help and contributions!

Spreadsheet::WriteExcel and Spreadsheet::ParseExcel

In 2000, Takanori Kawai and John McNamara produced the Spreadsheet::WriteExcel and Spreadsheet::ParseExcel modules and posted them on CPAN, which made it possible, though not easy, to extract data from Excel files on any platform.

As we'll see later, Win32::OLE still offers a simpler, more reliable solution if you're working with Windows, and is recommended by the Spreadsheet::WriteExcel module for more powerful manipulations of data and worksheets. Win32::OLE comes with the ActiveState Perl toolkit, and can be used to drive a lot of other Windows applications through OLE. Note that to use this module, you still need to have the Excel engine (usually installed with Excel itself) installed and licensed on your machine.

The applications that need to parse Excel data number in the thousands, but here are a few examples: exporting Excel to CSV, interacting with a spreadsheet stored on a shared drive, moving financial data to a database for reporting, and analyzing data not provided in any other format.

To follow along with the examples given here, you must have Perl 5.6.0 installed on your system. Preferably, your system should be a recent (2000 or later) mainstream UNIX installation (Linux, Solaris, BSD). Although the examples may work with earlier versions of Perl and UNIX, and with other operating systems, you should consider cases where they fail to function as exercises to solve.

Windows example: parsing

This section applies to Windows machines only. All the other sections apply to Linux.

Before you proceed, install ActiveState Perl (version 628 used here) or the ActiveState Komodo IDE for editing and debugging Perl. Komodo comes with a free license for home users, which you can get in a matter of minutes. (See Resources later in this article for the download sites.)

Installing the Spreadsheet::ParseExcel and Spreadsheet::WriteExcel modules using the ActiveState PPM package manager is difficult. PPM has no history, options are hard to set, help scrolls off the screen, and the default is to install modules ignoring dependencies. You can invoke PPM from the command line by typing "ppm" and issuing the following commands:

ppm> install OLE::Storage_Lite
ppm> install Spreadsheet::ParseExcel
ppm> install Spreadsheet::WriteExcel

The module install will fail in this case, because IO::Scalar is not yet available, so you may want to give up trying to find the problem with PPM, and switch to the built-in Win32::OLE module. However, by the time you read this, ActiveState may have released a fix for this problem.

With Win32::OLE from the ActiveState toolkit, you can dump a worksheet, cell by cell, using the code listed below:


#!/usr/bin/perl -w

use strict;
use Win32::OLE qw(in with);
use Win32::OLE::Const 'Microsoft Excel';

$Win32::OLE::Warn = 3;                                # die on errors...

# get already active Excel application or open new
my $Excel = Win32::OLE->GetActiveObject('Excel.Application')
    || Win32::OLE->new('Excel.Application', 'Quit');  

# open Excel file
my $Book = $Excel->Workbooks->Open("c:/komodo projects/test.xls"); 

# You can dynamically obtain the number of worksheets, rows, and columns
# through the Excel OLE interface.  Excel's Visual Basic Editor has more
# information on the Excel OLE interface.  Here we just use the first
# worksheet, rows 1 through 4 and columns 1 through 3.

# select worksheet number 1 (you can also select a worksheet by name)
my $Sheet = $Book->Worksheets(1);

foreach my $row (1..4)
 foreach my $col (1..3)
  # skip empty cells
  next unless defined $Sheet->Cells($row,$col)->{'Value'};

 # print out the contents of a cell  
  printf "At ($row, $col) the value is %s and the formula is %s\n",

# clean up after ourselves

Note that you can assign values to cells very easily in the following way:

$sheet->Cells($row, $col)->{'Value'} = 1;

View Cultured Perl: Reading and writing Excel files with Perl Discussion

Page:  1 2 3 Next Page: Linux example: parsing

First published by IBM developerWorks

Copyright 2004-2024 All rights reserved.
Article copyright and all rights retained by the author.