Using the Spreadsheet::WriteExcel and Spreadsheet::ParseExcel modules
Only recently have the doors been open to Microsoft Excel, the most popular spreadsheet application for the desktop. This article takes a look at reading and writing Excel files in Windows and Linux, using Perl and a few simple modules. The author of this article, Teodor Zlatanov, is an expert in Perl who has been working in the community since 1992 and who specializes in, among other things, open source work in text parsing.
Parsing Excel files presents a conundrum any way you look at it. Until last
year, UNIX modules were completely unavailable, and data from Excel
files for Windows could only be retrieved with the
Win32::OLE modules. But things have finally changed, thanks to two Perl hackers and a lot of volunteer help and contributions!
Spreadsheet::WriteExcel and Spreadsheet::ParseExcel
In 2000, Takanori Kawai and John McNamara produced the
Spreadsheet::ParseExcel modules and posted them on CPAN, which made it possible, though not easy, to extract data from Excel files on any platform.
As we'll see later,
Win32::OLE still offers a simpler, more reliable solution if you're working with Windows, and is recommended by the
Spreadsheet::WriteExcel module for more powerful manipulations of data and worksheets.
comes with the ActiveState Perl toolkit, and can be used to drive a lot
of other Windows applications through OLE. Note that to use this
module, you still need to have the Excel engine (usually installed with
Excel itself) installed and licensed on your machine.
The applications that need to parse Excel data number in the thousands, but here are a few examples: exporting Excel to CSV, interacting with a spreadsheet stored on a shared drive, moving financial data to a database for reporting, and analyzing data not provided in any other format.
To follow along with the examples given here, you must have Perl 5.6.0 installed on your system. Preferably, your system should be a recent (2000 or later) mainstream UNIX installation (Linux, Solaris, BSD). Although the examples may work with earlier versions of Perl and UNIX, and with other operating systems, you should consider cases where they fail to function as exercises to solve.
Windows example: parsing
This section applies to Windows machines only. All the other sections apply to Linux.
Before you proceed, install ActiveState Perl (version 628 used here) or the ActiveState Komodo IDE for editing and debugging Perl. Komodo comes with a free license for home users, which you can get in a matter of minutes. (See Resources later in this article for the download sites.)
modules using the ActiveState PPM package
manager is difficult. PPM has no history, options are hard to set, help
scrolls off the screen, and the default is to install modules ignoring
dependencies. You can invoke PPM from the command line by typing "ppm"
and issuing the following commands:
The module install will fail in this case, because
IO::Scalar is not yet available, so you may want to give up trying to find the problem with PPM, and switch to the built-in
Win32::OLE module. However, by the time you read this, ActiveState may have released a fix for this problem.
Win32::OLE from the ActiveState toolkit, you can dump a worksheet, cell by cell, using the code listed below:
Note that you can assign values to cells very easily in the following way: