Useful Perl Scripts With Regular Expressions Converting From Unix Files To Windows Files

Useful Perl Scripts With Regular Expressions

By Matthew Drouin - 2003-12-18 Page: 1 2 3 4 5 6

Converting From Unix Files To Windows Files

Lots of people seem to be moving from the Windows word to the Unix word or from the Unix word to the Linux word or maybe from one operating system to another and in between often for many reasons. The problem is that some operating systems do a newline (unix), others do a linefeed + newline (windows), and yet others just do a linefeed (Mac prior to OS X). So when moving files between these operating systems there can be some issues and some weird characters show up and you might not know why.

The previous script is easily modified to remove the line termination string and add in a new line termination string. If you are moving from a Unix system to a Windows based system you would want to remove all the \n's and convert it to windows by adding back on \r\n. This would allow the file to be read in Windows based applications like notepad. If you have ever opened a file in notepad before and saw everything on one line with weird boxes that is because the lines are not terminated correctly and notepad is confused.

There is a program out there called flip that can convert a single file at a time to but when needing to do many files and files in subdirectories it is not as easy to use.

The code below will go though each line and chomp the line which will remove the terminators at the end, be it \n in unix or \r\n in Windows or just \r on the Mac. We then go in and add in the line terminators that we want to add in. Please note I did not get to test chomp on Windows or Mac so I am assuming that chomp does what I said above without testing. If it does not work please let me know and you can easily just do a replace. I did it with a chomp because it seemed like it would be a lot cleaner code.

#!/usr/bin/perl

use File::Find;
use strict;

my $directory = "/home/directory/";

find (\&process, $directory);

sub process
{
    my @outLines; #Data we are going to output
    my $line;      #Data we are reading line by line

    #  print "processing $_ / $File::Find::name\n";

    # Only parse files that end in .html
    if ( $File::Find::name =~ /\.html$/ ) {

        open (FILE, $File::Find::name ) or
        die "Cannot open file: $!";

        print "\n" . $File::Find::name . "\n";
        while ( $line = <FILE> ) {
        chomp ( $line );
            push(@outLines, $line . "\r\n");
        }
        close FILE;

        open ( OUTFILE, ">$File::Find::name" ) or
        die "Cannot open file: $!";

        print ( OUTFILE @outLines );
        close ( OUTFILE );

        undef( @outLines );
    }
}

The code above is just like the code we used to change the body tag so it should be pretty straight forward. I have used scripts like this often to be able to move code that was created on a Unix machine to a Windows machine or the other way around. I have also used this code to move major things like perforce or cvs versioning files from one operating system to another so hopefully this serves to be as useful to you as it has to me.

View Useful Perl Scripts With Regular Expressions Discussion

Page: 1 2 3 4 5 6 Next Page: Summary