Dealing with EOL chars

Alex Brelsfoard alex.brelsfoard at
Thu Jan 24 15:30:48 GMT 2008

Matt, All,

Here's a more detailed explanation:
We are reading in feeds (typically TSV or CSV files), parsing them,
reformatting the data, and spitting it out as another file.
These feeds can come from all sorts of people/places/things.
So sometimes they are wonderfully formatted and we understand their content.
Sometime they are not, and we do not.

Imagine a feed that is a list of products.
Each row lists the name, type, description, and price of a product.
Now say you have someone using some sort of CMS to create their feed.
Now say that they have all of their information stored in Word files.
It is very easy to see the scenario where the person will just copy and
paste the description (with line breaks) into their CMS.
Their CMS may just enclose this field in quotes and move on.
So now we have a feed where the description column may have linebreaks in
So I can't just split on any form of linebreak.

Does this make a bit more sense?

btw, there's no chance that I could define $/ as a regex is there?

Thanks again for the help.

On Jan 24, 2008 5:50 AM, Matt Lawrence <matt.lawrence at> wrote:

> Alex Brelsfoard wrote:
> > Hi all,
> >
> > Sorry for making such an on-topic post, but seeing as this is my first
> post
> > with I figured I might pretend to be Perl-centric.
> >
> > I am currently trying to work on a system that reads in all kinds of
> feeds.
> > These feeds can be created on a PC, new/old mac, or a *nix machine.
> > And I need to be able to deal with them all.
> > Here's the kicker, these feeds sometimes have inline breaks, and we need
> to
> > keep them.
> >
> > Does anyone have any suggestions on how to deal with this?
> >
> > This works:
> > ---------------------------
> > my $newline = "\n";
> > my $file = '788_test.txt';
> > open (my $file_fh, $file) || die "could not open $file for reading";
> > my $file_content = <$file_fh>;
> > $file_content =~ s/(?:\015{1,2}\012|\015|\012)/$newline/sg;
> > foreach (split(/\n/, $file_content)) {
> > print "$_\n";
> > }
> > close($file_fh);
> > ---------------------------
> >
> > I can even split on that regular expression and save a line of code.
> > But I'm just concerned about losing inline breaks.
> >
> > Thoughts?
> >
> What do you mean by inline breaks?
> Matt

More information about the mailing list