Friday, June 27, 2008

Read a File Without Trimming Leading Whitespace [Shell Scripting]

This is how you would normally read a file, line-by-line, in a shell script:
file=/home/fahd/dummy
while read line
do
    echo "$line"
done < $file
The problem with this is that the "read" command automatically removes leading whitespace from each line and also concatenates a line ending with a backslash with the one following. This means that you cannot properly process a file with lots of leading whitespace e.g. xml files.

The trick around this, is to redefine your IFS (Internal Field Separator) variable. By default, IFS is set to the space, tab and newline characters to delimit words for the read command. This is how you can amend your script:

file=/home/fahd/dummy
OIFS=$IFS
IFS=
while read -r line
do
    echo "$line"
done < $file
IFS=$OIFS
First save the current value of IFS into a temporary variable called OIFS. Then blank out IFS. When you have finished reading your file, set IFS back to its original value of OIFS.

We also provide the read command with the -r flag, so that it treats each backslash to be part of the input line and does not concatenate it with the next line.

7 comments:

  1. Awesome, worked perfectly. Thanks! I just needed to make sure the IFS setting/resetting was done right around my for loop because it seemed to mess up sed, mv, awk, etc. I also had to make it "IFS=\n". In your example it looks like its set to nothing.

    ReplyDelete
  2. Scratch my earlier comment about IFS=\n. I "IFS=" is the proper solution, I had just put it in the wrong place.

    ReplyDelete
  3. WOW GOLD, nice blog. its worth reading. I liked it very much

    ReplyDelete
  4. Thanks. This was so helpful.

    ReplyDelete
  5. You saved my day !!

    Awesome, I am kinda searching the whole internet from yesterday, and here you go, thanks pal !!

    ReplyDelete
  6. thanks buddy i was searching this

    ReplyDelete
  7. Thanks - help of the day !!

    ReplyDelete