Saturday, October 08, 2011

Splitting a large file into smaller pieces

If you have a large file and want to break it into smaller pieces, you can use the Unix split command. You can tell it what the prefix of each split file should be and it will then append an alphabet (or number) to the end of each name.

In the example below, I split a file containing 100,000 lines. I instruct split to use numeric suffixes (-d), put 10,000 lines in each split file (-l 10000) and use suffixes of length 3 (-a 3). As a result, ten split files are created, each with 10,000 lines.

$ ls
hugefile

$ wc -l hugefile
100000 hugefile

$ split -d -l 10000 -a 3 hugefile hugefile.split.

$ ls
hugefile                hugefile.split.005
hugefile.split.000      hugefile.split.006
hugefile.split.001      hugefile.split.007  
hugefile.split.002      hugefile.split.008
hugefile.split.003      hugefile.split.009
hugefile.split.004

$ wc -l *split*
 10000 hugefile.split.000
 10000 hugefile.split.001
 10000 hugefile.split.002
 10000 hugefile.split.003
 10000 hugefile.split.004
 10000 hugefile.split.005
 10000 hugefile.split.006
 10000 hugefile.split.007
 10000 hugefile.split.008
 10000 hugefile.split.009
100000 total

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.