Thursday, February 3, 2011

Join files effectively on Linux

Is there a better way to join files that have been splitted than just doing a "cat" or "join"? These commands just copy the file streams into a new file on disk. A much better way would be manipulating the filesystem pointers to join the files into one big continuous file. Of course this would be filesystem specific. Is there something available for ext2 or ext3?

  • No, the correct way to split files is:

    split bigfile
    

    and conCATenate them:

    cat x* > newbigfile
    

    Trying to do this with the underlying filesystem is the wrong approach if for no other reason than it wouldn't be portable.

    Eddy Yosso : There are several use cases where portability is not the primary issue. Partition tools work with these kinds of issues all the time. I can't see why this shouldn't be available - at least for a root user.
    Dennis Williamson : See part of [Gary's answer](http://serverfault.com/questions/159397/join-files-effectively-on-linux/159415#159415). Filesystem abstraction would be broken. You're catting files, not manipulating filesystems. That's the difference between `split`/`cat` and `parted`. The rest of the OS wouldn't have any idea why all of a sudden something weird happened to some otherwise perfectly normal files. I wouldn't run such a utility except on an unmounted filesystem.
  • Yes. And it makes no sense for this kind of special case to be in userspace. It would break the whole idea of the filesystem abstraction.

    Dennis Williamson : Are you saying "yes" to the OP's question 'Is there a better way to join files that have been splitted than just doing a "cat" or "join"?'? The rest of your answer seems to say "no".
    From Gary

0 comments:

Post a Comment