GNU tar 1.34: 8.4 Comparison of tar and cpio
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
8.4 Comparison of tar and cpio
(This message will disappear, once this node revised.)
The cpio
archive formats, like tar
, do have maximum file name lengths. The binary and old ASCII formats have a maximum file length of 256, and the new ASCII and CRC ASCII formats have a max file length of 1024. GNU cpio
can read and write archives with arbitrary file name lengths, but other cpio
implementations may crash unexplainedly trying to read them.
tar
handles symbolic links in the form in which it comes in BSD; cpio
doesn’t handle symbolic links in the form in which it comes in System V prior to SVR4, and some vendors may have added symlinks to their system without enhancing cpio
to know about them. Others may have enhanced it in a way other than the way I did it at Sun, and which was adopted by AT&T (and which is, I think, also present in the cpio
that Berkeley picked up from AT&T and put into a later BSD release—I think I gave them my changes).
(SVR4 does some funny stuff with tar
; basically, its cpio
can handle tar
format input, and write it on output, and it probably handles symbolic links. They may not have bothered doing anything to enhance tar
as a result.)
cpio
handles special files; traditional tar
doesn’t.
tar
comes with V7, System III, System V, and BSD source; cpio
comes only with System III, System V, and later BSD (4.3-tahoe and later).
tar
’s way of handling multiple hard links to a file can handle file systems that support 32-bit i-numbers (e.g., the BSD file system); cpio
s way requires you to play some games (in its “binary” format, i-numbers are only 16 bits, and in its “portable ASCII” format, they’re 18 bits—it would have to play games with the "file system ID" field of the header to make sure that the file system ID/i-number pairs of different files were always different), and I don’t know which cpio
s, if any, play those games. Those that don’t might get confused and think two files are the same file when they’re not, and make hard links between them.
tar
s way of handling multiple hard links to a file places only one copy of the link on the tape, but the name attached to that copy is the only one you can use to retrieve the file; cpio
s way puts one copy for every link, but you can retrieve it using any of the names.
What type of check sum (if any) is used, and how is this calculated.
See the attached manual pages for tar
and cpio
format. tar
uses a checksum which is the sum of all the bytes in the tar
header for a file; cpio
uses no checksum.
If anyone knows why
cpio
was made whentar
was present at the unix scene,
It wasn’t. cpio
first showed up in PWB/UNIX 1.0; no generally-available version of UNIX had tar
at the time. I don’t know whether any version that was generally available within AT&T had tar
, or, if so, whether the people within AT&T who did cpio
knew about it.
On restore, if there is a corruption on a tape tar
will stop at that point, while cpio
will skip over it and try to restore the rest of the files.
The main difference is just in the command syntax and header format.
tar
is a little more tape-oriented in that everything is blocked to start on a record boundary.
Is there any differences between the ability to recover crashed archives between the two of them. (Is there any chance of recovering crashed archives at all.)
Theoretically it should be easier under tar
since the blocking lets you find a header with some variation of ‘dd skip=nn
’. However, modern cpio
’s and variations have an option to just search for the next file header after an error with a reasonable chance of resyncing. However, lots of tape driver software won’t allow you to continue past a media error which should be the only reason for getting out of sync unless a file changed sizes while you were writing the archive.
If anyone knows why
cpio
was made whentar
was present at the unix scene, please tell me about this too.
Probably because it is more media efficient (by not blocking everything and using only the space needed for the headers where tar
always uses 512 bytes per file header) and it knows how to archive special files.
You might want to look at the freely available alternatives. The major ones are afio
, GNU tar
, and pax
, each of which have their own extensions with some backwards compatibility.
Sparse files were tar
red as sparse files (which you can easily test, because the resulting archive gets smaller, and GNU cpio
can no longer read it).
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated on March 24, 2021 using texi2html 5.0.