Opened 13 years ago

Last modified 6 years ago

#329 closed enhancement

add md5sums for spkgs — at Version 6

Reported by: was Owned by: pdenapo
Priority: blocker Milestone: sage-duplicate/invalid/wontfix
Component: scripts Keywords:
Cc: ohanar Merged in:
Authors: Reviewers:
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Description (last modified by mvngu)

I've noticed that sage has problems with the integrity of sage-
packages.

Supose that you have patially donwload a file, but for whatever reason
it gets truncated.
Then sage won't check its integrity before installing.

I would sugest adding to each file an md5 sum (or perhaps better a gpg
signtaure, but this could be difficult since we need anybody to be
able to build their own sage packages)
[in a file like package-name.spkg.md5 or package-name.spkg.signature]
and make sage chek this md5sum is correct.
[and if not, download it again]

[Most linux distributions do this somehow, for example Gentoo keeps
md5sums in the manifiests in the portage tree, I think that a good
model also would be Debian. For each package, Debian sources consists
of 3 files:

- package.dsc: a description and the md5sum of the
package.orig.tar.gz, and package.diff.gz for checking the integrity of
the package
- packages.orig.tar.gz: the pristine sources from the upstream author
- the .diff.gz with the modifications specific to debian

(by keeping separated the upstream sources, and the Debian
modifications, Debian makes clear which modifications are specific to
Debian)

I think that sage could adopt a similar aproach for their packages

best regards,
Pablo
  • Ticket #7617 implements the integrity check procedure below for the SageTeX spkg.

Change History (6)

comment:1 Changed 12 years ago by mabshoff

  • Milestone set to sage-3.0

comment:2 Changed 12 years ago by pdenapo

  • Owner changed from was to pdenapo
  • Status changed from new to assigned

II'll try to implement this myself.

comment:3 Changed 12 years ago by was

> I think you can easily make tar-archives that contain a checksum, if
> you agree on some extremely mild file naming convention for such a
> checksum (i.e., the archive is not allowed to contain a filename that
> clashes with the file that stores the checksum). Of course, the key is
> that when you add something to the archive, the file changes, so the
> plain md5sum of the total archive changes. You have to md5sum
> something that is easily extracted and independent of the later added
> md5sum. The options -O (dump to stdout), -r (append file) and --
> exclude provide the necessary features for tar.
>
> Procedure for storing a checksum in a tar archive:
> ----------------------------------
> (tar xf file.tar --exclude md5sum.check -O; \
>     tar tvf file.tar --exclude md5sum.check ) | md5sum > md5sum.check
>
> tar -rf file.tar md5sum.check
> ----------------------------------
>
> Procedure for checking that the stored sum agrees with the computed
> one:
> ----------------------------------
> tar xf file.tar md5sum.check -O > storedcheck
> (tar xf file.tar --exclude md5sum.check -O; \
>     tar tvf file.tar --exclude md5sum.check ) | md5sum > computedcheck
>
> cmp storedcheck computedcheck
> ----------------------------------
>
> Note that we need to include the directory listing information as
> well, because the output of -O does not include file names
> (i.e., one could move files around and still have the same checksum)
>
> If it is ever decided that .spkgs should be signed, then you could
> include a .gpg-file via the same procedure.
>

I really like this idea a lot!  It's vastly better -- I think
-- from a usability point of view than having
to constantly pass around .spkg's and .md5 files together.
It will just work 100% automatically and transparently to users,
once we modify some scripts in local/bin/sage-*.

While we're at it, we should make the following work:

1)
  sage -unpkg packagename-version.spkg

which just does tar jxvf and does the above consistency checks.
I suggest sage -unpkg, since making a package is "sage -pkg".
Another option would be "sage -extract blah.spkg", or even
"sage -x blah.spkg".    Please note, sage spkg's can be either
bzip2'd or not, so that has to be taken account of.

2)

  sage -i packagename-version

where packagename-version is the name of a *directory*, does
sage -pkg on the directory, then installs it.

comment:4 follow-up: Changed 11 years ago by khorton

It would be useful to somehow also have a way to check the integrity of tarballs for the whole sage install - i.e. the 200 MB tarball for each sage release. Twice in the last month I've gotten bitten by build failures caused by corrupted tarballs. It would have been nice to know the tarball was bad before investing the time to wait for the build + time to troubleshoot the failure.

One option would be to use tar.gz format to distribute sage releases. There would be no reduction in file size, as most of the tarball consists of already compressed sources, but there would be detection if the tarball had somehow gotten corrupted. The time required to compress and extract the tarballs is trivial - my 5 year old PPC PowerBook? gzips the 200 MB sage tarball in about a minute, and extracts it in less than 30 seconds.

Rather than trying to reinvent the wheel for spkg formats, it may be worthwhile to consider simply using gzip format.

comment:5 in reply to: ↑ 4 Changed 11 years ago by ddrake

Replying to khorton:

Rather than trying to reinvent the wheel for spkg formats, it may be worthwhile to consider simply using gzip format.

Right now, spkgs are just renamed .tar.bz2 files. How does using gzip instead of bzip2 give us corruption detection? Both gzip and bzip2 have --test flags; does gzip's test work better?

comment:6 Changed 10 years ago by mvngu

  • Description modified (diff)
  • Report Upstream set to N/A
Note: See TracTickets for help on using tickets.