Opened 5 years ago

Closed 5 years ago

#17308 closed defect (fixed)

Fix sws2rst

Reported by: jdemeyer Owned by:
Priority: minor Milestone: sage-6.4
Component: scripts Keywords:
Cc: kcrisman Merged in:
Authors: Karl-Dieter Crisman, Jeroen Demeyer Reviewers: Jeroen Demeyer, Karl-Dieter Crisman
Report Upstream: N/A Work issues:
Branch: 6eed79e (Commits) Commit: 6eed79ee5a93f42f10389978370ba4994a3d8a02
Dependencies: Stopgaps:

Description (last modified by jdemeyer)

sws2rst abuses tempfile.gettempdir() such that it breaks if os.path.join(tempfile.gettempdir(), file_name) happens to exist already (in particular, if the original .sws file was stored in /tmp).

Also, I don't like

    try:
        process_sws(file_name)
    except Exception as e:
        print "Unable to process file ('%s')" % file_name
        print "Error message: %s" % e
        print "Exiting."
        sys.exit(1)

I would much rather see a full traceback rather than a "beautified" useless error message.

This strange assumption should also be removed:

Remark:

    The sws file(s) should be in the current directory; using an
    absolute path will result in an error."""

Change History (22)

comment:1 Changed 5 years ago by jdemeyer

  • Description modified (diff)

comment:2 Changed 5 years ago by kcrisman

  • Cc kcrisman added
  • Priority changed from major to minor

comment:3 Changed 5 years ago by kcrisman

Are you suggesting something like the following?

  • src/bin/sage-sws2rst

    diff --git a/src/bin/sage-sws2rst b/src/bin/sage-sws2rst
    index 3028ef0..fc517b1 100755
    a b def process_sws(file_name): 
    5050    sws_file = tarfile.open(file_name, mode='r:bz2')
    5151    #TODO: python complains about using tempnam, but I don't
    5252    #know hot to fix it or see any danger
    53 #    tempname = os.tempnam('.')
    54     tempname = os.path.join(tempfile.gettempdir(), file_name)
     53    tempname = os.tempnam('.')
     54#    tempname = os.path.join(tempfile.gettempdir(), file_name)
    5555    sws_file.extractall(tempname)
    5656    base_name = os.path.split(os.path.splitext(file_name)[0])[1]
    5757    base_name_clean = base_name.replace(' ','_')
    if len(args) < 1: 
    180180
    181181for file_name in args:
    182182    print "Processing "+file_name
    183     try:
    184         process_sws(file_name)
    185     except Exception as e:
    186         print "Unable to process file ('%s')" % file_name
    187         print "Error message: %s" % e
    188         print "Exiting."
    189         sys.exit(1)
     183    process_sws(file_name)

I don't know whether the assumption is incorrect, so I would rather not remove it. Probably the guy who wrote sws2rst originally had problems with it.

comment:4 follow-up: Changed 5 years ago by kcrisman

I checked it out, and the sort of security implications that using os.tempnam has is very, very unlikely to hit someone creating a tutorial. Otherwise I guess we could use temp file.mkdtemp(). I'd appreciate your advice on which option seems better, I haven't tested either.

  • src/bin/sage-sws2rst

    diff --git a/src/bin/sage-sws2rst b/src/bin/sage-sws2rst
    index 3028ef0..d6402b8 100755
    a b def process_sws(file_name): 
    5151    #TODO: python complains about using tempnam, but I don't
    5252    #know hot to fix it or see any danger
    5353#    tempname = os.tempnam('.')
    54     tempname = os.path.join(tempfile.gettempdir(), file_name)
     54    tempdirname = tempfile.mkdtemp()
     55    tempname = os.path.join(tempdirname, file_name)
    5556    sws_file.extractall(tempname)
    5657    base_name = os.path.split(os.path.splitext(file_name)[0])[1]
    5758    base_name_clean = base_name.replace(' ','_')
    def process_sws(file_name): 
    101102    print "File at "+rst_file
    102103    print "Image directory at "+images_dir
    103104
    104     shutil.rmtree(tempname)
     105    shutil.rmtree(tempdirname)
Last edited 5 years ago by kcrisman (previous) (diff)

comment:5 in reply to: ↑ 4 Changed 5 years ago by jdemeyer

Replying to kcrisman:

I checked it out, and the sort of security implications that using os.tempnam has is very, very unlikely to hit someone creating a tutorial.

os.tempnam simply shouldn't be used. Why use something potentially dangerous if there are better alternatives?

Otherwise I guess we could use tempfile.mkdtemp().

Good!

comment:6 follow-up: Changed 5 years ago by jdemeyer

Wouldn't just

tempname = tempfile.mkdtemp() 

work, without the os.path.join(..., file_name) part?

comment:7 in reply to: ↑ 6 Changed 5 years ago by kcrisman

Wouldn't just

tempname = tempfile.mkdtemp() 

work, without the os.path.join(..., file_name) part?

Probably, but I was trying to keep things pretty minimal in change. Perhaps there is a reason to keep the filename involved.

comment:8 Changed 5 years ago by kcrisman

  • Authors set to Karl-Dieter Crisman
  • Branch set to u/kcrisman/updatesws2rst
  • Commit set to 5d0c639aa36c6e652221f833ec3d9037c5962b3d
  • Status changed from new to needs_review

Okay, hopefully this will be fine. I couldn't get it to break with various choices of directory and path, but please do try to break it.

Incidentally, I was able to get Trac to barf by accidentally putting in a branch name starting with a colon... that seems a bit excessive, but of course I have no idea how all the plugins work.


New commits:

5d0c639Update sws2rst temp files and a bit else

comment:9 Changed 5 years ago by jdemeyer

  • Branch changed from u/kcrisman/updatesws2rst to u/jdemeyer/ticket/17308
  • Created changed from 11/08/14 10:55:37 to 11/08/14 10:55:37
  • Modified changed from 11/13/14 20:58:52 to 11/13/14 20:58:52

comment:10 Changed 5 years ago by git

  • Commit changed from 5d0c639aa36c6e652221f833ec3d9037c5962b3d to 6eed79ee5a93f42f10389978370ba4994a3d8a02

Branch pushed to git repo; I updated commit sha1. New commits:

6eed79eModernize sage-sws2rst script

comment:11 Changed 5 years ago by jdemeyer

I ended up changing a lot of the sage-sws2rst script, but I think it is better now and less prone to breakage. If you agree, you can set this to positive_review.

comment:12 Changed 5 years ago by jdemeyer

Do you happen to know why the try/except was put here originally?

    #"data" dir
    data_path = os.path.join(tempname,'sage_worksheet','data')
    if os.path.exists(data_path):
        for image in os.listdir(data_path):
            try:
                shutil.move(os.path.join(data_path, image), os.path.join(images_dir, image.replace(' ','_')))
            except shutil.Error:
                pass

I cannot see how this move could generate an error which can be ignored, so I removed the exception handling.

Last edited 5 years ago by jdemeyer (previous) (diff)

comment:13 follow-up: Changed 5 years ago by kcrisman

  • Authors changed from Karl-Dieter Crisman to Karl-Dieter Crisman, Jeroen Demeyer
  • Reviewers set to Jeroen Demeyer, Karl-Dieter Crisman

I ended up changing a lot of the sage-sws2rst script, but I think it is better now and less prone to breakage. If you agree, you can set this to positive_review.

Yes, and you can add yourself to the author log too since you added me.

Regarding your original remark

This strange assumption should also be removed:
Remark:

    The sws file(s) should be in the current directory; using an
    absolute path will result in an error."""

see here and here - and Pablo said he did it, but somehow it didn't make it in. But it turned out that switching to mkdtemp() completely removed the problem anyway, since the only place two paths were joined (as opposed to a path and a string like 'data', or something from a list of a directory) was there. So you were okay to remove that, and I have tested it with the latest version.

I cannot see how this move could generate an error which can be ignored, so I removed the exception handling.

How about a permissions error or something about too big of a file, or weird characters, or...? (Just brainstorming.) That could be safely ignored; in the event of emergency, just don't copy that particular file, I guess that was the thinking. Because after all otherwise you could never really use sws2rst at all on that worksheet. Because of this, and because this was in the very first version Pablo put up, so I'm reluctant to have it removed.

Otherwise all should be well, I even tested some of the more unusual error possibilities by renaming/moving/not moving various files.

Last edited 5 years ago by kcrisman (previous) (diff)

comment:14 in reply to: ↑ 13 ; follow-up: Changed 5 years ago by jdemeyer

Replying to kcrisman:

How about a permissions error or something about too big of a file, or weird characters, or...? (Just brainstorming.) That could be safely ignored;

I don't see why such errors can be "safely ignored". The .rst file refers to (some of) these images! I would prefer to get an error instead of a silently broken .rst file.

comment:15 in reply to: ↑ 14 ; follow-up: Changed 5 years ago by kcrisman

How about a permissions error or something about too big of a file, or weird characters, or...? (Just brainstorming.) That could be safely ignored;

I don't see why such errors can be "safely ignored". The .rst file refers to (some of) these images!

Hmm, well, there is that. But that only is a problem when using Sphinx; those references can always be deleted. Indeed, there is usually at least a little tweaking needed anyway, though amazingly little. I guess it comes down to whether it's worse to have references that don't work in the rst file, or no rst file at all, and I would lean middling-heavy on the latter.

comment:16 in reply to: ↑ 15 ; follow-up: Changed 5 years ago by jdemeyer

Replying to kcrisman:

I guess it comes down to whether it's worse to have references that don't work in the rst file, or no rst file at all, and I would lean middling-heavy on the latter.

A compromize could be to have an option --rst-only to create just the .rst file without the images.

comment:17 in reply to: ↑ 16 Changed 5 years ago by kcrisman

I guess it comes down to whether it's worse to have references that don't work in the rst file, or no rst file at all, and I would lean middling-heavy on the latter.

A compromize could be to have an option --rst-only to create just the .rst file without the images.

We already have the help and sphinxify options, that seems reasonable. Something like

parser.add_option("--rst-only",
                  action="store_true", dest="rst_only",
                  help="Create just a .rst file, without images or other resources")

I'm in the middle of a different branch right now, unfortunately, and am not yet comfortable enough with git stash to deal with this... - we'd need to update usage as well.

Incidentally, the jmol images now start with pngs so it could be possible to extract those as well - obviously, not here!

comment:18 follow-up: Changed 5 years ago by jdemeyer

If you agree, we could add the --rst-only in a different ticket.

comment:19 in reply to: ↑ 18 Changed 5 years ago by kcrisman

If you agree, we could add the --rst-only in a different ticket.

Haha, true, except then I might do to you what you did to me in #17265 and make this ticket depend on that one ;-)

Last edited 5 years ago by kcrisman (previous) (diff)

comment:20 Changed 5 years ago by jdemeyer

Well, I don't consider it a bug that sage-sws2rst gives an error if there was an error copying files...

comment:21 Changed 5 years ago by kcrisman

  • Status changed from needs_review to positive_review

Hmm, good point. Though it's also a good point that this is a change in behavior.

However, enough bikeshedding; presumably this is used rarely enough in such cases (or at all?) that we should do that. Put your fix there and I'll get to it at some point soon, this is a green light.

comment:22 Changed 5 years ago by vbraun

  • Branch changed from u/jdemeyer/ticket/17308 to 6eed79ee5a93f42f10389978370ba4994a3d8a02
  • Resolution set to fixed
  • Status changed from positive_review to closed
Note: See TracTickets for help on using tickets.