Opened 10 years ago

Closed 10 years ago

Last modified 9 years ago

#11760 closed defect (fixed)

'sage-location' shouldn't "initialize" .pc (pkg-config) files more than once

Reported by: leif Owned by:
Priority: blocker Milestone: sage-4.7.2
Component: scripts Keywords: pkgconfig libpng Duplicate definition SAGE_ROOT
Cc: Merged in: sage-4.7.2.alpha4
Authors: Leif Leonhardy Reviewers: John Palmieri
Report Upstream: N/A Work issues:
Branch: Commit:
Dependencies: Stopgaps:

Status badges

Description (last modified by jhpalmieri)

The logic of "initializing" .pc files exactly once is currently broken, which at the moment "only" causes trouble with libpng if pkg-config is installed.

The main reason is that sage-location doesn't really check whether a .pc file already contains a definition of SAGE_ROOT, and unconditionally replaces any occurrences of $SAGE_ROOT (i.e., its current value) by ${SAGE_ROOT}, then adding a (in the case of update_pkgconfig_files(), replacing the first) line defining SAGE_ROOT to have the current value.

The specific problem with libpng (see e.g. #11686, which was originally intended to just again fix a completely unrelated issue with the matplotlib spkg) arises since it installs both a libpng12.pc file and a symbolic link libpng.pc, the latter pointing to the former. sage-location now "initializes" all of Sage's .pc files, not checking whether a .pc file is just a symbolic link to another one, nor checking whether a file already got modified as mentioned above.

Note that the "initialization" doesn't take place during a build, but when Sage is started for the first time, also when e.g. running tests, such that problems with modified .pc files occur later, when one tries to (re)install some package(s), which includes upgrading Sage.

In the long run, sage-location shouldn't (have to) deal with .pc files at all, but we IMHO need a quick fix until we have a better, more general solution to all issues with pkg-config. (There are a couple of other issues; cf. for example #10202, #9668, #11687, #11681.) Therefore this ticket isn't intended to fix any of the other issues.


Apply

  1. trac_11760-avoid_multiple_defs_of_SAGE_ROOT.scripts.patch
  2. trac_11760-additional_improvements_to_pkgconfig_funs.optional.scripts.patch

to the Sage scripts repository.

Attachments (3)

trac_11760-avoid_multiple_defs_of_SAGE_ROOT.scripts.patch (11.5 KB) - added by leif 10 years ago.
SCRIPTS repo. Based on Sage 4.7.2.alpha3.
trac_11760-additional_improvements_to_pkgconfig_funs.optional.scripts.patch (5.5 KB) - added by leif 10 years ago.
SCRIPTS repo. Optional improvements to {initialize,update}_pkgconfig_files(). Apply on top of other patch..
trac_11760-cumulative_diff_of_both_patches.diff (13.4 KB) - added by leif 10 years ago.
Cumulative diff of both patches against Sage 4.7.2.alpha3. For review only.

Download all attachments as: .zip

Change History (31)

comment:1 Changed 10 years ago by leif

  • Description modified (diff)

comment:2 Changed 10 years ago by leif

  • Keywords Duplicate definition SAGE_ROOT added

comment:3 Changed 10 years ago by jdemeyer

leif, this ticket is listed as blocker. Any ideas for a quick fix?

Changed 10 years ago by leif

SCRIPTS repo. Based on Sage 4.7.2.alpha3.

comment:4 Changed 10 years ago by leif

  • Authors set to Leif Leonhardy
  • Description modified (diff)
  • Status changed from new to needs_review

Couldn't resist to also do some clean-up.

Actual (trivial) fix starting on line 178.

comment:5 follow-up: Changed 10 years ago by leif

P.S.:

The fix doesn't cure already broken installations (i.e., .pc files already containing multiple definitions, like usually libpng[12].pc).

To test the patch, you can change the contents of or remove the file $SAGE_ROOT/local/lib/sage-current-location.txt (perhaps first removing the line SAGE_ROOT=${SAGE_ROOT} from libpng12.pc). The former triggers update_pkgconfig_files(), the latter initialize_pkgconfig_files().

comment:6 in reply to: ↑ 5 Changed 10 years ago by leif

Replying to leif:

The fix doesn't cure already broken installations (i.e., .pc files already containing multiple definitions, like usually libpng[12].pc).

I could more or less easily also add that, but it wouldn't help much (on e.g. upgrades) since neither update_pkgconfig_files() nor initialize_pkgconfig_files() is called later, or more precisely, before we run into potential errors due to multiple definitions.

comment:7 follow-up: Changed 10 years ago by jhpalmieri

In lines 178-181:

            if re.search("^SAGE_ROOT=", config, re.MULTILINE): 
	                # There's already a definition of SAGE_ROOT, 
	                # so skip this file. (Cf. #11760). 
	                continue 

what if the path to SAGE_ROOT has changed? Would it be better to delete this first line (containing SAGE_ROOT=...) and then proceed with the rest of the code in the loop? I guess that these files shouldn't have SAGE_ROOT in them already at this point, but just in case?

comment:8 in reply to: ↑ 7 ; follow-up: Changed 10 years ago by leif

Replying to jhpalmieri:

In lines 178-181:

            if re.search("^SAGE_ROOT=", config, re.MULTILINE): 
	        # There's already a definition of SAGE_ROOT, 
	        # so skip this file. (Cf. #11760). 
	        continue 

what if the path to SAGE_ROOT has changed?

This is initialize_pkgconfig_files(), not update_...().


Would it be better to delete this first line (containing SAGE_ROOT=...) and then proceed with the rest of the code in the loop? I guess that these files shouldn't have SAGE_ROOT in them already at this point, but just in case?

As mentioned in the description, I was aiming at a quick fix which solves the issue we have, not at curing arbitrary corrupted pkg-config files.

Also, initialize_pkgconfig_files() is only called once (unless one messes around with Sage's files as noted for testing the patch).

Nevertheless, any wrong definition of SAGE_ROOT present during initialize_pkgconfig_files() will be corrected in the subsequent call to update_pkgconfig_files().


Since sage-location won't have to deal with .pc files in the long term anyway, it doesn't make sense to implement sophisticated consistency checks and corrections there.

comment:9 in reply to: ↑ 8 Changed 10 years ago by leif

Replying to leif:

Nevertheless, any wrong definition of SAGE_ROOT present during initialize_pkgconfig_files() will be corrected in the subsequent call to update_pkgconfig_files().

Ah, I see. update_pkgconfig_files() is not called in the same invocation of sage-location if initialize_pkgconfig_files() was called, since install_moved() in this case returns False.

So we could in principle add

  • sage-location

    diff --git a/sage-location b/sage-location
    a b  
    6868        write_location_file()
    6969        update_library_files()
    7070        initialize_pkgconfig_files()
     71        update_pkgconfig_files()
    7172        return False
    7273    elif path != SAGE_ROOT:
    7374        # location moved

but that's IMHO really a minor issue since that situation doesn't occur unless the user damages his pkg-config files.

comment:10 Changed 10 years ago by leif

John, would the following make you happy?

  • sage-location

    diff --git a/sage-location b/sage-location
    a b  
    167167    the Sage installation has moved, i.e., paths have changed.
    168168    """
    169169    import re
     170    pat = re.compile(r"^SAGE_ROOT=.*\n", re.MULTILINE)
    170171    LIB = os.path.join(os.path.abspath(SAGE_ROOT), 'local', 'lib')
    171172    PKG = os.path.join(LIB, 'pkgconfig')
    172173    for name in os.listdir(PKG):
     
    175176            with open(filename) as file:
    176177                config = file.read()
    177178
    178             if re.search("^SAGE_ROOT=", config, re.MULTILINE):
    179                 # There's already a definition of SAGE_ROOT,
    180                 # so skip this file. (Cf. #11760).
    181                 continue
     179            first_match = pat.search(config)
     180            if first_match:
     181                # There are already one or more definitions of SAGE_ROOT,
     182                # which also happens if we previously processed the same
     183                # file (here) because of [symbolic or hard] links, so this
     184                # isn't necessarily an error. (Cf. #11760).
     185                # For simplicity, we just remove any definition, and only
     186                # issue a warning if there was more than one, and don't
     187                # check whether a previous definition was already current:
     188                config, n = pat.subn("", config)
     189                if n>1:
     190                    sys.stderr.write(
     191                        "Warning: sage_location: initialize_pkgconfig_files():\n" +
     192                        "  File \"%s\" already contained %d definitions of SAGE_ROOT.\n" %
     193                            (name, n) +
     194                        "  These will get replaced by a single, current definition.\n")
     195                    sys.stderr.flush()
    182196
    183197            new_config = config.replace(os.path.abspath(SAGE_ROOT), "${SAGE_ROOT}")
    184198            new_config = 'SAGE_ROOT=%s\n' % os.path.abspath(SAGE_ROOT) + new_config

If so, I'll just attach that patch, to be applied on top of the other one.

comment:11 Changed 10 years ago by leif

I could of course remove the temporary variable first_match; it's only there for documentation purposes. ;-)

I.e.,

            if pat.search(config):
                ...

would be sufficient.

comment:12 follow-up: Changed 10 years ago by leif

P.S.: We could do similar in update_pkgconfig_files().

comment:13 in reply to: ↑ 12 Changed 10 years ago by leif

Replying to leif:

P.S.: We could do similar in update_pkgconfig_files().

Like this, which IMHO is also somehow a bit cleaner:

  • sage-location

    diff --git a/sage-location b/sage-location
    a b  
    194207    ``initialize_pkgconfig_files()``, in particular, each of them contains a
    195208    definition of the variable ``SAGE_ROOT``, since only that is updated here.
    196209    """
     210    import re
     211    pat = re.compile(r"^SAGE_ROOT=.*\n", re.MULTILINE)
    197212    LIB = os.path.join(os.path.abspath(SAGE_ROOT), 'local', 'lib')
    198213    PKG = os.path.join(LIB, 'pkgconfig')
    199214    for name in os.listdir(PKG):
    200         filename = os.path.join(PKG,name)
     215        filename = os.path.join(PKG, name)
    201216        if os.path.splitext(filename)[1]==".pc":
    202217            with open(filename) as file:
    203218                config = file.read()
    204219
    205             prefix_start = config.find('SAGE_ROOT=')
    206             if prefix_start==-1:
     220            if not pat.search(config):
    207221                # This should never happen, unless the user modified the file.
    208222                sys.stderr.write(
    209223                    "Error: sage_location: update_pkgconfig_files():\n" +
     
    211225                    name + "  Skipping it...\n")
    212226                sys.stderr.flush()
    213227                continue
     228                # We could of course call initialize_pkgconfig_file(filename)
     229                # here instead to fix this issue, but unfortunately there's no
     230                # such function (for a single file) [yet].
    214231
    215             prefix_end = config.find('\n', prefix_start)
    216             new_prefix = 'SAGE_ROOT=%s' % os.path.abspath(SAGE_ROOT)
    217             new_config = config[:prefix_start] + new_prefix + config[prefix_end:]
     232            # Delete all previous definitions of SAGE_ROOT:
     233            config = pat.sub("", config)
     234
     235            definition = "SAGE_ROOT=%s\n" % os.path.abspath(SAGE_ROOT)
    218236            with open(filename, 'w') as file:
    219                 file.write(new_config)
     237                file.write(definition + config)
    220238
    221239
    222240def remove_files(path, remove_extensions):

comment:14 Changed 10 years ago by leif

Ok, I've further improved and normalized both functions...

Just wonder whether using use_pat.sub("${SAGE_ROOT}", config) (where use_pat is a compiled pattern matching os.path.abspath(SAGE_ROOT)) would be faster than config.replace(SAGE_ROOT_absolute, "${SAGE_ROOT}").

(I've factored out os.path.abspath(SAGE_ROOT) since it was not only used more than once, but also inside the loops. 8/ *shudder*)

Changed 10 years ago by leif

SCRIPTS repo. Optional improvements to {initialize,update}_pkgconfig_files(). Apply on top of other patch..

comment:15 Changed 10 years ago by leif

  • Description modified (diff)

Ok, I've attached another, optional patch, to be applied on top of the other one.

Changed 10 years ago by leif

Cumulative diff of both patches against Sage 4.7.2.alpha3. For review only.

comment:16 Changed 10 years ago by leif

Note that (as I already mentioned elsewhere), on a follow-up or one of the other tickets, we have to (re-)initialize .pc files whenever a package [providing a .pc file] gets (re)installed (preferably at the end of sage-spkg, just like sage-make_relative is called), since that's very likely to overwrite our "once" modified / initialized pkg-config files, until all spkgs fix their own .pc files in their spkg-install files (or just call a Sage script / shell function that does that for them from there).

This ticket just fixes the worst cases, and especially the one that currently always occurs (since more than a year by the way), namely duplicate definitions of SAGE_ROOT in a .pc file.


If sh*t happens during reinstallation of one or more packages, with the patches here, the user will now be able to repair all pkg-config files by simply deleting $SAGE_ROOT/local/lib/sage-current-location.txt, and afterwards running Sage (e.g. just ./sage -c quit), without having to check and manually edit any pkg-config files, which is certainly not perfect, but an improvement.

comment:17 follow-up: Changed 10 years ago by jhpalmieri

Regarding the comment

Will have weird effects in case SAGE_ROOT_absolute contains other characters having special
meaning in regular expressions. 

You could use re.escape to deal with this issue.

In update_pkgconfig_files, the code

# Delete all previous definitions of SAGE_ROOT:
config = def_pat.sub("", config)

The sub method only replaces one definition, not all of them. There should be only one at this point anyway, of course. Perhaps change the comment to reflect this?

Otherwise, the combined patch looks pretty good to me. I still have to actually test it...

comment:18 in reply to: ↑ 17 ; follow-ups: Changed 10 years ago by leif

Replying to jhpalmieri:

You could use re.escape to deal with this issue.

Yep, but I haven't tested [yet] whether using sub() is faster than str.replace().


The sub method only replaces one definition, not all of them.

Nope. The optional count parameter defaults to 0, which means replace all occurrences.

comment:19 in reply to: ↑ 18 Changed 10 years ago by leif

Replying to leif:

Replying to jhpalmieri:

You could use re.escape to deal with this issue.

Yep, but I haven't tested [yet] whether using sub() is faster than str.replace().

There seems to be no (measurable) difference (10.9--11 vs. 11 ns with %timeit -c ..., and the "largest" .pc file we have). I only tested substituting ${SAGE_ROOT} by ${SAGE_FOO} though, i.e., on an already "initialized" .pc file. Might differ with longer patterns (i.e. paths) to match, and the original file is slightly larger of course.

comment:20 Changed 10 years ago by leif

Btw., with re.escape() typical paths get much longer... 8&

comment:21 follow-up: Changed 10 years ago by leif

Looks like IPython's %timeit is broken; I always get 11 ns, regardless of the size of the file / string... XD

comment:22 in reply to: ↑ 21 Changed 10 years ago by leif

Replying to leif:

Looks like IPython's %timeit is broken; I always get 11 ns, regardless of the size of the file / string... XD

Oh, my bad, I quoted the statement... 8/

Btw., interestingly the timing framework somehow makes the braces special characters, so I have to escape them for the timing tests, but the backslashs also end up in the result... Weird.

comment:23 in reply to: ↑ 18 ; follow-up: Changed 10 years ago by jhpalmieri

  • Reviewers set to John Palmieri
  • Status changed from needs_review to positive_review

Replying to leif:

Replying to jhpalmieri:

The sub method only replaces one definition, not all of them.

Nope. The optional count parameter defaults to 0, which means replace all occurrences.

Sorry, I misread the documentation (thought it said "leftmost non-overlapping occurrence" instead of "leftmost non-overlapping occurrences").

Btw., with re.escape() typical paths get much longer... 8&

Yes, but you don't have to ever look at them, do you? Just pass them on to re.

Anyway, things look good to me. In testing, the old version acts badly if, for example, you delete "sage-location.txt", whereas the new version does a good job of cleaning up the resulting mess. I've tried to break the new version in other ways, but not successfully. There may still be holes, but I'm not sure where. In any case, it's an improvement over the previous version.

(If you feel like adding a comment about re.escape, in case someone else works on this, go ahead. No need for further review in that case.)

comment:24 Changed 10 years ago by jhpalmieri

  • Description modified (diff)

comment:25 in reply to: ↑ 23 Changed 10 years ago by leif

Replying to jhpalmieri:

Replying to leif:

Btw., with re.escape() typical paths get much longer... 8&

Yes, but you don't have to ever look at them, do you? Just pass them on to re.

Of course, but longer strings take longer to compile... ;-)

(If you feel like adding a comment about re.escape, in case someone else works on this, go ahead. No need for further review in that case.)

It seems sub() is indeed slower than str.replace() (about 7:4 in CPU time), so it's not worth using it there instead. Users won't notice any difference anyway I think; I was just curious. (And it's hardly worth updating the patch or providing a new one just to update the comment on that.)

comment:26 follow-up: Changed 10 years ago by leif

P.S.: Thanks for reviewing this; triple X when it's finally merged and we get rid of this long-lasting issue.

comment:27 in reply to: ↑ 26 Changed 10 years ago by jdemeyer

  • Merged in set to sage-4.7.2.alpha4
  • Resolution set to fixed
  • Status changed from positive_review to closed

Replying to leif:

P.S.: Thanks for reviewing this; triple X when it's finally merged and we get rid of this long-lasting issue.

So where is the party?

comment:28 Changed 9 years ago by jhpalmieri

See #12331 for a follow-up.

Note: See TracTickets for help on using tickets.