Opened 5 years ago

Last modified 5 years ago

#16854 new enhancement

Add bibtex functionality to citation management

Reported by: mraum Owned by:
Priority: major Milestone: sage-6.4
Component: misc Keywords:
Cc: Merged in:
Authors: Martin Raum Reviewers:
Report Upstream: N/A Work issues:
Branch: u/mraum/citation_bibtex (Commits) Commit: 2ac77ee3936d9f156c43a8b6e5df8fc9bd18b720
Dependencies: #16777 Stopgaps:

Description

Provide basic bibtex facilities for the new citation management system.

Change History (7)

comment:1 Changed 5 years ago by mraum

  • Branch set to u/mraum/citation_bibtex

comment:2 Changed 5 years ago by aapitzsch

  • Commit set to 4277fdb3c82a7ef2d837866d9fc04594029584bf
  • Dependencies changed from 16777 to #16777

IMO we shouldn't introduce new all.py files. See #6547.

comment:3 Changed 5 years ago by burcin

I haven't looked at the changes in this ticket in detail, but #3317 also adds some functionality to process citations in bibtex format.

comment:4 follow-up: Changed 5 years ago by mraum

Thank you for pointing out #3317. I should have done that myself.

The code there would require essential work, though. The author of #3317, however, did not accept that adding a new standard spkg from scratch is not possible. The code also introduces decorators, which slow code down. He argued that the slow down is minor and would have to be accepted. But the community does not seem to agree. I thought about updating the code at #3317 for almost two month, and started twice. But it is quite different from the profiling approach that was taken by Mike Hansen. Overwriting code on a ticket is, I think, incompatible with the basic idea of Trac as we currently use it. So I opened a new ticket.


As for all.py: As far as I know it has not been decided that all.py will be deprecated. I would be happy if so, because as stated in #6547, it's not pythonic. However, I normally stick to the developer guide, which states differently at the moment.

comment:5 Changed 5 years ago by git

  • Commit changed from 4277fdb3c82a7ef2d837866d9fc04594029584bf to 2ac77ee3936d9f156c43a8b6e5df8fc9bd18b720

Branch pushed to git repo; I updated commit sha1. New commits:

2ac77eeUpdate ECM bibtex code.

comment:6 in reply to: ↑ 4 ; follow-up: Changed 5 years ago by burcin

Replying to mraum:

The code there would require essential work, though.

True, it's been bitrotting for about 2 years.

The author of #3317, however, did not accept that adding a new standard spkg from scratch is not possible.

If we had made pybtex an optional package then it wouldn't have been any trouble by now...

Are you suggesting that this limitation justifies reimplementing bibtex functionality in Sage when it is already provided by a pure Python package that is trivial to add to the distribution?

The code also introduces decorators, which slow code down. He argued that the slow down is minor and would have to be accepted. But the community does not seem to agree.

IIRC (and it's been a few years since I looked at the code), the decorators only introduce one string insertion in a Python set. That's one hash table lookup, which is negligible compared to what Python goes through for each function call.

Where do you get the idea that "the community does not seem to agree"?

I thought about updating the code at #3317 for almost two month, and started twice. But it is quite different from the profiling approach that was taken by Mike Hansen. Overwriting code on a ticket is, I think, incompatible with the basic idea of Trac as we currently use it. So I opened a new ticket.

It might have helped to contact the authors on the ticket or comment on the ticket directly.

The profiling approach is broken for several reasons:

  • the code used for different problem sizes is often different.

Profiling a small example will not give you the correct information. If you are really working on the cutting edge of what is computable, then you don't want to run the whole computation under the profiler once more.

  • you have to guess what is being used from the data obtained from the profiler.

There is no clean way to associate citation information to functions this way.

  • it does not allow tracking more fine grained information than function names.

If a Sage function wraps several algorithms by calling an external package with different arguments, you cannot differentiate these.

I'd really like it if Sage improved it's citation capabilities and gave more credit to authors of underlying packages and the papers describing the algorithms used. Unfortunately, I don't think this is a step in that direction.

comment:7 in reply to: ↑ 6 Changed 5 years ago by mraum

In some sense you've got a point. I apologize, if you have the feeling that I ignore the effort you and others made to write the code at #3317.

However I'm surprised that you as a coauthor of the old code at ​https://bitbucket.org/niels_mfo/sage-citation never made any effort to include pybtex in Sage. The code at #3317, in my opinion, has no realistic chance to be accepted in its current stage, because the author ignore several conventions in sage. Including all.py/init.py, declaration of code in init.py (which increases start up time, an important subject to sage for years) or including *.bib code into the sage library. All that requires decisions, and it seems to me that nobody made an effort to trigger them.

You can view the present code as a workaround. It's much better than what we are having, and much worse than what we could have. It will be superseded by other implementation, maybe even yours. But I don't see #3317 being ready for inclusion anytime soon (please, prove me wrong; I'm serious. I'm even glad to help, if you decide to either adhere to Sage standards or ask to change them, if necessary. E.g., why not push #6547? That's doable, and I'm in immediately if you decide to). I really want to say that the citation system on its own is very important to me. I feel that it is a moral obligation if we use other people's work to allow for proper citations of it.

As for the impact of speed of decorators, I'm sorry I exaggerated. I don't know of any specific place where it was stated that decorators for citations would be too slow to be integrated into Sage. However, I have had so many discussions about code being slowed down when introducing global constructions that I was convinced this would be the common opinion. But perhaps I'm wrong.

Note: See TracTickets for help on using tickets.