Opened 17 months ago

Closed 8 months ago

Last modified 8 months ago

#14358 closed defect (fixed)

Notebook server should run Java for JMol inside temporary directory

Reported by: ppurka Owned by: jason, mpatel, was
Priority: major Milestone: sage-6.1
Component: graphics Keywords:
Cc: gutow, dimpase Merged in:
Authors: Jeroen Demeyer Reviewers: Punarbasu Purkayastha
Report Upstream: N/A Work issues:
Branch: u/vbraun/sagenb_java_tmp (Commits) Commit: dcd3c9e8d900e7f7853df5bb9ee67137a979cf1c
Dependencies: Stopgaps:

Description (last modified by jdemeyer)

When a Sage notebook server needs to create a static 3D image, it runs JMol server-side to create this image (if Java is available). However, it does this inside a directory to which the worker process might not have access to.

Apply 14358_java.patch


Old report (before we diagnosed the problem):

We have been running the sage worksheets as an unprivileged user. The setup is as follows.

  1. A user, say user1, starts the sage server using
    sage -n port=8888 accounts=True interface='' server_pool=['sagenb@localhost'] openid=True automatic_login=False port_tries=0 directory=/home/user1/a.sagenb
    
  2. Because of the server_pool command, any worksheet is opened as an unprivileged user called sagenb. Here is some more info on how to set up the environment.
    1. Create user sagenb by doing (as root)
      $ groupadd sagenb
      $ useradd -g sagenb -m sagenb
      
    2. Create a ssh key for user1
      $ ssh-keygen -t rsa # run as user1 and make it passwordless
      
    3. Copy the id_rsa.pub to ~sagenb/.ssh/authorized_keys.
      $ cp -a ~user1/.ssh/id_rsa.pub /tmp/authorized_keys # Run as user1
      $ mkdir ~sagenb/.ssh # Run as sagenb
      $ cp /tmp/authorized_keys .ssh # Run as sagenb
      
  3. With default sage-5.2 there is no problem with this and everything works.
  4. With sage-5.2 + "#13121 without new jmol", jmol works but one needs to close the worksheet and then reopen the worksheet and then the jmol pops up.
  5. With sage-5.2 + #13121, jmol works partially:
    1. First, the static image can not be created and there is a permission denied error. See permission_error.jpg.
    2. However, once the 3D output is toggled, jmol loads fine. The permission denied error is still there.
    3. The error obtained is of the form
      script ERROR: script ERROR: io error reading
      /home/user1/a.sagenb/home/__store__/0/09/098/098f/te\
      st/0/cells/7/sage0-size500-499889951.jmol.zip|SCRIPT:
      java.io.FileNotFoundException:
      /home/user1/a.sagenb/home/__store__/0/09/098/098f/te\
      st/0/cells/7/sage0-size500-499889951.jmol.zip (Permission denied)
      Sleeping...Make Interactive
      
  6. When run locally, i.e., without server_pool, the jmol loads and runs fine. For instance, this is fine:
    sage -n directory=a.sagenb
    

This problem has made it difficult for me to upgrade a server from 5.2 and I think the error is introduced by some change in jmol. The error itself stems from some java component. A diff of the changes between the directory $SAGE_ROOT/devel/sagenb/sagenb/data/sage3d for sage-5.2 and sage-5.7.rc0 has not yielded any difference. So, I believe the difference is really in the jmol spkg introduced in #12299.

I have traced it back and reproduced it in sage-5.2 + new notebook + new jmol. Although I wrote sage-5.2 above, the error is present in all the sage versions I have tried - sage-5.4 till sage-5.9.beta0.

Attachments (3)

permission_error.jpg (176.7 KB) - added by ppurka 17 months ago.
The output of a 3D command
sagenb_strace.log (1.5 MB) - added by ppurka 8 months ago.
strace output
14358_java.patch (11.3 KB) - added by jdemeyer 8 months ago.

Download all attachments as: .zip

Change History (48)

Changed 17 months ago by ppurka

The output of a 3D command

comment:1 Changed 17 months ago by ppurka

  • Cc dima added
  • Description modified (diff)

comment:2 Changed 17 months ago by ppurka

  • Description modified (diff)

comment:3 follow-up: Changed 17 months ago by gutow

I haven't done anything to the way Jmol is set up, although somebody else might have. The error actually looks like the errors associated with Jmol running in a web page and directly accessing local resources without a server. In that case all files accessed must be below Jmol in the directory tree to satisfy Java security requirements. Another possibility is that somebody has changed the way files are created in the notebook and Jmol can no longer find the files. A third possibility relates to MacOS and Windows having become so Java in browsers unfriendly that they mess up file access as well. A pure javascript (although slower) alternative is in the works to deal with the Java in browsers problem.

Unfortunately, I do not have time to look at this particular problem in detail now. Maybe later this week.

comment:4 in reply to: ↑ 3 Changed 17 months ago by ppurka

Replying to gutow:

I haven't done anything to the way Jmol is set up, although somebody else might have. The error actually looks like the errors associated with Jmol running in a web page and directly accessing local resources without a server. In that case all files accessed must be below Jmol in the directory tree to satisfy Java security requirements. Another possibility is that somebody has changed the way files are created in the notebook and Jmol can no longer find the files.

The permissions in the worksheet directories haven't changed. They are the same as the worksheets from two years ago.

A third possibility relates to MacOS and Windows having become so Java in browsers unfriendly that they mess up file access as well. A pure javascript (although slower) alternative is in the works to deal with the Java in browsers problem.

All these problems were replicated on Linux - Debian-6 and Gentoo, both 64bits. Also, the bug was found with sage-5.2 + new notebook + new jmol, and not with vanilla sage-5.2. sage-5.4 onwards already have the new notebook and jmol and so all of them exhibit this bug.

Unfortunately, I do not have time to look at this particular problem in detail now. Maybe later this week.

There is no immediate rush. I have held up the upgrade for two months now and have been able to pinpoint the (hopefully) exact reason only now. Just wanted to bring this bug to your attention.

comment:5 Changed 12 months ago by jdemeyer

  • Milestone changed from sage-5.11 to sage-5.12

comment:6 follow-ups: Changed 12 months ago by novoselt

I've complained about it a while ago: #13978

jmol files for the image are stored in some cell directory with permissions 0700 for the server user, thus worker processes cannot get access to them.

comment:7 in reply to: ↑ 6 Changed 12 months ago by gutow

Replying to novoselt:

I've complained about it a while ago: #13978

jmol files for the image are stored in some cell directory with permissions 0700 for the server user, thus worker processes cannot get access to them.

Oh! That may be key. What should it be? I believe the code in jmoldata.py has permission to do chmod, so may be able to fix this. I didn't even think to look at those permissions.

comment:8 Changed 12 months ago by novoselt

I think group should have read/execute permission too, not sure about write. I tried to change it manually before in the filesystem, but doing anything with the worksheet would revert it to user-only, so I couldn't test if it works. If you know where to do the changes in Sage code - please do!

comment:9 in reply to: ↑ 6 Changed 12 months ago by ppurka

Replying to novoselt:

I've complained about it a while ago: #13978

jmol files for the image are stored in some cell directory with permissions 0700 for the server user, thus worker processes cannot get access to them.

This is not the problem. Or, at least, it wasn't the problem before sage-5.4. The permissions have always been 0700 (belonging to the server user), and it has remained the same for years. I am not sure how it worked earlier, but it did use to work.

The problem is the the worker processes may be restricted in their users and groups - for instance I have been running a server using sagenb:sagenb as the user/group for the worker process, and this user belongs to no other groups, and no other user shares this group sagenb.

If the change is only in granting of permissions, then the server user should have the same group as the worker and the permissions should be probably 0740 or 0750.

Last edited 12 months ago by ppurka (previous) (diff)

comment:10 Changed 12 months ago by novoselt

For the record using 5.12.beta4: it is possible to descend through "cells" directory, but numbered directories for each cell and files within are only accessible by the server user.

Meanwhile Java's appetite grew and it is not satisfied with 5Gb ulimit on virtual memory (10Gb was enough).

comment:11 Changed 11 months ago by ppurka

I have created a pull request at sagenb to address this bug - it changes some file permissions.

I need some sagenb expert to go through the patch; someone who knows/remembers why the file permissions were set so restrictively.

comment:12 Changed 11 months ago by ppurka

  • Cc dimpase added; dima removed

comment:13 Changed 8 months ago by jdemeyer

I'm also using a similar setup (worker accounts, even on different servers) with Sage 5.12 and I'm not seeing this problem. I am seeing a problem with published worksheets though, where JMol doesn't work (the "Make Interactive" button on a published worksheet simply does nothing).

Last edited 8 months ago by jdemeyer (previous) (diff)

comment:14 Changed 8 months ago by jdemeyer

I can confirm that the permissions are indeed restrictive enough that they prevent the worker account from accessing the server's .sage directory. But in my setup, this doesn't prevent JMol from working. Perhaps this problem has been fixed? I wouldn't simply relax permissions, it is probably a feature that not everybody can access everybody's worksheets.

comment:15 follow-up: Changed 8 months ago by ppurka

This problem has not been fixed on my side.

What system are you using? I have been able to reliably reproduce this on Linux (Debian and Gentoo) ever since I first reported it.

comment:16 Changed 8 months ago by ppurka

The one about published worksheets is known. It is https://github.com/sagemath/sagenb/issues/179 but I don't know the fix.

comment:17 in reply to: ↑ 15 Changed 8 months ago by jdemeyer

Replying to ppurka:

What system are you using? I have been able to reliably reproduce this on Linux (Debian and Gentoo) ever since I first reported it.

Linux sage4 3.2.1-gentoo-r2 #15 SMP Tue May 15 10:40:50 CEST 2012 x86_64 Intel(R) Xeon(R) CPU X5660 @ 2.80GHz GenuineIntel GNU/Linux
Sage Version 5.12, Release Date: 2013-10-07

I am using the patch from #11679, but I don't see how that could make a difference.

Notebook started with

server_pool = ['worker@sage3', 'worker@sage4']
ulimit='-v 1200000 -t 86400'


# Import sage library
from sage.all_cmdline import *

notebook(
        port=8081,
        secure=False,
        accounts=False,
        automatic_login=False,
        quiet=True,
        timeout=3600*3,
        ulimit=ulimit,
        server_pool=server_pool)

The workers most certainly do not have access to the server's .sage directory (it can even be on a different machine!) but JMol works...

Since we want the server_pool thing to keep working, changing permissions is not the right solution, since you should assume that the workers can only access /tmp (or whatever temporary directory you choose with #11679).

Last edited 8 months ago by jdemeyer (previous) (diff)

comment:18 Changed 8 months ago by jdemeyer

I just ran an strace on the worker process and there is only one point where it even tries to access the .sage directory (called dotsage in my case):

open("/home/sagenb/dotsage/sage_notebook.sagenb/home/J.Demeyer/197/data/c_lib.pyx", O_RDONLY) = -1 EACCES (Permission denied)

(this is because that directory is added to sys.path of the worker, see sagenb/notebook/worksheet.py, see #3844)

The worker creates a jmol.zip file:

open("sage0-size500-863868640.jmol.zip", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3

but it does so after

chdir("/var/sage/tmpmJfHuO")            = 0

(/var/sage plays the role of /tmp in my setup and is shared between the server and the workers).

Last edited 8 months ago by jdemeyer (previous) (diff)

comment:19 follow-up: Changed 8 months ago by ppurka

  1. Can you show me the permissions of /var/sage?
    ls -ld /var/sage
    
  2. Does your worker belong to the same group as that in /var/sage?
  3. Who creates this directory? The server user?

Edit: Thanks for debugging this, by the way. I think you have provided the main reason why it works in your case and doesn't work in my case.

Last edited 8 months ago by ppurka (previous) (diff)

comment:20 in reply to: ↑ 19 ; follow-up: Changed 8 months ago by jdemeyer

Replying to ppurka:

  1. Can you show me the permissions of /var/sage?
jdemeyer@sage4:/opt/sage/sage-5.12/devel/sagenb$ ls -ld /var/sage
drwxrwxrwt 1424 sagenb sagenb 274432 Dec 11 11:49 /var/sage

(the same permissions as /tmp)

  1. Does your worker belong to the same group as that in /var/sage?

No. But the worker does have write access to /var/sage.

  1. Who creates this directory? The server user?

It was created once by root but owned by the server user.

Edit: Thanks for debugging this, by the way. I think you have provided the main reason why it works in your case and doesn't work in my case.

Please share your analysis, because it's not yet clear to me...

Last edited 8 months ago by jdemeyer (previous) (diff)

comment:21 Changed 8 months ago by jdemeyer

The only thing which does not work in my setup is the DATA directory.

comment:22 in reply to: ↑ 20 ; follow-up: Changed 8 months ago by ppurka

Replying to jdemeyer:

Please share your analysis, because it's not yet clear to me...

The strace output here shows that it tries to create the jmol file in some other directory. I am trying to understand if that directory is in /home/sagenb or in the "dotsage" directory of the server process.

comment:23 Changed 8 months ago by ppurka

Hm.. I tried to recreate your setup but it still doesn't work for me.

comment:24 in reply to: ↑ 22 Changed 8 months ago by jdemeyer

Replying to ppurka:

The strace output here shows that it tries to create the jmol file in some other directory. I am trying to understand if that directory is in /home/sagenb or in the "dotsage" directory of the server process.

It's exactly where it should be: in /tmp (see the chdir on line 293 of the strace). The file is also successfully created, so the error must be coming from somewhere else (there are no EACCES errors in your strace).

Last edited 8 months ago by jdemeyer (previous) (diff)

comment:25 Changed 8 months ago by jdemeyer

Can you try tracing also subprocesses of the worker process (use strace -ff).

Changed 8 months ago by ppurka

strace output

comment:26 Changed 8 months ago by ppurka

I have updated the strace file. It shows the permission denied line.

comment:27 Changed 8 months ago by jdemeyer

Got it: you have Java installed and I do not. Therefore, my static image is generated by Tachyon, not JMol. For the dynamic image, everything is client-side and it doesn't matter that I don't have Java on the server.

comment:28 Changed 8 months ago by ppurka

Wow! That confirms my suspicion that it was the update of jmol that broke this feature. That means the fix is to always use Tachyon for generating the static image.

comment:29 Changed 8 months ago by jdemeyer

I think the more correct fix is simply to run JMol inside the temporary directory (that works for Tachyon).

Last edited 8 months ago by jdemeyer (previous) (diff)

comment:30 Changed 8 months ago by jdemeyer

  • Component changed from notebook to graphics
  • Description modified (diff)
  • Summary changed from Permission denied error with jmol and "remote" user to Notebook server should run Java for JMol inside temporary directory

comment:31 Changed 8 months ago by jdemeyer

  • Authors set to Jeroen Demeyer
  • Status changed from new to needs_review

comment:32 Changed 8 months ago by jdemeyer

  • Description modified (diff)

comment:33 Changed 8 months ago by jdemeyer

The changed R doctest has nothing to do with this ticket, but it's obvious that a doctest shouldn't try to install a package (I ran optional Java + Internet tests inside sage/interfaces).

Changed 8 months ago by jdemeyer

comment:34 follow-up: Changed 8 months ago by kcrisman

Please see #7771 with respect to the R doctest issue. I'd rather have that issue fixed there than here.

By the way, apparently it wasn't obvious before that a doctest shouldn't try to install a package... but this fix is fine too, though it's not always clear what R will output (see #7771 reviewer patch and discussion).

comment:35 follow-up: Changed 8 months ago by kcrisman

Also, does this change just plain old fix Github pull requests 183 and 185? Just checking, as that wasn't clear.

comment:36 in reply to: ↑ 34 Changed 8 months ago by jdemeyer

Replying to kcrisman:

Please see #7771 with respect to the R doctest issue. I'd rather have that issue fixed there than here.

In some sense you are right, although a fix for this small doctest now is better than a fix as part of a much bigger ticket later (which often implies never). Ideally, this small fix should be a new ticket (independent of #14358 and #7771) but then there is a good chance it will get stuck forever without review (that's unfortunately the fate of many tickets).

By the way, apparently it wasn't obvious before that a doctest shouldn't try to install a package...

It was an optional test, so I guess nobody noticed...

Last edited 8 months ago by jdemeyer (previous) (diff)

comment:37 in reply to: ↑ 35 Changed 8 months ago by jdemeyer

Replying to kcrisman:

Also, does this change just plain old fix Github pull requests 183 and 185?

Yes, it does.

comment:38 follow-up: Changed 8 months ago by ppurka

This patch looks good. I have tested it on sage-5.12.beta4 and sage-5.13.beta1 with sagenb git and it works well. I have just one comment:

The change to fg should be kept. I do not understand why it was set so low at fg=2. The default image size seems to be not fg=2 but rather fg=5. The change is needed to ensure that the java process does not run out of memory. For example cube(figsize=40) returns a memory error from jmol now.

Do you plan to include the fix in 5.13? Or, do you want to make a git patch out of it?

comment:39 in reply to: ↑ 38 ; follow-up: Changed 8 months ago by jdemeyer

Replying to ppurka:

The change to fg should be kept.

I only removed code which was commented out, so I see no problem.

For example cube(figsize=40) returns a memory error from jmol now.

That was already the case, it's not because of my patch.

Do you plan to include the fix in 5.13? Or, do you want to make a git patch out of it?

I haven't decided. It looks a pretty safe patch, which isn't going to break anything else and only affects the notebook.

comment:40 in reply to: ↑ 39 Changed 8 months ago by ppurka

  • Reviewers set to Punarbasu Purkayastha
  • Status changed from needs_review to positive_review

Replying to jdemeyer:

Replying to ppurka:

The change to fg should be kept.

I only removed code which was commented out, so I see no problem.

Oops. Sorry. I read and re-read it several times, but failed to notice that it was commented code. That explains why the default size was 5 - I was wondering why that if statement wasn't getting executed.

Then, the patch looks good to me. It is up to you if you want to include it in 5.13 or later.

comment:41 follow-up: Changed 8 months ago by ppurka

@jdemeyer by the way, if you get the time, maybe you can look at https://github.com/sagemath/sagenb/issues/184

I am really not getting the time to look into that. It is quite a showstopper bug that is more easily reproduced in old notebook directories (at least in my experience). You are good in debugging all kinds of stuff!! :-)

comment:42 in reply to: ↑ 41 Changed 8 months ago by jdemeyer

Replying to ppurka:

@jdemeyer by the way, if you get the time, maybe you can look at https://github.com/sagemath/sagenb/issues/184

Sorry, but I don't know much about neither OpenID nor the Sage Notebook.

comment:43 Changed 8 months ago by vbraun

  • Branch set to u/vbraun/sagenb_java_tmp

comment:44 Changed 8 months ago by vbraun

  • Resolution set to fixed
  • Status changed from positive_review to closed

comment:45 Changed 8 months ago by vbraun

  • Commit set to dcd3c9e8d900e7f7853df5bb9ee67137a979cf1c

This breaks the commandline jmol, see #15579

Note: See TracTickets for help on using tickets.