Ticket #8469: trac_8469-rsa.patch

File trac_8469-rsa.patch, 19.1 KB (added by mvngu, 11 years ago)

based on Sage 4.5.2.rc0

  • new file doc/en/thematic_tutorials/bibliography.rst

    # HG changeset patch
    # User Minh Van Nguyen <nguyenminh2@gmail.com>
    # Date 1268200610 28800
    # Node ID 2f09678738ba26f8c142aac81d86ba02a907f9c2
    # Parent  8f177bfc9a22c0c9ab23f64a5f6c1e0b05eeec9d
    #8469: tutorial: number theory and the RSA public key cryptosystem
    
    diff --git a/doc/en/thematic_tutorials/bibliography.rst b/doc/en/thematic_tutorials/bibliography.rst
    new file mode 100644
    - +  
     1============
     2Bibliography
     3============
     4
     5.. [CormenEtAl2001] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and
     6   C. Stein. *Introduction to Algorithms*. The MIT Press, USA, 2nd
     7   edition, 2001.
     8
     9.. [MenezesEtAl1996] A. J. Menezes, P. C. van Oorschot, and
     10   S. A. Vanstone. *Handbook of Applied Cryptography*. CRC Press, Boca
     11   Raton, FL, USA, 1996.
     12
     13.. [Stinson2006] D. R. Stinson. *Cryptography: Theory and Practice*.
     14   Chapman & Hall/CRC, Boca Raton, USA, 3rd edition, 2006.
     15
     16.. [TrappeWashington2006] W. Trappe and L. C. Washington. *Introduction
     17   to Cryptography with Coding Theory*. Pearson Prentice Hall, Upper
     18   Saddle River, New Jersey, USA, 2nd edition, 2006.
  • doc/en/thematic_tutorials/index.rst

    diff --git a/doc/en/thematic_tutorials/index.rst b/doc/en/thematic_tutorials/index.rst
    a b  
    1717 
    1818   functional_programming
    1919   group_theory
     20   numtheory_rsa
     21   bibliography
    2022
    2123Indices and tables
    2224==================
  • new file doc/en/thematic_tutorials/numtheory_rsa.rst

    diff --git a/doc/en/thematic_tutorials/numtheory_rsa.rst b/doc/en/thematic_tutorials/numtheory_rsa.rst
    new file mode 100644
    - +  
     1.. -*- coding: utf-8 -*-
     2
     3=================================================
     4Number Theory and the RSA Public Key Cryptosystem
     5=================================================
     6
     7.. MODULEAUTHOR:: Minh Van Nguyen <nguyenminh2@gmail.com>
     8
     9
     10This tutorial uses Sage to study elementary number theory and the RSA
     11public key cryptosystem.  A number of Sage commands will be presented
     12that help us to perform basic number theoretic operations such as
     13greatest common divisor and Euler's phi function.  We then present the
     14RSA cryptosystem and use Sage's built-in commands to encrypt and
     15decrypt data via the RSA algorithm.  Note that this tutorial on RSA is
     16for pedagogy purposes only.  For further details on cryptography or
     17the security of various cryptosystems, consult specialized texts such
     18as
     19[MenezesEtAl1996]_,
     20[Stinson2006]_, and
     21[TrappeWashington2006]_.
     22
     23
     24Elementary number theory
     25========================
     26
     27We first review basic concepts from elementary number theory,
     28including the notion of primes, greatest common divisors, congruences
     29and Euler's phi function.  The number theoretic concepts and Sage
     30commands introduced will be referred to in later sections when we
     31present the RSA algorithm.
     32
     33
     34Prime numbers
     35-------------
     36
     37Public key cryptography uses many fundamental concepts from number
     38theory, such as prime numbers and greatest common divisors.  A
     39positive integer `n > 1` is said to be *prime* if its factors are
     40exclusively 1 and itself.  In Sage, we can obtain the first 20 prime
     41numbers using the command ``primes_first_n``::
     42
     43    sage: primes_first_n(20)
     44    [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]
     45
     46
     47Greatest common divisors
     48------------------------
     49
     50Let `a` and `b` be integers, not both zero. Then the greatest common
     51divisor (GCD) of `a` and `b` is the largest positive integer which is
     52a factor of both `a` and `b`. We use `\gcd(a,b)` to denote this
     53largest positive factor. One can extend this definition by setting
     54`\gcd(0,0) = 0`. Sage uses ``gcd(a, b)`` to denote the GCD of `a`
     55and `b`. The GCD of any two distinct primes is 1, and the GCD of 18
     56and 27 is 9. ::
     57
     58    sage: gcd(3, 59)
     59    1
     60    sage: gcd(18, 27)
     61    9
     62
     63If `\gcd(a,b) = 1`, we say that `a`is *coprime* (or relatively
     64prime) to `b`.  In particular, `\gcd(3, 59) = 1` so 3 is coprime to 59
     65and vice versa.
     66
     67
     68Congruences
     69-----------
     70
     71When one integer is divided by a non-zero integer, we usually get a
     72remainder.  For example, upon dividing 23 by 5, we get a remainder of
     733; when 8 is divided by 5, the remainder is again 3.  The notion of
     74congruence helps us to describe the situation in which two integers
     75have the same remainder upon division by a non-zero integer.  Let
     76`a,b,n \in \ZZ` such that `n \neq 0`.  If `a` and `b` have the
     77same remainder upon division by `n`, then we say that `a` is
     78*congruent* to `b` modulo `n` and denote this relationship by
     79
     80.. MATH::
     81
     82    a \equiv b \pmod{n}
     83
     84This definition is equivalent to saying that `n` divides the
     85difference of `a` and `b`, i.e. `n \;|\; (a - b)`.  Thus
     86`23 \equiv 8 \pmod{5}` because when both 23 and 8 are divided by 5, we
     87end up with a remainder of 3.  The command ``mod`` allows us to
     88compute such a remainder::
     89
     90    sage: mod(23, 5)
     91    3
     92    sage: mod(8, 5)
     93    3
     94
     95
     96Euler's phi function
     97--------------------
     98
     99Consider all the integers from 1 to 20, inclusive.  List all those
     100integers that are coprime to 20.  In other words, we want to find
     101those integers `n`, where `1 \leq n \leq 20`, such that
     102`\gcd(n,20) = 1`.  The latter task can be easily accomplished with a
     103little bit of Sage programming::
     104
     105    sage: for n in range(1, 21):
     106    ...       if gcd(n, 20) == 1:
     107    ...           print n,
     108    ...
     109    1 3 7 9 11 13 17 19
     110
     111The above programming statements can be saved to a text file called,
     112say, ``/home/mvngu/totient.sage``, organizing it as follows to enhance
     113readability. ::
     114
     115    for n in xrange(1, 21):
     116        if gcd(n, 20) == 1:
     117            print n,
     118
     119We refer to ``totient.sage`` as a Sage script, just as one would refer
     120to a file containing Python code as a Python script.  We use 4 space
     121indentations, which is a coding convention in Sage as well as Python
     122programming, instead of tabs.
     123
     124The command ``load`` can be used to read the file containing our
     125programming statements into Sage and, upon loading the content of the
     126file, have Sage execute those statements::
     127
     128    load("/home/mvngu/totient.sage")
     129    1 3 7 9 11 13 17 19
     130
     131From the latter list, there are 8 integers in the closed interval
     132`[1, 20]` that are coprime to 20.  Without explicitly generating the
     133list ::
     134
     135    1  3  7  9  11  13  17  19
     136
     137how can we compute the number of integers in `[1, 20]` that are
     138coprime to 20?  This is where Euler's phi function comes in handy.
     139Let `n \in \ZZ` be positive.  Then *Euler's phi function* counts the
     140number of integers `a`, with `1 \leq a \leq n`, such that
     141`\gcd(a,n) = 1`.  This number is denoted by `\varphi(n)`.  Euler's phi
     142function is sometimes referred to as Euler's totient function, hence
     143the name ``totient.sage`` for the above Sage script.  The command
     144``euler_phi`` implements Euler's phi function.  To compute
     145`\varphi(20)` without explicitly generating the above list, we proceed
     146as follows::
     147
     148    sage: euler_phi(20)
     149    8
     150
     151
     152How to keep a secret?
     153=====================
     154
     155*Cryptography* is the science (some might say art) of concealing
     156data.  Imagine that we are composing a confidential email to
     157someone.  Having written the email, we can send it in one of two ways.
     158The first, and usually convenient, way is to simply press the send
     159button and not care about how our email will be delivered.  Sending an
     160email in this manner is similar to writing our confidential message on
     161a postcard and post it without enclosing our postcard inside an
     162envelope.  Anyone who can access our postcard can see our message.
     163On the other hand, before sending our email, we can scramble the
     164confidential message and then press the send button.  Scrambling our
     165message is similar to enclosing our postcard inside an envelope.
     166While not 100% secure, at least we know that anyone wanting to read
     167our postcard has to open the envelope.
     168
     169In cryptography parlance, our message is called *plaintext*.  The
     170process of scrambling our message is referred to as *encryption*.
     171After encrypting our message, the scrambled version is called
     172*ciphertext*.  From the ciphertext, we can recover our original
     173unscrambled message via *decryption*. The following figure
     174illustrates the processes of encryption and decryption.  A
     175*cryptosystem* is comprised of a pair of related encryption and
     176decryption processes. ::
     177
     178   + ---------+   encrypt    +------------+   decrypt    +-----------+
     179   | plaintext| -----------> | ciphertext | -----------> | plaintext |
     180   +----------+              +------------+              +-----------+
     181
     182
     183The following table provides a very simple method of scrambling a
     184message written in English and using only upper case letters,
     185excluding punctuation characters. ::
     186
     187   +----------------------------------------------------+
     188   | A   B   C   D   E   F   G   H   I   J   K   L   M  |
     189   | 65  66  67  68  69  70  71  72  73  74  75  76  77 |
     190   +----------------------------------------------------+
     191   | N   O   P   Q   R   S   T   U   V   W   X   Y   Z  |
     192   | 78  79  80  81  82  83  84  85  86  87  88  89  90 |
     193   +----------------------------------------------------+
     194
     195Formally, let
     196
     197.. MATH::
     198
     199    \Sigma
     200    =
     201    \{ \texttt{A}, \texttt{B}, \texttt{C}, \dots, \texttt{Z} \}
     202
     203be the set of capital letters of the English alphabet. Furthermore,
     204let
     205
     206.. MATH::
     207
     208    \Phi
     209    =
     210    \{ 65, 66, 67, \dots, 90 \}
     211
     212be the American Standard Code for Information Interchange (ASCII)
     213encodings of the upper case English letters.  Then the above table
     214explicitly describes the mapping `f: \Sigma \longrightarrow \Phi`.
     215(For those familiar with ASCII, `f` is actually a common process for
     216*encoding* elements of `\Sigma`, rather than a cryptographic
     217"scrambling" process *per se*.)  To scramble a message written using
     218the alphabet `\Sigma`, we simply replace each capital letter of the
     219message with its corresponding ASCII encoding.  However, the
     220scrambling process described in the above table provides,
     221cryptographically speaking, very little to no security at all and we
     222strongly discourage its use in practice.
     223
     224
     225Keeping a secret with two keys
     226==============================
     227
     228The Rivest, Shamir, Adleman (RSA) cryptosystem is an example of a
     229*public key cryptosystem*.  RSA uses a *public key* to
     230encrypt messages and decryption is performed using a corresponding
     231*private key*.  We can distribute our public keys, but for
     232security reasons we should keep our private keys to ourselves.  The
     233encryption and decryption processes draw upon techniques from
     234elementary number theory.  The algorithm below is adapted from page
     235165 of [TrappeWashington2006]_. It outlines the RSA procedure for
     236encryption and decryption.
     237
     238#. Choose two primes `p` and `q` and let `n = pq`.
     239#. Let `e \in \ZZ` be positive such that
     240   `\gcd \big( e, \varphi(n) \big) = 1`.
     241#. Compute a value for `d \in \ZZ` such that
     242   `de \equiv 1 \pmod{\varphi(n)}`.
     243#. Our public key is the pair `(n, e)` and our private key is the
     244   triple `(p,q,d)`.
     245#. For any non-zero integer `m < n`, encrypt $m$ using
     246   `c \equiv m^e \pmod{n}`.
     247#. Decrypt `c` using `m \equiv c^d \pmod{n}`.
     248
     249The next two sections will step through the RSA algorithm, using
     250Sage to generate public and private keys, and perform encryption
     251and decryption based on those keys.
     252
     253
     254Generating public and private keys
     255==================================
     256
     257Positive integers of the form `M_m = 2^m - 1` are called
     258*Mersenne numbers*.  If `p` is prime and `M_p = 2^p - 1` is also
     259prime, then `M_p` is called a *Mersenne prime*.  For example, 31
     260is prime and `M_{31} = 2^{31} - 1` is a Mersenne prime, as can be
     261verified using the command ``is_prime(p)``.  This command returns
     262``True`` if its argument ``p`` is precisely a prime number;
     263otherwise it returns ``False``.  By definition, a prime must be a
     264positive integer, hence ``is_prime(-2)`` returns ``False``
     265although we know that 2 is prime.  Indeed, the number
     266`M_{61} = 2^{61} - 1` is also a Mersenne prime.  We can use
     267`M_{31}` and `M_{61}` to work through step 1 in the RSA algorithm::
     268
     269    sage: p = (2^31) - 1
     270    sage: is_prime(p)
     271    True
     272    sage: q = (2^61) - 1
     273    sage: is_prime(q)
     274    True
     275    sage: n = p * q ; n
     276    4951760154835678088235319297
     277
     278A word of warning is in order here.  In the above code example, the
     279choice of `p` and `q` as Mersenne primes, and with so many digits far
     280apart from each other, is a very bad choice in terms of cryptographic
     281security.  However, we shall use the above chosen numeric values for
     282`p` and `q` for the remainder of this tutorial, always bearing in mind
     283that they have been chosen for pedagogy purposes only.  Refer to
     284[MenezesEtAl1996]_,
     285[Stinson2006]_, and
     286[TrappeWashington2006]_
     287for in-depth discussions on the security of RSA, or consult other
     288specialized texts.
     289
     290For step 2, we need to find a positive integer that is coprime to
     291`\varphi(n)`.  The set of integers is implemented within the Sage
     292module ``sage.rings.integer_ring``.  Various operations on
     293integers can be accessed via the ``ZZ.*`` family of functions.
     294For instance, the command ``ZZ.random_element(n)`` returns a
     295pseudo-random integer uniformly distributed within the closed interval
     296`[0, n-1]`.  Using a simple programming loop, we can compute the
     297required value of `e` as follows::
     298
     299    sage: n = 4951760154835678088235319297
     300    sage: e = ZZ.random_element(euler_phi(n))
     301    sage: while gcd(e, euler_phi(n)) != 1:
     302    ...       e = ZZ.random_element(euler_phi(n))
     303    ...
     304    sage: e  # random
     305    1850567623300615966303954877
     306    sage: e < n
     307    True
     308
     309As ``e`` is a pseudo-random integer, its numeric value changes
     310after each execution of ``e = ZZ.random_element(euler_phi(n))``.
     311
     312To calculate a value for ``d`` in step 3 of the RSA algorithm, we use
     313the extended Euclidean algorithm.  By definition of congruence,
     314`de \equiv 1 \pmod{\varphi(n)}` is equivalent to
     315
     316.. MATH::
     317
     318    de - k \cdot \varphi(n) = 1
     319
     320where `k \in \ZZ`.  From steps 1 and 2, we already know the numeric
     321values of `e` and `\varphi(n)`.  The extended Euclidean algorithm
     322allows us to compute `d` and `-k`.  In Sage, this can be accomplished
     323via the command ``xgcd``.  Given two integers `x` and `y`,
     324``xgcd(x, y)`` returns a 3-tuple ``(g, s, t)`` that satisfies
     325the Bézout identity `g = \gcd(x,y) = sx + ty`.  Having computed a
     326value for ``d``, we then use the command
     327``mod(d*e, euler_phi(n))`` to check that ``d*e`` is indeed congruent
     328to 1 modulo ``euler_phi(n)``. ::
     329
     330    sage: n = 4951760154835678088235319297
     331    sage: e = 1850567623300615966303954877
     332    sage: bezout = xgcd(e, euler_phi(n)); bezout  # random
     333    (1, 4460824882019967172592779313, -1667095708515377925087033035)
     334    sage: d = Integer(mod(bezout[1], euler_phi(n))) ; d  # random
     335    4460824882019967172592779313
     336    sage: mod(d * e, euler_phi(n))
     337    1
     338
     339Thus, our RSA public key is
     340
     341.. MATH::
     342
     343    (n, e)
     344    =
     345    (4951760154835678088235319297,\, 1850567623300615966303954877)
     346
     347and our corresponding private key is
     348
     349.. MATH::
     350
     351    (p, q, d)
     352    =
     353    (2147483647,\, 2305843009213693951,\, 4460824882019967172592779313)
     354
     355
     356Encryption and decryption
     357=========================
     358
     359Suppose we want to scramble the message ``HELLOWORLD`` using RSA
     360encryption.  From the above ASCII table, our message maps to integers
     361of the ASCII encodings as given below. ::
     362
     363    +----------------------------------------+
     364    | H   E   L   L   O   W   O   R   L   D  |
     365    | 72  69  76  76  79  87  79  82  76  68 |
     366    +----------------------------------------+
     367
     368Concatenating all the integers in the last table, our message can be
     369represented by the integer
     370
     371.. MATH::
     372
     373    m = 72697676798779827668
     374
     375There are other more cryptographically secure means for representing
     376our message as an integer.  The above process is used for
     377demonstration purposes only and we strongly discourage its use in
     378practice. In Sage, we can obtain an integer representation of our
     379message as follows::
     380
     381    sage: m = "HELLOWORLD"
     382    sage: m = map(ord, m); m
     383    [72, 69, 76, 76, 79, 87, 79, 82, 76, 68]
     384    sage: m = ZZ(list(reversed(m)), 100) ; m
     385    72697676798779827668
     386
     387To encrypt our message, we raise `m` to the power of `e` and reduce
     388the result modulo `n`.  The command ``mod(a^b, n)`` first computes
     389``a^b`` and then reduces the result modulo ``n``.  If the exponent
     390``b`` is a "large" integer, say with more than 20 digits, then
     391performing modular exponentiation in this naive manner takes quite
     392some time.  Brute force (or naive) modular exponentiation is
     393inefficient and, when performed using a computer, can quickly
     394consume a huge quantity of the computer's memory or result in overflow
     395messages.  For instance, if we perform naive modular exponentiation
     396using the command ``mod(m^e, n)``, where ``m``, ``n`` and ``e`` are as
     397given above, we would get an error message similar to the following::
     398
     399    mod(m^e, n)
     400    Traceback (most recent call last)
     401    /home/mvngu/<ipython console> in <module>()
     402    /home/mvngu/usr/bin/sage-3.1.4/local/lib/python2.5/site-packages/sage/rings/integer.so
     403    in sage.rings.integer.Integer.__pow__ (sage/rings/integer.c:9650)()
     404    RuntimeError: exponent must be at most 2147483647
     405
     406There is a trick to efficiently perform modular exponentiation, called
     407the method of repeated squaring, cf. page 879 of [CormenEtAl2001]_.
     408Suppose we want to compute `a^b \mod n`.  First, let
     409`d \mathrel{\mathop:}= 1` and obtain the binary representation of `b`,
     410say `(b_1, b_2, \dots, b_k)` where each `b_i \in \ZZ/2\ZZ`.  For
     411`i \mathrel{\mathop:}= 1, \dots, k`, let
     412`d \mathrel{\mathop:}= d^2 \mod n` and if `b_i = 1` then let
     413`d \mathrel{\mathop:}= da \mod n`.  This algorithm is implemented in
     414the function ``power_mod``. We now use the function ``power_mod`` to
     415encrypt our message::
     416
     417    sage: m = 72697676798779827668
     418    sage: e = 1850567623300615966303954877
     419    sage: n = 4951760154835678088235319297
     420    sage: c = power_mod(m, e, n); c
     421    630913632577520058415521090
     422
     423Thus `c = 630913632577520058415521090` is the ciphertext.  To recover
     424our plaintext, we raise ``c`` to the power of ``d`` and reduce the
     425result modulo ``n``.  Again, we use modular exponentiation via
     426repeated squaring in the decryption process::
     427
     428    sage: m = 72697676798779827668
     429    sage: c = 630913632577520058415521090
     430    sage: d = 4460824882019967172592779313
     431    sage: n = 4951760154835678088235319297
     432    sage: power_mod(c, d, n)
     433    72697676798779827668
     434    sage: power_mod(c, d, n) == m
     435    True
     436
     437
     438Notice in the last output that the value 72697676798779827668 is the
     439same as the integer that represents our original message.  Hence we
     440have recovered our plaintext.
     441
     442
     443Acknowledgements
     444================
     445
     446#. 2009-07-25: Ron Evans (Department of Mathematics, UCSD) reported
     447   a typo in the definition of greatest common divisors. The revised
     448   definition incorporates his suggestions.
     449
     450#. 2008-11-04: Martin Albrecht (Information Security Group, Royal
     451   Holloway, University of London), John Cremona (Mathematics
     452   Institute, University of Warwick) and William Stein (Department of
     453   Mathematics, University of Washington) reviewed this tutorial. Many
     454   of their invaluable suggestions have been incorporated into this
     455   document.