RSS

A review of phase 1

22 Jun

Phase 1 of my project involved writing sparse matrix algorithms. For that purpose, I have written two classes, the DOKMatrix class and the LILMatrix class. DOKMatrix is good for add, sub, mult and easy fast editing of matrices. It currently has sparse cholesky decomposition, LDL decomposition algorithms implemented, and a fast matrix multiplication algorithm. But these decompositions only work for symmetric matrices. So I was multiplying by A.T and solving the equation A.T * A * x = A.T * B. There is an overhead of two matrix multiplications here. Further, this method could only give the absolute value of the determinant of a general matrix.

Enter LIL. LIL is a list of list of elements. It has N lists which store the N rows of a matrix. A row is a list which stores only the non-zero elements of that row in the form of (j, value) tuple pair. Row operations in LILMatrix are very fast and intuitive. This enable me to write sparsified gaussian elimination and reduced row echelon form of a matrix. This completes the bare essentials of sparse matrix algorithms !

Some benchmarks:

LDL decomposition on DOKMatrix

In [5]: A = randInvDOKMatrix(10,2); A = A.T * A
# creates a banded 10*10 matrix, with diagonals till 2nd diagonal full.

In [6]: %timeit A._LDL_sparse() # On python floats
1000 loops, best of 3: 894 us per loop

In [7]: A.applyfunc(S)

In [8]: %timeit A._LDL_sparse() # On sympy Integers and Rationals.
100 loops, best of 3: 2.34 ms per loop

In [9]: A.sparsity()
Out[9]: 0.42
In [11]: A = randInvDOKMatrix(100, 10); A = A.T * A 
# For a 100 * 100 Matrix, with a band of 2*10 full diagonals

In [12]: %timeit A._LDL_sparse()
1 loops, best of 3: 221 ms per loop

In [13]: A.applyfunc(S)

In [14]: %timeit A._LDL_sparse()
1 loops, best of 3: 712 ms per loop

In [15]: A.sparsity()
Out[15]: 0.3316

Reduction to Upper Triangular Matrix using Gaussian Elimination on LILMatrix

In [20]: %timeit A.gauss_col()
1000 loops, best of 3: 280 us per loop

In [21]: A = randInvLILMatrix(10,2)

In [22]: %timeit A.gauss_col()
1000 loops, best of 3: 259 us per loop

In [23]: A.applyfunc(S)

In [24]: %timeit A.gauss_col()
1000 loops, best of 3: 744 us per loop

In [25]: A.sparsity()
Out[25]: 0.26

In [26]: A = randInvLILMatrix(100,10)

In [27]: %timeit A.gauss_col()
10 loops, best of 3: 77.2 ms per loop

In [28]: A.applyfunc(S)

In [29]: %timeit A.gauss_col()
1 loops, best of 3: 241 ms per loop

In [30]: A.sparsity()
Out[30]: 0.1691

Finding inverse of a LILMatrix using RREF

In [8]: A = randInvLILMatrix(10,2)

In [9]: %timeit A.inv_rref()
1000 loops, best of 3: 1.75 ms per loop

In [10]: A.applyfunc(S)

In [11]: %timeit A.inv_rref()
100 loops, best of 3: 9.04 ms per loop

In [12]: A.sparsity()
Out[12]: 0.26

In [13]: A = randInvLILMatrix(100,10)

In [14]: %timeit A.inv_rref()
1 loops, best of 3: 851 ms per loop

In [15]: A.applyfunc(S)

In [16]: %timeit A.inv_rref()
1 loops, best of 3: 10.6 s per loop

In [17]: A.sparsity()
Out[17]: 0.1704
Dense Matrix speed (for comparison)

In [18]: B = A.toMatrix()
In [19]: %time C = B.inv()
CPU times: user 175.02 s, sys: 3.73 s, total: 178.75 s
Wall time: 186.35 s

Solving Ax=B by RREFing on a augmented matrix.

In [5]: A = randLILMatrix(10,11, sparsity=0.6)

In [6]: %timeit A.rref()
1000 loops, best of 3: 1.01 ms per loop

In [7]: A.applyfunc(S)

In [8]: %timeit A.rref()
100 loops, best of 3: 2.67 ms per loop

In [9]: A = randLILMatrix(100,101, sparsity=0.6)

In [10]: %timeit A.rref()
1 loops, best of 3: 999 ms per loop

In [11]: A = randLILMatrix(100,101, sparsity=0.3)

In [12]: %timeit A.rref()
1 loops, best of 3: 787 ms per loop

In [13]: A.applyfunc(S)

In [14]: %timeit A.rref()
1 loops, best of 3: 9.04 s per loop

In [18]: A.sparsity()
Out[18]: 0.304158415842

 

About these ads
 
5 Comments

Posted by on June 22, 2011 in Sympy

 

5 responses to “A review of phase 1

  1. asmeurer

    June 22, 2011 at 7:09 am

    The values of A appear to be Python ints, not floats.

    Was it intentional that you changed the slicing syntax? A[0][0] doesn’t work for me.

     
    • Sherjil Ozair

      June 22, 2011 at 2:41 pm

      Yes, they’re python ints. But I have division imported from __future__. So the ints become floats soon enough.

      Yes A[0] wouldn’t mean anything right now. Should it return the first row of the matrix ? The slicing syntax currently is A[r1:r2,c1:c2].

       
      • vks

        June 22, 2011 at 6:11 pm

        I think A[0] should at least work when A.cols or A.rows == 1, this way you can easily use such matrices as vectors.

         
  2. Sherjil Ozair

    June 23, 2011 at 2:09 pm

    More typechecks in the __getitem__. Performance regression, if you ask me. I’ll have to figure out something.

     
    • vks

      June 23, 2011 at 9:18 pm

      Yes, but it should be in the high level interface. It’s fine if the low-level interface does not have this syntactic sugar.

       

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

%d bloggers like this: