Create a sparse diagonal matrix from row of a sparse matrix

I process rather large matrices in Python/Scipy. I need to extract rows from large matrix (which is loaded to coo_matrix) and use them as diagonal elements. Currently I do that in the following fashion:

import numpy as np
from scipy import sparse

def computation(A):
  for i in range(A.shape[0]):
    diag_elems = np.array(A[i,:].todense())
    ith_diag = sparse.spdiags(diag_elems,0,A.shape[1],A.shape[1], format = "csc")
    #...

#create some random matrix
A = (sparse.rand(1000,100000,0.02,format="csc")*5).astype(np.ubyte)
#get timings
profile.run('computation(A)')

What I see from the profile output is that most of the time is consumed by get_csr_submatrix function while extracting diag_elems. That makes me think that I use either inefficient sparse representation of initial data or wrong way of extracting row from a sparse matrix. Can you suggest a better way to extract a row from a sparse matrix and represent it in a diagonal form?

EDIT

The following variant removes bottleneck from the row extraction (notice that simple changing 'csc' to csr is not sufficient, A[i,:] must be replaced with A.getrow(i) as well). However the main question is how to omit the materialization (.todense()) and create the diagonal matrix from the sparse representation of the row.

import numpy as np
from scipy import sparse

def computation(A):
  for i in range(A.shape[0]):
    diag_elems = np.array(A.getrow(i).todense())
    ith_diag = sparse.spdiags(diag_elems,0,A.shape[1],A.shape[1], format = "csc")
    #...

#create some random matrix
A = (sparse.rand(1000,100000,0.02,format="csr")*5).astype(np.ubyte)
#get timings
profile.run('computation(A)')

If I create DIAgonal matrix from 1-row CSR matrix directly, as follows:

diag_elems = A.getrow(i)
ith_diag = sparse.spdiags(diag_elems,0,A.shape[1],A.shape[1])

then I can neither specify format="csc" argument, nor convert ith_diags to CSC format:

Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.6/profile.py", line 70, in run
    prof = prof.run(statement)
  File "/usr/local/lib/python2.6/profile.py", line 456, in run
    return self.runctx(cmd, dict, dict)
  File "/usr/local/lib/python2.6/profile.py", line 462, in runctx
    exec cmd in globals, locals
  File "<string>", line 1, in <module>
  File "<stdin>", line 4, in computation
  File "/usr/local/lib/python2.6/site-packages/scipy/sparse/construct.py", line 56, in spdiags
    return dia_matrix((data, diags), shape=(m,n)).asformat(format)
  File "/usr/local/lib/python2.6/site-packages/scipy/sparse/base.py", line 211, in asformat
    return getattr(self,'to' + format)()
  File "/usr/local/lib/python2.6/site-packages/scipy/sparse/dia.py", line 173, in tocsc
    return self.tocoo().tocsc()
  File "/usr/local/lib/python2.6/site-packages/scipy/sparse/coo.py", line 263, in tocsc
    data    = np.empty(self.nnz, dtype=upcast(self.dtype))
  File "/usr/local/lib/python2.6/site-packages/scipy/sparse/sputils.py", line 47, in upcast
    raise TypeError,'no supported conversion for types: %s' % args
TypeError: no supported conversion for types: object`

Solution

Here's what I came up with:

def computation(A):
    for i in range(A.shape[0]):
        idx_begin = A.indptr[i]
        idx_end = A.indptr[i+1]
        row_nnz = idx_end - idx_begin
        diag_elems = A.data[idx_begin:idx_end]
        diag_indices = A.indices[idx_begin:idx_end]
        ith_diag = sparse.csc_matrix((diag_elems, (diag_indices, diag_indices)),shape=(A.shape[1], A.shape[1]))
        ith_diag.eliminate_zeros()

Python profiler said 1.464 seconds versus 5.574 seconds before. It takes advantage of the underlying dense arrays (indptr, indices, data) that define sparse matrices. Here's my crash course: A.indptr[i]:A.indptr[i+1] defines which elements in the dense arrays correspond to the non-zero values in row i. A.data is a dense 1d array of non-zero the values of A and A.indptr is the column where those values go.

I would do some more testing to make very certain this does the same thing as before. I only checked a few cases.