Skip to content

info type discussions for mem-based optimizations #1

@spencerpatty

Description

@spencerpatty

operation_info_t info;
device_policy policy;
multiply_inspect(info, policy, a, x, y);
multiply_inspect(info, policy, transposed(a), x, y);
// Allocate more memory for y based on `info`
while (/* ... */) {
multiply_execute(info, policy, a, x, y);
// do something with y, update x...
multiply_execute(info, policy, transposed(a), y, x);
// Maybe do some more stuff...
}

I like this idea of having an info type that is directly associated with some matrix structure and which is filled with 0 or more inspection based optimizations (which means it houses "stateful + read-only" optimizations. I wonder if it would be possible to have our multiply functions take in some hybrid matrix_obj object which consists of either a matrix_view or a matrix_view + an associated matrix_info_t type -- used in some way like the following snippet

csr_view<T,I,O> A(...);
matrix_info_t A_info(...);
multiply_inspect( matrix_obj{A, A_info}, descriptor, x, y, /*backend stuff*/ ) 
multiply_execute( matrix_obj{A, A_info}, descriptor, x,y, /*backend stuff */)

or we might also skip the inspection at the cost of less performance...

csr_view<T,I,O> A(...);
multiply_execute( matrix_obj{A}, descriptor, x,y, /*backend stuff */)

The benefit of this is that when we look at the sparse * sparse operation, we could have an A_info, B_info, that may contain good (read-only stateful) stuff that might be useful about A, B while creating C, and then there may be another multistage_info_t which is particular to the multi-stage operation (stateful + read/write data)

csr_view<T,I,O> A(...);
matrix_info_t A_info(...);

csr_view<T,I,O> B(...);
matrix_info_t B_info(...);

csr_view<T,I,O> C(...);

multiply_info_t  mult_info(); // C = A *B^T

multiply_inspect(matrix_obj{C}, matrix_obj{A, A_info}, transpose(matrix_obj{B,B_info}), desc, /*backend stuff*/ ); // fills A_info and or B_info
multiply_execute_stage1( matrix_obj{C}, matrix_obj{A, A_info}, transpose(matrix_obj{B,B_info}), mult_info, /*backend stuff*/ ); // fills mult_info and C
multiply_execute_stage2( matrix_obj{C}, matrix_obj{A, A_info}, transpose(matrix_obj{B,B_info}), mult_info, /*backend stuff*/ ); // fills mult_info and C
multiply_execute_stage3( matrix_obj{C}, matrix_obj{A, A_info}, transpose(matrix_obj{B,B_info}), mult_info, /*backend stuff*/ ); // fills mult_info and C

mult_info might house the stateful + read/write optimizations pertaining to the multiply multi-stage process ... A_info and B_info might pertain stateful + read-only optimizations about A and or B ...

Does this idea make sense? Does anyone see any use issues ? Is it too ugly ? I worry that we will have too many overloads if we have A and possibly A_info etc separated as inputs ... And this allows us to distinguish between matrix inputs + info and operational info data ...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions