Loading...

algogeeks@googlegroups.com

[Prev] Thread [Next]  |  [Prev] Date [Next]

Re: [algogeeks] Re: optimisation Arun Vishwanathan Wed Feb 29 10:00:56 2012

@all: Thanks a lot

On Wed, Feb 29, 2012 at 9:02 AM, Gene <[EMAIL PROTECTED]> wrote:

> For big matrices, using all the caches well is very important.  The
> usual approach is block tiling as you say.  You want two blocks to fit
> nicely in the L1 cache.  The most highly optimized schemes will have a
> hierarchy of tiles where two tiles at the second level will fit in the
> L2 cache, etc. Additional levels can be based on memory interfaces,
> interleaving, page size, and even cylinder size on disk (for really
> huge matrices). The idea is _never_ to generate more cache misses than
> you need to. A miss causes a factor of 10 to 10000 performance
> multiple on that operation.
>
> Multiplying within the innermost blocks should also consider prefetch:
> you'll get best performance touching locations in contiguous ascending
> order.  The inner loops are
>
> for i = 1 to ma
>  for j = 1 to nb
>    for k = 1 to na
>      r[i,j] += a[i,k] * b[k,j]
>
> This ignores that r[i,j] needs to be zeroed out somewhere.  But with
> this assumption, the loops can be juxtaposed in any order without
> changing the result. You should explore the 6 possible orderings for
> the best one.  Of course you also have to zero out the sums in the
> best possible manner.
>
> FWIW, a GPU will normally outperform a general purpose CPU with ease
> on this problem.  Since even cell phones are getting GPUs these days,
> tweaking single-processor matrix code is a dying art.
>
> Cheers.
>
> On Feb 27, 6:57 pm, Arun Vishwanathan <[EMAIL PROTECTED]> wrote:
> > Hi all,
> >
> > We have this challenge to make the fastest executing serial matrix
> > multiplication code. I have tried using matrix transpose( in C for row
> > major ) and loop unrolling.I was able to obtain little speedup. Does
> anyone
> > have any hints/papers that I could read upon and try to speed up
> further?I
> > had tried a bit of block tiling but was not successful.
> >
> > Thanks
> > Arun
>
> --
> You received this message because you are subscribed to the Google Groups
> "Algorithm Geeks" group.
> To post to this group, send email to [EMAIL PROTECTED]
> To unsubscribe from this group, send email to
> [EMAIL PROTECTED]
> For more options, visit this group at
> http://groups.google.com/group/algogeeks?hl=en.
>
>


-- 
 "People often say that motivation doesn't last. Well, neither does bathing
- that's why we recommend it daily."

-- 
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To post to this group, send email to [EMAIL PROTECTED]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/algogeeks?hl=en.