l i n u x - u s e r s - g r o u p - o f - d a v i s
L U G O D
 
Next Meeting:
August 18: A professional photographer's view of Linux
Next Installfest:
TBD
Latest News:
Aug. 18: Discounts to "Velocity" in NY; come to tonight's "Photography" talk
Page last updated:
2005 Jan 17 14:27

The following is an archive of a post made to our 'vox-tech mailing list' by one of its subscribers.

Report this post as spam:

(Enter your email address)
Re: [vox-tech] gcc questions: inline and -ffast-math
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vox-tech] gcc questions: inline and -ffast-math



On Mon 17 Jan 05, 12:21 PM, Josh Parsons <jbparsons@ucdavis.edu> said:
> 
> Well, you should be able to find out more about these with "man gprof"
> and "info gcc".  If you haven't thought of using profiling, I'd start
> there; that will help you write better, faster, code, and gain you much
> more speed than tweaking code generation options would.
 
Yeah -- I've been playing around with gprof for the past 3 hours.  There are
3 functions that take up 90% of the run time.  Unfortunately, there were no
surprises.  I could have guessed the results.

One of them is a straight implementation of the Thomas Algorithm, which
solves linear equations in tridiagonal form.  There's really nothing I can
do with that function; the algorithm is very straight forward to implement
and already makes use of the "sparse" nature of the matrix.  It's an O(2n)
operation.

The second function is something I've already optimized the heck out of,
both with pencil-paper and with code.  It's an O(n^2) operation that forms
arrays of sums of sums.  I was very clever in writing it, keeping "cache"
whenever possible.  I'm fairly sure that function is about as optimized as
it can get.

The third function I haven't paid too much attention to, so I'll take a
look.  It's an O(n) operation, but it's a fairly straight forward
implementation of difference equations for a PDE.  I'm not hopeful.

I'm actually quite happy with the speed of this program; I was just trying
to push it more.  Greedy.  :)   It used to be much slower.  I shifted most
of the variables used to solve the PDE to preprocessor #defines, and that
cut run time, literally, by more than half.

That means I need to recompile the program whenever I want to change
*anything*.  But the run-time speed savings I get in return is WAY more than
worth the compilation time.

Thanks for the thread -- I learned a LOT from it!

Pete

-- 
The mathematics of physics has become ever more abstract, rather than more
complicated.  The mind of God appears to be abstract but not complicated.
He also appears to like group theory.  --  Tony Zee's "Fearful Symmetry"

GPG Fingerprint: B9F1 6CF3 47C4 7CD8 D33E  70A9 A3B9 1945 67EA 951D
_______________________________________________
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech



LinkedIn
LUGOD Group on LinkedIn
Sign up for LUGOD event announcements
Your email address:
facebook
LUGOD Group on Facebook
'Like' LUGOD on Facebook:

Hosting provided by:
Sunset Systems
Sunset Systems offers preconfigured Linux systems, remote system administration and custom software development.

LUGOD: Linux Users' Group of Davis
PO Box 2082, Davis, CA 95617
Contact Us

LUGOD is a 501(c)7 non-profit organization
based in Davis, California
and serving the Sacramento area.
"Linux" is a trademark of Linus Torvalds.

Sponsored in part by:
O'Reilly and Associates
For numerous book donations.