Next Meeting: July 21: Defensive computing: Information security for individuals Next Installfest: TBD Latest News: Jul. 4: July, August and September: Security, Photography and Programming for Kids Page last updated: 2005 Jan 17 14:27
 The following is an archive of a post made to our 'vox-tech mailing list' by one of its subscribers. Report this post as spam: (Enter your email address)
Re: [vox-tech] gcc questions: inline and -ffast-math

# Re: [vox-tech] gcc questions: inline and -ffast-math

```On Mon 17 Jan 05, 12:21 PM, Josh Parsons <jbparsons@ucdavis.edu> said:
>
> Well, you should be able to find out more about these with "man gprof"
> and "info gcc".  If you haven't thought of using profiling, I'd start
> there; that will help you write better, faster, code, and gain you much
> more speed than tweaking code generation options would.

Yeah -- I've been playing around with gprof for the past 3 hours.  There are
3 functions that take up 90% of the run time.  Unfortunately, there were no
surprises.  I could have guessed the results.

One of them is a straight implementation of the Thomas Algorithm, which
solves linear equations in tridiagonal form.  There's really nothing I can
do with that function; the algorithm is very straight forward to implement
and already makes use of the "sparse" nature of the matrix.  It's an O(2n)
operation.

The second function is something I've already optimized the heck out of,
both with pencil-paper and with code.  It's an O(n^2) operation that forms
arrays of sums of sums.  I was very clever in writing it, keeping "cache"
whenever possible.  I'm fairly sure that function is about as optimized as
it can get.

The third function I haven't paid too much attention to, so I'll take a
look.  It's an O(n) operation, but it's a fairly straight forward
implementation of difference equations for a PDE.  I'm not hopeful.

I'm actually quite happy with the speed of this program; I was just trying
to push it more.  Greedy.  :)   It used to be much slower.  I shifted most
of the variables used to solve the PDE to preprocessor #defines, and that
cut run time, literally, by more than half.

That means I need to recompile the program whenever I want to change
*anything*.  But the run-time speed savings I get in return is WAY more than
worth the compilation time.

Thanks for the thread -- I learned a LOT from it!

Pete

--
The mathematics of physics has become ever more abstract, rather than more
complicated.  The mind of God appears to be abstract but not complicated.
He also appears to like group theory.  --  Tony Zee's "Fearful Symmetry"

GPG Fingerprint: B9F1 6CF3 47C4 7CD8 D33E  70A9 A3B9 1945 67EA 951D
_______________________________________________
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech

```