Mailinglist Archive: opensuse (1558 mails)

< Previous Next >
Re: [opensuse] c question - how to tell how much memory you program/data structure uses?
  • From: G T Smith <grahamsmith@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
  • Date: Sat, 15 Aug 2009 12:00:23 +0100
  • Message-id: <4A869547.1020709@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
Hash: SHA1

David C. Rankin wrote:

A couple of questions. In c, I am working on a small program to read
deposition transcripts (text files) into a linked list and then step
through the list to classify the lines, search, etc. What I would
like to know is how do I tell how much memory the program uses? The
list structure is basically:

struct record { char *line; int lineno; int linetype; int pageno,
lineno, recordno; struct record *next; };

The file is read with lines length determined by "length = getline
(&lineptr, &n, depo)" with the char *line allocated as length+1 with
malloc. Is there a way to determine how much memory I'm using besides
doing some kind of sum on each node in the list where memory for each
node would be something like ( length*char + int + int + int + int +
int + pointer )? Or, could I just take something like the text file
size on disk + number of structures * ( int + int + int + int + int +
pointer )?

Also, more generally, for searching and working on the text, would
just all reading the text into something other than a linked list be
better? I guess I could use one big buffer. The text files for each
single transcript are rarely ever more than 500K. Eventually, I would
like to have the ability to search through any number of transcripts,
but they need not all be in memory at the same time (although for
speed that would help). Any text handling favorite data structures?

Next, and more importantly... who has a good link to a gdb tutorial?
(I have found a few, but I would love to have a good one
recommended). Thanks.

Sorry formatting lost in above as message for some reason did not wrap
properly in Thunderbird...


If you are trying to familiarise yourself with c, strings and lists then
fine. I personally find associative arrays/hashes are very useful for
this kind of activity but c does not natively support this type of data
structure. C++ with STL does include much of this functionality (in the
collection templates), though as a keyword is small I am not certain
that this is appropriate and basic usage is somewhat like pulling ones
own teeth without pain relief.

To be honest what you describing is moving perilously close to a
relational database function, and it is not clear to me whether you want
the overall memory usage of the program process (I thought there were
some system calls for that) or the memory usage of the program data.

However, Perl or Python are better for this kind of activity if
performance is not a key issue, and do not forget Java. They also take
care of memory management headaches, C/C++ memory management along with
C pointers should really be in the 'Here be monsters, beware!' section
of any C reference text or manual :-) .

Also there is a fair bit of corpora/concordance software (much in the
public domain) already out there which seems to cover what you seem to
be aiming to do. (Why re-invent the wheel, except for the fun :-) ).
There used to be something the Law School of the institution I used to
work for I was told was used for this purpose (Micro-Concord I think)...

As for gdb, probably useful to know, but I would use an IDE such as
Eclipse for debugging and development. Most IDEs include goodies like
syntax checking, code profiling, header reference checking , code
templates and the like. I good IDE in some ways can be a good tutorial
tool if you trying to get to grips with an unfamiliar language. I have
not used the C/C++ support in Eclipse for a long time so cannot comment
on what it is like now (it was quite nice)... but the Java and Perl
stuff is useful.


It is useful to ensure a productive software project by choosing the
appropriate tools, choosing the right type of language is a useful start....


Shell scripts are good for stringing sequences of system commands where
processing of output is fairly simple and flow decisions are also
relatively simple. However are not so good for situations where either
the output processing or flow control is complex. (You can do more
complex stuff, but being able to do something and whether you should do
something are two different questions).

Interpretative script languages such as Perl or Python are more useful
for more complex projects, and offer some benefits that purely compiled
languages do not offer. For a static code base there is usually a
comparative performance hit, but they can perform operations which are
not easily done within a compiled language. They are also useful for
fast proto-typing for later use in a compiled language version (Perl is
possibly somewhat better than Python in this aspect as there is a
somewhat closer match between C style Perl syntax to C/C++ syntax,
though I am sure some Python supporters will question this assertion).

Compiled languages will usually give the best performing and more
efficient implementation of a particular program design (if you are
using an appropriate language of course, fluid mechanics in Cobol anyone
:-) ).

(BTW Java is neither fish nor fowl on the last two comments and
depending on ones view point, either combines the worst or best of the
strengths and weaknesses of interpreted script languages and complied

- --
I have always wished that my computer would be as easy to use as my
My wish has come true. I no longer know how to use my telephone.

Bjarne Stroustrup
Version: GnuPG v2.0.9 (GNU/Linux)
Comment: Using GnuPG with SUSE -

To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
For additional commands, e-mail: opensuse+help@xxxxxxxxxxxx

< Previous Next >