Discussion:
[Gc] Is there a way to walk the entire heap of live objects
Christian Schafmeister
2014-06-05 11:57:53 UTC
Permalink
Hi there,

Is there any functionality in the Boehm GC that would allow me to freeze the garbage collector and walk the heap of live objects? I’d like to take an inventory of all of the live objects to determine the source of my current memory problems (30 GB of virtual memory used after 4 hours of runtime).

I recently incorporated the Boehm garbage collector into a Common Lisp system that I’ve written in C++ that uses LLVM as the back-end and interoperates with C++.

I’ve exposed the Clang compiler front end and the Clang AST Matcher library to the Common Lisp system and I’ve written a static analyzer to analyze and clean-up the C++ source code for the Common Lisp system (165 C++ source files).

The static analyzer takes about 4 hours to run and when it is done the REPL sits there and the application is consuming about 30 GB (gigabytes) of virtual memory.

I have made all of the objects self-describing so if I could walk the live objects I could count the number of each class (~500 classes) and measure their size.

Either I’ve got some enormous data structures in there that I have forgotten about or heap fragmentation
is very bad.

In case I’m not configuring the Boehm library correctly - here is my configuration.
boehm-setup:
(cd boehm-$(BOEHM_VERSION); \
export ALL_INTERIOR_PTRS=1; \
CFLAGS="-DUSE_MMAP -DMMAP_FIXED -DHEAP_START=0x7AE147AE1000" \
./configure --with-large-heap=yes --enable-cplusplus --prefix=$(CLASP_APP_RESOURCES_EXTERNALS_COMMON_DIR);)



Best,

Christian Schafmeister
Bruce Hoult
2014-06-06 03:22:09 UTC
Permalink
It's pretty complicated to walk the live objects — there's an awful lot of
code in the marking engine to figure all that out, given different marking
procedures etc.

Really, the only sensible way is to hook into marking yourself. Maybe we
should have this facility but we don't, at least not in the sense of
calling arbitrary user code.

If we did have it, it would be a GC library compile time option, so as to
not slow down production code.

I believe the best thing for you to do as it stands is to look in
include/private/gc_pmark.h where you'll find two implementations of
the PUSH_CONTENTS_HDR macro. Only one is used on each platform, depending
on whether we use a bit or a byte as the object mark "bit".

At the end of each you'll see:

TRACE(source, GC_log_printf("GC #%u: previously unmarked\n", \
(unsigned)GC_gc_no)); \
TRACE_TARGET(base, \
GC_log_printf("GC #%u: marking %p from %p instead\n", \
(unsigned)GC_gc_no, base, source)); \
INCR_MARKS(hhdr); \
GC_STORE_BACK_PTR((ptr_t)source, base); \
PUSH_OBJ(base, hhdr, mark_stack_top, mark_stack_limit); \

I'd add a call to your own function somewhere in there .. it doesn't matter
.. let's say just before the PUSH_OBJ.

christian_heap_trace(base); \

Maybe pass hhdr->hb_sz as well if it will help you. NB it will in general
be rounded up from the size that the client program actually asked for.
There is no record kept of the originally requested size.

You'll want to add a declaration for christian_heap_trace(ptr_t) somewhere
in gc_pmark.h also.

Obviously you need to be very careful in your christian_heap_trace()
function! For sure don't call any functions from the GC in it. I'd
recommend you do a printf() with what you find (e.g. the object address and
type) and analyse the emitted text file later using perl or whatever you
like.

I think that should work :-)



On Thu, Jun 5, 2014 at 11:57 PM, Christian Schafmeister <
Post by Christian Schafmeister
Hi there,
Is there any functionality in the Boehm GC that would allow me to freeze
the garbage collector and walk the heap of live objects? I’d like to
take an inventory of all of the live objects to determine the source of my
current memory problems (30 GB of virtual memory used after 4 hours of
runtime).
I recently incorporated the Boehm garbage collector into a Common Lisp
system that I’ve written in C++ that uses LLVM as the back-end and
interoperates with C++.
I’ve exposed the Clang compiler front end and the Clang AST Matcher
library to the Common Lisp system and I’ve written a static analyzer to
analyze and clean-up the C++ source code for the Common Lisp system (165
C++ source files).
The static analyzer takes about 4 hours to run and when it is done the
REPL sits there and the application is consuming about 30 GB (gigabytes) of
virtual memory.
I have made all of the objects self-describing so if I could walk the live
objects I could count the number of each class (~500 classes) and measure
their size.
Either I’ve got some enormous data structures in there that I have
forgotten about or heap fragmentation
is very bad.
In case I’m not configuring the Boehm library correctly - here is my
configuration.
(cd boehm-$(BOEHM_VERSION); \
export ALL_INTERIOR_PTRS=1; \
CFLAGS="-DUSE_MMAP -DMMAP_FIXED -DHEAP_START=0x7AE147AE1000" \
./configure --with-large-heap=yes --enable-cplusplus
--prefix=$(CLASP_APP_RESOURCES_EXTERNALS_COMMON_DIR);)
Best,
Christian Schafmeister
--
This message has been scanned for viruses and
dangerous content by *MailScanner* <http://www.mailscanner.info/>, and is
believed to be clean.
_______________________________________________
bdwgc mailing list
https://lists.opendylan.org/mailman/listinfo/bdwgc
Peter Wang
2014-06-06 04:16:27 UTC
Permalink
Post by Christian Schafmeister
Hi there,
Is there any functionality in the Boehm GC that would allow me to
freeze the garbage collector and walk the heap of live objects? I’d
like to take an inventory of all of the live objects to determine the
source of my current memory problems (30 GB of virtual memory used
after 4 hours of runtime).
Hi,

We do something like that in the Mercury project using the following
function.

I am not an expert on Boehm GC internals, and the code is only used in a
special memory profiling grade, so not heavily exercised.
Please let me know if there is something wrong with it.

Peter

(in reclaim.c)

STATIC void GC_mercury_do_enumerate_reachable_objects(struct hblk *hbp,
word dummy)
{
struct hblkhdr * hhdr = HDR(hbp);
size_t sz = hhdr -> hb_sz;
size_t bit_no;
char *p, *plim;

if (GC_block_empty(hhdr)) {
return;
}

p = hbp->hb_body;
bit_no = 0;
if (sz > MAXOBJBYTES) { /* one big object */
plim = p;
} else {
plim = hbp->hb_body + HBLKSIZE - sz;
}
/* Go through all words in block. */
while (p <= plim) {
if (mark_bit_from_hdr(hhdr, bit_no)) {
GC_mercury_callback_reachable_object((GC_word *)p,
BYTES_TO_WORDS(sz));
}
bit_no += MARK_BIT_OFFSET(sz);
p += sz;
}
}

GC_INNER void GC_mercury_enumerate_reachable_objects(void)
{
GC_ASSERT(GC_mercury_callback_reachable_object);
GC_apply_to_all_blocks(GC_mercury_do_enumerate_reachable_objects, (word)0);
}
Bruce Hoult
2014-06-06 04:41:16 UTC
Permalink
Right. Without thinking hard about every line that looks plausible (and it
must be pretty ok if it's working for you), and effectively uses the end
results of the last GC mark phase.

One thing, I expect it'll miss anything you've allocated since the last GC.

The object order will also be pretty random — in a fresh heap each block
will have objects in order of allocation of objects of that size. Once
there's been some churn it'll be a lot more mixed up.

My idea (which I should mention I haven't tested), will give some
approximation to a pre-order traversal of the object graph.

Actually, if you want to recreate the DAG, it might be more useful to put
your debug callback *before* the SET_MARK_BIT_EXIT_IF_SET line.
On Thu, 05 Jun 2014 07:57:53 -0400, Christian Schafmeister <
Post by Christian Schafmeister
Hi there,
Is there any functionality in the Boehm GC that would allow me to
freeze the garbage collector and walk the heap of live objects? I’d
like to take an inventory of all of the live objects to determine the
source of my current memory problems (30 GB of virtual memory used
after 4 hours of runtime).
Hi,
We do something like that in the Mercury project using the following
function.
I am not an expert on Boehm GC internals, and the code is only used in a
special memory profiling grade, so not heavily exercised.
Please let me know if there is something wrong with it.
Peter
(in reclaim.c)
STATIC void GC_mercury_do_enumerate_reachable_objects(struct hblk *hbp,
word dummy)
{
struct hblkhdr * hhdr = HDR(hbp);
size_t sz = hhdr -> hb_sz;
size_t bit_no;
char *p, *plim;
if (GC_block_empty(hhdr)) {
return;
}
p = hbp->hb_body;
bit_no = 0;
if (sz > MAXOBJBYTES) { /* one big object */
plim = p;
} else {
plim = hbp->hb_body + HBLKSIZE - sz;
}
/* Go through all words in block. */
while (p <= plim) {
if (mark_bit_from_hdr(hhdr, bit_no)) {
GC_mercury_callback_reachable_object((GC_word *)p,
BYTES_TO_WORDS(sz));
}
bit_no += MARK_BIT_OFFSET(sz);
p += sz;
}
}
GC_INNER void GC_mercury_enumerate_reachable_objects(void)
{
GC_ASSERT(GC_mercury_callback_reachable_object);
GC_apply_to_all_blocks(GC_mercury_do_enumerate_reachable_objects, (word)0);
}
_______________________________________________
bdwgc mailing list
https://lists.opendylan.org/mailman/listinfo/bdwgc
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
Ivan Maidanski
2014-06-06 07:14:13 UTC
Permalink
Hi,
I think we it would be good to add such an Api to upstream
--
Post by Christian Schafmeister
Hi there,
Is there any functionality in the Boehm GC that would allow me to
freeze the garbage collector and walk the heap of live objects?    I’d
like to take an inventory of all of the live objects to determine the
source of my current memory problems (30 GB of virtual memory used
after 4 hours of runtime).
Hi,
We do something like that in the Mercury project using the following
function.
I am not an expert on Boehm GC internals, and the code is only used in a
special memory profiling grade, so not heavily exercised.
Please let me know if there is something wrong with it.
Peter
(in reclaim.c)
STATIC void GC_mercury_do_enumerate_reachable_objects(struct hblk *hbp,
    word dummy)
{
    struct hblkhdr * hhdr = HDR(hbp);
    size_t sz = hhdr -> hb_sz;
    size_t bit_no;
    char *p, *plim;
    if (GC_block_empty(hhdr)) {
        return;
    }
    p = hbp->hb_body;
    bit_no = 0;
    if (sz > MAXOBJBYTES) { /* one big object */
        plim = p;
    } else {
        plim = hbp->hb_body + HBLKSIZE - sz;
    }
    /* Go through all words in block. */
    while (p <= plim) {
        if (mark_bit_from_hdr(hhdr, bit_no)) {
            GC_mercury_callback_reachable_object((GC_word *)p,
BYTES_TO_WORDS(sz));
        }
        bit_no += MARK_BIT_OFFSET(sz);
        p += sz;
    }
}
GC_INNER void GC_mercury_enumerate_reachable_objects(void)
{
    GC_ASSERT(GC_mercury_callback_reachable_object);
    GC_apply_to_all_blocks(GC_mercury_do_enumerate_reachable_objects, (word)0);
}
Christian Schafmeister
2014-06-06 06:30:02 UTC
Permalink
Thank you both - that is very helpful. I’ll try it tomorrow.

Before posting to the email list I did find this very old email on the topic and the code looks like the code you posted below.
The old code didn’t work but I’ll try your tomorrow.

Again - thank you very much - I’ll post the results.

Best,

.Chris.
Post by Peter Wang
Post by Christian Schafmeister
Hi there,
Is there any functionality in the Boehm GC that would allow me to
freeze the garbage collector and walk the heap of live objects? I’d
like to take an inventory of all of the live objects to determine the
source of my current memory problems (30 GB of virtual memory used
after 4 hours of runtime).
Hi,
We do something like that in the Mercury project using the following
function.
I am not an expert on Boehm GC internals, and the code is only used in a
special memory profiling grade, so not heavily exercised.
Please let me know if there is something wrong with it.
Peter
(in reclaim.c)
STATIC void GC_mercury_do_enumerate_reachable_objects(struct hblk *hbp,
word dummy)
{
struct hblkhdr * hhdr = HDR(hbp);
size_t sz = hhdr -> hb_sz;
size_t bit_no;
char *p, *plim;
if (GC_block_empty(hhdr)) {
return;
}
p = hbp->hb_body;
bit_no = 0;
if (sz > MAXOBJBYTES) { /* one big object */
plim = p;
} else {
plim = hbp->hb_body + HBLKSIZE - sz;
}
/* Go through all words in block. */
while (p <= plim) {
if (mark_bit_from_hdr(hhdr, bit_no)) {
GC_mercury_callback_reachable_object((GC_word *)p,
BYTES_TO_WORDS(sz));
}
bit_no += MARK_BIT_OFFSET(sz);
p += sz;
}
}
GC_INNER void GC_mercury_enumerate_reachable_objects(void)
{
GC_ASSERT(GC_mercury_callback_reachable_object);
GC_apply_to_all_blocks(GC_mercury_do_enumerate_reachable_objects, (word)0);
}
Christian Schafmeister
2014-06-06 16:59:01 UTC
Permalink
Peter,

Thank you for the code below - I incorporated it into reclaim.c and it works perfectly.
I set up the callback to accumulate results in std::maps keyed on the type of each object.

Now I can use this to help me identify my memory problems.

Here is an example of the output that I’m getting:
-------------------- Reachable StringKinds -------------------
strs: total_size: 1136720 count: 26858 avg.sz: 42 largest: 2128 N7gctools17GCString_moveableIcEE
-------------------- Reachable ContainerKinds -------------------
cont: total_size: 2359344 count: 2 avg.sz: 1179672 largest: 1179672 N7gctools17GCVector_moveableIN4core11CacheRecordEEE
cont: total_size: 1287048 count: 14386 avg.sz: 89 largest: 24816 N7gctools17GCVector_moveableINS_9smart_ptrIN4core3T_OEEEEE
cont: total_size: 160944 count: 1339 avg.sz: 120 largest: 288 N7gctools17GCVector_moveableIN4core16RequiredArgumentEEE
cont: total_size: 88920 count: 2632 avg.sz: 33 largest: 88 N7gctools16GCArray_moveableINS_9smart_ptrIN4core3T_OEEELi0EEE
cont: total_size: 51472 count: 73 avg.sz: 705 largest: 3296 N7gctools17GCVector_moveableINS_9smart_ptrIN4core8Symbol_OEEEEE
cont: total_size: 32256 count: 112 avg.sz: 288 largest: 288 N7gctools17GCVector_moveableIN4core16OptionalArgumentEEE
cont: total_size: 15232 count: 272 avg.sz: 56 largest: 56 N7gctools17GCVector_moveableINS_9smart_ptrIN4core6Cons_OEEEEE
cont: total_size: 14784 count: 50 avg.sz: 295 largest: 672 N7gctools17GCVector_moveableIN4core15KeywordArgumentEEE
cont: total_size: 8216 count: 1 avg.sz: 8216 largest: 8216 N7gctools17GCVector_moveableIN4core14DynamicBindingEEE
cont: total_size: 3288 count: 1 avg.sz: 3288 largest: 3288 N7gctools17GCVector_moveableINS_9smart_ptrIN6clbind10ClassRep_OEEEEE
cont: total_size: 1560 count: 1 avg.sz: 1560 largest: 1560 N7gctools17GCVector_moveableINS_9smart_ptrIN4core11Character_OEEEEE
cont: total_size: 1560 count: 1 avg.sz: 1560 largest: 1560 N7gctools17GCVector_moveableINS_9smart_ptrIN4core5Str_OEEEEE
cont: total_size: 1024 count: 1 avg.sz: 1024 largest: 1024 N7gctools17GCVector_moveableIN4core14ExceptionEntryEEE
cont: total_size: 808 count: 13 avg.sz: 62 largest: 136 N7gctools17GCVector_moveableINS_9smart_ptrIN4core9Package_OEEEEE
cont: total_size: 392 count: 7 avg.sz: 56 largest: 56 N7gctools17GCVector_moveableIPN4core15SequenceStepperEEE
cont: total_size: 168 count: 3 avg.sz: 56 largest: 56 N7gctools17GCVector_moveableINS_9smart_ptrIN4core16SourceFileInfo_OEEEEE
cont: total_size: 152 count: 1 avg.sz: 152 largest: 152 N7gctools17GCVector_moveableIN4core11AuxArgumentEEE
-------------------- Reachable LispKinds -------------------
lisp: total_size: 2561936 count: 80050 avg.sz: 32 N4core6Cons_OE
lisp: total_size: 638880 count: 19965 avg.sz: 32 N4core5Str_OE
lisp: total_size: 592560 count: 14814 avg.sz: 40 N4core8Symbol_OE
lisp: total_size: 454960 count: 11374 avg.sz: 40 N4core15VectorObjects_OE
lisp: total_size: 335104 count: 10472 avg.sz: 32 N4core8Bignum_OE
lisp: total_size: 215432 count: 3847 avg.sz: 56 N4core18CompiledFunction_OE
lisp: total_size: 200112 count: 8338 avg.sz: 24 N4core14StandardChar_OE
lisp: total_size: 194272 count: 6071 avg.sz: 32 N4core14CompiledBody_OE
lisp: total_size: 183696 count: 7654 avg.sz: 24 N4core13DoubleFloat_OE
lisp: total_size: 146816 count: 2294 avg.sz: 64 N4core10Instance_OE
lisp: total_size: 137104 count: 1558 avg.sz: 88 N4core19LambdaListHandler_OE
lisp: total_size: 121392 count: 5058 avg.sz: 24 N4core8Fixnum_OE
lisp: total_size: 102408 count: 2560 avg.sz: 40 N4core12ValueFrame_OE
lisp: total_size: 88592 count: 1582 avg.sz: 56 N4core9BuiltIn_OE
lisp: total_size: 79000 count: 2468 avg.sz: 32 N4core15SourcePosInfo_OE
lisp: total_size: 64640 count: 1616 avg.sz: 40 N5llvmo10Function_OE
lisp: total_size: 64360 count: 1609 avg.sz: 40 N4core13HashTableEq_OE
lisp: total_size: 54360 count: 1359 avg.sz: 40 N4core14HashTableEql_OE
lisp: total_size: 37312 count: 583 avg.sz: 64 N4core31SingleDispatchGenericFunction_OE
Post by Peter Wang
Post by Christian Schafmeister
Hi there,
Is there any functionality in the Boehm GC that would allow me to
freeze the garbage collector and walk the heap of live objects? I’d
like to take an inventory of all of the live objects to determine the
source of my current memory problems (30 GB of virtual memory used
after 4 hours of runtime).
Hi,
We do something like that in the Mercury project using the following
function.
I am not an expert on Boehm GC internals, and the code is only used in a
special memory profiling grade, so not heavily exercised.
Please let me know if there is something wrong with it.
Peter
(in reclaim.c)
STATIC void GC_mercury_do_enumerate_reachable_objects(struct hblk *hbp,
word dummy)
{
struct hblkhdr * hhdr = HDR(hbp);
size_t sz = hhdr -> hb_sz;
size_t bit_no;
char *p, *plim;
if (GC_block_empty(hhdr)) {
return;
}
p = hbp->hb_body;
bit_no = 0;
if (sz > MAXOBJBYTES) { /* one big object */
plim = p;
} else {
plim = hbp->hb_body + HBLKSIZE - sz;
}
/* Go through all words in block. */
while (p <= plim) {
if (mark_bit_from_hdr(hhdr, bit_no)) {
GC_mercury_callback_reachable_object((GC_word *)p,
BYTES_TO_WORDS(sz));
}
bit_no += MARK_BIT_OFFSET(sz);
p += sz;
}
}
GC_INNER void GC_mercury_enumerate_reachable_objects(void)
{
GC_ASSERT(GC_mercury_callback_reachable_object);
GC_apply_to_all_blocks(GC_mercury_do_enumerate_reachable_objects, (word)0);
}
Paul Bone
2014-06-16 12:03:59 UTC
Permalink
Post by Peter Wang
Post by Christian Schafmeister
Hi there,
Is there any functionality in the Boehm GC that would allow me to
freeze the garbage collector and walk the heap of live objects? I’d
like to take an inventory of all of the live objects to determine the
source of my current memory problems (30 GB of virtual memory used
after 4 hours of runtime).
Hi,
We do something like that in the Mercury project using the following
function.
I am not an expert on Boehm GC internals, and the code is only used in a
special memory profiling grade, so not heavily exercised.
Please let me know if there is something wrong with it.
I'm chasing down a bug that I think is in Boehm GC and affects Mercury.
I'm going through the various changes that we've made to the collector
including Peter's change and rebasing them onto version 7.4.2 (they were on
7.2).

Here is Peter's code for enumerating reachable objects as a patch on version
7.4.2 incase anyone finds it useful or would like to include it upstram.

https://github.com/PaulBone/bdwgc/tree/peter_trace_heap

Thanks.
--
Paul Bone
Ivan Maidanski
2015-08-06 08:23:30 UTC
Permalink
Hi,
I have added heap walk feature (to master) based on code developed by Peter.
--
Post by Paul Bone
Post by Peter Wang
Post by Christian Schafmeister
Hi there,
Is there any functionality in the Boehm GC that would allow me to
freeze the garbage collector and walk the heap of live objects? I’d
like to take an inventory of all of the live objects to determine the
source of my current memory problems (30 GB of virtual memory used
after 4 hours of runtime).
Hi,
We do something like that in the Mercury project using the following
function.
I am not an expert on Boehm GC internals, and the code is only used in a
special memory profiling grade, so not heavily exercised.
Please let me know if there is something wrong with it.
I'm chasing down a bug that I think is in Boehm GC and affects Mercury.
I'm going through the various changes that we've made to the collector
including Peter's change and rebasing them onto version 7.4.2 (they were on
7.2).
Here is Peter's code for enumerating reachable objects as a patch on version
7.4.2 incase anyone finds it useful or would like to include it upstram.
https://github.com/PaulBone/bdwgc/tree/peter_trace_heap
Thanks.
--
Paul Bone
_______________________________________________
bdwgc mailing list
https://lists.opendylan.org/mailman/listinfo/bdwgc
Christian Schafmeister
2014-06-06 23:07:26 UTC
Permalink
Apologies if this is a duplicate - I ran into trouble with the size due to a graph I included.

I used the code that Peter sent me as-is - it works very well - thanks!

I’m writing a Common Lisp implementation that interoperates with C++ and uses LLVM as the back end. I’ve written a static analyzer for my C++ code to help me clean it up and add features. When I run the static analyzer on the 165 C++ source files using the Boehm GC it blows up.

At the end of processing 165 source files the process consumes about 30GB (gigabytes) of memory.

So I run the static analyzer on one source file 10 times and use the code that Peter sent me and walk the reachable objects and add up their memory footprint. There are a handful of objects that don’t have valid headers - I filter them out and currently don’t count them. I find that the total reachable memory used by the system remains pretty constant at 62MB to 34MB (it spikes and then goes down).

But the total memory used by my process goes up and up and up very quickly!

I ran the OS X program “heap” on the process and it reports “non-object” heap memory - which I assume is allocated by the Boehm GC (this assumption may be wrong - please correct me if you know better).

Here is a graph that shows all of the memory usage each time after parsing/generating an AST/searching the AST for the same C++ file. A bit of explanation… The Y axis is log(Bytes) The red points are what “heap” reports as “non-object” memory - I assume this is total Boehm memory. The green points are the total reachable Boehm memory that has a valid header that I built. The blue points are the reachable Boehm memory that was allocated just since the start of loading the most recent C++ file. I measure this by writing an integer marker into the header of each newly allocated object and changing that marker each time I read the source file - this lets me track what objects are allocated since the last operation.

Graph link: http://imgur.com/tSN5jRL



This can’t be memory fragmentation can it?
Is Boehm not reusing memory properly?
Am I not configuring Boehm properly?
Bruce Hoult
2014-06-07 01:04:07 UTC
Permalink
How are you reading the c++ files? Are they, or anything else, GC objects
larger than 4 KB?



On Sat, Jun 7, 2014 at 11:07 AM, Christian Schafmeister <
Post by Christian Schafmeister
Apologies if this is a duplicate - I ran into trouble with the size due to
a graph I included.
I used the code that Peter sent me as-is - it works very well - thanks!
I’m writing a Common Lisp implementation that interoperates with C++ and
uses LLVM as the back end. I’ve written a static analyzer for my C++ code
to help me clean it up and add features. When I run the static analyzer on
the 165 C++ source files using the Boehm GC it blows up.
At the end of processing 165 source files the process consumes about 30GB
(gigabytes) of memory.
So I run the static analyzer on one source file 10 times and use the code
that Peter sent me and walk the reachable objects and add up their memory
footprint. There are a handful of objects that don’t have valid headers -
I filter them out and currently don’t count them. I find that the total
reachable memory used by the system remains pretty constant at 62MB to 34MB
(it spikes and then goes down).
But the total memory used by my process goes up and up and up very quickly!
I ran the OS X program “heap” on the process and it reports “non-object”
heap memory - which I assume is allocated by the Boehm GC (this assumption
may be wrong - please correct me if you know better).
Here is a graph that shows all of the memory usage each time after
parsing/generating an AST/searching the AST for the same C++ file. A bit of
explanation
 The Y axis is log(Bytes) The red points are what “heap”
reports as “non-object” memory - I assume this is total Boehm memory. The
green points are the total reachable Boehm memory that has a valid header
that I built. The blue points are the reachable Boehm memory that was
allocated just since the start of loading the most recent C++ file. I
measure this by writing an integer marker into the header of each newly
allocated object and changing that marker each time I read the source file
- this lets me track what objects are allocated since the last operation.
Graph link: http://imgur.com/tSN5jRL
This can’t be memory fragmentation can it?
Is Boehm not reusing memory properly?
Am I not configuring Boehm properly?
_______________________________________________
bdwgc mailing list
https://lists.opendylan.org/mailman/listinfo/bdwgc
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
Christian Schafmeister
2014-06-07 03:54:31 UTC
Permalink
Bruce,

The C++ files are read by the Clang library - they are used to construct C++ abstract syntax trees that I search using the ASTMatcher facilities of the Clang library. I can monitor clang/llvm allocations on OS X using the “heap” command line program because they are allocated using new/delete. They don’t appear to leak memory.

I have some large arrays (largest is 393K, see below) but I don’t think that many - I could be wrong on that. I’ll count how many are larger than 4K.

After parsing/generating-AST/matching features in the AST for the same C++ file 10 times
A summary of the memory usage according to the Boehm GC library routines...

Done walk of memory 1093 ClassKinds 82 LispKinds 18 ContainerKinds 1 StringKinds
TOTAL invalidHeaderTotalSize = 8682352
TOTAL memory usage (bytes): 34446496
TOTAL GC_get_heap_size() 352399360
TOTAL GC_get_free_bytes() 35569664
TOTAL GC_get_bytes_since_gc() 141379680
TOTAL GC_get_total_bytes() 316783433851

So the Boehm GC says it’s using 0.3GB (GC_get_heap_size()) but the OS X Activity monitor reports that the process is consuming 6.34GB of memory.

If I can believe the Boehm GC number then I’m leaking memory somewhere else.


An inventory of the largest reachable objects…

-------------------- Reachable StringKinds -------------------
strs: total_size: 5401416 count: 88618 avg.sz: 60 largest: 1024 N7gctools17GCString_moveableIcEE
-------------------- Reachable ContainerKinds -------------------
cont: total_size: 3569344 count: 47460 avg.sz: 75 largest: 393240 N7gctools17GCVector_moveableINS_9smart_ptrIN4core3T_OEEEEE
cont: total_size: 2359344 count: 2 avg.sz: 1179672 largest: 1179672 N7gctools17GCVector_moveableIN4core11CacheRecordEEE
cont: total_size: 177512 count: 5112 avg.sz: 34 largest: 88 N7gctools16GCArray_moveableINS_9smart_ptrIN4core3T_OEEELi0EEE
cont: total_size: 160584 count: 1336 avg.sz: 120 largest: 288 N7gctools17GCVector_moveableIN4core16RequiredArgumentEEE
cont: total_size: 54040 count: 73 avg.sz: 740 largest: 3296 N7gctools17GCVector_moveableINS_9smart_ptrIN4core8Symbol_OEEEEE
cont: total_size: 32256 count: 112 avg.sz: 288 largest: 288 N7gctools17GCVector_moveableIN4core16OptionalArgumentEEE
cont: total_size: 15736 count: 281 avg.sz: 56 largest: 56 N7gctools17GCVector_moveableINS_9smart_ptrIN4core6Cons_OEEEEE
cont: total_size: 14784 count: 50 avg.sz: 295 largest: 672 N7gctools17GCVector_moveableIN4core15KeywordArgumentEEE
cont: total_size: 8216 count: 1 avg.sz: 8216 largest: 8216 N7gctools17GCVector_moveableIN4core14DynamicBindingEEE
cont: total_size: 3288 count: 1 avg.sz: 3288 largest: 3288 N7gctools17GCVector_moveableINS_9smart_ptrIN6clbind10ClassRep_OEEEEE
cont: total_size: 2168 count: 1 avg.sz: 2168 largest: 2168 N7gctools17GCVector_moveableIN10asttooling12_GLOBAL__N_112RegistryMaps27SymbolMatcherDescriptorPairEEE
cont: total_size: 1560 count: 1 avg.sz: 1560 largest: 1560 N7gctools17GCVector_moveableINS_9smart_ptrIN4core11Character_OEEEEE
cont: total_size: 1560 count: 1 avg.sz: 1560 largest: 1560 N7gctools17GCVector_moveableINS_9smart_ptrIN4core5Str_OEEEEE
cont: total_size: 1448 count: 1 avg.sz: 1448 largest: 1448 N7gctools17GCVector_moveableIN4core14ExceptionEntryEEE
cont: total_size: 808 count: 13 avg.sz: 62 largest: 136 N7gctools17GCVector_moveableINS_9smart_ptrIN4core9Package_OEEEEE
cont: total_size: 784 count: 14 avg.sz: 56 largest: 56 N7gctools17GCVector_moveableIPN10asttooling8internal17MatcherDescriptorEEE
cont: total_size: 152 count: 1 avg.sz: 152 largest: 152 N7gctools17GCVector_moveableIN4core11AuxArgumentEEE
cont: total_size: 56 count: 1 avg.sz: 56 largest: 56 N7gctools17GCVector_moveableINS_9smart_ptrIN4core16SourceFileInfo_OEEEEE
-------------------- Reachable LispKinds -------------------
lisp: total_size: 11301888 count: 353160 avg.sz: 32 N4core6Cons_OE
lisp: total_size: 2454176 count: 76693 avg.sz: 32 N4core5Str_OE
lisp: total_size: 1724448 count: 53889 avg.sz: 32 N4core8Bignum_OE
lisp: total_size: 1607040 count: 25110 avg.sz: 64 N4core10Instance_OE
lisp: total_size: 1295120 count: 32378 avg.sz: 40 N4core13BranchSNode_OE
lisp: total_size: 852768 count: 26649 avg.sz: 32 N4core11LeafSNode_OE
lisp: total_size: 657440 count: 16436 avg.sz: 40 N4core8Symbol_OE
lisp: total_size: 446408 count: 11160 avg.sz: 40 N4core15VectorObjects_OE
lisp: total_size: 418920 count: 10473 avg.sz: 40 N4core26VectorObjectsWithFillPtr_OE
lisp: total_size: 268912 count: 4802 avg.sz: 56 N4core18CompiledFunction_OE
lisp: total_size: 229760 count: 7180 avg.sz: 32 N4core14CompiledBody_OE
lisp: total_size: 203680 count: 5092 avg.sz: 40 N4core12ValueFrame_OE
lisp: total_size: 194112 count: 8088 avg.sz: 24 N4core8Fixnum_OE
lisp: total_size: 148064 count: 4627 avg.sz: 32 N4core16StrWithFillPtr_OE
lisp: total_size: 136064 count: 1546 avg.sz: 88 N4core19LambdaListHandler_OE
lisp: total_size: 96376 count: 1721 avg.sz: 56 N4core9BuiltIn_OE
lisp: total_size: 74272 count: 2320 avg.sz: 32 N4core15SourcePosInfo_OE
How are you reading the c++ files? Are they, or anything else, GC objects larger than 4 KB?
Apologies if this is a duplicate - I ran into trouble with the size due to a graph I included.
I used the code that Peter sent me as-is - it works very well - thanks!
I’m writing a Common Lisp implementation that interoperates with C++ and uses LLVM as the back end. I’ve written a static analyzer for my C++ code to help me clean it up and add features. When I run the static analyzer on the 165 C++ source files using the Boehm GC it blows up.
At the end of processing 165 source files the process consumes about 30GB (gigabytes) of memory.
So I run the static analyzer on one source file 10 times and use the code that Peter sent me and walk the reachable objects and add up their memory footprint. There are a handful of objects that don’t have valid headers - I filter them out and currently don’t count them. I find that the total reachable memory used by the system remains pretty constant at 62MB to 34MB (it spikes and then goes down).
But the total memory used by my process goes up and up and up very quickly!
I ran the OS X program “heap” on the process and it reports “non-object” heap memory - which I assume is allocated by the Boehm GC (this assumption may be wrong - please correct me if you know better).
Here is a graph that shows all of the memory usage each time after parsing/generating an AST/searching the AST for the same C++ file. A bit of explanation… The Y axis is log(Bytes) The red points are what “heap” reports as “non-object” memory - I assume this is total Boehm memory. The green points are the total reachable Boehm memory that has a valid header that I built. The blue points are the reachable Boehm memory that was allocated just since the start of loading the most recent C++ file. I measure this by writing an integer marker into the header of each newly allocated object and changing that marker each time I read the source file - this lets me track what objects are allocated since the last operation.
Graph link: http://imgur.com/tSN5jRL
This can’t be memory fragmentation can it?
Is Boehm not reusing memory properly?
Am I not configuring Boehm properly?
_______________________________________________
bdwgc mailing list
https://lists.opendylan.org/mailman/listinfo/bdwgc
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
Bruce Hoult
2014-06-07 04:33:51 UTC
Permalink
You also seem to have
two N7gctools17GCVector_moveableIN4core11CacheRecordEEE of 1.2 MB each.

So when you allocate one of those, the GC has to find 300 contiguous pages.
That's a lot.

If those two are all you ever have, then no problem. If you're constantly
allocating and GCing them then that's potentially a big big problem.

Unfortunately, in a non-moving memory manager there is no perfect algorithm
to decide when or whether to split up or combine large memory blocks. That
applies to both malloc and gc.

In my own code, I try not to allocate anything larger than a heap block. I
try to avoid very large arrays (including strings), and when I do need them
I use an implementation that uses lots of small blocks and an index.

That would probably be a big change to your code.

While you have a locally-modified version of the GC, may I suggest you go
to GC_allochblk_nth() and add "may_split = FALSE;" as the first statement
(overriding the parameter's passed value). Let me know if that makes any
difference and if so we can try something a little more subtle :)



On Sat, Jun 7, 2014 at 3:54 PM, Christian Schafmeister <
Post by Christian Schafmeister
Bruce,
The C++ files are read by the Clang library - they are used to construct
C++ abstract syntax trees that I search using the ASTMatcher facilities of
the Clang library. I can monitor clang/llvm allocations on OS X using
the “heap” command line program because they are allocated using
new/delete. They don’t appear to leak memory.
I have some large arrays (largest is 393K, see below) but I don’t think
that many - I could be wrong on that. I’ll count how many are larger than
4K.
After parsing/generating-AST/matching features in the AST for the same C++ file 10 times
A summary of the memory usage according to the Boehm GC library routines...
Done walk of memory 1093 ClassKinds 82 LispKinds 18 ContainerKinds
1 StringKinds
TOTAL invalidHeaderTotalSize = 8682352
TOTAL memory usage (bytes): 34446496
TOTAL GC_get_heap_size() 352399360
TOTAL GC_get_free_bytes() 35569664
TOTAL GC_get_bytes_since_gc() 141379680
TOTAL GC_get_total_bytes() 316783433851
So the Boehm GC says it’s using 0.3GB (GC_get_heap_size()) but the OS X
Activity monitor reports that the process is consuming 6.34GB of memory.
If I can believe the Boehm GC number then I’m leaking memory somewhere
else.
An inventory of the largest reachable objects

-------------------- Reachable StringKinds -------------------
1024 N7gctools17GCString_moveableIcEE
-------------------- Reachable ContainerKinds -------------------
393240 N7gctools17GCVector_moveableINS_9smart_ptrIN4core3T_OEEEEE
1179672 N7gctools17GCVector_moveableIN4core11CacheRecordEEE
88 N7gctools16GCArray_moveableINS_9smart_ptrIN4core3T_OEEELi0EEE
288 N7gctools17GCVector_moveableIN4core16RequiredArgumentEEE
3296 N7gctools17GCVector_moveableINS_9smart_ptrIN4core8Symbol_OEEEEE
288 N7gctools17GCVector_moveableIN4core16OptionalArgumentEEE
56 N7gctools17GCVector_moveableINS_9smart_ptrIN4core6Cons_OEEEEE
672 N7gctools17GCVector_moveableIN4core15KeywordArgumentEEE
8216 N7gctools17GCVector_moveableIN4core14DynamicBindingEEE
3288
N7gctools17GCVector_moveableINS_9smart_ptrIN6clbind10ClassRep_OEEEEE
2168
N7gctools17GCVector_moveableIN10asttooling12_GLOBAL__N_112RegistryMaps27SymbolMatcherDescriptorPairEEE
1560 N7gctools17GCVector_moveableINS_9smart_ptrIN4core11Character_OEEEEE
1560 N7gctools17GCVector_moveableINS_9smart_ptrIN4core5Str_OEEEEE
1448 N7gctools17GCVector_moveableIN4core14ExceptionEntryEEE
136 N7gctools17GCVector_moveableINS_9smart_ptrIN4core9Package_OEEEEE
56
N7gctools17GCVector_moveableIPN10asttooling8internal17MatcherDescriptorEEE
152 N7gctools17GCVector_moveableIN4core11AuxArgumentEEE
56
N7gctools17GCVector_moveableINS_9smart_ptrIN4core16SourceFileInfo_OEEEEE
-------------------- Reachable LispKinds -------------------
lisp: total_size: 11301888 count: 353160 avg.sz: 32
N4core6Cons_OE
lisp: total_size: 2454176 count: 76693 avg.sz: 32
N4core5Str_OE
lisp: total_size: 1724448 count: 53889 avg.sz: 32
N4core8Bignum_OE
lisp: total_size: 1607040 count: 25110 avg.sz: 64
N4core10Instance_OE
lisp: total_size: 1295120 count: 32378 avg.sz: 40
N4core13BranchSNode_OE
lisp: total_size: 852768 count: 26649 avg.sz: 32
N4core11LeafSNode_OE
lisp: total_size: 657440 count: 16436 avg.sz: 40
N4core8Symbol_OE
lisp: total_size: 446408 count: 11160 avg.sz: 40
N4core15VectorObjects_OE
lisp: total_size: 418920 count: 10473 avg.sz: 40
N4core26VectorObjectsWithFillPtr_OE
lisp: total_size: 268912 count: 4802 avg.sz: 56
N4core18CompiledFunction_OE
lisp: total_size: 229760 count: 7180 avg.sz: 32
N4core14CompiledBody_OE
lisp: total_size: 203680 count: 5092 avg.sz: 40
N4core12ValueFrame_OE
lisp: total_size: 194112 count: 8088 avg.sz: 24
N4core8Fixnum_OE
lisp: total_size: 148064 count: 4627 avg.sz: 32
N4core16StrWithFillPtr_OE
lisp: total_size: 136064 count: 1546 avg.sz: 88
N4core19LambdaListHandler_OE
lisp: total_size: 96376 count: 1721 avg.sz: 56
N4core9BuiltIn_OE
lisp: total_size: 74272 count: 2320 avg.sz: 32
N4core15SourcePosInfo_OE
How are you reading the c++ files? Are they, or anything else, GC objects larger than 4 KB?
On Sat, Jun 7, 2014 at 11:07 AM, Christian Schafmeister <
Post by Christian Schafmeister
Apologies if this is a duplicate - I ran into trouble with the size due
to a graph I included.
I used the code that Peter sent me as-is - it works very well - thanks!
I’m writing a Common Lisp implementation that interoperates with C++ and
uses LLVM as the back end. I’ve written a static analyzer for my C++ code
to help me clean it up and add features. When I run the static analyzer on
the 165 C++ source files using the Boehm GC it blows up.
At the end of processing 165 source files the process consumes about 30GB
(gigabytes) of memory.
So I run the static analyzer on one source file 10 times and use the code
that Peter sent me and walk the reachable objects and add up their memory
footprint. There are a handful of objects that don’t have valid headers -
I filter them out and currently don’t count them. I find that the total
reachable memory used by the system remains pretty constant at 62MB to 34MB
(it spikes and then goes down).
But the total memory used by my process goes up and up and up very quickly!
I ran the OS X program “heap” on the process and it reports “non-object”
heap memory - which I assume is allocated by the Boehm GC (this assumption
may be wrong - please correct me if you know better).
Here is a graph that shows all of the memory usage each time after
parsing/generating an AST/searching the AST for the same C++ file. A bit of
explanation
 The Y axis is log(Bytes) The red points are what “heap”
reports as “non-object” memory - I assume this is total Boehm memory. The
green points are the total reachable Boehm memory that has a valid header
that I built. The blue points are the reachable Boehm memory that was
allocated just since the start of loading the most recent C++ file. I
measure this by writing an integer marker into the header of each newly
allocated object and changing that marker each time I read the source file
- this lets me track what objects are allocated since the last operation.
Graph link: http://imgur.com/tSN5jRL
This can’t be memory fragmentation can it?
Is Boehm not reusing memory properly?
Am I not configuring Boehm properly?
_______________________________________________
bdwgc mailing list
https://lists.opendylan.org/mailman/listinfo/bdwgc
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
--
This message has been scanned for viruses and
dangerous content by *MailScanner* <http://www.mailscanner.info/>, and is
believed to be clean.
Ivan Maidanski
2014-06-07 08:15:33 UTC
Permalink
Hi Cristian,
Do you use latest gc? (From any bdwgc repository)
I remember I had some out of memory issue resolved by fixing splitting in allocblk_nth. See commits ec5130a and 701f219 but they both relates to bdwgc compiled with USE MUNMAP.
Regards,
Ivan
--
Sat, 07 Jun 2014, 08:33 +04:00 from Bruce Hoult <***@hoult.org>:
You also seem to have two N7gctools17GCVector_moveableIN4core11CacheRecordEEE of 1.2 MB each.
So when you allocate one of those, the GC has to find 300 contiguous pages. That's a lot.
If those two are all you ever have, then no problem. If you're constantly allocating and GCing them then that's potentially a big big problem.
Unfortunately, in a non-moving memory manager there is no perfect algorithm to decide when or whether to split up or combine large memory blocks. That applies to both malloc and gc.
In my own code, I try not to allocate anything larger than a heap block. I try to avoid very large arrays (including strings), and when I do need them I use an implementation that uses lots of small blocks and an index.
That would probably be a big change to your code.
While you have a locally-modified version of the GC, may I suggest you go to GC_allochblk_nth() and add "may_split = FALSE;" as the first statement (overriding the parameter's passed value). Let me know if that makes any difference and if so we can try something a little more subtle :)
Post by Christian Schafmeister
Bruce,
The C++ files are read by the Clang library - they are used to construct C++ abstract syntax trees that I search using the ASTMatcher facilities of the Clang library.    I can monitor clang/llvm allocations on OS X using the “heap” command line program because they are allocated using new/delete.  They don’t appear to leak memory.   
I have some large arrays (largest is 393K, see below) but I don’t think that many - I could be wrong on that.  I’ll count how many are larger than 4K.
After parsing/generating-AST/matching features in the AST for the same C++ file 10 times
A summary of the memory usage according to the Boehm GC library routines...
Done walk of memory  1093 ClassKinds   82 LispKinds   18 ContainerKinds   1 StringKinds
TOTAL invalidHeaderTotalSize =      8682352
TOTAL memory usage (bytes):        34446496
TOTAL GC_get_heap_size()          352399360
TOTAL GC_get_free_bytes()          35569664
TOTAL GC_get_bytes_since_gc()     141379680
TOTAL GC_get_total_bytes()     316783433851
So the Boehm GC says it’s using 0.3GB (GC_get_heap_size()) but the OS X Activity monitor reports that the process is consuming 6.34GB of memory.
If I can believe the Boehm GC number then I’m leaking memory somewhere else.
An inventory of the largest reachable objects

-------------------- Reachable StringKinds -------------------
strs: total_size:    5401416 count:    88618   avg.sz :       60  largest:     1024 N7gctools17GCString_moveableIcEE
-------------------- Reachable ContainerKinds -------------------
cont: total_size:    3569344 count:    47460   avg.sz :       75  largest:   393240 N7gctools17GCVector_moveableINS_9smart_ptrIN4core3T_OEEEEE
cont: total_size:    2359344 count:        2   avg.sz :  1179672  largest:  1179672 N7gctools17GCVector_moveableIN4core11CacheRecordEEE
cont: total_size:     177512 count:     5112   avg.sz :       34  largest:       88 N7gctools16GCArray_moveableINS_9smart_ptrIN4core3T_OEEELi0EEE
cont: total_size:     160584 count:     1336   avg.sz :      120  largest:      288 N7gctools17GCVector_moveableIN4core16RequiredArgumentEEE
cont: total_size:      54040 count:       73   avg.sz :      740  largest:     3296 N7gctools17GCVector_moveableINS_9smart_ptrIN4core8Symbol_OEEEEE
cont: total_size:      32256 count:      112   avg.sz :      288  largest:      288 N7gctools17GCVector_moveableIN4core16OptionalArgumentEEE
cont: total_size:      15736 count:      281   avg.sz :       56  largest:       56 N7gctools17GCVector_moveableINS_9smart_ptrIN4core6Cons_OEEEEE
cont: total_size:      14784 count:       50   avg.sz :      295  largest:      672 N7gctools17GCVector_moveableIN4core15KeywordArgumentEEE
cont: total_size:       8216 count:        1   avg.sz :     8216  largest:     8216 N7gctools17GCVector_moveableIN4core14DynamicBindingEEE
cont: total_size:       3288 count:        1   avg.sz :     3288  largest:     3288 N7gctools17GCVector_moveableINS_9smart_ptrIN6clbind10ClassRep_OEEEEE
cont: total_size:       2168 count:        1   avg.sz :     2168  largest:     2168 N7gctools17GCVector_moveableIN10asttooling12_GLOBAL__N_112RegistryMaps27SymbolMatcherDescriptorPairEEE
cont: total_size:       1560 count:        1   avg.sz :     1560  largest:     1560 N7gctools17GCVector_moveableINS_9smart_ptrIN4core11Character_OEEEEE
cont: total_size:       1560 count:        1   avg.sz :     1560  largest:     1560 N7gctools17GCVector_moveableINS_9smart_ptrIN4core5Str_OEEEEE
cont: total_size:       1448 count:        1   avg.sz :     1448  largest:     1448 N7gctools17GCVector_moveableIN4core14ExceptionEntryEEE
cont: total_size:        808 count:       13   avg.sz :       62  largest:      136 N7gctools17GCVector_moveableINS_9smart_ptrIN4core9Package_OEEEEE
cont: total_size:        784 count:       14   avg.sz :       56  largest:       56 N7gctools17GCVector_moveableIPN10asttooling8internal17MatcherDescriptorEEE
cont: total_size:        152 count:        1   avg.sz :      152  largest:      152 N7gctools17GCVector_moveableIN4core11AuxArgumentEEE
cont: total_size:         56 count:        1   avg.sz :       56  largest:       56 N7gctools17GCVector_moveableINS_9smart_ptrIN4core16SourceFileInfo_OEEEEE
-------------------- Reachable LispKinds -------------------
lisp: total_size:   11301888 count:   353160  avg.sz :       32 N4core6Cons_OE
lisp: total_size:    2454176 count:    76693  avg.sz :       32 N4core5Str_OE
lisp: total_size:    1724448 count:    53889  avg.sz :       32 N4core8Bignum_OE
lisp: total_size:    1607040 count:    25110  avg.sz :       64 N4core10Instance_OE
lisp: total_size:    1295120 count:    32378  avg.sz :       40 N4core13BranchSNode_OE
lisp: total_size:     852768 count:    26649  avg.sz :       32 N4core11LeafSNode_OE
lisp: total_size:     657440 count:    16436  avg.sz :       40 N4core8Symbol_OE
lisp: total_size:     446408 count:    11160  avg.sz :       40 N4core15VectorObjects_OE
lisp: total_size:     418920 count:    10473  avg.sz :       40 N4core26VectorObjectsWithFillPtr_OE
lisp: total_size:     268912 count:     4802  avg.sz :       56 N4core18CompiledFunction_OE
lisp: total_size:     229760 count:     7180  avg.sz :       32 N4core14CompiledBody_OE
lisp: total_size:     203680 count:     5092  avg.sz :       40 N4core12ValueFrame_OE
lisp: total_size:     194112 count:     8088  avg.sz :       24 N4core8Fixnum_OE
lisp: total_size:     148064 count:     4627  avg.sz :       32 N4core16StrWithFillPtr_OE
lisp: total_size:     136064 count:     1546  avg.sz :       88 N4core19LambdaListHandler_OE
lisp: total_size:      96376 count:     1721  avg.sz :       56 N4core9BuiltIn_OE
lisp: total_size:      74272 count:     2320  avg.sz :       32 N4core15SourcePosInfo_OE
How are you reading the c++ files? Are they, or anything else, GC objects larger than 4 KB?
Post by Christian Schafmeister
Apologies if this is a duplicate - I ran into trouble with the size due to a graph I included.
I used the code that Peter sent me as-is - it works very well - thanks!
I’m writing a Common Lisp implementation that interoperates with C++ and uses LLVM as the back end.  I’ve written a static analyzer for my C++ code to help me clean it up and add features. When I run the static analyzer on the 165 C++ source files using the Boehm GC it blows up.
At the end of processing 165 source files the process consumes about 30GB (gigabytes) of memory.
So I run the static analyzer on one source file 10 times and use the code that Peter sent me and walk the reachable objects and add up their memory footprint.   There are a handful of objects that don’t have valid headers - I filter them out and currently don’t count them. I find that the total reachable memory used by the system remains pretty constant at 62MB to 34MB (it spikes and then goes down).
But the total memory used by my process goes up and up and up very quickly!
I ran the OS X program “heap” on the process and it reports “non-object” heap memory - which I assume is allocated by the Boehm GC (this assumption may be wrong - please correct me if you know better).
Here is a graph that shows all of the memory usage each time after parsing/generating an AST/searching the AST for the same C++ file. A bit of explanation
 The Y axis is log(Bytes)  The red points are what “heap” reports as “non-object” memory - I assume this is total Boehm memory. The green points are the total reachable Boehm memory that has a valid header that I built.   The blue points are the reachable Boehm memory that was allocated just since the start of loading the most recent C++ file. I measure this by writing an integer marker into the header of each newly allocated object and changing that marker each time I read the source file - this lets me track what objects are allocated since the last operation.
Graph link:  http://imgur.com/tSN5jRL
This can’t be memory fragmentation can it?
Is Boehm not reusing memory properly?
Am I not configuring Boehm properly?
_______________________________________________
bdwgc mailing list
https://lists.opendylan.org/mailman/listinfo/bdwgc
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
--
This message has been scanned for viruses and
dangerous content by MailScanner , and is
believed to be clean.
Bruce Hoult
2014-06-07 08:44:48 UTC
Permalink
Personally, I think block splitting (or combining for that matter) is
inherently a bad thing to do.

The only use-case it really helps with is a program allocating huge
temporary things at startup and then never again. At the same time it
pessimises many other common use-patterns. It's one thing to have a single
1 MB unused hole laying around forever. It's quite another to repeatedly
turn 1 MB holes into 999999 KB holes that aren't quite big enough the next
time the 1 MB object is allocated.

Anyway, I'm interested to hear from Christian if disabling it has any
effect.
Post by Ivan Maidanski
Hi Cristian,
Christian Schafmeister
2014-06-07 15:23:18 UTC
Permalink
Bruce,

I made the following change to my GC_allochblk_nth() function - it crashes whenever I run with that change.

Here’s what happens and a backtrace - it is followed by a section of GC_allochblk_nth() where I made the modification.

(lldb) run
run
There is a running process, kill it and restart?: [Y/n] y
y
Process 70812 launched: '/Users/meister/Development/cando/clasp/build/cando.app/Contents/MacOS/clasp_boehm_d' (x86_64)
Too many heap sections: Increase MAXHINCR or MAX_HEAP_SECTS
Process 70812 stopped
* thread #1: tid = 0x98121, 0x00007fff8ad77866 libsystem_kernel.dylib`__pthread_kill + 10, stop reason = signal SIGABRT
frame #0: 0x00007fff8ad77866 libsystem_kernel.dylib`__pthread_kill + 10
libsystem_kernel.dylib`__pthread_kill + 10:
-> 0x7fff8ad77866: jae 0x7fff8ad77870 ; __pthread_kill + 20
0x7fff8ad77868: movq %rax, %rdi
0x7fff8ad7786b: jmpq 0x7fff8ad74175 ; cerror_nocancel
0x7fff8ad77870: ret
(lldb) bt
bt
* thread #1: tid = 0x98121, 0x00007fff8ad77866 libsystem_kernel.dylib`__pthread_kill + 10, stop reason = signal SIGABRT
* frame #0: 0x00007fff8ad77866 libsystem_kernel.dylib`__pthread_kill + 10
frame #1: 0x00007fff8d71635c libsystem_pthread.dylib`pthread_kill + 92
frame #2: 0x00007fff9265bb1a libsystem_c.dylib`abort + 125
frame #3: 0x0000000105e3badd libgc.1.dylib`GC_abort + 157
frame #4: 0x0000000105e2c4ad libgc.1.dylib`GC_add_to_heap + 45
frame #5: 0x0000000105e2c9bc libgc.1.dylib`GC_expand_hp_inner + 556
frame #6: 0x0000000105e2cd0e libgc.1.dylib`GC_collect_or_expand + 478
frame #7: 0x0000000105e2cfc9 libgc.1.dylib`GC_allocobj + 345
frame #8: 0x0000000105e33f09 libgc.1.dylib`GC_generic_malloc_inner + 345
frame #9: 0x0000000105e33ec4 libgc.1.dylib`GC_generic_malloc_inner + 276
frame #10: 0x0000000105e340d4 libgc.1.dylib`GC_generic_malloc + 148
frame #11: 0x0000000105e347a0 libgc.1.dylib`GC_malloc_uncollectable + 384
frame #12: 0x0000000100b4087a clasp_boehm_d`core::Lisp_O* gctools::allocateRootClass<core::Lisp_O>() + 26 at /Users/meister/Development/cando/clasp/src/main/../../src/gctools/gcalloc.h:272
frame #13: 0x0000000100b0556b clasp_boehm_d`core::Lisp_O::createLispEnvironment(bool, int, int) + 27 at /Users/meister/Development/cando/clasp/src/main/../../src/core/lisp.cc:327
frame #14: 0x0000000100b2ed79 clasp_boehm_d`LispHolder + 73 at /Users/meister/Development/cando/clasp/src/main/../../src/core/lisp.cc:3782
frame #15: 0x0000000100b2ed21 clasp_boehm_d`LispHolder + 49 at /Users/meister/Development/cando/clasp/src/main/../../src/core/lisp.cc:3783
frame #16: 0x000000010000233a clasp_boehm_d`startup(int, char**, bool&, int&, int&) + 90 at /Users/meister/Development/cando/clasp/src/main/main.cc:74
frame #17: 0x000000010000311a clasp_boehm_d`main + 314 at /Users/meister/Development/cando/clasp/src/main/main.cc:199
(lldb)



Here’s the change I made:


STATIC struct hblk *
GC_allochblk_nth(size_t sz, int kind, unsigned flags, int n,
GC_bool may_split)
{
struct hblk *hbp;
hdr * hhdr; /* Header corr. to hbp */
/* Initialized after loop if hbp !=0 */
/* Gcc uninitialized use warning is bogus. */
struct hblk *thishbp;
hdr * thishdr; /* Header corr. to hbp */
signed_word size_needed; /* number of bytes in requested objects */
signed_word size_avail; /* bytes available in this block */

may_split = FALSE; // meister modification suggested by Bruce Hoult

size_needed = HBLKSIZE * OBJ_SZ_TO_BLOCKS(sz);

/* search for a big enough block in free list */
hbp = GC_hblkfreelist[n];
for(; 0 != hbp; hbp = hhdr -> hb_next) {
GET_HDR(hbp, hhdr);
size_avail = hhdr->hb_sz;
if (size_avail < size_needed) continue;
if (size_avail != size_needed) {
signed_word next_size;
...
You also seem to have two N7gctools17GCVector_moveableIN4core11CacheRecordEEE of 1.2 MB each.
So when you allocate one of those, the GC has to find 300 contiguous pages. That's a lot.
If those two are all you ever have, then no problem. If you're constantly allocating and GCing them then that's potentially a big big problem.
Unfortunately, in a non-moving memory manager there is no perfect algorithm to decide when or whether to split up or combine large memory blocks. That applies to both malloc and gc.
In my own code, I try not to allocate anything larger than a heap block. I try to avoid very large arrays (including strings), and when I do need them I use an implementation that uses lots of small blocks and an index.
That would probably be a big change to your code.
While you have a locally-modified version of the GC, may I suggest you go to GC_allochblk_nth() and add "may_split = FALSE;" as the first statement (overriding the parameter's passed value). Let me know if that makes any difference and if so we can try something a little more subtle :)
Bruce,
The C++ files are read by the Clang library - they are used to construct C++ abstract syntax trees that I search using the ASTMatcher facilities of the Clang library. I can monitor clang/llvm allocations on OS X using the “heap” command line program because they are allocated using new/delete. They don’t appear to leak memory.
I have some large arrays (largest is 393K, see below) but I don’t think that many - I could be wrong on that. I’ll count how many are larger than 4K.
After parsing/generating-AST/matching features in the AST for the same C++ file 10 times
A summary of the memory usage according to the Boehm GC library routines...
Done walk of memory 1093 ClassKinds 82 LispKinds 18 ContainerKinds 1 StringKinds
TOTAL invalidHeaderTotalSize = 8682352
TOTAL memory usage (bytes): 34446496
TOTAL GC_get_heap_size() 352399360
TOTAL GC_get_free_bytes() 35569664
TOTAL GC_get_bytes_since_gc() 141379680
TOTAL GC_get_total_bytes() 316783433851
So the Boehm GC says it’s using 0.3GB (GC_get_heap_size()) but the OS X Activity monitor reports that the process is consuming 6.34GB of memory.
If I can believe the Boehm GC number then I’m leaking memory somewhere else.
An inventory of the largest reachable objects…
-------------------- Reachable StringKinds -------------------
strs: total_size: 5401416 count: 88618 avg.sz: 60 largest: 1024 N7gctools17GCString_moveableIcEE
-------------------- Reachable ContainerKinds -------------------
cont: total_size: 3569344 count: 47460 avg.sz: 75 largest: 393240 N7gctools17GCVector_moveableINS_9smart_ptrIN4core3T_OEEEEE
cont: total_size: 2359344 count: 2 avg.sz: 1179672 largest: 1179672 N7gctools17GCVector_moveableIN4core11CacheRecordEEE
cont: total_size: 177512 count: 5112 avg.sz: 34 largest: 88 N7gctools16GCArray_moveableINS_9smart_ptrIN4core3T_OEEELi0EEE
cont: total_size: 160584 count: 1336 avg.sz: 120 largest: 288 N7gctools17GCVector_moveableIN4core16RequiredArgumentEEE
cont: total_size: 54040 count: 73 avg.sz: 740 largest: 3296 N7gctools17GCVector_moveableINS_9smart_ptrIN4core8Symbol_OEEEEE
cont: total_size: 32256 count: 112 avg.sz: 288 largest: 288 N7gctools17GCVector_moveableIN4core16OptionalArgumentEEE
cont: total_size: 15736 count: 281 avg.sz: 56 largest: 56 N7gctools17GCVector_moveableINS_9smart_ptrIN4core6Cons_OEEEEE
cont: total_size: 14784 count: 50 avg.sz: 295 largest: 672 N7gctools17GCVector_moveableIN4core15KeywordArgumentEEE
cont: total_size: 8216 count: 1 avg.sz: 8216 largest: 8216 N7gctools17GCVector_moveableIN4core14DynamicBindingEEE
cont: total_size: 3288 count: 1 avg.sz: 3288 largest: 3288 N7gctools17GCVector_moveableINS_9smart_ptrIN6clbind10ClassRep_OEEEEE
cont: total_size: 2168 count: 1 avg.sz: 2168 largest: 2168 N7gctools17GCVector_moveableIN10asttooling12_GLOBAL__N_112RegistryMaps27SymbolMatcherDescriptorPairEEE
cont: total_size: 1560 count: 1 avg.sz: 1560 largest: 1560 N7gctools17GCVector_moveableINS_9smart_ptrIN4core11Character_OEEEEE
cont: total_size: 1560 count: 1 avg.sz: 1560 largest: 1560 N7gctools17GCVector_moveableINS_9smart_ptrIN4core5Str_OEEEEE
cont: total_size: 1448 count: 1 avg.sz: 1448 largest: 1448 N7gctools17GCVector_moveableIN4core14ExceptionEntryEEE
cont: total_size: 808 count: 13 avg.sz: 62 largest: 136 N7gctools17GCVector_moveableINS_9smart_ptrIN4core9Package_OEEEEE
cont: total_size: 784 count: 14 avg.sz: 56 largest: 56 N7gctools17GCVector_moveableIPN10asttooling8internal17MatcherDescriptorEEE
cont: total_size: 152 count: 1 avg.sz: 152 largest: 152 N7gctools17GCVector_moveableIN4core11AuxArgumentEEE
cont: total_size: 56 count: 1 avg.sz: 56 largest: 56 N7gctools17GCVector_moveableINS_9smart_ptrIN4core16SourceFileInfo_OEEEEE
-------------------- Reachable LispKinds -------------------
lisp: total_size: 11301888 count: 353160 avg.sz: 32 N4core6Cons_OE
lisp: total_size: 2454176 count: 76693 avg.sz: 32 N4core5Str_OE
lisp: total_size: 1724448 count: 53889 avg.sz: 32 N4core8Bignum_OE
lisp: total_size: 1607040 count: 25110 avg.sz: 64 N4core10Instance_OE
lisp: total_size: 1295120 count: 32378 avg.sz: 40 N4core13BranchSNode_OE
lisp: total_size: 852768 count: 26649 avg.sz: 32 N4core11LeafSNode_OE
lisp: total_size: 657440 count: 16436 avg.sz: 40 N4core8Symbol_OE
lisp: total_size: 446408 count: 11160 avg.sz: 40 N4core15VectorObjects_OE
lisp: total_size: 418920 count: 10473 avg.sz: 40 N4core26VectorObjectsWithFillPtr_OE
lisp: total_size: 268912 count: 4802 avg.sz: 56 N4core18CompiledFunction_OE
lisp: total_size: 229760 count: 7180 avg.sz: 32 N4core14CompiledBody_OE
lisp: total_size: 203680 count: 5092 avg.sz: 40 N4core12ValueFrame_OE
lisp: total_size: 194112 count: 8088 avg.sz: 24 N4core8Fixnum_OE
lisp: total_size: 148064 count: 4627 avg.sz: 32 N4core16StrWithFillPtr_OE
lisp: total_size: 136064 count: 1546 avg.sz: 88 N4core19LambdaListHandler_OE
lisp: total_size: 96376 count: 1721 avg.sz: 56 N4core9BuiltIn_OE
lisp: total_size: 74272 count: 2320 avg.sz: 32 N4core15SourcePosInfo_OE
How are you reading the c++ files? Are they, or anything else, GC objects larger than 4 KB?
Apologies if this is a duplicate - I ran into trouble with the size due to a graph I included.
I used the code that Peter sent me as-is - it works very well - thanks!
I’m writing a Common Lisp implementation that interoperates with C++ and uses LLVM as the back end. I’ve written a static analyzer for my C++ code to help me clean it up and add features. When I run the static analyzer on the 165 C++ source files using the Boehm GC it blows up.
At the end of processing 165 source files the process consumes about 30GB (gigabytes) of memory.
So I run the static analyzer on one source file 10 times and use the code that Peter sent me and walk the reachable objects and add up their memory footprint. There are a handful of objects that don’t have valid headers - I filter them out and currently don’t count them. I find that the total reachable memory used by the system remains pretty constant at 62MB to 34MB (it spikes and then goes down).
But the total memory used by my process goes up and up and up very quickly!
I ran the OS X program “heap” on the process and it reports “non-object” heap memory - which I assume is allocated by the Boehm GC (this assumption may be wrong - please correct me if you know better).
Here is a graph that shows all of the memory usage each time after parsing/generating an AST/searching the AST for the same C++ file. A bit of explanation… The Y axis is log(Bytes) The red points are what “heap” reports as “non-object” memory - I assume this is total Boehm memory. The green points are the total reachable Boehm memory that has a valid header that I built. The blue points are the reachable Boehm memory that was allocated just since the start of loading the most recent C++ file. I measure this by writing an integer marker into the header of each newly allocated object and changing that marker each time I read the source file - this lets me track what objects are allocated since the last operation.
Graph link: http://imgur.com/tSN5jRL
This can’t be memory fragmentation can it?
Is Boehm not reusing memory properly?
Am I not configuring Boehm properly?
_______________________________________________
bdwgc mailing list
https://lists.opendylan.org/mailman/listinfo/bdwgc
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
Christian Schafmeister
2014-06-07 01:11:07 UTC
Permalink
I might also be leaking memory some other way. I’m trying to figure it out now.

Is the Boehm function: GC_get_heap_size() an accurate measure of the size of the heap allocated by Boehm?

Best,

.Chris.
Post by Christian Schafmeister
Apologies if this is a duplicate - I ran into trouble with the size due to a graph I included.
I used the code that Peter sent me as-is - it works very well - thanks!
I’m writing a Common Lisp implementation that interoperates with C++ and uses LLVM as the back end. I’ve written a static analyzer for my C++ code to help me clean it up and add features. When I run the static analyzer on the 165 C++ source files using the Boehm GC it blows up.
At the end of processing 165 source files the process consumes about 30GB (gigabytes) of memory.
So I run the static analyzer on one source file 10 times and use the code that Peter sent me and walk the reachable objects and add up their memory footprint. There are a handful of objects that don’t have valid headers - I filter them out and currently don’t count them. I find that the total reachable memory used by the system remains pretty constant at 62MB to 34MB (it spikes and then goes down).
But the total memory used by my process goes up and up and up very quickly!
I ran the OS X program “heap” on the process and it reports “non-object” heap memory - which I assume is allocated by the Boehm GC (this assumption may be wrong - please correct me if you know better).
Here is a graph that shows all of the memory usage each time after parsing/generating an AST/searching the AST for the same C++ file. A bit of explanation… The Y axis is log(Bytes) The red points are what “heap” reports as “non-object” memory - I assume this is total Boehm memory. The green points are the total reachable Boehm memory that has a valid header that I built. The blue points are the reachable Boehm memory that was allocated just since the start of loading the most recent C++ file. I measure this by writing an integer marker into the header of each newly allocated object and changing that marker each time I read the source file - this lets me track what objects are allocated since the last operation.
Graph link: http://imgur.com/tSN5jRL
This can’t be memory fragmentation can it?
Is Boehm not reusing memory properly?
Am I not configuring Boehm properly?
Bruce Hoult
2014-06-07 01:20:01 UTC
Permalink
I'm not sure whether that function you're using will see
GC_malloc_uncollectable things. I seem to recall that the mark bits aren't
set for those, maybe? (I'm away from the computer at the moment)

The reason I ask about large objects is that objects larger than 4 KB (GC
heap block size) require finding continuous heap blocks. Finding 2 or 3
contiguous isn't usually a problem, but if you need dozens or hundreds of
contiguous blocks then that can require allocating new memory. If that
object is then freed, and the GC uses a block or two of that space for
small objects ... and then you allocate a large object again ... that space
isn't contiguous any more and another had to be allocated.

That's a very fast way to go through VM...

There's a setting (USE_ENTIRE_HEAP??? Something like that) which should be
false if you don't want this to happen. (leave the space for GC'd large
objects intact in case you allocate another one)


On Sat, Jun 7, 2014 at 1:11 PM, Christian Schafmeister <
I might also be leaking memory some other way. I’m trying to figure it
out now.
Is the Boehm function: GC_get_heap_size() an accurate measure of the
size of the heap allocated by Boehm?
Best,
.Chris.
On Jun 6, 2014, at 7:07 PM, Christian Schafmeister <
Post by Christian Schafmeister
Apologies if this is a duplicate - I ran into trouble with the size due
to a graph I included.
Post by Christian Schafmeister
I used the code that Peter sent me as-is - it works very well - thanks!
I’m writing a Common Lisp implementation that interoperates with C++ and
uses LLVM as the back end. I’ve written a static analyzer for my C++ code
to help me clean it up and add features. When I run the static analyzer on
the 165 C++ source files using the Boehm GC it blows up.
Post by Christian Schafmeister
At the end of processing 165 source files the process consumes about
30GB (gigabytes) of memory.
Post by Christian Schafmeister
So I run the static analyzer on one source file 10 times and use the
code that Peter sent me and walk the reachable objects and add up their
memory footprint. There are a handful of objects that don’t have valid
headers - I filter them out and currently don’t count them. I find that the
total reachable memory used by the system remains pretty constant at 62MB
to 34MB (it spikes and then goes down).
Post by Christian Schafmeister
But the total memory used by my process goes up and up and up very
quickly!
Post by Christian Schafmeister
I ran the OS X program “heap” on the process and it reports “non-object”
heap memory - which I assume is allocated by the Boehm GC (this assumption
may be wrong - please correct me if you know better).
Post by Christian Schafmeister
Here is a graph that shows all of the memory usage each time after
parsing/generating an AST/searching the AST for the same C++ file. A bit of
explanation
 The Y axis is log(Bytes) The red points are what “heap”
reports as “non-object” memory - I assume this is total Boehm memory. The
green points are the total reachable Boehm memory that has a valid header
that I built. The blue points are the reachable Boehm memory that was
allocated just since the start of loading the most recent C++ file. I
measure this by writing an integer marker into the header of each newly
allocated object and changing that marker each time I read the source file
- this lets me track what objects are allocated since the last operation.
Post by Christian Schafmeister
Graph link: http://imgur.com/tSN5jRL
This can’t be memory fragmentation can it?
Is Boehm not reusing memory properly?
Am I not configuring Boehm properly?
_______________________________________________
bdwgc mailing list
https://lists.opendylan.org/mailman/listinfo/bdwgc
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
Loading...