Loading...

torquedev@supercluster.org

[Prev] Thread [Next]  |  [Prev] Date [Next]

[torquedev] Memory leak in pbs_mom Steve Snelgrove Wed Nov 14 00:06:58 2007


There has been a report of a memory leak in pbs_mom. This becomes noticeable after running many thousands of jobs.

Running some tests with valgrind point to a problem in catch_child.c, post_epilogue.

==12438== 317 bytes in 4 blocks are still reachable in loss record 14 of 29
==12438==    at 0x4021AA4: calloc (vg_replace_malloc.c:279)
==12438==    by 0x807BB7D: attrlist_alloc (attr_func.c:316)
==12438==    by 0x807BC21: attrlist_create (attr_func.c:378)
==12438==    by 0x807ADDB: encode_size (attr_fn_size.c:201)
==12438==    by 0x806308C: encode_used (requests.c:1981)
==12438==    by 0x804CE80: post_epilogue (catch_child.c:1040)
==12438==    by 0x8078063: scan_for_terminated (mom_start.c:459)
==12438==    by 0x805FE2E: main (mom_main.c:5756)



In this routine, post_epilogue, the variable preq is used twice with alloc_br and does not seem to have corresponding invocations of free_br.

The other routines in this file that are similar, all seem clean up preq with the following sequence of code.

   free_br(preq);

   shutdown(sock,SHUT_RDWR);

   close_conn(sock);

I am still new to this code and am wondering if someone with more experience could look at this and see if this is a problem.

Thanks,
Steve


_______________________________________________
torquedev mailing list
[EMAIL PROTECTED]
http://www.supercluster.org/mailman/listinfo/torquedev