lcache coring with ndsd becoming unresponsive

  • 7023255
  • 09-Aug-2018
  • 09-Aug-2018

Environment

eDirectory 9.0.4

novell-AUDTedirinst NetIQ 9.0.4-0 
novell-AUDTplatformagent Novell 2.0.2-81

/etc/logevent.conf 
LogHost=xx.xx.xx.xx 
LogCachePort=1288 
LogEnginePort=1289 
LogCacheDir=/opt/novell/idm/naudit/cache

Situation


ndsstat doesn't return and stays waiting

/var/log/messages:

Error in `lcache': corrupted double-linked list: 0x00007fbf74b7c690 ***  

Errors in the nproduct.log

[Novell Audit Cache]: Authenticated to Server... 
[EndClientConnection]: Not Exiting thread due to STATE_ENDING for socket 0 
[Novell Audit Cache]: Server dropped the connection, Trying to connect again... 
[EndClientConnection]: Not Exiting thread due to STATE_ENDING for socket 0 
[Novell Audit Cache]: Server seems busy, wait for 5 Seconds and try again... 
[Novell Audit Cache]: Authenticated to Server... 
[EndClientConnection]: Not Exiting thread due to STATE_ENDING for socket 0 
[Novell Audit Cache]: Server dropped the connection, Trying to connect again... 
[EndClientConnection]: Not Exiting thread due to STATE_ENDING for socket 0 
[Novell Audit Cache]: Server seems busy, wait for 5 Seconds and try again.

lcache process has a large number of open files

lsof -p 12435   (12435 represents the pid number of the lcache process)








Resolution

The platform agent - novell-AUDTplatformagent had been updated but the Sentinel collector had not.  

Ensure the components involved in event auditing are all current for the version of eDirectory / Sentinel running

Find the latest Sentinel plugins at this location:


Cause

Connectivity issues between lcache and the Sentinel collector for eDirectory causing lcache cores and connection failures with open files.