-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dev.icinga.com #11534] DowntimesExpireTimerHandler crashes Icinga2 with <unknown function> #4095
Comments
Updated by mfriedrich on 2016-04-06 15:58:03 00:00
Can you please install gdb and generate a backtrace once this issue happens again? Thanks. |
Updated by PowellEB on 2016-04-06 17:34:46 00:00 gdb is now installed. we will get trace when it happens again. For the downtime inconsistency resolution is this the correct direction, |
Updated by mfriedrich on 2016-04-06 17:38:07 00:00 The internal_downtime_id problem is discussed in #11382 and a possible fix is available through the snapshot packages. Please test them. Kind regards, |
Updated by mfriedrich on 2016-04-07 08:25:51 00:00
|
Updated by gbeutner on 2016-04-12 09:40:20 00:00 This might actually be a duplicate of #11559. |
Updated by gbeutner on 2016-04-12 09:40:50 00:00
|
Updated by gbeutner on 2016-04-12 10:06:07 00:00
|
Updated by gbeutner on 2016-04-12 10:10:03 00:00
Applied in changeset 974ca9f. |
Updated by gbeutner on 2016-04-20 06:35:34 00:00
|
Updated by gbeutner on 2016-04-20 08:15:55 00:00
|
Updated by mfriedrich on 2016-05-02 13:27:40 00:00
|
Updated by mfriedrich on 2016-05-02 13:28:17 00:00 I guess there is a problem with this patch for expiring downtimes. The check for active objects looks like this
but the downtime class overrides that function with the same signature.
That way the patch does not work. I'll create a follow-up issue for that. |
Updated by mfriedrich on 2016-05-02 13:34:11 00:00
|
This issue has been migrated from Redmine: https://dev.icinga.com/issues/11534
Created by PowellEB on 2016-04-05 19:19:12 00:00
Assignee: gbeutner
Status: Resolved (closed on 2016-04-12 10:10:03 00:00)
Target Version: 2.4.5
Last Update: 2016-05-02 13:28:17 00:00 (in Redmine)
Moved from old icinga2 (2.4.4) Centos to new hardware on Ubuntu 14.04 (2.4.4-1) with two Node cluster.
Since new database on Ubuntu, dumped all old downtimes (author_name, downtime_type,comment_data,scheduled_start_time,scheduled_end_time, name).
Added old downtimes to new database from external command (icinga2.cmd). All went in fine.
40 minutes later, icinga2 #01 crashed. Two of the crash reports are attached. and below in message.
Could not get #01 to start. Did a config check on icinga2 #02, check failed, then icinga2 #02 also crashed.
It looked like icinga2 had processed (added) downtimes that had already ended, then when DowntimesExpireTimerHandler tried
to expire them it crashed and caused icinga2 to stop.
The only way to get the config check to pass and run icinga2 was to remove downtimes from /var/lib/icinga2/api/packages/_api/....../conf.d/downtimes
Both nodes are running now. However we cannot delete old downtimes now.
After some time, checking icinga_downtimehistory there are duplicate "internal_downtime_id" entries now.
(1) reporting the issue, and crash info below attachements.
(2) how to repair the current inconsistencies for api & icinga_downtimehistory ?
assume:
dump all downtimes from icinga_downtimehistory
clear ...../conf.d/downtimes on both nodes of icinga2
truncate icinga_downtimehistory
add downtimes back via external command (ensuring no incoming downtimes are expired)
If this will work, but question is how to reset the "internal_downtime_id" counter so we do not get duplicate ids again?
Text from Crash:
**
Application information:
Application version: r2.4.4-1
Installation root: /usr
Sysconf directory: /etc
Run directory: /run
Local state directory: /var
Package data directory: /usr/share/icinga2
State path: /var/lib/icinga2/icinga2.state
Modified attributes path: /var/lib/icinga2/modified-attributes.conf
Objects path: /var/cache/icinga2/icinga2.debug
Vars path: /var/cache/icinga2/icinga2.vars
PID path: /run/icinga2/icinga2.pid
System information:
Platform: Ubuntu
Platform version: 14.04, Trusty Tahr
Kernel: Linux
Kernel version: 3.13.0-24-generic
Architecture: x86_64
Stacktrace:
(0) libpthread.so.0: ( 0x10340) [0x7f64286fe340]
(1) libc.so.6: gsignal ( 0x39) [0x7f6427497f79]
(2) libc.so.6: abort ( 0x148) [0x7f642749b388]
(3) libc.so.6: ( 0x2fe36) [0x7f6427490e36]
(4) libc.so.6: ( 0x2fee2) [0x7f6427490ee2]
(5) libicinga.so: ( 0x184ec3) [0x7f6422f5dec3]
(6) libicinga.so: icinga::Downtime::RemoveDowntime(icinga::String const&, bool, bool, boost::intrusive_ptricinga::MessageOrigin const&) ( 0x6e7) [0x7f6422f78c17]
(7) libicinga.so: icinga::Downtime::DowntimesExpireTimerHandler() ( 0x359) [0x7f6422f79a69]
(8) libbase.so: boost::signals2::detail::signal_impl<void (boost::intrusive_ptricinga::Timer const&), boost::signals2::optional_last_value, int, std::less, boost::function<void (boost::intrusive_ptricinga::Timer const&)>, boost::function<void (boost::signals2::connection const&, boost::intrusive_ptricinga::Timer const&)>, boost::signals2::mutex>::operator()(boost::intrusive_ptricinga::Timer const&) ( 0x1cc) [0x7f6428473f6c]
(9) libbase.so: icinga::Timer::Call() ( 0x29) [0x7f64284216a9]
(10) libbase.so: icinga::ThreadPool::WorkerThread::ThreadProc(icinga::ThreadPool::Queue&) ( 0x326) [0x7f642841e496]
(11) libboost_thread.so.1.54.0: ( 0xba4a) [0x7f6428d89a4a]
(12) libpthread.so.0: ( 0x8182) [0x7f64286f6182]
(13) libc.so.6: clone ( 0x6d) [0x7f642755c30d]
*****
Failed to launch GDB: No such file or directory
**
Attachments
Changesets
2016-04-12 10:05:43 00:00 by gbeutner 974ca9f
2016-04-20 08:09:34 00:00 by gbeutner 159681c
Relations:
The text was updated successfully, but these errors were encountered: