Unexpected freeswitch startup failures

I posted this in a subforum and didn’t see any responses, sorry if this is considered a dupe but I thought I’d try in the general forum. If I’m not doing something wrong as to not see the errors in failures like this I’m surprised no one else has noticed. Thanks…

I’m using Debian 12 and running the current freeswitch packages from the signalwire repo.

I’ve got an issue where sometimes freeswitch won’t start at boot, and there’s nothing in the logs that indicates why it failed.

I don’t know why it’s failing at boot, but it’s easy to replicate the issue of freeswitch failing without any indication of why. It’s kind of contrived, but say for example /etc/freeswitch/freeswitch.xml doesn’t exist. freeswitch fails to start:

pbx-01 ~ # systemctl status freeswitch.service
× freeswitch.service - freeswitch
Loaded: loaded (/lib/systemd/system/freeswitch.service; enabled; preset: enabled)
Drop-In: /etc/systemd/system/freeswitch.service.d
└─nochown.conf, order.conf, realtime.conf
Active: failed (Result: exit-code) since Sat 2024-06-29 19:50:22 PDT; 30s ago
Duration: 52.239s
Process: 6000 ExecStart=/usr/bin/freeswitch -u ${USER} -g ${GROUP} -ncwait ${DAEMON_OPTS} (code=exited, s>
CPU: 44ms

Jun 29 19:50:22 pbx-01 systemd[1]: freeswitch.service: Scheduled restart job, restart counter is at 5.
Jun 29 19:50:22 pbx-01 systemd[1]: Stopped freeswitch.service - freeswitch.
Jun 29 19:50:22 pbx-01 systemd[1]: freeswitch.service: Start request repeated too quickly.
Jun 29 19:50:22 pbx-01 systemd[1]: freeswitch.service: Failed with result ‘exit-code’.
Jun 29 19:50:22 pbx-01 systemd[1]: Failed to start freeswitch.service - freeswitch.

There’s nothing in the freeswitch log, the last entry is from being shut down before this test:

2024-06-29 19:50:06.562165 99.27% [NOTICE] switch_loadable_module.c:1573 Deleting Say interface ‘en’

Obviously, in this case, it failed before logging was configured, so nothing in the logs is expected. But in other cases, such as when it randomly fails at boot, there is still nothing in the logs.

There’s also nothing in the journal:

pbx-01 ~ # journalctl -xeu freeswitch.service

A start job for unit freeswitch.service has finished with a failure.

The job identifier is 3988 and the job result is failed.

The only thing that provides any detail as to what failed is running freeswitch by hand rather than via the service and leaving out the -ncwait option:

pbx-01 ~ # /usr/bin/freeswitch -u freeswitch -g freeswitch -nonat
2024-06-29 19:58:00.500758 0.00% [INFO] switch_event.c:714 Activate Eventing Engine.
2024-06-29 19:58:00.511267 0.00% [WARNING] switch_event.c:685 Create additional event dispatch thread 0
2024-06-29 19:58:00.511497 0.00% [ERR] switch_xml.c:1439 Couldn’t open /etc/freeswitch/freeswitch.xml (No such file or directory)
Cannot Initialize [Cannot Open log directory or XML Root!]

If the problem is reproducible (I’ve run into a few other config issues that make this happen other than the sledgehammer of whacking a main config file) this is a solution. But when it fails to start randomly, it’s not.

From what I can tell, when started via systemd:

Jun 29 19:59:59 pbx-01 systemd[1]: Starting freeswitch.service - freeswitch…
Jun 29 19:59:59 pbx-01 freeswitch[6112]: 6113 Backgrounding.
Jun 29 20:00:03 pbx-01 freeswitch[6112]: FreeSWITCH[6112] Waiting for background process pid:6113 to be ready>
Jun 29 20:00:03 pbx-01 freeswitch[6112]: FreeSWITCH[6112] Waiting for background process pid:6113 to be ready>
Jun 29 20:00:03 pbx-01 freeswitch[6112]: FreeSWITCH[6112] System Ready pid:6113
Jun 29 20:00:03 pbx-01 systemd[1]: Started freeswitch.service - freeswitch.

The process systemd runs spawns another process, and when that process is ready, the original process exits with success, and things work. But if that child process fails, the original process fails, but without any reason why.

Am I missing something? Should the initial output as seen by running freeswitch by hand show up somewhere when running as a service? Presumably it’s that child process that knows about it but I can’t find anywhere it might log it.

Thanks much for any insight as to sorting out what’s broken when freeswitch fails to start early and there’s no evidence of why…

Id start by starting FreeSWITCH manually and see if any errors happen. malformed xml can cause FreeSWITCH to not start.

Thanks for the response. The malformed XML was just an example to demonstrate how the freeswitch service can fail to start but not provide any evidence as to why.

My specific problem is that sometimes freeswitch fails to start at boot. When I start it after boot, whether as a service or in the foreground manually, it works fine.

I need a way to make the freeswitch service log why it failed to start, otherwise I have no way to sort out why it occasionally won’t start at boot.

Unless I’m missing a way to do that, this seems to be a deficiency in the systemd service implementation for freeswitch. Ideally if it failed to start before it was in a position to generate a log entry in its own log, it would emit something via stderr that would be captured and displayed by systemctl status.

Thanks…

I agree with Len that starting FS manually (as user ‘freeswitch’ if you do it that way) tells you much of what is going on here. I’m not sure why it’s intermittent, unless it has to do with networking weirdness affecting FS binding to a network interface.

I offer a few semi-related suggestions for your systemd unit file that might help:

[Unit]
Wants=network-online.target
Requires=network.target local-fs.target
After=network.target network-online.target local-fs.target

[Service]
PIDFile=/usr/local/freeswitch/run/freeswitch.pid

TimeoutStartSec=45
TimeoutStopSec=45
; Restart=always will ALWAYS restart FreeSWITCH even when you wish to stop it to debug and configure it
Restart=on-failure

[Install]
WantedBy=multi-user.target

I’m pretty sure the lines under [Unit] are how they appear in the repository, but if not FS might start before those subsystems are ready.

The repo PIDFile= points to /var/run (which is really /run nowadays) which only allows write by root. I compile FS from source which ends up in /usr/local/freeswitch so I just point to the /run directory under there which is writeable by user ‘freeswitch’. This has nothing to do with your problem, but is useful to do.

The timeout lines prevent FS from being killed and restarted so rapidly as to make it useless to troubleshoot. That’s what systemd was complaining about, among other things.

The Restart= line allows you to stop FS without systemd starting it behind your back.

The WantedBy= line is what actually tells systemd to start FS at boot time IIRC.

I have an ancient wiki page that lists a number of steps after I build FS from scratch to set permissions and such that might be worthwhile to you:

Thanks for the thoughts.

I agree with Len that starting FS manually (as user ‘freeswitch’ if you do it that way) tells you much of what is going on here.

I agree with both of you that it seems the only way to find out why freeswitch fails to start sometimes is to run it manually. Which is a problem if it’s failing when you’re not in a position to do so. It really seems that the log output provided by freeswitch when run manually should also be passed to the systemd status when run as a service. That could be accomplished by simply dumping it to stderr on a failure.

I offer a few semi-related suggestions for your systemd unit file that might help:

The dependencies in the official freeswitch package service match those you list. The pid is set to:

PIDFile=/run/freeswitch/freeswitch.pid

which is a directory created with the right permissions to use:

$ cat /usr/lib/tmpfiles.d/freeswitch.conf
d /var/run/freeswitch 0755 freeswitch freeswitch - -

The timeout lines prevent FS from being killed and restarted so rapidly as to make it useless to troubleshoot. That’s what systemd was complaining about, among other things.

No, those lines control how long systemd will wait for a service to start or stop before it’s considered failed, they’ve got nothing to do with how long freeswitch waits after a service fails before it tries to restart it. To control the wait between restart tries you need to use RestartSec or RestartSteps/RestartMaxDelaySec.

The Restart= line allows you to stop FS without systemd starting it behind your back.

What do you mean by behind your back? If you explicitly shut it down via systemctl, it doesn’t randomly restart it, unless it’s activated by a matching socket service or something (in which case systemd will warn you when you shut it down). It only restarts it if it exits on its own. Telling it to only restart on failure just means it won’t auto restart if freeswitch exits unexpectly but with a 0 status code.

I’m sure there’s something simple going on behind the scenes. I guess I could just tweak the dependencies to make it start at the absolute end of the boot process and see if that clears up whatever race condition is happening. It would sure be nice if it could just tell me why it’s not starting though.

Thanks again…

Looking at the source, switch.c calls daemonize which closes stderr in the child process. If -ncwait is passed, the parent creates a pipe before spawning the child and waits for the child to send it a 1 to indicate it started.

In between the call to daemonize and the write to the pipe by the child, various things could go wrong which cause it to just write to stderr and then exit, so the parent knows it failed, but not why.

Given stderr is dropped in the child, these errors are dumped into /dev/null and there’s no way to know what they were.

If -ncwait is passed, couldn’t the child wait to drop stderr until it writes the success to the pipe, so any errors or failures would still be written to stderr and captured by systemd so they could be seen?

Hello,

I am investigating similar Issue. I have a debian 11 running freeswitch :
1.10.11~release~25~f24064f7c9~bullseye-1~bullseye+1

randomly service won’t start and is looping for ever.
Here is systemd logs :

-- Boot 37a2da1b8ca940c58c80ae8f5d6cfff5 --
Jan 15 17:58:14 server systemd[1]: Starting freeswitch...
Jan 15 17:58:15 server freeswitch[838]: 847 Backgrounding.
Jan 15 17:59:15 server systemd[1]: freeswitch.service: start operation timed out. Terminating.
Jan 15 17:59:15 server systemd[1]: freeswitch.service: Control process exited, code=exited, status=1/FAILURE
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server systemd[1]: freeswitch.service: Failed with result 'timeout'.
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Waiting for background process pid:847 to be ready.....
Jan 15 17:59:15 server freeswitch[838]: FreeSWITCH[838] Error starting system! pid:847
Jan 15 17:59:15 server systemd[1]: Failed to start freeswitch.
Jan 15 17:59:15 server systemd[1]: freeswitch.service: Scheduled restart job, restart counter is at 1.
Jan 15 17:59:15 server systemd[1]: Stopped freeswitch.
Jan 15 17:59:15 server systemd[1]: Starting freeswitch...
Jan 15 17:59:15 server freeswitch[998]: 999 Backgrounding.

Honestly I don’t understand the random part, sometimes freeswitch just start without any issue.
I see that systemd unit timeout I tried bigger value (from 45s to 60s) without any effect.

Any idea on how to takle this issue ?

thanks !

What does your freeswitch log say – what you put in the Email is not
Probably reading that will tell you whhat is going on.
your freeswitch log.

1 Like

As I already explained earlier in the thread, there are no errors in the freeswitch logs. When starting freeswitch in the background, stderr is directed to /dev/null. Any errors that occur from that point until the logging subsystem is initialized can not be seen and potentially just cause freeswitch to fail to start with no way to determine why 8-/. The only way to see this type of error is to start freeswitch in the foreground, so those errors are sent to stderr instead of /dev/null.

You can’t really do that when freeswitch randomly fails to start at boot…

Ideally the background startup code could be fixed so that these errors are sent to stderr so systemd would log them.

1 Like

I was able to reproduce the error from CLI, starting just get stuck :

root@server~  /usr/bin/freeswitch -u freeswitch -g freeswitch  -nonat
2025-01-17 11:02:35.916219 0.00% [INFO] switch_event.c:714 Activate Eventing Engine.
2025-01-17 11:02:35.926411 0.00% [WARNING] switch_event.c:685 Create additional event dispatch thread 0
2025-01-17 11:02:35.934228 0.00% [ERR] switch_xml.c:1395 Error including /etc/freeswitch/lang/en/demo/*.xml
2025-01-17 11:02:35.956793 0.00% [CONSOLE] switch_core.c:1593 Created ip list lan default (allow)
2025-01-17 11:02:35.956831 0.00% [CONSOLE] switch_core.c:1593 Created ip list domains default (deny)
2025-01-17 11:02:35.957411 0.00% [CONSOLE] switch_core.c:2468 Bringing up environment.
2025-01-17 11:02:35.957429 0.00% [CONSOLE] switch_core.c:2469 Loading Modules.
2025-01-17 11:02:35.957801 0.00% [CONSOLE] switch_time.c:1576 Clock calibration disabled.
2025-01-17 11:02:35.957815 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [CORE_SOFTTIMER_MODULE]
2025-01-17 11:02:35.957862 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [CORE_PCM_MODULE]
2025-01-17 11:02:35.957932 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [CORE_SPEEX_MODULE]
2025-01-17 11:02:35.957954 0.00% [CONSOLE] switch_loadable_module.c:2196 open of pre_load_modules.conf failed
2025-01-17 11:02:35.963395 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [CORE_VPX_MODULE]
2025-01-17 11:02:35.963670 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_console]
2025-01-17 11:02:35.964328 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_enum]
2025-01-17 11:02:35.964530 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_cdr_csv]
2025-01-17 11:02:35.964700 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_event_socket]
2025-01-17 11:02:35.969178 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_snmp]
2025-01-17 11:02:37.470507 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_sofia]
2025-01-17 11:02:37.470924 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_loopback]
2025-01-17 11:02:37.471650 0.00% [CONSOLE] mod_signalwire.c:1020 Welcome to
     _____ _                   ___       ___
    / ___/(_)___ _____  ____ _/ / |     / (_)_______
    \__ \/ / __ `/ __ \/ __ `/ /| | /| / / / ___/ _ \
   ___/ / / /_/ / / / / /_/ / / | |/ |/ / / /  /  __/
  /____/_/\__, /_/ /_/\__,_/_/  |__/|__/_/_/   \___/
         /____/
2025-01-17 11:02:37.471682 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_signalwire]
2025-01-17 11:02:37.472298 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_commands]
2025-01-17 11:02:37.473881 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_curl]
2025-01-17 11:02:37.474328 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_db]
2025-01-17 11:02:37.474720 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_dptools]
2025-01-17 11:02:37.476240 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_expr]
2025-01-17 11:02:37.478509 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_fifo]
2025-01-17 11:02:37.478790 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_hash]
2025-01-17 11:02:37.479030 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_httapi]
2025-01-17 11:02:37.479212 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_dialplan_xml]
2025-01-17 11:02:37.479364 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_dialplan_asterisk]
2025-01-17 11:02:37.500315 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_spandsp]
2025-01-17 11:02:37.500870 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_g723_1]
2025-01-17 11:02:37.501034 0.00% [CONSOLE] switch_loadable_module.c:1772 Successfully Loaded [mod_g729]

my command is still running and time is 11h06 currently so that is why systemd timeouts.
Any idea on how to get more logs ? should I try to strace the process ?

thanks for your help

I don’t know what I am looking for but if this help somehow I run a strace against the stuck process

# strace -f -s 200 -p 1194
[pid  1214] fcntl(35, F_SETLK, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=1073741824, l_len=1}) = 0
[pid  1215] fcntl(36, F_SETLK, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=0, l_len=0} <unfinished ...>
[pid  1214] stat("/var/lib/freeswitch/db/sofia_reg_internal.db-journal",  <unfinished ...>
[pid  1215] <... fcntl resumed>)        = 0
[pid  1214] <... stat resumed>0x7f5c202ad650) = -1 ENOENT (No such file or directory)
[pid  1214] pread64(35, "\0\0\0U\0\0\0S\0\0\0\0\0\0\0\0", 16, 24) = 16
[pid  1214] stat("/var/lib/freeswitch/db/sofia_reg_internal.db-wal", 0x7f5c202ad650) = -1 ENOENT (No such file or directory)
[pid  1215] fcntl(36, F_SETLK, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=1073741824, l_len=1} <unfinished ...>
[pid  1214] fstat(35,  <unfinished ...>
[pid  1215] <... fcntl resumed>)        = 0
[pid  1214] <... fstat resumed>{st_mode=S_IFREG|0640, st_size=339968, ...}) = 0
[pid  1215] fcntl(36, F_SETLK, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=1073741826, l_len=510} <unfinished ...>
[pid  1214] fcntl(35, F_SETLK, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=0, l_len=0} <unfinished ...>
[pid  1215] <... fcntl resumed>)        = 0
[pid  1214] <... fcntl resumed>)        = 0
[pid  1215] fcntl(36, F_SETLK, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=1073741824, l_len=1} <unfinished ...>
[pid  1214] futex(0x7f5c218bdb98, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1737110213, tv_nsec=131177000}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid  1215] <... fcntl resumed>)        = 0
[pid  1215] stat("/var/lib/freeswitch/db/sofia_reg_external.db-journal", 0x7f5c20270650) = -1 ENOENT (No such file or directory)
[pid  1215] pread64(36, "\0\0\0U\0\0\0S\0\0\0\0\0\0\0\0", 16, 24) = 16
[pid  1215] stat("/var/lib/freeswitch/db/sofia_reg_external.db-wal", 0x7f5c20270650) = -1 ENOENT (No such file or directory)
[pid  1215] fstat(36, {st_mode=S_IFREG|0640, st_size=339968, ...}) = 0
[pid  1215] fcntl(36, F_SETLK, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = 0
[pid  1215] futex(0x7f5c218bdb98, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1737110213, tv_nsec=131471000}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid  1209] <... epoll_wait resumed>[], 4, 395) = 0
[pid  1209] epoll_wait(27, [], 4, 0)    = 0
[pid  1209] epoll_wait(27,  <unfinished ...>
[pid  1210] <... epoll_wait resumed>[], 4, 457) = 0
[pid  1210] epoll_wait(30, [], 4, 0)    = 0
[pid  1210] epoll_wait(30,  <unfinished ...>
[pid  1214] <... futex resumed>)        = -1 ETIMEDOUT (Connection timed out)
[pid  1214] futex(0x7f5c218bdb20, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  1214] futex(0x7f5c218bdb98, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1737110213, tv_nsec=231401000}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid  1215] <... futex resumed>)        = -1 ETIMEDOUT (Connection timed out)
[pid  1215] futex(0x7f5c218bdb20, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  1215] futex(0x7f5c218bdb98, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1737110213, tv_nsec=231597000}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid  1214] <... futex resumed>)        = -1 ETIMEDOUT (Connection timed out)
[pid  1214] futex(0x7f5c218bdb20, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  1214] futex(0x7f5c218bdb98, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1737110213, tv_nsec=331600000}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid  1215] <... futex resumed>)        = -1 ETIMEDOUT (Connection timed out)
[pid  1215] futex(0x7f5c218bdb20, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  1215] futex(0x7f5c218bdb98, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1737110213, tv_nsec=331741000}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid  1214] <... futex resumed>)        = -1 ETIMEDOUT (Connection timed out)
[pid  1214] futex(0x7f5c218bdb20, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  1214] futex(0x7f5c218bdb98, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1737110213, tv_nsec=431729000}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid  1215] <... futex resumed>)        = -1 ETIMEDOUT (Connection timed out)
[pid  1215] futex(0x7f5c218bdb20, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  1215] futex(0x7f5c218bdb98, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1737110213, tv_nsec=431872000}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid  1199] <... futex resumed>)        = -1 ETIMEDOUT (Connection timed out)
[pid  1199] futex(0x558b77e19520, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  1199] futex(0x558b77e1959c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1737110213, tv_nsec=846464000}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid  1214] <... futex resumed>)        = -1 ETIMEDOUT (Connection timed out)
[pid  1214] futex(0x7f5c218bdb20, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  1214] futex(0x7f5c218bdb98, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1737110213, tv_nsec=531918000}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid  1215] <... futex resumed>)        = -1 ETIMEDOUT (Connection timed out)
[pid  1215] futex(0x7f5c218bdb20, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  1215] futex(0x7f5c218bdb98, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1737110213, tv_nsec=531997000}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid  1214] <... futex resumed>)        = -1 ETIMEDOUT (Connection timed out)
[pid  1215] <... futex resumed>)        = -1 ETIMEDOUT (Connection timed out)
[pid  1214] futex(0x7f5c218bdb20, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  1215] futex(0x7f5c218bdb20, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  1214] <... futex resumed>)        = 0
[pid  1215] <... futex resumed>)        = 0
[pid  1214] futex(0x7f5c218bdb98, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1737110213, tv_nsec=632102000}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid  1215] futex(0x7f5c218bdb98, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1737110213, tv_nsec=632108000}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid  1221] <... select resumed>)       = 0 (Timeout)
[pid  1221] select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0} <unfinished ...>
[pid  1215] <... futex resumed>)        = -1 ETIMEDOUT (Connection timed out)
[pid  1214] <... futex resumed>)        = -1 ETIMEDOUT (Connection timed out)
[pid  1215] futex(0x7f5c218bdb20, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  1214] futex(0x7f5c218bdb20, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  1215] <... futex resumed>)        = 0
[pid  1214] <... futex resumed>)        = 0
[pid  1215] futex(0x7f5c218bdb98, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1737110213, tv_nsec=732227000}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid  1214] futex(0x7f5c218bdb98, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1737110213, tv_nsec=732232000}, FUTEX_BITSET_MAT

I am unable to understand this stack though

You need to fix any xjml errors and then try again.

I recommend opening a ticket on github to focus attention on this, unless you have a simple fix like swapping a couple lines of code to solve the problem. If you do have, then submit a Pull Request and they will incorporate that into the code after review.

1 Like

Yes I’ll have a look.
But has it is random issue this is not the root cause.

Done: