Actium Posted March 3 Posted March 3 (edited) The dedicated server (DCS_server.exe) cannot be used reliably as a non-interactive background service. For example, failure to login with the sometimes unreliable master server will raise this Login failed popup (if a saved authentication is available): The popup must be closed manually. Afterwards, the server process initializes and the splash screen window opens, but the server refuses to start any mission via the WebGUI and appears to be bricked indefinitely with the following dcs.log error message: 2025-03-03 21:20:24.025 ERROR ASYNCNET (Main): server_start failed: login is required If no saved authentication is available, another popup is opened that must also be acknowledged manually, before the server process terminates (at least the process isn't bricked, this time): UPDATE: An already running dedicated server will terminate with the following Login session has expired popup if the master server is unreachable for more than 20 minutes. The error occurs only when connection to the master server is re-established (more details here). The dedicated server process terminates after acknowledging the popup: Last but not least, a mission scripting error will raise another popup that freezes an already started dedicated server mid-mission until the popup is manually acknowledged (workaround available) : When using DCS_server.exe interactively, i.e., starting a single server instance on your local Desktop only if and when you need it, these popup messages may provide useful debugging information. However, many DCS_server.exe instances run non-interactively on dedicated servers in data centers without constant supervision of the server GUI. If such a popup message opens, e.g., after an automatic restart that leads to a temporary login failure or when a mission scripting error occurs, the server instance is bricked until manual admin intervention. This is highly impractical. DCS_server.exe should support non-interactive use by exiting with a non-zero exit status code immediately after fatal errors, e.g., the login failure*. Non-fatal errors should be logged to dcs.log, but otherwise ignored (or better yet: communicated to the clients as done in my workaround). The operating system service manager or a restart script can then decide if and when to attempt restarting DCS_server.exe after an error occurred, based on the exit status code. To not affect the aforementioned interactive desktop use of DCS_server.exe, a --quiet argument could be introduced, just like DCS_updater.exe has it already. Such an option should inhibit all popups including crash popup and therefore imply crash_report_mode = "silent", so DCS_server.exe can also be restarted automatically after it crashed. *) Regarding the login failure, a superior solution to just exiting would be to repeatedly attempt to connect to the master server in the background. Meanwhile, the server should still be allowed to start a mission, so dedicated servers can still be started and used (by connecting via IP) even if the master server is temporarily unavailable. Edited March 29 by Actium 1
Toumal Posted March 4 Posted March 4 I don't understand why the dedicated server needs to authenticate in the first place. As it stands, I find myself unable to host a dedicated server not just because of the gymnastics I have to do to check if the server is actually up (check PID, check port connection, etc) but because any time the ED auth servers return a 401 the server quits.
Actium Posted March 5 Author Posted March 5 On 3/4/2025 at 12:49 PM, Toumal said: I don't understand why the dedicated server needs to authenticate in the first place. Absolutely. The DCS updater allows downloading all files without login anyway, so I see no reasonable justification to intentionally restrict the dedicated server to not start without login. It does not prevent software piracy at all. Of course, I'd be happy to learn about any logical reason to keep the login restriction (and the annoying issues it causes) in place. However, I have a practical explanation: The dedicated server code base likely borrows largely – if not entirely – from the client. You can deduce that from the dedicated server's log messages: 2025-03-03 22:25:33.969 ERROR EDCORE (Main): No suitable driver found to mount bazar/textures/f-15 It attempts but fails to load a bunch of textures, although a dedicated server would never need any textures for lack of rendering. Deriving the dedicated server from the client makes sense from the narrow perspective of code maintainability: It limits the overall amount of code. The unidiomatic login necessity, login error popups, and the mission scripting error popup are presumably just leftovers from the client. Obviously, the slightly simplified code maintenance comes at the cost of added support effort, with the latter probably outweighing the former. It shouldn't be too troublesome to #ifdef the entire login code to only be included in the client and not in the dedicated server. But that would be sth. for the DCS Code Wish List. I see no reason to invest time advocating for that change as long as the described dedicated server usability bug goes unacknowledged and the far more serious Login Failure remains unfixed for 2 weeks as of now.
Actium Posted March 6 Author Posted March 6 On 3/4/2025 at 12:49 PM, Toumal said: I have to do to check if the server is actually up (check PID, check port connection, etc) @ToumalCould you elaborate a bit on what exactly you do for these checks, please? I was wondering how to detect a bricked DCS_server.exe for the purpose of terminating and restarting it. I've though of a simple Python script that periodically polls the server status via its UDP query protocol and shoots it down if it fails to respond for a couple of times (got a working proof-of-concept). As I host my dedicated server on Linux, I also considered using some dark magic (swaymsg -t get_tree) to periodically check if DCS_server.exe opened any popups that would qualify it for immediate termination.
Toumal Posted March 11 Posted March 11 (edited) Sure this is my current script: #!/bin/bash while true do dcspid=`ps x|grep DCS_server.exe|grep -v grep|head -n1|awk '{print $1;}'` case $dcspid in ''|*[!0-9]*) echo "Restarting DCS...";lutris lutris:rungameid/1;sleep 200;; *) echo "DCS is running as PID $dcspid";; esac HTTP_RESPONSE=$(curl -s --max-time 3 --write-out "%{http_code}" -o /dev/null http://127.0.0.1:8088/ ) # Check the HTTP response code if [[ "$HTTP_RESPONSE" -eq 404 ]]; then #echo "The server at $URL is up and responding with HTTP 404" sleep 1 else echo "The server at $URL did not respond with HTTP 404. Response code: $HTTP_RESPONSE." kill -9 $dcspid fi sleep 30 done I noticed that if the server is non-responsive, the http ports aren't open. So all I had to do is check if you can access the http server on 8088. If I get a 404 on / then all's good Edited March 11 by Toumal 1
Actium Posted March 11 Author Posted March 11 (edited) @Toumal Possibly of interest to you: Systemd user unit file to run and automatically restart a DCS server. More details on the whole work-in-progress thing in this forum thread (same link as above). Let me know if you have any comments on it. 1 hour ago, Toumal said: So all I had to do is check if you can access the http server on 8088. Thanks! Love the idea! Guess I'll write another systemd service that'll supervise the server that way. Before, I have done it the crude, hard, and brittle way by querying the game port instead of the webgui port : #!/usr/bin/python3 import socket sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) sock.sendto(b"\000\222\274\330S\n\241\016u\233,\3325\256o\371;\304\261;Sa\213\205{\000'\252\334d\355", ("127.0.0.1", 10308)) print(sock.recvfrom(65536)) Edited March 11 by Actium
Actium Posted March 14 Author Posted March 14 @Toumal Just pushed a watchdog that'll query the WebGUI port and force a server restart if it fails to response. Also figured out that the server will keep responding normally even with a frozen simulation (infinite loop in onSimulationFrame()), unless queried with a valid encrypted request. 1
Toumal Posted March 25 Posted March 25 @Actium Your checks are definitely more robust than mine. I should also note that my issue at the moment is the fact that my server is stopped by authentication failures from EDs servers after a few hours. In that case the web ports are not available anymore. I will post in a different topic about that particular issue, which currently makes it impossible for me to host.
Actium Posted March 29 Author Posted March 29 Updated the first post. The following issue also results in a dedicated server bricked by a popup that must be manually acknowledged:
Recommended Posts