BitMaster Posted Saturday at 11:16 PM Posted Saturday at 11:16 PM (edited) As some may know I service SoHo and smaller companies like up to ~25 people, nothing too big, so it's relative comparable to what can happen to you, me, everybody. Today was such a day, nothing really special, replace old HW with Win10 with 6 new Win11 Pro clients which put together, GB-B850-9600x-1TB 990Pro-Seasonic Platinum 500w fanless-32GB_6000 and a DarkRock-5 in a nice and quiet bequet tower. The building was fun despite 1 MB was DOA, but they just flew. Did that 2 weeks ago and today I planned to roll them out. Save Data and Mail from old PC to server Setup local LAN static IP and DNS rename PC join AD setup accounts install all LAN printers restore Data and Mail rinse & repeat...1..2..3..4..5..BANG !!!!! I accidently ended up on the wrong desktop, i was in RDP on Server instead of on the local Desktop, both had 2 File Explorers open, Task manager, windows over windows, I opened the the RENAME THIS PC on the wrong Desktop !! ME IDIOT !!!! I RENAMED THE Active Directory Controller, AD-DC, to PC-3 !!!!!!!!!!!!!!!!!!!!!!!! hit OK and then found out FUUUUUuuuuuuu§& WRONG RDP --> WRONG FILE EXPLORER --> WRONG CONTROL PANEL -----> GOD DAMN WRONG MACHINE to rename and I picked the worst machine you can possibly rename while in production and running THE DOMAIN CONTROLLER My blood froze, my pulse went to 180 from chilling low 60s, my promise to be home at 19:30h THE LATEST suddenly became a lie and my biggest fear was: "You're gonna either have some damn luck, PRETTY SOON, or you end up with Acronis, putting the Backup from last night on" Ya, after you made the bootstick, 20min, went upstairs to the server. a real ClusterFu%§ visualising what was ahead of me if I didnt come up with something quick, quickly . So, I know how to rename a DC, you DON'T do it like you do with any other Windows machine, it's a Powershell command, 3 of them iirc, with reboots and DNS entries. Not complicated, but must be obeyed or you screw up the directory, or your access to it...f you only run 1 DC, that must be said to be fair. OK..Google... it says...you did the WRONG thing, there is NO easy way out, you either use a Backup or REINSTITUTE the Domain, which basically means you might as well reinstall the whole damn thing, that's 1 day alone if ALL goes right. I did that server and it is not difficult, just a lot of stuff and it takes time and care. So...I had no choice.. I had to reboot it while I was there, no way out, you dont want to expore the outcome of that 40km away and only VPN,Teamviewer/RDP/VNC at hand. OK, the server has an Enterprise Dell iDrac that I can access from here, that would have worked even from remote...but here it goes. So I rebooted, there was no "rename again" back to SRV possible, it needed the reboot to commit. So I did. It booted to login screen, so far so good. At least no AD DB error while booting, that is a total mess then. OK, OK...so far great...try to login.. Login as admin and pwd --> Red Pop-Up --> No security entry in AD Database for this workstation. Login denied! OHH SH!T You cannot log into the local domain aka PC/Server on a DC, it is ALWAYS a log in into AD-admin. So I could not do SRV\administrator to log into the Server locally w/o Domain. I tried despite I knew it wouldn't work and it didn't, waah ! You're finally screwed !!!! You have a booting 2022 AD-Controller with a wrong name that doesnt connect to the network, doesnt show shares you could connect to and SOMEHOW work, heck, if the SQL's would run, I'd have more time to sort out things...but this beast doesnt do ANYTHING but boot to Welcome screen and sit there forever. mI also couldn't even finish the 6th PC and drive home, I was about to join AD when that happened There is no guide on Google or in AI that tells you how to redo that mishap. It's a nail in the server's coffin that you usually don't get out again. That is what gave me headaches, the not having options to fix it, just FIX IT, god damn, it's only a few lines of code...ok maybe a few more it's AD and thus deeply DNS integrated...WTF, how could you fall into exactly this pit, this No-Way-Out pit, with a big "How to Sink a DC" sign above it in Neon. Aaarghhh, a big client..and you screwed up his DC, it better runs Monday morning 7am, no matter what, no matter. So...I have it all on Acronis on a Linux Server as well as in Acronis Cloud, that would be the ultimate thing to fix it if I didnt come up with a solution, something dead simple, stupid simple, as quick&dirty as the renaming was. But what. OK,...lets try F8 and Safe Mode, maybe there is something that can be done.....but wait....I cannot reboot or shut it down. The DC-Server Welcome Screen in contrast to Client Windows OS has no Button to restart or shutdown, it's a safe guard. Well, lets see Dell's iDrac....but there have never been drivers installed in the OS for any iDrac, so I did not expect see a graceful OS shutdown/reboot option, just the normal AC operations which I knew. I was right, no graceful options, warmstart or cold restart or OFF. Yeah, what a freaking nightmare, I have to reset the server now as well, a Russian Roulette with AD Database and RAID's, you never reset server unless you really really really have to. I had to ! Show balls !! Grab it btP ! I hit warm restart and was then greatet with a UEFI Secure Boot error, luckily I could just say "screw you, I already have enough trouble!" and clicked it away and it kept booting through the many Dell UEFI and FW screens, I kept hitting F8 over and over agin. Made it, Boot options came up 1 to 9..or 0 god knows. 4 is Safe Mode.... but wait....boot to LAST KNOWN WORKING CONFIGURATION ! ???? OK, that is not ment to fix any AD structural damage if there is any but it should rewind the actual PC-settings I changed a reboot ago !? Let's try, there is really nothing to loose but the 5min it takes to boot. If it screws up and crashes I go the Acronis route, chill and wait until restored, if Acronis lets me down I am screwed, then I am gonna spend Sunday here, for free, fixing my sh!T. Monday 7am is deadline. The server booted, as before, login available, no errors so far....now try the usual DOMAIN\administrator PWD and see if it locks up. TATTAA....It fixed it, dead simple decade old rescue option fixed the misconfig, in a very lucky way I admit. The narrow path with too much domain on one side and too severly misconfigured HW on the other side, this option was golden, it saved a lot of work, money and effort. And I, I will, after 30 years, watch even closer now..on which RDP or Desktop I am before I rename a machine in a Directory ! I really felt baaaad the first 5 minutes after it happened, the"You screwed it Up!" hit hard. Anyway, that server booted as nothing ever had happened, DNS was correct, DHCP changes I made were also still present, all good. Fixed that last PC, printers, users, Mail, done. When I got home, 2h late, my step daughter looked at me, said "You know how late you are!!?" went out and took off...LoL I still felt great, still do, having a beer and typing this.... better than installing a 2022 SRV with AD, Rolls, Software, Users and what not else till 2am and all Sunday too. Done that, thank you, my beer tastes a lot better than that ! God save the command "Revert to last known working configuration" Prost Edited Saturday at 11:25 PM by BitMaster 3 Gigabyte Aorus X570S Master - Ryzen 5900X - Gskill 64GB 3200/CL14@3600/CL14 - Sapphire Nitro+ 7800XT - 4x Samsung 980Pro 1TB - 1x Samsung 870 Evo 1TB - 1x SanDisc 120GB SSD - Heatkiller IV - MoRa3-360LT@9x120mm Noctua F12 - Corsair AXi-1200 - TiR5-Pro - Warthog Hotas - Saitek Combat Pedals - Asus XG27ACG QHD 180Hz - Corsair K70 RGB Pro - Win11 Pro/Linux - Phanteks Evolv-X
Aquorys Posted 17 hours ago Posted 17 hours ago You'd think that if they can disable the option to log in locally, they would also be able to disable the button that allows you to rename the domain controller and sink the entire ship, or maybe make it run those 3 magical commands instead, but no... top notch engineering I guess F-16 / Su-33 / Ka-50 F-16 Checklists (Kneeboard compatible) F-16 BVR training missions
Yurgon Posted 2 hours ago Posted 2 hours ago Nice job almost destroying that server. Will you be on alert 5 on Monday at 7 AM just to be on the safe side if the phone rings? The timing with this thread is impeccable. Got a story to share as well. So I run a few Linux servers, and I have a hand written script that anonymizes IP addresses from their log files so I don't get in trouble with the GDPR. It's hooked into the logrotate daemon as a prerotate script. Super simple thing. I'd noticed a while ago that one of the servers takes unusually long to process the log files and is under heavy load every night. Well, the websites shouldn't have that much more traffic than those on other servers, but it's probably some AI scraper crap that's filling them up again and again day by day, forcing the server to go through countless lines of logs to scan for IPv4 addresses and replace the last 2 octets with zeros before the backup runs. Not ideal, maybe I should optimize the script. Maybe tomorrow. You know, the "that was 6 months ago" kind of tomorrow. Last week I found a smart article about optimizing backups and creating transparent snapshots using Linux filesystem hardlinks. Cool, I'm going to implement that. Start with this server first, it's got the least valuable data, just on the off chance I might mess something up. Nah, worked okay, except... the daily snapshot on the backup server was about the same size as the previous one. They should appear to have the same size, but the copy should really be tiny. But the file system was filling up big time, something apparently wasn't properly hard linked and instead copied. I checked a few random files, couldn't see anything wrong. Same file, same content, same inode number, proper hard linked. One thing I noticed while looking at this snapshot snafu was the server took 2:30 hours to get backed up. Similar servers took 2 minutes. That's... odd. Same OS, same software, same version of rsync, same custom backup script. Why on earth does the server insist on copying 400 GB of data every night when there's hardly any changes in the file system...? Ran a few more du -h to figure out where the largest amounts of data were hanging around. Err... the web host log file directories? Gigabytes over gigabytes of log data? That can't be right, there's no way I get so much traffic that the logs are this big day in and day out. Well. Turns out I had two custom config files for the logrotate daemon here. One from another server, and the one adapted for this server. I had forgotten to delete the old one after adapting. Wasn't a big deal until almost exactly one year ago when I changed the location of my IP address anonymizer script. Put the proper location into the logrotate prerotate hook - but only in the "correct" config file. I hadn't noticed that extraneous config file. That one still had the old path to the prerotate script, this path no longer existed, but by virtue of alphabetical sorting, the old and now wrong logrotate script was run before the correct one. For almost 365 days, the logrotate daemon failed to rotate any of my webserver logs on this server, dutifully logged the failure to syslog every night where I never bothered to check, and so I have amassed numerous gigabytes of access_logs, dwarfed in one particular case by an error_log that has grown to 352 gigabytes on this server. Since the logs get a few new lines every day, they need to be unlinked and transferred in the backup process, blowing snapshots completely out of proportion, and the oldest entries have had their logged IP addresses dutifully anonymized almost 365 times. And all it takes to fix was to delete one extraneous config file. As I write this, the IP address anonymizer should go through its gargantuan task one last time before the log rotation - hopefully - compresses the log files, rotates them, and now I can't wait to see my daily snapshots rotate those old logs into oblivion. You can probably tell I should also invest some time and effort into a more robust monitoring solution. On the plus side it seems I didn't need to dig into this server's Apache logs in at least a year, or else I might have noticed they're a little bigger than they need to be.
Recommended Posts