Recently we had few issues with our Hyper-V SAN where a number of virtual servers had lost connection to their disk. While this is usually not an issue (they just reboot on the other cluster node), we always have issues on two virtual servers that are terminal servers.
When these servers restart after such a disk outage, they usually hang at the "Applying Computer Policy" screen. Users trying to log on hang at the "Applying User Policy" screen for 60 minutes, and the following event is logged:
Log Name: Application Source: Microsoft-Windows-Winlogon Date: ... Event ID: 6005 Task Category: None Level: Warning Keywords: Classic User: N/A Computer: ... Description: The winlogon notification subscriber <GPClient> is taking long time to handle the notification event (CreateSession).
In most cases, it helps to perform the following procedure to get the server back working:
- disable GPSVC (through PSEXEC -s) and reboot
- remove from domain and re-join
- delete C:\ProgramData\Microsoft\Group Policy and C:\Windows\System32\GroupPolicy
- schedule CHKDSK
- reboot
I really wonder why this issue happens only on the two RDP servers and not on the ca. 50 other servers on the same Hyper-V cluster. Of course the RDP servers have way more user profiles, but that should be the only difference that comes to my mind.
Does anyone have an idea what can cause this, or how to trace the issue if it happens again? I also checked the GPSVC.LOG and enabled debug logging but did not find any helpful information.