Tuesday, July 4, 2017

The dreaded DCOM error

This one took weeks from me to figure out, and a little help from our friends at Microsoft. After upgrading a customer's SBA fron 2013 to SfB (and following the correct process):
  1. Move users to Front-End
  2. Remove SBA from topology
  3. Add SBA back into topology with the same name but under the SfB node
  4. Re-install SBA software using 2015 image
  5. Move users back to SBA
Number five is where the issue happened; the users would not move back with a DCOM error:

Interestingly enough, the error in the event log is even more confusing, talking about RoutingGroup already assigned to a different pool (EventID:32209):

Since this is a weird error, the fix will have to be even more weird; this fix is from MS internal documentation :-) (DISCLAIMER: I take no responsibility for doing this on your own environment)
You will need to run the below command to find the users that have a conflicting routing group ID with the one in the Error message:
(Get-CsPool FE01.contoso.com).Computers | `
    % {Sqlcmd.exe -E -S $_\rtclocal -d rtc -Q `
    "select UserAtHost,* from dbo.RoutingGroupAssignment as rga `
    inner join dbo.ServiceCluster as sc on (sc.ServiceClusterId = rga.ServiceClusterId) `
    inner join dbo.ServiceAssignment as sa on (sa.ServiceTagId = sc.ServiceTagId) `
    inner join dbo.Pool as p on (p.PoolId = sa.UscClusterId) `
    inner join dbo.ResourceDirectory as ResD on (ResD.RoutingGroupId = rga.RoutingGroupId) `
    inner join Resource as Res on (Res.ResourceId = ResD.ResourceId) `
    where rga.RoutingGroupName = '<BAD ROUTING GROUP ID>'" -W -s "," -o $_-dumpuserinRG.csv}

The output file will have several users for that routing group ID that are considered "defective"; you will need to run the next command from the RTC database on ALL FE servers in the pool:
use rtc
go
exec [RtcDeleteResource] 'user@domain.com'
go

Rinse and Repeat ... for all Routing Group IDs that show up in the error message.
Now the move user works !!!

No comments:

Post a Comment