Skip to main content

verifying replication failure with admp and mom 2005

you’ve no doubt seen this error message if you’re monitoring active directory replication.

The following DCs have not updated their MOMLatencyMonitor objects within the specified time period (8 hours). This is probably caused by either replication not occurring, or because the 'AD Replication Monitoring' script is not running on the DC.

Format: DC, Naming Context, Hours since last update

My-Site
myDCserver, NDNC:DC=DomainDnsZones,DC=myDomain,DC=com, 16
 

typically, this error is generated when a DC is no longer replicating.  the ADMP script watches changes to an attribute called adminDescription.  under the container MOMLatencyMonitors off the root of the watched naming context, exist objects that represent all of the DCs for that naming context.

for example:

myDCserver, NDNC:DC=ForestDnsZones,DC=myDomain,DC=com, 9
 

this statement indicates that the domain controller myDCserver has not replicated the required value for 9 hours or more in the naming context of DC=ForestDnsZones,DC=myDomain,DC=com.  there are two places this can fail:

  1. the domain controller is having trouble replicating.
  2. the MOM Agent is not operating correctly to write to the adminDescription.

to narrow down the problem, follow the steps below.

the domain controller may be having trouble replicating.

to validate this condition, we can use repadmin.  issuing the following command gets us some usable data.

repadmin.exe /showrepl myDCserver
 
DC=ForestDnsZones,DC=myDomain,DC=com
    mySite\myDCserver2 via RPC
        DC object GUID: 67x4141y-x526-45xy-x32y-8x04yx041yx7
        Last attempt @ 2008-09-16 09:46:41 was successful. 
 

it's important to pay attention to the naming context that was specified.  in this case, we see that the last attempt was successful and very close to the current timeframe.  this indicates that replication is not the issue.

 

the mom agent is not operating correctly to write to the adminDescription.

as stated above, the admp script to check replication uses the objects in these containers to handle a type of synthetic replication.  for the mom agent running the script, it writes to its own object's adminDescription attribute.  in order to see where a problem may exist, we can utilize dsquery to list the current attributes for all objects in the naming context of ForestDnsZones.

dsquery * cn=momlatencymonitors,dc=forestdnszones,dc=cox,dc=com -scope onelevel -attr name admindescription
 

and receive the following results:

name admindescription
myDCserver 20080916.0301
myDCserver1 20080916.1301
myDCserver2 20080916.1401
myDCserver3 20080916.1501
myDCserver4 20080916.1301
myDCserver5 20080916.1101
myDCserver6 20080916.1301
myDCserver7 20080916.1301
myDCserver8 20080916.1201

from the results, we can determine that this time, the replication alert is actually a mom error as noted by the delta between myDCserver and any other in the list.

Comments

Popular posts from this blog

using preloadpkgonsite.exe to stage compressed copies to child site distribution points

UPDATE: john marcum sent me a kind email to let me know about a problem he ran into with preloadpkgonsite.exe in the new SCCM Toolkit V2 where under certain conditions, packages will not uncompress.  if you are using the v2 toolkit, PLEASE read this blog post before proceeding.   here’s a scenario that came up on the mssms@lists.myitforum.com mailing list. when confronted with a situation of large packages and wan links, it’s generally best to get the data to the other location without going over the wire. in this case, 75gb. :/ the “how” you get the files there is really not the most important thing to worry about. once they’re there and moved to the appropriate location, preloadpkgonsite.exe is required to install the compressed source files. once done, a status message goes back to the parent server which should stop the upstream server from copying the package source files over the wan to the child site. anyway, if it’s a relatively small amount of packages, you can

How to Identify Applications Using Your Domain Controller

Problem Everyone has been through it. We've all had to retire or replace a domain controller at some point in our checkered collective experiences. While AD provides very intelligent high availability, some applications are just plain dumb. They do not observe site awareness or participate in locating a domain controller. All they want is the name or IP of one domain controller which gets hardcoded in a configuration file somewhere, deeply embedded in some file folder or setting that you are never going to find. How do you look at a DC and decide which applications might be doing it? Packet trace? Logs? Shut it down and wait for screaming? It seems very tedious and nearly impossible. Potential Solution Obviously I wouldn't even bother posting this if I hadn't run across something interesting. :) I ran across something in draftcalled Domain Controller Isolation. Since it's in draft, I don't know that it's published yet. HOWEVER, the concept is based off

sccm: content hash fails to match

back in 2008, I wrote up a little thing about how distribution manager fails to send a package to a distribution point . even though a lot of what I wrote that for was the failure of packages to get delivered to child sites, the result was pretty much the same. when the client tries to run the advertisement with an old package, the result was a failure because of content mismatch. I went through an ordeal recently capturing these exact kinds of failures and corrected quite a number of problems with these packages. the resulting blog post is my effort to capture how these problems were resolved. if nothing else, it's a basic checklist of things you can use.   DETECTION status messages take a look at your status messages. this has to be the easiest way to determine where these problems exist. unfortunately, it requires that a client is already experiencing problems. there are client logs you can examine as well such as cas, but I wasn't even sure I was going to have enough m