# Robust logistic diagnostic for the identification of high leverage points in logistic regression model

Original by B.A Syaiba, M. Habshah, University Putra Malaysia, 2010, 9 pages

This summary note was Posted on

• High leverage points are observations that hae outlying values in covariate space
• Popular (recent) method from Imon(2006) is to use the distance from the mean (DM) diagnostic to identify these points
• It may however suffer from masking and swamping effects, due to low leverage points
• In a logistic regression most of the extreme points in the covariaite pattern may have the smallest leverage values
• Thus detecting high leverage points in logistic regression based on the leverage values method in linear regression is unsuccessful
• Cut off point for any $b_j$ DM: $b_j \geq Median(b_j) + c.MAD(b_j)$ with $MAD(b_j) = Median{|b_j-Median(b_j)|}/0.6745$ and c a constant to choose as 2 or 3
• Proposing the Robust Logistic Diagnostic (RLGD) mixing the DM technique and the Diagnostic Robust Generalized Potentials from Habshah (2009)
• First stage identifies high leverage points using robust estimator either using Minimum Covariance Determinants (MDC) or Minimum Volume Ellipsoid (MVE) (Rousseeuw 1984) then use the diagnostic approach to confirm
• step 1 : For each ith point compute the RMD using either MCD or MVE estimators
• step 2 : Ith points with $RMD_i>Median(RMD_i) + c.MAD(RMD_i)$ are suspected as high leverage points and two sets are created, one with without the high leverage values and one with only the leverage values
• step 3 : Compute $b_i$ for set with high leverage points
• step 4 : Delete $b_i$ greater than the cut off value
• Performane of the DM method and the RLGD method were compared using the Detection Capacity and the False Alarm rate.
• The RLGD method has a better detection probability and a false alarm rate up to 20%, better than the DM method.