The 82nd Annual Meeting of the American Association of Physical Anthropologists (2013)


Geometric Morphometrics and Statistical Classification: Size Matters

STEPHEN D. OUSLEY1 and MICHAEL KENYHERCZ2.

1Department of Anthropology/Archaeology, Mercyhurst University, 2Department of Anthropolology, University of Alaska, Fairbanks

Thursday Afternoon, 301E Add to calendar

Geometric Morphometrics (GM) includes powerful methods for exploring and understanding shape variation in human crania. Recently, software such as 3D-ID (Slice and Ross 2009) and MorphoJ (Klingenberg 2011) utilize Linear Discriminant Function Analysis (LDFA) for classification using all Procrustes coordinates, which are adjusted landmark coordinates after translation, rotation, and scaling. Classification methods are especially useful in forensic and bioarchaeological settings. However, landmark data can be analyzed in other ways, and classification goals are different from GM goals. This study investigates the performance of classification techniques applied to various transformations of landmark data.

Three-dimensional landmark coordinate data from 155 black and white males and females in the Terry collection were analyzed in four-way classifications using from 4 to 40 landmarks. Procrustes coordinates (PCoos), with centroid size ((PCoosCS), principal component scores of the PCoos, and interlandmark distances (ILDs) were employed in LDFA and stepwise selection of up to 10 variables. Classification accuracy was assessed using leave-one-out cross-validation in Fordisc 3.1 (Jantz and Ousley 2007).

Results highlight the Curse of Dimensionality (overfitting due to too many variables) in all datasets and the value of using size in classifications involving the sexes. Accuracy using all variables peaked at 14 landmarks for all data sets, and PCoosCS generally showed the highest accuracy, but stepwise selection produced the highest accuracies of all, especially using ILDs or PCoosCS. Classification using all PCoos was generally low, and was only 50% accurate using 40 landmarks. In using GM data for classification, optimized classification methods should be used.

Tweet
comments powered by Disqus