Abstract The demand for efficient image sorting methods has increased due to technological advancements that enable more intensive phytoplankton monitoring. Both statistical and machine learning algorithms can misidentify algal taxa in taxonomically diverse samples, in which phytoplankton morphology and image traits can vary. We evaluated the statistical filtering performance of the image processing software of an imaging flow cytometer (FlowCam) for two approaches to image library development; these were applied independently to seven commonly occurring algal shapes in mixed natural samples. The “intrinsic method” used a small selection of images (5–15 images of a target taxon) from the same sample being filtered (i.e., intrinsic), whereas the “compiled method” used a larger selection of images (30–80 images of a target taxon) compiled from multiple samples. Filter performance varied with the type of image library, image library size, and target taxon. The largest image libraries offered the highest recall (> 86% for intrinsic, > 94% for compiled) but lower precision (3–85% for intrinsic, 75% for most taxa) than the compiled method (< 20% for most taxa). Statistical filtering performance was higher for larger, solitary‐celled taxa with relatively uniform features (e.g., Gyrosigma ) compared to small‐celled colonial species with more complex or variable shapes (e.g., mucilaginous colonial cyanobacteria, and Scenedesmus ). Iteratively using the intrinsic statistical filtering method with manual correction between each iteration can be used to augment manual sample classification and reduce processing time.
Farrow et al. (Wed,) studied this question.