Enterobacter sakazakii is an emergent pathogen associated with ingestion of infant formula milk that can lead to neonatal meningitis, necrotising enterocolitis and sepsis [1–5]. The International Commission for Microbiological Specifications for Foods  has ranked E. sakazakii as 'severe hazard for restricted populations, life threatening or substantial chronic sequelae or long duration'. Therefore as there is no accepted gold standard methodology, the correct definition and identification of E. sakazakii is important for powdered infant formula manufacturers, as well as regulators, clinicians and epidemiologists.
In 1980, Farmer and co-workers  defined the species and described fifteen biogroups according to biochemical profiles. A defining characteristic has been activity of the α-glucosidase enzyme. Consequently selective, differential media incorporating chromogenic or fluorogenic α-glucosides such as the indolyl substrate 5-bromo-4-chloro-3-indolyl-α, D-glucopyranoside have been developed [8, 9]. It has been reported that 100% of E. sakazakii (n = 129) were positive for α-glucosidase in comparison to 0% of other Enterobacter species (n = 97) ; however a small number of other Enterobacteriaceae test positive for this enzyme.
Recently 16S rDNA sequencing has revealed that commercial biochemical test kits identified more than one species as 'E. sakazakii' , and that there are at least four genetically and biochemically distinct subgroups of E. sakazakii. In this study we applied Artificial Neural Networks (ANNs) [12–14] to biochemical and 16S rDNA data in order to identify key phenotypic characteristics and nucleotide sequences which could improve the identification of E. sakazakii in respect to, a) other Enterobacteriaceae, and b) non-E. sakazakii α-glucosidase positive Enterobacteriaceae.
ANNs are adaptive, non linear forms of Artificial Intelligence (AI) inspired by the way the human brain learns and processes information in order to solve specific problems, such as pattern recognition and classification problems. The multi-layer perceptron (MLP) ANN is a form of feed-forward ANN architecture which contains several layers, with each node in one layer being connected to every node in the next by a series of weighted links. When used with the back-propagation algorithm, this type of ANN learns in a fashion analogous to the way learning in the human brain is carried out, that is, by example. In humans, learning involves minor adjustments being made to the synaptic connections between neurons, in ANNs, learning is achieved by updating the weights that exist between the processing elements that constitute the network topology.
ANNs were applied to biochemical and 16S rDNA data derived from 282 strains of Enterobacteriaceae, including 189 E. sakazakii isolates, in order to identify key characteristics which could improve the identification of E. sakazakii. Results show that ANNs have the potential to identify key features from the data, both for biochemical tests and sequence data. These key features may then be used to form the basis of novel rapid identification systems, which have the ability to classify samples by strain and eliminate the risk of false positive and negative results.