Aims: Microbial family identification of 16S rDNA sequences by applying a strategy based on algorithms for data compression. Methods and Results: Perl scripts were developed to analyse similarities in microbial sequences, based on a gzip data compression technique. For each bacterial family (n ¼ 196) a 16S rRNA reference file was constructed to compare new queries looking at compression performance. An online user-friendly bioinformatics tool was built up to attribute a bacterial family to a 16S rRNA sequence. It was successfully applied to recognize different bacterial families, including Legionellaceae, Bacillaceae, Enterobacteriaceae, Acetobacteriaceae and Rhizobiaceae. The percentage of positive identifications is higher than 95% for fragments over 450 bp. Conclusions: A new bioinformatics approach has been developed to assign a taxonomic classification to a 16SrDNA sequence. An online tool provides quick and easy sequence attribution. The general principle can be applied to other genes of taxonomic interest. Significance and Impact of the Study: Availability of simple bioinformatics tools can support the development of molecular-based analysis and classification of bacteria, especially for environmental or uncultured strains.

A gzip-based algorithm to identify bacterial families by 16S rRNA

ROMANO SPICA V
2006-01-01

Abstract

Aims: Microbial family identification of 16S rDNA sequences by applying a strategy based on algorithms for data compression. Methods and Results: Perl scripts were developed to analyse similarities in microbial sequences, based on a gzip data compression technique. For each bacterial family (n ¼ 196) a 16S rRNA reference file was constructed to compare new queries looking at compression performance. An online user-friendly bioinformatics tool was built up to attribute a bacterial family to a 16S rRNA sequence. It was successfully applied to recognize different bacterial families, including Legionellaceae, Bacillaceae, Enterobacteriaceae, Acetobacteriaceae and Rhizobiaceae. The percentage of positive identifications is higher than 95% for fragments over 450 bp. Conclusions: A new bioinformatics approach has been developed to assign a taxonomic classification to a 16SrDNA sequence. An online tool provides quick and easy sequence attribution. The general principle can be applied to other genes of taxonomic interest. Significance and Impact of the Study: Availability of simple bioinformatics tools can support the development of molecular-based analysis and classification of bacteria, especially for environmental or uncultured strains.
2006
environmental microbiology,
bacteria
bioinformatics
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14244/3071
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
social impact