skip to main content

NCTC in the media: Scientists map genetic codes of 3000 dangerous bacteria

Professor Julian Parkhill’s group at the Wellcome Sanger Institute (WSI) and Pacific Biosciences featured together with the NCTC team in multiple media articles in June 2018 after we reached the important milestone of having sequenced DNA from 3000 NCTC bacteria.

Professor Parkhill said “Historical collections such as the NCTC are of enormous value in understanding current pathogens. Knowing very accurately what bacteria looked like before and during the introduction of antibiotics and vaccines, and comparing them to current strains from the same collection, shows us how they have responded to these treatments. This in turn helps us develop new antibiotics and vaccines. PacBio’s comprehensive DNA sequencing enables deep genomic analyses, and we are happy to be partnering with them for this important project”.

The NCTC team, working in partnership with Professor Julian Parkhill’s group at the Wellcome Sanger Institute (WSI), embarked on an ambitious project in 2014 to generate reference whole genome sequences (WGS) for 3000 of the bacterial pathogens preserved in NCTC. Further collaboration between the WSI and Pacific Biosciences of California Inc meant that the teams have been able to deliver long read sequences, particularly important for reference genomes because this technology generates more complete genomic data.

In June 2018 we reached our target, having sequencing the DNA samples extracted from 3000 NCTC bacteria, and this milestone generated significant media interest. Perhaps unsurprisingly, headlines focused on the ‘dangerous and deadly’ bacteria preserved in the collection, referencing diseases such as plague, dysentery and cholera. Decoding emerging strains such as those with multiple antimicrobial resistance mechanisms also featured, with emphasis on evolving “superbugs” as one of the biggest threats facing medicine today.

Historically interesting strains were cited including an isolate from Alexander Fleming’s nose (NCTC 4842 Haemophilus influenzae) and a dysentery-causing strain from a World War One soldier (NCTC1 Shigella flexneri). Including strains of Mycobacteria tuberculosis and Neisseria gonorrhoea in the project raised awareness that these two bacterial species alone infect nearly 90 million people a year with 1.7 million deaths being attributed to tuberculosis in 2016, and recent WHO reports stating that gonorrhoea is becoming almost untreatable.

The data generated by this project will enable researchers to better understand infectious diseases and how the bacteria become resistant to antibiotics. The publicly available genomic maps could also lead to the development of new diagnostic tests, vaccines or treatments which is why there’s one further element associated with this incredible project – the delivery of an electronic information centre that will bring together all the metadata associated the NCTC strains, including the raw and assembled genomic data.  This final stage is well underway with a release date scheduled for 2019. In the meantime much of the data can be accessed via the NCTC website or the public EMBL databases at:

The scientific significance of sequencing the type strains of 852 bacterial species associated with human infection was largely overlooked in the media coverage of this impressive story, or the fact that at least 298 of those type strains had no WGS data available in any public databases. The potential for using this data in phylogeny and populations genetics studies wasn’t highlighted on this occasion. Nevertheless, a Reuters article reporting this project was published all over the world including more than 34 US online outlets, Bangladesh, British Virgin Islands, Canada, India, Indonesia, Ireland, Japan, Malaysia, New Zealand, Saudi Arabia and UAE and there were multiple press outlets for this story within the United Kingdom.


Read Wellcome Sanger Institute article 


Wrtten by Julie Russell

January 2020