Balderton joins $30 M Series D for big data biotech platform play, Sophia Genetics

Switzerland based SaaS startup Sophia Genetics is hoping to give IBM Watson a run for its money in the healthcare diagnostics space. It’s built a big data analytics platform that harnesses clinicians’ medical expertise to enhance genomic diagnostic via AI algorithms — leading, it says, to better and faster diagnoses for patients with cancers such as cancer.

Hospitals that use the platform are intended to jointly benefit from expert-fed, algorithmic DNA sequencing diagnostic insights exactly because they are shared in the different regions of the platform. So as the user-base scales — it says it’s adding 10 new hospitals each month — Sophia Genetics’ AIs get smarter and more accurate, and patients anywhere can benefit from the pooled knowledge.

The company is announcing a $30 million Series D money round today, adding UK-based VC firm Balderton Capital to its investor roster, along with 360 Capital Partner. Previous investors including UK tech entrepreneur Mike Lynch’s Invoke Capital, and Alychlo, started by Mark Coucke, a Belgian pharmaceutical entrepreneur.

According to Crunchbase the biotech business has raised $28.75 M since being founded back in 2011, so has pulled in in matters of $58.75 M thus far — capital that’s been used to develop its platform proposition to a tipping point of utility, as co-founder and CEO Dr Jurgi Camblong explains.

As the cost of genome sequencing has come down he says the challenge for healthcare providers has been quickly and accurately reading and analyzing more readily available DNA sequencing data. This is where Sophia Genetics’ analytics platform aims to assist — currently targeting oncology, hereditary cancer, metabolic ailments, pediatrics and cardiology.

“With the decreasing costs of these technologies that[ are] basically digitalizing patients’ DNA information, we did ensure an opportunity to engage with hospitals to help them be part of a community and share experience and knowledge to continuously better diagnose and treat patients through the use of such type of digital technologies, ” he tells TechCrunch.

“Since our dream was to impact on better diagnosing of the maximum number of patients we thought that in the end the best route was helping every hospital to leverage on this genomic technology. Rather than build a company that would end up competing with the hospitals. And so that’s why we constructed a software as a service platform.”

However, for the platform play to work Camblong says the company needed to be able to attract hospitals to sign up even before it had algorithms that could offer accelerated diagnostic insights — so it needed to be able to offer them something of value right away to get them involved.

And while Camblong said the team’s initial thought was that processing and storage would likely be the major challenges for hospitals handling what are extremely large genomic data-sets, along with issues such as data integrity, privacy and visualization, they actually found the main problem hospitals were grappling with was data accuracy. So they set out to help with that to offer early utility and win longer term buy in from clinicians.

“All of them[ were] purchase those technologies to basically better diagnose patients but the data they would render, although they would be larger, would not be as accurate as what they would have with legacy technology — and this is where we were somehow forced as a startup … to develop algorithm that would correct the data so that clinicians would be able to rely on this data. And use this data to better diagnose patients, ” he says.

“This is really how we started, from 2011 where we had nothing, to launching our platform in 2014 where we were 20 both the employees and we were working with I think 50 hospitals by the end of 2014. To today where we are working with over 350 hospitals that are all connected through our SaaS platform, who are all pulling patients’ genome data, sharing knowledge to continuously get a better outcome of our algorithms that by the time[ i.e. now] have become an artificial intelligence.”

On the data accuracy issue, Camblong says the startup worked with hospitals to benchmark DNA samples investigated via their sequencing systems, with the aim of “getting the signal out of the noise”, as he puts it, and then training algorithms of its own to be able to perform that de-noising process automatically, and to distinguish the salient/ relevant patterns in the genome data. And thus, ultimately, to speed up diagnoses in the targeted health areas.

Sophia Genetics refers to its business as sitting within the “fast-emerging field of data-driven medicine” — and is specifically applying AI to enhance relatively modern, so-called “Next Generation DNA Sequencing”( NGS) techniques, which may be faster than but aren’t as accurate as older-gen legacy systems, according to Camblong.

“All the AI technology that we’ve developed is based on statistical inference, pattern recognition, and some of it as well on machine learning, ” he says of Sophia Genetics’ core tech.

Data are not valuable any more once you have them. In any AI industry what is interesting is find the capacity to be exposed to the problem and teach an algorithm on how to recognize and solve the problem.

“Data are not valuable any more once you have them, ” he adds, fleshing out the startup’s relationship with its hospital customers/ partners. “In any AI industry what is interesting is ensure the capacity to be exposed to the problem and teach an algorithm on how to recognize and solve the problem. But once you have taught this AI[ to do] that you don’t require any more the data you’ve been computing. So it’s not so much the facts of the case that we get access to this data — it’s because, unlike any other actor in the industry, we took this challenge of taking the pain.

“Unlike no other company we understood that their own problems was accuracy and we took the challenge of aggregating their own problems of accuracy.”

Commenting on why Sophia Genetics stood out for Balderton, partner James Wise told us: “On top of their easy to utilize workflow tool to annotate and use sequenced data( compared with unsupported open source software) and their active clinician community, Sophia’s real technological advantage comes out of its machine learning technology that analyses the genomic data and minimise the noise from the use of multiple different combinations of sequencers and diagnostic kits to identify variants( DNA modifications) with a clinical-grade accuracy.”

“As the market for diagnostic kits continues to expand, and as new sequencers come to market, there will continue to be a plethora of different ways that clinicians can use genomic data to make a diagnosis. But this requires a sophisticated third party platform to handle these many different inputs and to optimize their outcomes — in Sophia Genetics’ case by employing machine learning techniques across the huge datasets and through testing with their clinician network, ” he added.

“While there are competing solutions for tertiary analysis that may work well with a certain type of sequencer, it is Sophia’s independent stance and its technical ability to incorporate any combining of diagnostic and sequencer that constructs its technology universal and unique.”

Camblong says Sophia Genetics has benchmarked DNA sequencing data for more than 10,000 patients, and for over 500,000 unique variants at the present stage — and “youre having” three “core” diagnostic technologies trained off of this data.

It says the process it uses has been validated with more than 340 different DNA sequencers, while its algorithms were built bottom-up from raw FASTQ data( aka the most common file format used in DNA sequencing) — and claims its tech is universally applicable.

“You cannot use deep learn techniques in this industry, ” says Camblong, elaborating on why the business took several years to develop algorithms manually, with human experts benchmarking and analyzing data. “You need to have the prior knowledge. Deep learning requires you to have millions of millions of millions of data. And then you can expect that because of that eventually the neurons you will build are going to be able to find the route by their own. In many industries you need to have prior knowledge.

“First for the accuracy phase, Sophia has been learning by our data scientists because they have been exposed to the patterns[ i.e. by analyzing the DNA sequencing data ]… and then at the second stage, once you have a platform … the platform can evolve and learn with machine learning techniques.”

At this stage he says the business is in its second phase — utilizing the network of hospitals and clinicians it has signed up and connected via its platform, and describing on the access to thousands of cases it’s been afforded, coupled with the continued elbow grease of clinicians feeding their diagnostic knowledge on the pathogenicity of variants into the platform on an ongoing basis — to be in a position to now apply machine learning techniques to accelerate utility and scale the business. Hence taking in more funding.

Camblong refers to what the platform does as a “democratization” of DNA sequencing expertise, asserting: “So that the next hospital that starts employing your technology will enter at a level where it will require less competencies, less experience to be able to diagnose patients through the use of genomic information.”

It charges hospitals for use of the platform on an on-demand basis — so they pay per analysis performed, rather than having to shell out for a fixed monthly fee.

The workflow for using the platform involves a patient with one of the suspected conditions arriving at the hospital and having a sample taken. Their DNA is extracted and enriched with molecular biology principles, and genes selected to be redone by the hospital’s NGS machine.

The digitization of that data takes two days, after which users log in to Sophia Genetics’ platform and loading in the raw data, which is transferred to the company’s datacenters( “in an anonymized way”, according to Camblong; he also confirms that the platform prompts hospitals to confirm it has patients’ consent for transferring their data to be processed by a third party) — and then the startup’s AI algorithms get to work to pull out unique genetic variants.

“These data are going to be annotated … it means that you add additional information that is out there in public databases, or as well in the databases of the users of Sophia DDM, and then the information is being ranked according to pathogenicity predictions, ” he continues , noting that the data processing undertaken by its AI takes two hours.

“Two hours later the user logs in and given the genetic variants that are being seen the user is going to taken any steps — so Sophia can learn as well from these actions. The expert is going to classify those variants as being pathogenic or benign.”

Camblong says the platform has moved from having a accuracy rate of 85% for classification of variants for the first 10,000 patients, to 95% with the following 10,000, and 98% with the 10,000 after that.

“We are always between 99.9% and 100% for sensitivity, and between 99% and 100% specificity, ” he adds of the platform’s current median accuracy range.

As it evolves, he says the wider vision is to add more layers to expand its capabilities — so it could, for example, calculate imaging data regarding medical scans together with molecular genomics data to support more powerful predictive analyses.

“If you combine two sequence images and molecular information about[ a cancer] tumor you can predict how the tumor is going to evolve in the following months, ” he indicates, saying surgeons could then make decisions about whether they need to operate immediately or whether they could wait. So the big pushing is towards the opportunity of an ever more personalized sort of healthcare — enabled by AI being able to shrink the time-scales and costs of performing robust genomic analysis.

He says the new funding will be used to “fully deploy” Sophia’s SaaS platform globally, and to ramp up commercial activity — moving beyond its current focus on Europe to Latin american states, AsiaPac, Canada and the U.S.

“We believe that the number of hospitals that will adopt our technology will dramatically ramp up over the following financial year, ” he says.

The investment will also go into oncology, specifically — towards developing what he calls “full management of a cancer case”, explaining this as all-encompassing: “From the first image that has been taken with a scan, up to the monitoring of the efficiency of the treatment and eventually adaptation of the treatment.”

It also intends to add additional capacity generally, so it can associate molecular info with metadata, such as imaging data — to start to push towards expanding the platform’s analytical abilities by supporting the co-processing of multiple types of healthcare data pertaining to its targeted conditions.

Though Camblong concedes that personal privacy challenges will step up as more highly sensitive medical data gets processed in concert.

“We took[ privacy] very serious. There are companies in the industry that have built very bad moves in the past. And we have never wanted to go to a DTC[ direct to customer] approach. For us it was very clear that if “youre trying to” impact on better diagnosing the maximum number of patients, trust by the institutions would be very important, ” he says.

“You cannot roll out an AI unless you construct it bottom up. So everything you’ve been challenging me about on how we’ve been able to build this AI to make it accurate is actually what recognise Sophia from any other actor that may want to be important in this space. We have been the only one who made the endeavours of digging into this complexity of inducing the relevant data accurate — and of inducing everything bottom up, because that’s the only way you are able to build smart intelligence, or artificial intelligence, ” he adds.

“To take a parallel, self-driving vehicles are not going to learn from speech recognition systems — they will learn from you, from me, from people that are going to drive autoes, make missteps, take right decisions and by be interested to know whether we have taken the right decision or whether we’ve induced mistakes we are going to be able to teach the cars how to drive themselves.”

Read more:

AI data-monopoly hazards to be probed by UK parliamentarians

The UKs upper house of parliament is asking for contributions to an enquiry into the socioeconomic and ethical impacts of artificial intelligence technology.

Among the questions the House of Lords committee will consider as part of the enquiry are 😛 TAGEND Is the current level of exhilaration surrounding artificial intelligence warranted? How can the general public best be developed for more widespread employ of artificial intelligence? Who in society is gaining the most from the growth and use of artificial intelligence? Who is gaining the least? Should the publics understanding of, and participation with, artificial intelligence be improved? What are the key industry sectors that stand to take advantage of the growth and use of artificial intelligence? How can the data-based monopolies of some large corporations, and the winner-takes-all economics associated with them, be addressed? What are the ethical implications of the growth and use of artificial intelligence? In what situations is a relative absence of transparency in artificial intelligence systems( so-called black boxing) acceptable? What role should the government take in the development and use of artificial intelligence in the UK? Should artificial intelligence be regulated?

Artificial Intelligence Is Setting Up the Internet for a Huge Clash With Europe

Neural networks are changing the Internet. Inspired by the networks of neurons inside the human brain, these deep mathematical models can learn discrete undertakings by analyzing enormous amounts of data. They’ve learned to recognize faces in photos, identify spoken commands, and translate text from one speech to another. And that’s just a start. They’re also moving into the heart of tech giants like Google and Facebook. They’re helping to choose what you is displayed when you query the Google search engine or visit your Facebook News Feed.

All this is sharpening the behavior of online services. But it also means the Internet is poised for an ideological showdown with the European union, the world’s single largest online market.

In April, the EU laid down new regulations for the collect, storage, and use of personal data, including online data. Ten years in the making and set to take affect in 2018, the General Data Protection Regulation guards the data of EU citizens even when collected by companies based in other parts of the world. It codifies the” right to be forgotten “, which lets citizens request that certain connects not appear when their name is typed into Internet search engines. And it dedicates EU authorities the power to fine companies an enormous 20 million euro–or 4 percent of their global revenue–if they infringe.

But that’s not all. With a few paragraphs buried in the measure’s reams of bureaucrat-speak, the GDPR also restricts what the EU calls” automated individual decision-making .” And for the world’s biggest tech companies, that’s a potential problem.” Automated individual decision-making” is what neural networks do.” They’re talking about machine learning ,” says Bryce Goodman, a doctrine and social science researcher at Oxford University who, together with a fellow Oxford researcher, recently published a newspaper investigating the potential effects of these new regulations.

Hard to Explain

The regulations proscribe any automated decision that” significantly affects” EU citizens. This includes techniques that assess a person’s” performance at work, economic situation, health, personal preferences, interests, reliability, behavior, location, or movements .” At the same period, the legislation provides what Goodman calls a” right to explanation .” In other terms, the rules devote EU citizens the option of reviewing how a particular service made a particular algorithmic decision.

Both of these stipulations could strike at the heart of major Internet services. At Facebook, for example, machine learning systems are already driving ad targeting, and these depend on so much personal data. What’s more, machine learning doesn’t precisely lend itself to that” right of rationale .” Explaining what goes on inside a neural network is a complicated process even for the experts. These systems operate by analyzing millions of pieces of data, and though they run quite well, it’s difficult to determine precisely why they run so well. You can’t easily trace their precise track to a final answer.

Viktor Mayer-Schnberger, an Oxford expert in Internet governance who helped draft parts of the new legislation, says that the GDPR’s description of automated decisions is open to interpretation. But at the moment, he says, the “big question” is how this speech affects deep neural networks. Deep neural net depend on vast amounts of data, and they produce complex algorithm that can be opaque even to the persons who put this new system in place.” On both those levels, the GDPR has something to say ,” Mayer-Schnberger says.

Poised for Conflict

Goodman, for one, believes the regulations ten-strike at the center of Facebook’s business model.” The legislation has these big multi-national companies in intellect ,” he says. Facebook did not respond to a request for comment on the matter, but the tension here is obvious. The company makes billions of dollars a year targeting ads, and it’s now utilizing machine learning techniques to do so. All signs indicate that Google has also applied neural networks to ad targeting, just as it has applied them to “organic” search results. It too did not respond to a request for comment.

Neural networks themselves elude easy rationale, which likely constructs some kind of conflict inevitable.

But Goodman isn’t just pointing at the big Internet players. The latest in machine learning is trickling down from these giants to the rest of the Internet. The new EU regulations, he says, could affect the progress of everything from ordinary online recommendation engines to credit card and insurance companies.

European courts may ultimately find that neural networks don’t fall into the automated decision category, that they’re more about statistical analysis, says Mayer-Schnberger. Even then, however, tech companies are left wrestling with the” right to explanation .” As he explains, part of the beauty of deep neural nets is that they’re “black boxes.” They work beyond the bounds of human logic, which entails the myriad industries that will adopt this technology in the course of the year will have trouble sussing out the kind of rationale the EU regulations seem to demand.

“It’s not impossible,” says Chris Nicholson, the CEO and founder of the neural networking startup Skymind.” But it’s complicated .”

Human Intervention

One way around this conundrum is for human decision makers to intervene or override automated algorithms. In many cases, this already happens, since so many services use machine learning in tandem with other technologies, including rules explicitly defined by humen. This is how the Google search engine runs.” A lot of the time, algorithms are only part of the solution—a human-in-the-loop answer ,” Nicholson says.

But the Internet is moving towards more automation , not less. And in the end, human intervention isn’t necessarily the best answer.” Human are far worse ,” one commenter wrote on Hacker News, the popular tech discussion site.” We are incredibly biased .”

The conundrums presented by the new EU regulations wont only apply to the biggest tech companies. Theyll extend to everything.

It’s a fair argument. And it will only become fairer as machine learning continues to improve. People tend to put their faith in humans over machines, but machines are growing more and more important. This is the same tension at the heart of ongoing discussions over the ethics of self-driving cars. Some say:” We can’t let machines make moral decisions .” But others say:” You’ll change your mind when you see how much safer the roads are .” Machines will never be human. But in some cases, they will be better than human.

Beyond Data Protection

Ultimately, as Goodman connotes, the conundrums presented by the new EU regulations will extend to everything . Machine learning is the way of the future, whether the task is generate search results, navigating roads, trading stocks, or observing a romantic partner. Google is now on a mission to retrain its staff for this new world order. Facebook offers all sorts of tools that let anyone inside the company tap into the power of machine learning. Google, Microsoft, and Amazon are now offering their machine learning techniques to the rest of the world via their cloud calculating services.

The GDPR deals in data protection. But this is just one area of potential conflict. How, for example, will anti-trust laws treat machine learning? Google is now facing a case that accuses the company of discriminating against certain competitors in its search results. But this case was brought years ago. What happens when companies complain that machines are doing the discriminate?

” Refuting the evidence becomes more problematic ,” says Mayer-Schnbergerd, because even Google may have trouble explaining why a decision is stimulated.

Read more:

Google’s AI Reads Retinas to Prevent Blindness in Diabetics

Getty Images

Google’s artificial intelligence can play the ancient game of Go better than any human. It can identify faces, recognise spoken words, and pull answers to your questions from the web. But the promise is that this same kind of technology will soon handle far more serious run than playing games and feeding smartphone apps. One day, it could help care for the human body.

Demonstrating this promise, Google researchers have worked with doctors to develop an AI that can automatically identify diabetic retinopathy, a resulting cause blindness among adults. Use deep learningthe same breed of AI that identifies faces, animals, and objects in paintings uploaded to Google’s online servicesthe system detects the condition by analyse retinal photos. In a recent study, it succeeded at about the same rate as human opthamologists, according to a paper published today in the Journal of the American Medical Association .

” We were able to take something core to Google–classifying cats and dogs and faces–and apply it to another sort of problem ,” says Lily Peng, the physician and biomedical technologist who oversees the project at Google.

But the idea behind this AI isn’t to replace doctors. Blindness is often preventable if diabetic retinopathy is caught early. The hope is that the technology can screen far more people for the condition than doctors could on their own, particularly in countries where healthcare is restriction, says Peng. The project began, she says, when a Google researcher realized that physicians in his native India were struggling to screen all the locals that needed to be screened.

In many places, doctors are already employing photos to diagnose the condition without seeing patients in person.” This is a well validated technology that can bring screening services to remote locations where diabetic retinal eye screening is less available ,” says David McColloch, a clinical professor of medicine at the University of Washington who specializes in diabetes. That could provide a convenient on-ramp for an AI that automates the process.

Peng’s project is part of a much wider great efforts to detect illness and illness using deep neural network, pattern recognition systems that they are able learn discrete undertakings by analyzing vast amounts of data. Researchers at DeepMind, a Google AI lab in London, have teamed with Britain’s National Health Service to construct various technologies that can automatically detect when patients are at risk of disease and illness, and several other companies, including and a startup called Enlitic, are exploring similar systems. At Kaggle, an internet site where data scientists compete to solve real-world problems employing algorithm, groups have worked to build their own machine learning systems that can automatically identify diabetic retinopathy.

Medical Brains

Peng is part of Google Brain, a team inside the company that offer AI software and services for everything from search to security to Android. Within this team, she now leads a group spanning dozens of researchers that focuses solely on medical applications for AI.

The work on diabetic retinopathy started as a” 20 Percent project “ about two years ago, before becoming a full-time endeavor. Researchers began working with hospitals in the Indian cities of Aravindand Sankarathat were already collecting retinal photos for physicians to investigate. Then the Google team asked more than four dozen doctors in India and the US to identify photos where mini-aneurysms, hemorrhages, and other issues indicated that diabetic patients could be at risk for blindness. At least three doctors reviewed each photo, before Pemng and team fed about 128,000 of these images into their neural network.

Ultimately, the system identified the condition slightly more consistently than the original group of doctors. At its most sensitive, the system avoided both false negatives and false positives more than 90 percent of the time, exceeding the National Institute of Health’s recommended standard of at least 80 percent accuracy and accuracy for diabetic retinopathy screens.

Given the success of deep learning algorithms with other machine vision undertakings, the results of the original trial aren’t surprising. But Yaser Sheikh, a prof of computer science at Carnegie Mellon who is working on other forms of AI for healthcare, says that actually moving this kind of thing into the developing world can be difficult.” It is the kind of thing that sounds good, but actually attaining it work has turned out to be far more difficult ,” he says.” Get technology to actually help in the developing world–there are many, many systematic obstacles .”

But Peng and her squad are pushing forward. She says Google is now running additional trials with photos taken specifically to develop its diagnostic AI. Preliminary outcomes, she says, indicate that the system once again performs as well as trained doctors. The machines, it seems, are gaining new various kinds of sight. And some day, they might save yours.

Read more: