In 1981, at a conference co-organized by MIT and IBM, the irreverent Nobel Prize-winning physicist Richard Feynman urged the world to build a quantum computer. He said, “Nature isn’t classical, and if you want to make a simulation of nature, you’d better make it quantum mechanical, and by golly, it’s a wonderful problem because it doesn’t look so easy.”
At that time, people considered that a quantum computer, which would operate by the laws of nature or physics, was an impossible undertaking. However, in 1994, Chris Monroe, a postdoc in David Wineland’s lab at the National Institute of Standards and Technology (NIST) showed the first quantum logic gate using multiple qubits (short for quantum bit, the basic unit of information in quantum computing).
It was an accomplishment that helped lead to the Nobel Prize for David Wineland in 2012 and launched a new industry.
Today there are about 65 quantum computing companies in the world, including Google, Amazon, and IBM, along with a large group of startups. There are currently nine different technologies to build a quantum computer and we still do not know which of these will be the winner. Right
now there are probably 10 different working quantum computers, all with small numbers of qubits, and no clear “winner.” Over the next 10-20 years, we will see many advances and the industry hopes to develop a computer that has a large number of “perfect” qubits (probably about 300,000 to 1 million).
Quantum computing is the key to solving many NP hard (non-deterministic polynomial time) problems that are not possible to solve using a supercomputer. Probably the most famous problem for a supercomputer is called the ‘traveling salesman,’ where the computer has to calculate the optimal route between 50 or so different geographical points.
We believe that in the future, quantum computers will be used to solve problems such as drug development, data security, optimization, supply chain logistics, fraud detection, portfolio optimization, weather forecasting, and even help improve the environment.
"Quantum is a ground-breaking technology that will require great data and data science and CDOs need to pay attention."
Quantum is a ground-breaking technology that will require great data and data science and CDOs need to pay attention. We can use our prior experience in the other great leaps forward in data since the advent of the personal computer to help us understand how we best tackle Quantum.
It is customary to think data science began with Big Data. This is not entirely accurate; the slower cousins of statistical and econometric science have been around since the nineteenth century. It is the speed and scale of personal computers, the internet, mobile devices, and data lakes that have accelerated this discipline into ‘Data Science.’
During the late twentieth and early twenty-first centuries, statisticians and econometricians were able to deploy models (not yet called algorithms) on the mainframe. During this period, creating a new ‘variable’ on the mainframe could take up to a year. Even with this slow timeline, models had a significant upside for business outcomes, particularly in early credit and fraud risk decisioning.
Data scientists used what was then considered large amounts of data (hundreds of thousands of rows with hundreds of columns) in tools like R, Matlab, or SAS to build models and worked with their technology counterparts to code, test, and deploy them in mainframe production environments.
Spring forward to 2005, and the advent of distributed systems and databases. The process for building models was nearly identical — find the data for the question of interest, clean and standardize the data, build the model, validate the model, and finally work with technologies to deploy it into production.
The tools used by data scientists were largely the same – R and SAS. The introduction of SQL improved time, speed, and flexibility. Yet the same problems occurred in deployment; if the data needed for the model did not exist in the production environment, the timeline could be prohibitive.
The next leap was the advent of Big Data
Big Data environments unlocked greater speed and scale, yet often decreased flexibility for data scientists. The philosophy of quickly ingesting large amounts of unstructured data to reduce costs and increase throughput was a solid goal, yet data scientists still needed to understand the data in detail to do their work. Unstructured data was a considerable hurdle to understanding.
Many structured overlays were deployed to enable data scientists, often increasing costs instead of decreasing them and data scientists needed to learn new tools yet again. This time, Python, Hadoop, etc. This is when the term ‘algorithm’ (Machine Learning and AI) began being used.
The scale and speed of Big Data enabled statistical models (i.e., algorithms) to be recalculated in near real time to adjust to new trends and behaviors in the data. Meaning data scientists needed to not only learn new tools to pull data but also new statistical and algorithmic methods. During this wave, the scale and usage of the data (petabytes) start to illuminate the cracks in less-than-professional data management.
The lack of data, completeness, or quality starts to impact business functions as data becomes embedded in business decisions. Professional disciplines become the norm for managing quality, consistency, and usability (data governance and management) in mature organizations. This enables data scientists to be more effective.
During the fourth leap, with the advent of the cloud, data science comes to the fore in major organizations around the world. The purpose of the cloud is to bring data together in near real-time across different experiences to create cohesive and contextual decision-making. Remove data science, and this would be impossible. In the cloud, AI starts to truly join the discussion. The speed, scale, and interconnectedness of the data enable entirely new algorithmic approaches.
Data scientists must once again be provided with new tools and techniques. Fortunately, a few techniques carry over from the third wave, such as Python and R. Yet a whole new industry of sophisticated processes and tools arises to foster the creation and deployment of data science models (ML Ops).
And (once again), delays or even incorrect decisions occur when the data is not accessible, appropriately formatted, labeled, and understood. Good data management works tirelessly to resolve data quality and usability issues and enables great data science.
"The quantum era has just begun..."
The quantum era has just begun, yet from historical experience, we can form a plan to enable data scientists to embrace this new technology.
Firstly, we need to educate them and provide them access to the different types of quantum computing available. Similar to the introduction of mainframe, distributed, big data, and cloud, organizations that work proactively to support the learning and development of talent will have an advantage.
Industries that make complex decisions based on human behavior, natural science, or network dynamics will be the first to glean commercial value from quantum.
Secondly, organizations should seek out new ML/Algorithms that are appropriate to quantum computing and learn them. If they do not exist for the use case, then the entrepreneurial and/or research-oriented data scientists will need to create them.
Once these algorithms exist, they can scale to more data scientists. The first companies to seek out open-source algorithms will benefit the most.
Lastly, the more complete, flexible, and usable the data lake and data model, the greater the likelihood of success. Regardless of the technology infrastructure, a strong foundation of data governance and management is necessary to enable great data science and, consequently, strong business outcomes.
Authors’ Bios:
Denise Ruffner serves as the President of Women In Quantum, a rapidly growing organization dedicated to highlighting and creating a community for women in the quantum field. She was previously Chief Business Officer at Atom Computing where she was responsible for the company’s strategic customer and partner ecosystem and business development engagements.
Recognized for innovation, sales leadership, and strategic planning, Denise held a variety of leadership roles at IBM, including being part of the IBM Systems Quantum computing team.
Danielle Crop is a proven data leader with over 15 years of experience. Crop most recently served as the Senior Vice President and Chief Data Officer at Albertsons, responsible for building and executing a world-class central data strategy to benefit customers. Before Albertsons, Crop worked as Chief Data Officer for American Express, realizing the potential of the company’s data assets to create the world’s best customer experience.