Data science is the ‘fourth paradigm’ of science, posited Jim Gray, a Turing Award-winning computer scientist. He believed that the explosion of data, and information technology, will change the way we practice science. In the years since data science has changed a lot more — it has impacted everyday lives in ways never imagined before. In this blog post, we’ll explore some paradigm-shifting data science case studies from the recent past.

We’ll see how data science projects manifest themselves — say, through data analytics, machine learning, artificial intelligence, etc. We’ll also ponder a bit about where these data science projects might go in 2020.

Top Data Science Case Studies from around the world

1. Data Science Case Studies in Geosciences: Early Warning Systems for Tsunami

Key Area: Data analytics, computer vision

Geoscientists are collecting, processing, analyzing, and predicting using weather data for various reasons — most significant of them is disaster prediction. Early this year, scientists from Stanford published a new method that combines data assimilation and simulation techniques to enable early Tsunami warnings. 

Leveraging data from offshore sensors, using a technique called the ensemble Kalman filter, this method reconstructs the tsunami wavefield, building propagation simulations, and generating forecasts for wave height and arrival time at the coast. But these sensors are expensive and unavailable in most countries. So, data scientists are also exploring alternatives — like the data science datasets from under-ocean fiber optic cables or GPS stations on commercial ships.

This is just one among many data science case studies in geosciences. Scientists are building early warning systems with data analytics of SAR datasets (here is the dataset if you want to experiment), and applying Weiner filters for image processing. Another group from the Australian National University is fine-tuning its Time Reverse Imaging Method, which is expected to deliver more accurate predictions.

2. Healthcare: Disease Detection

Key Area: Machine learning, computer vision

Cancer cure is one of healthcare’s biggest challenges. While it might continue to be so for a few more years to come, data science is making significant headway in cancer detection and prediction. Using deep learning, Google LYNA (LYmph Node Assistant) helps increase the diagnostic accuracy of nodal metastasis of breast cancer. The algorithm was able to “correctly distinguish a slide with metastatic cancer from a slide without cancer 99% of the time.” 

Another team is working with pathologists to improve their deep neural network (DNN) predictions for prostate cancer diagnosis. 

3. Logistics: Mapping and Routing

Key Area: Data analytics

The e-commerce boom pivoted the logistics industry, forcing them to find innovative ways to compete — on price and speed. Map-based applications are combining data science datasets from GPS, traffic, weather, historical trends, toll locations, street-level imagery, etc. to find the best/faster route for delivery. 

For example, UPS uses Network Planning Tools (NPT) to route shipments to UPS facilities with the most capacity. In fact, they also use autonomous drones — that can be launched from rooftops of UPS trucks — to improve delivery efficiency in remote locations. 

4. E-commerce: Understanding Customer Behavior 

Key Area: Data analytics, Machine learning

E-commerce is a goldmine for data science case studies. Everything you see on your Amazon home page today is curated by a machine learning algorithm. E-commerce players analyze every piece of data they have — user information, product information, customer behavior, market trends, etc. They leverage email marketing analytics to see how their promotions are yielding results, or cart abandonment rates to understand why a customer is leaving the site without completing the purchase — and use them to build hyper-personalized experiences for each customer.

In fact, Amazon’s recommendation engine is publicly available for anyone to use.

Banking: Risk Profiling

Key Area: Data analytics, Machine learning

Machine learning algorithms for credit risk assessments are now very common. Algorithms analyze various factors — financial information, relationships, employment, demographics, etc. — to gauge one’s creditworthiness. More recently, banks have found ways to also include an applicant’s social media and other publicly available profiles in their credit checks. 

But this kind of profiling has huge socio-cultural implications. In fact, just last month, customers complained that the Apple credit card algorithm was discriminating against women! Apple, and its banking partner Goldman Sachs, are under investigation. The immediate future of machine learning, especially in building people’s profiles and making predictions, will be in eliminating biases, both in datasets and algorithms.

Each area of data science and each of its applications is evolving rapidly, coming with its own possibilities as well as pressing concerns. And this is just the beginning. Data science, as we know it today, is neither perfect nor efficient. Which is why there are millions of job opportunities all over the world for data scientists.

Stay at the forefront of future technologies like data science, machine learning / artificial intelligence and data analytics with Springboard’s 1:1 mentoring-led, project-driven online learning programs that comes along with a job guarantee.