In today’s world, any job interview can be intimidating. This is more so in areas like artificial intelligence and machine learning which are still nascent, jobs are competitive, and the expectations from candidates are sky-high. At Springboard, we believe that a successful career transition in such fields can only happen through mentorships. In this blog post, Data Science Solution Architect, Sami Ulla, draws from his experience to help you prepare for your next job interview. Here are his top artificial intelligence and machine learning interview questions and their right answers.

Artificial Intelligence and Machine Learning Interview Questions & Answers

We have taken two sections to categorise artificial intelligence interview questions and machine learning interview questions individually.

AI Interview Questions & the Best Ways to Answer Them

Before we get into the details, let’s start with the basics. This is a diagram of a neural network. Circles are nodes or layers. Input layers take input from the dataset. Hidden layer does the computation. Output layer delivers the result. Arrows are the vertices that connect the layers. 

Image credit: Wikipedia

Having understood that, let’s look at some of the questions that develop on these basics.

#1 What are the activation functions?

A node performs processing functions assigned to it. An activation function communicates to the nodes about what to calculate and what to cascade to the next node. It’s a piece of code with a set of instructions. Each kind of node has its own set of activation functions. Selection of the activation function defines how well your model performs as it contributes directly to computational efforts. Commonly used activation functions are Leaky ReLU and Softmax.

#2 What is the backpropagation algorithm?

Neural networks flow from left to right, as in the image above. To optimise learning and deliver expected results, the backpropagation algorithm is used to tune the weight. It helps identify what is the importance of a particular node with respect to the particular input. In this algorithm, as the inputs are moving from left to right, the change in the weights moves from right to left. 

#3 What is an epoch?

Epoch is a hyper parameter defined for every model training. It refers to the no. of times the training input has been referred for modifying the weights of the model. This is a part of the backpropagation process. The whole loop of a set going from left to right and back is called an epoch. There might be hundreds or thousands of epochs based on complexity of the model.

#4 What is a deep neural network?

Don’t get misled by this question, which is being asked at interviews for many artificial intelligence jobs. This is about the topology of the neural network. The number of hidden layers cascading from existing layers will tell if it is a deep or shallow neural network. There is no single number that measures this, it will be based on the need of the model.

#5 What is a convolution? How do you apply it to CNN?

Convolution is a process where two functions are multiplied. If you have f(x) and g(x), convolution can show how the shape of f(x) is modified by g(x). In convolutional neural networks (CNN), you use this filter to get results in the form of matrices. These matrices go through max pooling, through which it begins to recognise images.

CNN is one of many algorithms. You might get similar questions for others too. So, when you learn an algorithm, understand where it is derived from, its strengths and weaknesses, alternatives, etc.

Machine Learning Interview Questions & Answers

Now, let’s focus on the questions you might get at interviews for machine learning jobs.

#1 What is the law of large numbers?

Law of large numbers is a basic axiom in probability. In essence, it says that when you perform a large number of trials, the average results will be close to the expected value. And it will get closer as you perform more trials. For instance, you’d think the probability of getting heads when you flip a coin is 50%. If you flip 10 times, you might get 7 heads. But it’s unlikely that you’ll get 70,000 heads when you flip 100,000 times. It will be close to 50,000. 

Law of large numbers is used a lot in data engineering in deciding how many observations will be needed for optimum model performance.

#2 What is an outlier? How do you identify it between univariate and multivariate cases?

Outlier is a value that deviates significantly from other observations in the dataset. For univariate cases, use a box plot and identify outliers. For multivariate cases, you can use an n-dimensional graph or one-class SVM.

#3 What is an imputation? Provide an example application for each type of imputation.

Imputation is the process of replacing missing values in a dataset with substitutes. If you have a column that is a continuous number, use mean. If it’s a frequency, use mode. You can also use KNN — finding the nearest neighbours for the missing fields.

#4 How do you define a cost function? What is the cost function of linear regression?

Cost function, also known as error function, is used to measure model performance. It quantifies the difference between predicted and actual values. For linear regression, it is the sum of the difference between predicted value and actual value.

#5 If logistic regression is a classification algorithm, why is it called regression?

Logistic regression helps find if a data points to an event that will occur or not. So, the end-result is either 0 or 1. When you look into the working of this model, logistic regression helps identify what are the logodds, or probability of the event to occur. When you impose this probability and decide a threshold, you are trying to predict a continuous value, and convert that into a classification problem. So, internally, it’s a regression process.

Read more about this topic here.

#6 What is class imbalance? How do you resolve it?

This is when the class distribution is highly imbalanced. For instance, if the fraudulent transaction is at 0.001%, then, the learning algorithm will have low predictive accuracy, because it doesn’t have enough observations to learn from. So, it will learn normal transaction dynamics, rather than fraudulent transaction dynamics.

You can resolve it by taking a random sample of equal number as the fraudulent transactions. If you have 15 fraudulent, take 15 random samples from the population while training the algorithm. This balances the input data.

#7 What is the best way to decide the value ‘K’ in K-means clustering?

You can use the elbow curve technique. You would start with a reasonable number based on instinct or business knowledge. And then, you’ll slowly increase or decrease to optimise.

#8 Explain the gradient descent algorithm.

Gradient descent is an optimisation algorithm. It minimises a function by iterating in the direction of the steepest descent. It helps decide the best set of weights that can be assigned to independent variables so it explains the dependent variable.

Listen to Sami Ulla answer these artificial intelligence and machine learning questions with examples in his Youtube session on the Springboard channel. There are also bonus tips on interview preparation at the end. For a strong foundational understanding of artificial intelligence and machine learning, consider Springboard’s online learning program. It is the only AI/ML program in India with a job guarantee!