[MCQ's] Machine Learning Question & Answer

Question 1 :
Various ____ methods and techniques are used for calculation of the outliers.

distance calculation
prediction
optimization
integration

Question 2 :
Which of the following is a disadvantage of decision trees?

Decision trees require less preprocessing.
Decision trees are robust to outliers.
Decision trees are prone to be overfit.
Decision tree traces all possible alternatives.

Question 3 :
Which of the following techniques would perform worst for reducing dimensions of a data set?

Removing columns which have high variance in data
Removing columns which have too many missing values
Removing columns with redundant data
Removing columns with similar data trends

Question 4 :
Below are the 8 actual values of the target variable in the train file.[0,0,0,1,1,1,1,1]What is the entropy of the target variable?

-(5/8 log(5/8) + 3/8 log(3/8))
5/8 log(5/8) + 3/8 log(3/8)
3/8 log(5/8) + 5/8 log(3/8)
5/8 log(3/8) – 3/8 log(5/8)

Question 5 :
Predicting on whether it will rain or not tomorrow evening at a particular time is a type of _________ problem.

Clustering
Regression
Unsupervised learning
Supervised learning

Question 6 :
Which of the following problems can be solved by supervised learning too? Assume appropriate dataset is available.

From a large collection of spam emails, discover if there are sub types of spam emails.
Given data on how 1000 medical patients respond to an experimental medicine , discover whether there are different categories of patients in terms of how they respond to , and if so what are these categories
Given a large dataset of medical records from patients suffering from heart disease, try to learn whether there might be different groups of patients for which customised treatment is required
Given genetic (DNA) data from a person, predict the odds of the person developing diabetes over the next 10 years

Question 7 :
You ran gradient descent for 20 iterations with learning rate=0.2 and compute cost for each iteration. You observe that cost decreases after each iteration. Based on this which conclusion is more suitable.

0.2 is an effective choice of learning rate.
Try larger values of learning rate like 1.
0.2 is not an effective choice of learning rate.
The model is overfitting.

Question 8 :
Support Vector Machine(SVM) performs well in _____ dimension spaces.

high
low
wide
single

Question 9 :
K-fold cross-validation is____.

linear in K
quadratic in K
cubic in K
exponential in K

Question 10 :
Why SVM’s are more accurate than logistic regression?

SVM gives more weightage to wrongly classified data points.
SVM gives more weightages to data points which are correctively classified .
SVM uses all the data points assuming a probabilistic model.
SVM uses concept of large margin seperator and for non linearity it uses kernel functions

Question 11 :
What is the approach of the basic algorithm for decision tree induction?

Greedy
Bottom up
Procedural
Step by Step

Question 12 :
Which of the following is not supervised learning algorithm

PCA
Decision Tree
Bayes Theorem
Linear regression

Question 13 :
While comparing reinforcement learning and supervised learning, which of the following statement is true?

Both in reinforcement and supervised learning decisions are taken sequentially
Supervised learning is best suited where human interaction is prevalant wheareas reinforcement learning is best suited for sofware systems.
Reinforcement learning works by interacting with environment wheareas supervised learning works on sample data
Both in reinforcement and supervised learning decisions taken at one time step is independent with respect to previous timestep.

Question 14 :
For a trained logistic classifer given a sample x,it gives prediction as 0.8.This means that___.

P(Y=0|x)=0.8
P(Y=1|x)=0.8
P(Y=0|x)=0.2
P(Y=1|x)=0.2

Question 15 :
Which algorithm is State Transition Based Algorithm?

K-Nearest neighbor
Hidden markov model
Bayes theorem
Linear regression

Question 16 :
Principal component analysis(PCA) is used for___.

Dimensionality Enhancement
LU Decomposition
QR Decomposition
Dimensionality Reduction

Question 17 :
What is true about the discount factor in reinforcement learning?

discount factor should be greater than 1
discount factor should always be negative
discount factor should be in range of 0 and 1
discount factor can be any real number

Question 18 :
What are support vectors?

These are the datapoints which help the SVM to generate optimal hyperplane.
It is an intermediate vector generated during calculation of optimal hyperplane
In SVM all the data points are called support vectors.
This are predefined vectors used in calculating hyperplane

Question 19 :
The process of obtaining best result under given constraints is called as

Optimization
Generalization
Summation
Regularization

Question 20 :
A and B are two events. If P(A, B) decreases while P(A) increases, which of the following is true?

P(A|B) decreases
P(B|A) decreases
P(B) decreases
P(B|A) increases

Question 21 :
Machine Learning comes under which of the following domain?

Artificial Intelligence
Network Security
Engineeering sciences
System programming

Question 22 :
Which of the following option(s) is / are true? 1.You need to randomize parameters in PCA 2.You don’t need to randomize parameters in PCA 3.PCA can be trapped into local maxima problem 4.PCA can’t be trapped into local minima problem

1 and 3
1 and 4
2 and 3
2 and 4

Question 23 :
What is the major component of PCA?

all the eigen vectors for the projection space
The average of eigen vectors for the projection space
Value of the last among the eigen vectors for the projection space
Value of the first among the eigen vectors for the projection space

Question 24 :
Which of the following is a clustering algorithm in machine learning?

Expectation Maximization
CART
Gaussian Naïve Bayes
Apriori

Question 25 :
A machine learning model gives 95% accuracy on an unbalanced dataset. What can be concluded about the classifier?

Since accuracy is 95% the classifier will perform well in real life scenario
Classifier will give good accuracy on the validation of the dataset.
Unbalanced Dataset will not affect the performance of classifier
Because of an unbalanced dataset the classifier will predict only one class of samples accurately.

Question 26 :
Choose correct applications of reinforcement learning?

Aircraft Control
Sentimental analysis
House price prediction
Spam Email Filtering

Question 27 :
Which algorithm is used for performing probabilistic reasoning on temporal data?

Hill-climbing search
Hidden markov model
Naïve Method
Support Vector Machine

Question 28 :
You are training an RBF SVM with the following parameters: C (slack penalty) and γ = 1/2σ 2 (where σ 2 is the variance of the RBF kernel). How should you tweak the parameters to reduce overfitting?

Increase C and/or reduce γ
Reduce C and/or increase γ
Reduce C and/or reduce γ
Reduce C only (γ has no predictable effect on overfitting)

Question 29 :
___________ phenomenon refers that a model is neither trained on training data nor generalized properly on new data.

good fitting
overfitting
moderate fitting
underfitting

Question 30 :
Neural networks:

Optimize a convex objective function
Can use a mix of different activation functions
are not suitable for learning.
Can only be trained with stochastic gradient descent

Machine learning (M.L) MCQ's