# Data Science Technical Interview Questions (MCQ): Test Your Knowledge!

What is the F1 score?

The F1 score is a weighted average of precision and recall

The F1 score is the sum of precision and recall

The F1 score is the difference between precision and recall

The F1 score is the product of precision and recall

What is the role of machine learning in chatbots?

To enable the chatbot to learn and improve its responses over time

To hardcode responses for the chatbot

To perform data visualization

To generate static content

What is the primary goal of data normalization in databases?

To make data abnormal

To organize data to reduce redundancy

To increase data complexity

To encrypt normal forms

What is a common technique for feature scaling in machine learning?

Standardization

Normalization

One-hot encoding

Label encoding

Which algorithm is commonly used for anomaly detection in images?

Linear regression

Naive Bayes

Isolation Forest

Decision Trees

What does GUI stand for in software design?

General User Integration

Graphical User Interface

Guided Universal Interaction

Generated Utility Implementation

Which of these is NOT a common method for natural language generation?

Rule-based

Statistical

Neural

Random generation

What is the main purpose of data lineage?

To create family trees for data

To track the origin and transformations of data

To increase data volume

To encrypt data paths

What is spaCy and its diff. from NLTK?

spaCy: NLP lib.

Focuses on prod. use

Efficient, supports deep learning

All of the above

What is the main purpose of generative adversarial networks?

To generate adversaries

To learn data distributions and generate new samples

To reduce network complexity

To encrypt generated data

What is the primary goal of LIME (Local Interpretable Model-agnostic Explanations)?

To increase model accuracy

To explain individual predictions

To speed up model training

To reduce model complexity

What is the difference between precision and accuracy?

Precision is consistency, accuracy is correctness

Accuracy is consistency, precision is correctness

They are the same

Neither relates to consistency

Which of these is an example of unsupervised learning?

Linear regression

Logistic regression

Principal Component Analysis

Random forest

What is the main purpose of dropout?

To drop out of training

To prevent overfitting

To reduce model size

To increase dropout rate

Which statement about machine learning is true?

It cannot predict outcomes

It always needs labeled data

It can learn from past data

It does not require any data

What is a real-world application of clustering?

Customer segmentation for targeted marketing

Predicting stock prices

Classifying email as spam or not spam

Reducing data dimensionality

What is the difference between precision and recall?

They are the same

Precision focuses on false positives, recall on false negatives

Recall focuses on false positives, precision on false negatives

Both measure the same thing

What is boosting in machine learning?

A technique to reduce model complexity

An ensemble method that combines weak learners

A method to increase training data size

A way to decrease model training time

What is the difference between machine learning and general programming?

ML uses data to learn

GP uses instructions

ML adapts to new data

GP follows fixed rules

What does MAE stand for in regression analysis?

Mean Absolute Error

Maximum Average Error

Mean Absolute Estimation

Median Average Error

What is time series analysis?

Time series analysis is a set of techniques for analyzing time series data

Time series analysis is a type of database

Time series analysis is a tool for data visualization

Time series analysis is a type of machine learning algorithm

Which of these is NOT a common technique for handling imbalanced datasets?

Oversampling

Undersampling

SMOTE

Overfitting

Which is not a type of data visualization chart?

Bar chart

Line chart

Pie chart

Quantum chart

What is the process of dividing a dataset into smaller, more manageable subsets for training and testing a model?

Data splitting

Data cleaning

Data transformation

Data aggregation

What does the Bias-Variance tradeoff address?

The speed of the model

The balance between model complexity and generalization

The size of the training data

The number of layers in a neural network

Score: 0/25