Data Science Technical Interview Questions (MCQ): Test Your Knowledge! What is a decision tree?A decision tree is a tree-like model that is used to make predictionsA decision tree is a type of databaseA decision tree is a tool for data visualizationA decision tree is a type of machine learning algorithmWhat is the purpose of the activation function in neural networks?To speed up trainingTo introduce non-linearityTo reduce overfittingTo normalize inputsWhat is the purpose of the weight initialization in neural networks?To speed up trainingTo set initial values for model parametersTo reduce overfittingTo improve accuracyThe main purpose of A/B testing is:To compare two versions of somethingTo analyze big dataTo perform clusteringTo reduce dimensionalityWhat is the purpose of the discriminator in GANs?To generate fake samplesTo classify real vs fake samplesTo reduce model complexityTo speed up trainingWhat is an example of supervised learning?Spam email detectionGrouping similar customersFinding patterns in dataReducing data dimensionalityWhat is the primary goal of optical character recognition (OCR)?To generate charactersTo recognize and convert text in imagesTo classify fontsTo reduce image complexityWhich of these is not a type of data sampling method?Random samplingStratified samplingCluster samplingQuantum samplingHow do you approach a data analytics project? Define problem and objectives Collect and explore data Model data and interpret results All of the aboveWhat is the main goal of anomaly detection?To generate anomaliesTo identify unusual patternsTo classify normal dataTo increase data complexityWhat is the role of feature scaling in clustering? Feature scaling: standardizes data Improves clustering perf. Handles varying data scales All of the aboveWhat is the main purpose of k-means clustering?To classify labeled dataTo group similar data pointsTo reduce data dimensionsTo predict continuous valuesIn deep learning, GAN stands for:General Adversarial NetworkGenerative Adversarial NetworkGrouped Analytical NodesGuided Attention NormalizationWhat does IoT stand for in data science context?Internet of ThingsIntegration of TechnologiesInventory of ToolsNone of the aboveIn the context of multicollinearity, what does a Variance Inflation Factor (VIF) greater than 10 typically indicate?No multicollinearityModerate multicollinearitySevere multicollinearityPerfect multicollinearityWhat is the main purpose of the Jensen-Shannon divergence?To diverge Jensen's theoriesTo measure similarity between probability distributionsTo classify Shannon entropyTo reduce divergence complexityWhich of these is not a type of data governance framework?COBITITILTOGAFQUANTUMWhat is the main difference between bagging and boosting?Bagging reduces bias, boosting reduces varianceBagging reduces variance, boosting reduces biasThey are the sameNeither affects bias or varianceExplain Principal Component Analysis (PCA). Dimensionality reduction technique Transforms data to lower dimension Preserves most variance All of the aboveWhich algorithm is best suited for regression tasks?Linear RegressionK-Means ClusteringDecision TreeNaive BayesWhat's the primary purpose of treating outlier values?To remove all extreme valuesTo improve model accuracyTo increase data volumeTo visualize data distributionWhat is the purpose of cross-validation in machine learning?To increase model complexityTo assess model performanceTo speed up trainingTo visualize resultsWhat is spaCy and its diff. from NLTK? spaCy: NLP lib. Focuses on prod. use Efficient, supports deep learning All of the aboveWhy is cross-validation important in machine learning?Increases training speedProvides more training dataGives a better estimate of model performanceSimplifies the model architectureWhat is the primary purpose of hypothesis testing?To prove a theoryTo make inferences about a populationTo calculate probabilitiesTo determine causation Score: 0/25 Retake Quiz Next Set of Questions