#### Question No.31

You are creating a model to predict the price of a student#39;s artwork depending on the following variables: the student#39;s length of education, degree type, and art form.

You start by creating a linear regression model. You need to evaluate the linear regression model.

Solution: Use the following metrics: Accuracy, Precision, Recall, F1 score and AUC. Does the solution meet the goal?

1. Yes

2. No

# Explanation:

Those are metrics for evaluating classification models, instead use: Mean Absolute Error, Root Mean Absolute Error, Relative Absolute Error, Relative Squared Error, and the Coefficient of Determination.

References:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/evaluate- model

#### Question No.32

You are analyzing a numerical dataset which contains missing values in several columns.

You must clean the missing values using an appropriate operation without affecting the dimensionality of the feature set.

You need to analyze a full dataset to include all values.

Solution: Remove the entire column that contains the missing data point. Does the solution meet the goal?

1. Yes

2. No

# Explanation:

Use the Multiple Imputation by Chained Equations (MICE) method.

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/clean-missing- data

#### Question No.33

You are analyzing a numerical dataset which contains missing values in several columns.

You must clean the missing values using an appropriate operation without affecting the dimensionality of the feature set.

You need to analyze a full dataset to include all values.

Solution: Replace each missing value using the Multiple Imputation by Chained Equations (MICE) method.

Does the solution meet the goal?

1. Yes

2. No

# Explanation:

Replace using MICE: For each missing value, this option assigns a new value, which is calculated by using a method described in the statistical literature as quot;Multivariate Imputation

using Chained Equationsquot; or quot;Multiple Imputation by Chained Equationsquot;. With a multiple imputation method, each variable with missing data is modeled conditionally using the other variables in the data before filling in the missing values.

Note:

Multivariate imputation by chained equations (MICE), sometimes called quot;fully conditional specificationquot; or quot;sequential regression multiple imputationquot; has emerged in the statistical literature as one principled method of addressing missing data. Creating multiple imputations, as opposed to single imputations, accounts for the statistical uncertainty in the imputations. In addition, the chained equations approach is very flexible and can handle variables of varying types (e.g., continuous or binary) as well as complexities such as bounds or survey skip patterns.

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/clean-missing- data

#### Question No.34

You are implementing a machine learning model to predict stock prices. The model uses a PostgreSQL database and requires GPU processing.

You need to create a virtual machine that is pre-configured with the required tools. What should you do?

1. Create a Data Science Virtual Machine (DSVM) Windows edition.

2. Create a Geo Al Data Science Virtual Machine (Geo-DSVM) Windows edition.

3. Create a Deep Learning Virtual Machine (DLVM) Linux edition.

4. Create a Deep Learning Virtual Machine (DLVM) Windows edition.

5. Create a Data Science Virtual Machine (DSVM) Linux edition.

#### Question No.35

You are moving a large dataset from Azure Machine Learning Studio to a Weka environment. You need to format the data for the Weka environment.

Which module should you use?

1. Convert to CSV

2. Convert to Dataset

3. Convert to ARFF

4. Convert to SVMLight

# Explanation:

Use the Convert to ARFF module in Azure Machine Learning Studio, to convert datasets and results in Azure Machine Learning to the attribute-relation file format used by the Weka toolset. This format is known as ARFF.

The ARFF data specification for Weka supports multiple machine learning tasks, including data preprocessing, classification, and feature selection. In this format, data is organized by entites

and their attributes, and is contained in a single text file.

References:

https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/convert-to-arff

#### Question No.36

DRAG DROP

You are creating an experiment by using Azure Machine Learning Studio.

You must divide the data into four subsets for evaluation. There is a high degree of missing values in the data. You must prepare the data for analysis.

You need to select appropriate methods for producing the experiment.

Which three modules should you run in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.  #### Question No.37

HOTSPOT

You are working on a classification task. You have a dataset indicating whether a student would like to play soccer and associated attributes. The dataset includes the following columns:

You need to classify variables by type.

Which variable should you add to each category? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.  #### Question No.38

HOTSPOT

You create an experiment in Azure Machine Learning Studio. You add a training dataset that contains 10,000 rows. The first 9,000 rows represent class 0 (90 percent).

The remaining 1,000 rows represent class 1 (10 percent).

The training set is imbalances between two classes. You must increase the number of training examples for class 1 to 4,000 by using 5 data rows. You add the Synthetic Minority Oversampling Technique (SMOTE) module to the experiment.

You need to configure the module.

Which values should you use? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.  #### Question No.39

DRAG DROP

You have a data-set that contains over 150 features. You use the dataset to train a Support Vector Machine (SVM) binary classifirer.

You need to use the Permutation Feature Importance module in Azure Machine Learning Studio to compute a set of feature importance scores for the dataset.

In which order should you perform the actions? To answer move al actions from from the list of Actions to the answer area and arrange them in the correct order.  #### Question No.40

You are performing clustering by using the K-means algorithm. You need to define the possible termination conditions.

Which three conditions can you use? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point.

1. A fixed number of iterations is executed.

2. The residual sum of squares (RSS) rises above a threshold.

3. The sum of distances between centroids reaches a maximum.

4. The residual sum of squares (RSS) falls below a threshold.

5. Centroids do not change between iterations.