Data Analyst Interview Questions and Answers - 2

9. Name some statistical methods used by data analysts.

Some of the statistical methods used by data analysts are:

i.) Cluster processes
iii.) Imputation techniques
iii.) Markov process
iv.) Mathematical optimization
v.) Simplex Algorithm

10. What is imputation?

Imputation the process of replacing missing data with substituted values. Its various types are:

A.) Single Imputation - It imputes the values a single time.  

i.) Hot-deck imputation: Here a missing value is imputed from a randomly selected similar record with the help of a punch card
ii.) Cold deck imputation: Its way of working is same as hot deck imputation. It is a little more advanced and can select selects donors from other datasets as well.
ii.) Mean imputation: It replaces missing values with the mean of given variable.
iv.) Regression imputation: Here the missing value is replaced by predicted value based on other variables
v.) Stochastic regression: It is quite similar to regression imputation with the difference that it adds the average regression variance to regression imputation

B.) Multiple Imputation - It estimates the values multiple times.

11. What do you know about KPI?

KPI stands for Key Performance Indicator. It consists of various reports, spreadsheets and charts about the whole business process.

12. What do you know about 80/20 Rule?

This rule largely means that 80% of your outcome is the result of 20% of your actions.

13. What is Clustering?

Clustering is a classification method. The Clustering algorithm divides data set into natural groups also called as Clusters.

14. Explain Data capturing and Data mining.

Data Capturing: Data capturing is the process of collecting data from many sources in different formats. This data is later made usable by getting it in a readable format, with the help of various software programs.

Data Mining: Data mining is also known as Knowledge discovery process. In this process, useful information is extracted from the captured data which can be used for forecasting and to make business decisions.

15. What are the steps of data mining?

Following are the important steps of Data Mining:

i.) Identify the source of information from where you want to retrieve the data.
ii.) Identifying the data which needs to be analyzed.
iii.) Picking up the information from the data you were looking for.
iv.) Identifying the important values from the extracted data.
v.) Presenting and reporting the results from the extracted data.

16. What are the advantages and disadvantages of data mining?

The advantages of data mining are as follows:

i.) Data mining helps to predict future trends.

ii.) It helps to keep the track of customer's behavior and buying habits.

iii.) It helps in the process of decision making.

iv.) As it gives useful insights about the customer it helps to serve the customer better which eventually increases the revenue of the company.

v.) It helps the analysts to understand complex data.

The disadvantages of data mining are as follows:

i.) Personal information is collected and is leaked causing privacy issues.

ii.) Data mining may not always be accurate. Inaccurate information may lead to wrong decisions.

iii.) Data mining also causes security issues as a lot of hackers steal data from many big organizations. Hence, security is one of the big issues.