1. Differentiate data cleaning and data transformation.
Data cleaning is fixing errors, removing duplicates, and handling missing values, and data transformation is converting data into a format and type.
2. How to work with missing data?
I check the pattern of missing data. I might fill in values using mean/median or remove the affected data if they don’t impact results.
3. What is exploratory data analysis?
EDA helps understand the data’s structure, detect patterns and test assumptions using statistics before applying any models.
4. How do you decide which visualisation to use?
Visualisation depends on our goals and data. Bar charts can be used for categories, line charts for trends, scatter plots to see relationships, and histograms to analyse distributions.
5. What are correlation and causation?
Correlation explains the relation between 2 different variables, and causation means one variable directly changes the other.
6. What data analysis tools do you use?
I use Excel, Python and SQL for analysis, and tools like Power BI.
7. How do you make sure your data is accurate?
I clean data, cross-check with multiple methods, check formulas or code, and interpret results to ensure that my data is reliable.
8. Is statistical significance important in analysis?
Statistical significance tells us if our results are performance-based or coincidental. It adds confidence to our conclusions in A/B testing or experiments.
9. Tell me about a time when your analysis led to a good decision.
My analysis showed that customer drop-off increased after a specific website promotion. Based on that insight, the team changed the promotion, which improved conversion rates.
10. What are the steps in a data analysis process?
The key steps are: Finding problems, collecting data, cleaning data, finding the data, performing analysis, visualising results, and sharing insights.