Skip to content

mahajan07/ANZ-DATA

Repository files navigation

ANZ@DATA Virtual Internship

The Data@ANZ Program gives a peek into Exploratory and Predictive Analytics

Data Science is a contact sport at ANZ - it’s about mining and linking datasets to develop stories that matter and challenge the status quo. To distill unique insights from the data that will enable the authority to take action. The company wants to know “what does all this data actually mean for us and our clients, and what should be the next steps?” This program gives you a small peek into the exciting Data@ANZ world.

TASKS

This task is based on a synthesised transaction dataset containing 3 months’ worth of transactions for 100 hypothetical customers. It contains purchases, recurring transactions, and salary transactions. The dataset is designed to simulate realistic transaction behaviours that are observed in ANZ’s real transaction data, so many of the insights you can gather from the tasks below will be genuine.

EXPLORATORY ANALYTICS

  1. Load the transaction dataset below into an analysis tool of your choice (Excel, R, SAS, Tableau, or similar)
  2. Start by doing some basic checks – are there any data issues? Does the data need to be cleaned?
  3. Gather some interesting overall insights about the data. For example -- what is the average transaction amount? How many transactions do customers make each month, on average?
  4. Segment the dataset by transaction date and time. Visualise transaction volume and spending over the course of an average day or week. Consider the effect of any outliers that may distort your analysis.

PREDICTIVE ANALYTICS

  1. Using the same transaction dataset, identify the annual salary for each customer
  2. Explore correlations between annual salary and various customer attributes (e.g. age). These attributes could be those that are readily available in the data (e.g. age) or those that you construct or derive yourself (e.g. those relating to purchasing behaviour). Visualise any interesting correlations using a scatter plot.
  3. Build a simple regression model to predict the annual salary for each customer using the attributes you identified above
  4. How accurate is your model? Should ANZ use it to segment customers (for whom it does not have this data) into income brackets for reporting purposes?

For a challenge: build a decision-tree based model to predict salary.

Does it perform better? How would you accurately test the performance of this model?

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published