Any topic (writer’s choice)

All companies keep databases of its customers. This assignment will be based on a financial credit providing company that has a database of information provided by its customers when they use their credit amounts. These customer records include information on the customers age, employment, other available credits etc. as shown in the database German credit. 

The company is interested in using these variables to predict credit loans for these applicants. The data set contains relevant information for 1,000 customers. This has been outsourced to a business analytics company. It will be your job as the analyst to develop a regression model to predict credit limits for new applicants.

The data are provided in the German credit Excel file 2 sheets:
1.    The Data;
2.    The Codelist with the data description.

As part of that business analytics team you are expected to produce a managerial report to answer the following questions:

1.    What type of variables are Duration and Credit History?

2.    Use methods of descriptive statistics to summarise the numerical variables. Provide an explanation of your findings.

3.    Produce a pivot table and bar chart for Employment comparing Credit Amount. Provide an explanation of your findings.

4.    Produce a pivot table and histogram for Age variable. Provide an explanation of your findings.

5.    Produce a pivot table and line graph showing the Credit Amount related to Age. Provide an explanation of your findings.

6.    Produce a correlation table for all numerical variables. Which variables do you think would have an impact on the provided Credit Amount? Provide an explanation of your findings.

7.    Add a dummy variable (REdummy) for the Real Estate variable and a dummy variable (ORdummy) for whether a customer owns residence. Use 1 for the Yes value and 0 for the No value. Please provide a picture of your data in the appendices to prove this has been completed.

8.    Develop a multiple regression model using Credit Amount as the dependent variable and Duration, Age, Number of credits and Number of dependents as independent variables. Discuss your findings.

9.    Now develop a multiple regression model using Credit Amount as the dependent variable and, REdummy and ORdummy as independent variables. Discuss your findings comparing these to the previous regression in question 8.

10.    Based on your analysis what conclusions and recommendations can you provide for the credit company?

a.    Is there another regression model that could be developed? Discuss?
b.    Are there any variables included that have no impact on predicting annual charges?

The managerial report should contain
    Abstract this should be brief no more than 100 words and answer the following question
o    What did I do in a nutshell?
    Introduction
o    What is the problem? no more than 150 words
    Literature Review
o    Whose work is related to mine? Find 3 pieces of literature that you can link to this topic no more than 300 words
    Methods
o    How did I solve the problem? no more than 100 words
    Results & Discussion around 600 words for this section
o    What did I find out? describe the results
o    What do the results mean? interpret the results
o    Tables/graphs/figures are not included in the word count
o    This is where you answer questions 1-9
    Conclusion –  around 250 words for this section
o    Bringing everything together what does it all mean?
o    This is where you answer question 10
    References
o    Whose work did I refer to?
o    Include 3 references above and also remember to include any textbooks that you may have used to help with analysis and interpretation.
o    Not included in word count
    Appendices
o    Extra information
o    Not included in word count