USYD悉尼大学 | Business School商学院数据分析EDA | python jupyter notebook .ipynb代写, sqlite代写

Description

Please note that the scenario for this assignment is fictional.

A group of coffee-loving alumni had the business idea of setting up a cafe on campus that only opens at night. To test their business plan and attract potential investors, they opened a pop-up store next to the Fisher Library during the second semester of 2019. The trading hours of the pop-up store is between 7pm and 1am for 7 days a week. They named the shop ‘Café Insomnia’.

Individual transaction records of the pop-up store are collected from 2019 July 22 to 2019 December 22. In addition to collecting purchase related information, each customer was asked to select where on campus she/he was studying and whether it is raining (via an iPad at the cashier’s counter). The founders of Café Insomnia are interested in knowing what affects their coffee sales on an hourly basis.

data_dictionary.txt

The database consists of three tables “ci_transaction”, “study_area” and “drink”.

Below is the data dictionary and notes on variables.

——————————————————

ci_transaction

– id: unique transaction id (int)

– date: transaction date (string)

– days_after_open: number of days since opening the pop-up store on 2019-07-22 (int)

– day_of_week: day of the week (string, ‘Mon’ – ‘Sun’)

– hours_after_open: number of hours since opening at 7pm (int, 0 – 5)

– drink_id: id of the drink being purchased (int, 0 – 16)

– quantity: quantity of purchase (int)

– raining: whether it is raining at the time of purchase (string, ‘Yes’, ‘No’, missing indicated by ‘NA’)

– study_area_id: id of the customer’s study area (int, 0 – 6, missing indicated by -1)

——————————————————

study_area

– id: unique study area id (int, 0 – 6)

– name: building name (string)

– dist_to_cafe: distance to cafe insomnia (in meters) (int)

——————————————————

drink

– id: unique drink id (int, 0 – 16)

– name: drink name (string)

– unit_price: unit price of the drink in AUD (float)

Max Length:  500 words +/- 10% (excluding code)

In this task you are required to produce a short EDA vignette that explores how one of the following attributes is related to hourly coffee sales (in AUD).

Number of days since the pop-up store opened.

Seasonal factors (such as day-of-the-week and hour-of-the-day).

Where on campus their customers are studying and whether it is raining.

Your vignette will consist of a Jupyter Notebook, in which you will use Markdown cells to provide commentary on your EDA process.

Due to the small word limit we do not expect a large amount of detail. This vignette should be used as a way to share your preliminary findings with your group which is to be expanded on in the group report.

Suggested Structure

Heading – include which attribute you are exploring

Main body (approx. 400 words) – this should contain the plots, tables, corresponding commentary and your Python code that make up your exploratory analysis

Further work (approx. 100 words) – this is a small note to your teammates about how your analysis could be further refined and used in your final report

Submission Items

A Jupyter Notebook

Mark:

Analysis(8 marks)

A detailed and comprehensive analysis. The analysis explores all variables which could be involved. Justification is provided as to the inclusion or exclusion of variables from analysis.

Appropriate techniques are used and accompanied by justification. Shortcomings of techniques used and limitations of analysis are discussed.

The context is carefully considered and has clearly informed the analysis.

Results Presentation (4 marks)

The figures or tables produced are appropriate and support the analysis.

The figures produced are of high visual quality, are well appointed and easy to read.

Written language demonstrates outstanding precision, clarity, and concision.

Notebook Presentation (3 marks)

Features of the Jupyter Notebook have been used to pleasing effect. The notebook is well laid out and formatted.

Code is clear, consistent, follows best practices, descriptions and comments are excellent.

Jupyter Notebook is runnable without error.

统计字数

main body

The pop-up store was firstly opened on July.

From the above table and bar chart, we can find that the average hourly sales (AUD) continued growth from 67 AUD on July to 1218 AUD on November. And it slightly decreased to 1063 AUD on December.

This finding is in line with the normal assumption that as time grows, more customers know this cafe, and sales will increase accordingly. And during the final exam week in November, the demand for coffee reached its peak, so the hourly sales of the coffee shop also reached its peak at this time.

The cafe opens 7 days a week.

From the perspective of days in a week, the hourly sales volume takes the shape of a mountain. On Wednesday, which is the middle of the weekdays, the hourly sales reached its peak. I think the reason is that Wednesday is the busiest time for students on campus, and there are most courses scheduled on Wednesday. Therefore, the demand of coffee is highest on Wednesday.

On weekend, some students study at home rather than on campus or fisher library. So the sales decrease on Saturday and Sunday.

The trading hours of the pop-up store is between 7pm and 1am.

From the chart, we can see that the highest time period for coffee sales in a day is 1 (from 8pm to 9pm). And the high sales volume is concentrated in the time period of 7pm – 10pm. In the following time, the later the time, the lower the sales volume. Because students are going back home.

We know that the latest courses or tutorials will end before 10pm. After that, most students will leave campus, so the number of people buying coffee will be much smaller. Therefore, this trend of sales volume is also in line with the actual situation.

From the line charts of hourly sales volume in each day, they have a common trend that peak sales are between 7pm – 10pm. Subsequently, the later the time, the lower the hourly sales. This is consistent with the findings of 4.1.3, and no day is an exception.

Only Thursday is slightly different. Its 7-8pm and 9-10pm sales are higher than 8-9pm, while the peaks of the other six days are at 8-9pm. This might be resulted from the courses schedule on Thursday.

Further work

The highest average hourly coffee sales (AUD) is on November, which includes the final exam weeks.

The average hourly sales peak in a week is Wednesday.

The top average hourly sales in a day is from 8pm to 9pm. Sales drops after this period. This pattern is the same for every day of week.

– Limitations of my analysis.

The analysis only adds one seasonal factor, which is the month of transaction. There is no other seasonal factor than month, day_of_week, and hours_after_open in the analysis.

Also, it uses simple bar and line charts to get findings. No advanced methods like machine learning or similar techniuqes are applied.

– Things should be done further.

Try to think about more seasonal factors that can be derived from the original dataset and analyse them.