北京网站建设的公,网站开发上线流程图,wordpress评论添加emoji表情,七牛云cdn wordpress人生苦短 我用python Python其他实用资料:点击此处跳转文末名片获取 数据可视化分析目录人生苦短 我用python一、数据描述1、数据概览二、数据预处理0、导入包和数据1、列名重命名2、提取数据中时间#xff0c;方便后续分析绘图三、数据可视化1、美国各个地区销售额的分布方便后续分析绘图三、数据可视化1、美国各个地区销售额的分布地图2、各产品类别销售额对比柱状图3、不同客户类别销售额对比饼图4、每月各产品销售额top10榜单5、销售额、净利润在时间维度的变化折线图6、销售额一、数据描述
数据集中9994条数据横跨1237天 销售额为2,297,200.8603美元 利润为286,397.0217美元 他们的库存中有1862件独特的物品 它们被分为3类 所有这些物品都在美国4个地区的49个州销售 来着793位客户的5009个订单。 数据集 Superstore.csv 来源kaggle 一共21列数据每一列属性描述如下
Row ID 每一行唯一的ID.
Order ID 每个客户的唯一订单ID.
Order Date 产品的订单日期.
Ship Date 产品发货日期.
Ship Mode 客户指定的发货模式.
Customer ID 标识每个客户的唯一ID.
Customer Name 客户的名称.
Segment The segment where the Customer belongs.
Country 客户居住的国家.
City 客户居住的城市.
State 客户所在的州.
Postal Code 每个客户的邮政编码.
Region “客户”所属地区.
Product ID 产品的唯一ID.
Category 所订购产品的类别.
Sub-Category 所订购产品的子类别.
Product Name 产品名称
Sales 产品的销售.
Quantity 产品数量.
Discount 提供折扣.
Profit 已发生的利润/亏损.1、数据概览
9994行21列数据
print(df.info())class pandas.core.frame.DataFrame
RangeIndex: 9994 entries, 0 to 9993
Data columns (total 21 columns):# Column Non-Null Count Dtype
--- ------ -------------- ----- 0 Row ID 9994 non-null int64 1 Order ID 9994 non-null object 2 Order Date 9994 non-null object 3 Ship Date 9994 non-null object 4 Ship Mode 9994 non-null object 5 Customer ID 9994 non-null object 6 Customer Name 9994 non-null object 7 Segment 9994 non-null object 8 Country 9994 non-null object 9 City 9994 non-null object 10 State 9994 non-null object 11 Postal Code 9994 non-null int64 12 Region 9994 non-null object 13 Product ID 9994 non-null object 14 Category 9994 non-null object 15 Sub-Category 9994 non-null object 16 Product Name 9994 non-null object 17 Sales 9994 non-null float6418 Quantity 9994 non-null int64 19 Discount 9994 non-null float6420 Profit 9994 non-null float64
dtypes: float64(3), int64(3), object(15)
memory usage: 1.6 MB
None二、数据预处理
0、导入包和数据
import pandas as pd
from pyecharts.charts import *
from pyecharts import options as opts
from pyecharts.commons.utils import JsCodedata pd.read_csv(r./data/Superstore.csv)1、列名重命名
重命名后的列名
data.columns [行ID, 订单ID, 订单日期, 发货日期, 发货方式, 客户ID, 客户名称, 客户类型, 国家, 城市, 州, 邮政编码, 所属区域, 产品ID,产品类别, 产品子类别, 产品名称, 销售额, 产品数量, 提供折扣, 利润/亏损]2、提取数据中时间方便后续分析绘图
data[年份] data[订单日期].apply(lambda x: x[-4:])
data[日期] pd.to_datetime(data[订单日期], format%m/%d/%Y)
data[月份] data[日期].dt.month
data[年-月] data[年份].astype(str) - data[月份].astype(str)三、数据可视化
1、美国各个地区销售额的分布地图
包含Order_Date Sales Quantity Profit year month
usa_sale data[[州, 销售额]].groupby(州).sum().round(2).reset_index()
print(usa_sale.head())def echarts_map(province, data, title主标题, subtitle副标题, label图例):province传入省份Listdata传入各省对应的数据Listtitle主标题subtitle副标题label图例map_ Map(init_optsopts.InitOpts(bg_color#080b30themedarkwidth980pxheight700px))map_.add(label, [list(i) for i in zip(province, data)],maptype美国)map_.set_global_opts(title_optsopts.TitleOpts(titletitlesubtitlesubtitlepos_leftcentertitle_textstyle_optsdict(color#fff) legend_optsopts.LegendOpts(is_showTrue pos_leftrightpos_top3%orienthorizontal ),visualmap_optsopts.VisualMapOpts(max_int(max(data)), is_piecewiseFalse))return map_.render(title - subtitle .html)echarts_map(usa_sale[州].tolist(), usa_sale[销售额].tolist(), title美国各地区销售额分布, subtitle销售额分布地图, label销售额) 2、各产品类别销售额对比柱状图
pro_category data[[产品类别, 销售额, 利润/亏损]].groupby(产品类别).sum().round(2).reset_index()
pro_category.head()def echarts_bar(x, y, y2, title主标题, subtitle副标题, label图例, label2图例2):x: 函数传入x轴标签数据y函数传入y轴数据title主标题subtitle副标题label图例bar Bar(init_optsopts.InitOpts(bg_color#080b30themedarkwidth900pxheight600px ))bar.add_xaxis(x)bar.add_yaxis(label, y,label_optsopts.LabelOpts(is_showTrue), category_gap70% , yaxis_index0)bar.add_yaxis(label2, y2,label_optsopts.LabelOpts(is_showTrue) , category_gap70% , yaxis_index1)bar.set_series_opts( label_optsopts.LabelOpts(is_showTrue,positiontopfont_size15,colorwhite,font_weightbolderfont_styleobliqueitemstyle_opts{normal: {color: JsCode(new echarts.graphic.LinearGradient(0, 0, 0, 1, [{offset: 0,color: rgba(0, 244, 255, 1)},{offset: 1,color: rgba(0, 77, 167, 1)}], false))shadowBlur: 15barBorderRadius: [100, 100, 100, 100]shadowColor: #0EEEF9shadowOffsetY: 2,shadowOffsetX: 2}})bar.set_global_opts(title_optsopts.TitleOpts(titletitlesubtitlesubtitlepos_leftcentertitle_textstyle_optsdict(color#fff) ),legend_optsopts.LegendOpts(is_showTruepos_leftrightpos_top3%orienthorizontal ),tooltip_optsopts.TooltipOpts(is_showTruetriggeraxisis_show_contentTrue,trigger_onmousemove|clickaxis_pointer_typecross),yaxis_optsopts.AxisOpts(is_showTrue,splitline_optsopts.SplitLineOpts(is_showFalse), axistick_optsopts.AxisTickOpts(is_showFalse), axislabel_optsopts.LabelOpts( font_size13, font_weightbolder ),)xaxis_optsopts.AxisOpts(boundary_gapTrueaxistick_optsopts.AxisTickOpts(is_showTrue)splitline_optsopts.SplitLineOpts(is_showFalse)axisline_optsopts.AxisLineOpts(is_showTrue)axislabel_optsopts.LabelOpts( font_size13font_weightbolder ),),)bar.extend_axis(yaxisopts.AxisOpts())return bar.render(title - subtitle .html)echarts_bar(pro_category[产品类别].tolist(), pro_category[销售额].tolist(),pro_category[利润/亏损].tolist(), title不同产品类别销售额对比, subtitle销售额对比柱状图,label销售额, label2利润)3、不同客户类别销售额对比饼图
customer_sale data[[客户类型, 销售额, 利润/亏损]].groupby(客户类型).sum().round(2).reset_index()def echarts_pie(x, y, title主标题, subtitle副标题, label图例):pie Pie(init_optsopts.InitOpts(bg_color#080b30themedarkwidth900pxheight600px))pie.add(, [list(z) for z in zip(x, y)])pie.set_series_opts(label_optsopts.LabelOpts(formatter{b}: {c},font_size15,font_styleoblique,font_weightbolder))pie.set_global_opts(title_optsopts.TitleOpts(titletitlesubtitlesubtitlepos_leftcentertitle_textstyle_optsdict(colorwhite)subtitle_textstyle_optsdict(colorwhite)),legend_optsopts.LegendOpts(is_showTrue,pos_leftrightpos_top3%, orientvertical, textstyle_optsopts.TextStyleOpts(colorwhite, font_size13, font_weightbolder, ),))return pie.render(title - subtitle .html)echarts_pie(customer_sale[客户类型], customer_sale[销售额], title不同客户类别销售额对比, subtitle , label销售额)
echarts_pie(customer_sale[客户类型], customer_sale[利润/亏损], title不同客户类别利润对比, subtitle , label利润/亏损) 4、每月各产品销售额top10榜单
month_lis data.sort_values(by日期)[年-月].unique().tolist()
month_sale []
for i in month_lis:month_data data[data[年-月] i][[产品名称, 销售额]].groupby([产品名称]). \sum().round(2).reset_index().sort_values(by销售额, ascendingFalse)[:10]month_data month_data.sort_values(by销售额, ascendingTrue)month_sale.append(month_data)def echart_line(x, y, title主标题, subtitle副标题, label图例):tl Timeline(init_optsopts.InitOpts(bg_color#080b30themedarkwidth1200pxheight700px ))tl.add_schema(is_auto_playTrueplay_interval1500is_loop_playTrue)for i, data1 in zip(x, y):day ibar Bar(init_optsopts.InitOpts(bg_color#080b30themedarkwidth1200pxheight700px))bar.add_xaxis(data1.iloc[:, 0].tolist())bar.add_yaxis(label,data1.iloc[:, 1].round(2).tolist(),category_gap40%)bar.reversal_axis()bar.set_series_opts( label_optsopts.LabelOpts(is_showTrue,positionright,font_styleoblique,font_weightbolder,font_size13,),itemstyle_opts{normal: {color: JsCode(new echarts.graphic.LinearGradient(1, 0, 0, 0, [{offset: 0,color: rgba(0, 244, 255, 1)},{offset: 1,color: rgba(0, 77, 167, 1)}], false))shadowBlur: 8barBorderRadius: [100, 100, 100, 100]shadowColor: #0EEEF9shadowOffsetY: 6,shadowOffsetX: 6, }})bar.set_global_opts(title_optsopts.TitleOpts(titletitle, subtitlesubtitle, pos_leftcenter, title_textstyle_optsdict(colorwhite), subtitle_textstyle_optsdict(color#white)),legend_optsopts.LegendOpts(is_showTrue, pos_leftright, pos_top3%, orientvertical, textstyle_optsopts.TextStyleOpts(colorwhite, font_size13, font_weightbolder, font_styleoblique,),),tooltip_optsopts.TooltipOpts(is_showTrue, triggeraxis, is_show_contentTrue,trigger_onmousemove|click, axis_pointer_typecross, ),yaxis_optsopts.AxisOpts(is_showTrue,splitline_optsopts.SplitLineOpts(is_showFalse), axistick_optsopts.AxisTickOpts(is_showFalse), axislabel_optsopts.LabelOpts( font_size13, font_weightbolder ),), xaxis_optsopts.AxisOpts(boundary_gapTrue, axistick_optsopts.AxisTickOpts(is_showTrue), splitline_optsopts.SplitLineOpts(is_showFalse), axisline_optsopts.AxisLineOpts(is_showTrue), axislabel_optsopts.LabelOpts( font_size13, font_weightbolder, ),),)tl.add(bar, day)return tl.render(title - subtitle .html)echart_line(month_lis, month_sale, title每月各产品销售额top10榜单, subtitle , label销售额)5、销售额、净利润在时间维度的变化折线图
sale_data data.sort_values(by日期)[[年份, 日期, 销售额, 利润/亏损]]. \groupby([年份, 日期]).sum().round(2).reset_index()
year_lis sale_data[年份].unique().tolist()
sale_data1 sale_data[sale_data[年份] 2014]
sale_data2 sale_data[sale_data[年份] 2015]
sale_data3 sale_data[sale_data[年份] 2016]
sale_data4 sale_data[sale_data[年份] 2017]
sale_data_lis [sale_data1, sale_data2, sale_data3, sale_data4]
print(sale_data4.head())def echarts_two_line(x, y, title主标题, subtitle副标题, label图例, label2图例2):x: 函数传入x轴table数据y函数传入y轴dataframe集合title主标题subtitle副标题label图例tab Tab()for table, data in zip(x, y):line1 Line(init_optsopts.InitOpts(bg_color#080b30, # 设置背景颜色themedark, # 设置主题width1200px, # 设置图的宽度height700px # 设置图的高度))line1.add_xaxis(data[日期].tolist())line1.extend_axis(yaxisopts.AxisOpts()) # 添加一条Y轴line1.add_yaxis(label,data[销售额].tolist(),yaxis_index0,is_symbol_showFalse, # 是否显示数据标签点is_smoothTrue, # 设置曲线平滑label_optsopts.LabelOpts(is_showTrue, # 是否显示数据),# 线条粗细阴影设置linestyle_opts{normal: {color: #E47085, # 线条颜色shadowColor: #E4708560, # 阴影颜色和不透明度shadowBlur: 8, # 阴影虚化大小shadowOffsetY: 20, # 阴影y偏移量shadowOffsetX: 20, # 阴影x偏移量width: 7 # 线条粗细},},)line1.set_global_opts(# 标题设置title_optsopts.TitleOpts(titletitle, # 主标题subtitlesubtitle, # 副标题pos_leftcenter, # 标题展示位置title_textstyle_optsdict(colorwhite), # 设置标题字体颜色subtitle_textstyle_optsdict(colorwhite)),# 图例设置legend_optsopts.LegendOpts(is_showTrue, # 是否显示图例pos_leftright, # 图例显示位置pos_top3%, # 图例距离顶部的距离orienthorizontal, # 图例水平布局textstyle_optsopts.TextStyleOpts(colorwhite, # 颜色font_size13, # 字体大小font_weightbolder, # 加粗),),tooltip_optsopts.TooltipOpts(is_showTrue, # 是否使用提示框triggeraxis, # 触发类型is_show_contentTrue,trigger_onmousemove|click, # 触发条件点击或者悬停均可出发axis_pointer_typecross, # 指示器类型鼠标移动到图表区可以查看效果# formatter {a}br{b}:{c}人 # 文本内容),datazoom_optsopts.DataZoomOpts(range_start0, # 开始范围range_end25, # 结束范围# orientvertical, # 设置为垂直布局type_slider, # slider形式is_zoom_lockFalse, # 锁定区域大小# pos_left1% # 设置位置),yaxis_optsopts.AxisOpts(is_showTrue,splitline_optsopts.SplitLineOpts(is_showFalse), # 分割线axistick_optsopts.AxisTickOpts(is_showFalse), # 刻度不显示axislabel_optsopts.LabelOpts( # 坐标轴标签配置font_size13, # 字体大小font_weightbolder # 字重),), # 关闭Y轴显示xaxis_optsopts.AxisOpts(boundary_gapFalse, # 两边不显示间隔axistick_optsopts.AxisTickOpts(is_showTrue), # 刻度不显示splitline_optsopts.SplitLineOpts(is_showFalse), # 分割线不显示axisline_optsopts.AxisLineOpts(is_showTrue), # 轴不显示axislabel_optsopts.LabelOpts( # 坐标轴标签配置font_size13, # 字体大小font_weightbolder # 字重),),)# 新建一个折线图Lineline2 Line()line2.add_xaxis(data[日期].tolist())# 将line数据通过yaxis_index指向后添加的Y轴# line2.extend_axis(yaxisopts.AxisOpts())line2.add_yaxis(label2,data[利润/亏损].tolist(),yaxis_index0,is_symbol_showFalse, # 是否显示数据标签点is_smoothTrue, # 设置曲线平滑label_optsopts.LabelOpts(is_showTrue, # 是否显示数据),# 线条粗细阴影设置linestyle_opts{normal: {color: #44B2BE, # 线条颜色shadowColor: #44B2BE60, # 阴影颜色和不透明度shadowBlur: 8, # 阴影虚化大小shadowOffsetY: 20, # 阴影y偏移量shadowOffsetX: 20, # 阴影x偏移量width: 7 # 线条粗细},},)line1.overlap(line2)tab.add(line1, table)return tab.render(title - subtitle .html)echarts_two_line(year_lis, sale_data_lis, title销售额、利润在时间维度的变化, subtitle ,label销售额, label2利润/亏损) 6、销售额
sale_sum int(data[销售额].sum())
num_count int(data[产品数量].sum())
profit_sum int(data[利润/亏损].sum())
print(profit_sum)def big_data(title主标题, subtitle副标题):c Pie(init_optsopts.InitOpts(chart_id1,bg_color#080b30,themedark,width300px,height300px,))c.set_global_opts(title_optsopts.TitleOpts(titletitle,subtitlesubtitle,title_textstyle_optsopts.TextStyleOpts(font_size36,color#FFFFFF,),pos_leftcenter,pos_topmiddle))return c.render(str(title) - subtitle .html)big_data(titlesale_sum, subtitle销售额) 问题解答 · 源码获取 · 技术交流 · 抱团学习请联系