본문 바로가기

분류 전체보기

(102)

07_07_32. group별로 ts time series 한번에 만들기 # 일자 reindex datelist = pd.date_range( start = df.oper_dt.min(), end = df.oper_dt.max(), freq='D') df_01 = df.set_index('oper_dt').groupby(['origin_bizpl_cd', 'ctgr_cd']).apply(lambda x: x.reindex(datelist, fill_value=0)).drop(columns=['origin_bizpl_cd', 'ctgr_cd']).reset_index() # groupby reindex df_01 = df_01.rename(columns={'level_2':'oper_dt'})

07_01_31. 데이터프레임 groupby 계산시 컬럼명(column) 변경 df_4w_ag = df_4w.groupby(["origin_bizpl_cd" , "goods_cd"] ).agg({"bg_purch_qty": [("sum_4w", "sum"), ("mean_4w", "mean"), ("max_4w", "max")], "bg_purch_qty_yn":[("count_4w", "sum")]}).droplevel(0, axis=1).reset_index()

07_01_30. convert multi index to single index (Series) # convert multi index to single index : reset_index s_roll_incdec = df_covid.groupby(["gubun"], group_keys=False)["incdec"].rolling(7).sum().reset_index(drop=True)

07_01_29. isocalendar() 일자의 주차 구하기 weeknum ## 특정 일자의 주차를 알고 싶을 때, ## isocalendar 가 튜플 타입으로 반환하며, 년, 주 datetime.date(2019,12,31).isocalendar()[0] datetime.date(2019,12,31).isocalendar()[1]

07_01_28. 오른쪽 특정 문자 삭제하기 ### 문자열 맨 오른쪽에 . 없애기 df_covid["qurrate"] = df_covid["qurrate"].str.rstrip('.') 왼쪽 특정 문자 삭제 lstrip('.') 양쪽 특정 문자 삭제 strip('.')

07_01_27. groupby rolling sum & Series name 변경 ## group 별로 rolling sum/average 하는 방법 ## rolling(n) : n 길이만큼 rolling 하겠다는 의미 즉, window size df_tmp[ df_tmp["oper_dt"].str[:6]=='201901'].groupby(["origin_bizpl_cd","ctgr_cd"], as_index=False)["buyget_sale_qty"].rolling(3).sum() # min_period=n : 데이터가 최소 n개라도 존재하면 값을 구하고 싶을 때 사용 df_tmp[ df_tmp["oper_dt"].str[:6]=='201901'].groupby(["origin_bizpl_cd","ctgr_cd"], as_index=False)["buyget_sale_qty"].r..

07_01_26. date_range 연속적인 날짜 생성 s_start_dt = pd.date_range('2018-01-01', periods = 38, freq = 'MS').strftime('%Y%m%d') s_end_dt = pd.date_range('2018-01-01', periods = 38, freq = 'M').strftime('%Y%m%d') # df_date_range = pd.DataFrame({ "start_dt" : s_start_dt , "end_dt" : s_end_dt})

07_01_25. list 숫자 range 로 범위 설정 후, str 문자형 변환 후 앞에 0으로 채우기 ## 숫자 range 로 범위 설정 후, str 문자형 변환 year= list(map(str, range(2019,2022))) #2019,2020,2021 ## 숫자 range 로 범위 설정 후, str 문자형 변환 후 앞에 0으로 채우기 month=[] for x in range(1, 13): month.append(str(x).zfill(2))

07_01_24. 데이터프레임 컬럼명 소문자로 변환 df_dw_bizpl.columns.str.lower()

07_01_23. 데이터프레임 컬럼값들 중 가장 큰 컬럼명 가져오기 idxmax ## 프로모션들 행 기준 큰 값 컬럼명 가져오기 df_prmt_tmp =df_sub_01[df_sub_01["prmt_tot"]>0][["prmt_1","prmt_2","prmt_3","prmt_4","prmt_5"]].idxmax(1).to_frame(name= "max_prmt")

이전 1 ··· 4 5 6 7 8 9 10 11 다음

티스토리툴바