본문 바로가기

언어 꿀Tip/API

09_06. 나이스교육정보개방포털 API - 학사일정

전국 학교의 학사일정 데이터를 가져와보고자 함

 

1. 모듈 불러오기

import requests
import re
import pandas as pd
import json
import os
import time, datetime
from dfply import *
from urllib.request import urlopen
import math

 

2. today 날짜 Setting

today = datetime.date.today().strftime('%Y%m%d')

 

3. API 인증키

api_key = 'api 인증키 입력'

 

4. 이전 09_05(1) 에서 수집한 학교 list 정보 가져오기 (첨부)

→ 학교코드 사용 (sd_schul_code)

https://based-infos.tistory.com/58

df_school = pd.read_parquet('20210524_school_info.parquet')

 

5. API 학사일정 데이터 수집

%%time
df_sch = pd.DataFrame()
backup = pd.DataFrame()
l_no = []

for i in range(0, len(df_school)):
    l_info = df_school.iloc[i]
    
    v_atpt = l_info['atpt_ofcdc_sc_code']
    v_schul = l_info['sd_schul_code']
    
    if i % 500 == 0:
        print( "{} 번째: {}".format(i, l_info['schul_nm']))

    api_schedule_url = "https://open.neis.go.kr/hub/SchoolSchedule?Type={}&Key={}&pIndex=1&pSize=1000&ATPT_OFCDC_SC_CODE={}&SD_SCHUL_CODE={}&AA_FROM_YMD=20210301&AA_TO_YMD=20210630".format("JSON", api_key, v_atpt, v_schul)

    api_schedule_response = requests.get(api_schedule_url)
    api_schedule_info = api_schedule_response.json()

    try: 
        tmp_sch = pd.DataFrame(api_schedule_info['SchoolSchedule'][1]['row'])

        if tmp_sch['SCHUL_CRSE_SC_NM'][0] == '초등학교':
#             tmp_sch_01 = tmp_sch[tmp_sch.EVENT_NM.str.contains('|'.join(l_event))].reset_index(drop=True)
            tmp_sch_01 = tmp_sch[['ATPT_OFCDC_SC_CODE', 'SD_SCHUL_CODE', 'SCHUL_NM', 'AY', 'AA_YMD', 'EVENT_NM'
                                     , 'ONE_GRADE_EVENT_YN', 'TW_GRADE_EVENT_YN', 'THREE_GRADE_EVENT_YN', 'FR_GRADE_EVENT_YN', 'FIV_GRADE_EVENT_YN', 'SIX_GRADE_EVENT_YN']]
        else:
#             tmp_sch_01 = tmp_sch[tmp_sch.EVENT_NM.str.contains('|'.join(l_event))].reset_index(drop=True)
            tmp_sch_01 = tmp_sch[['ATPT_OFCDC_SC_CODE', 'SD_SCHUL_CODE', 'SCHUL_NM', 'AY', 'AA_YMD', 'EVENT_NM'
                                     , 'ONE_GRADE_EVENT_YN', 'TW_GRADE_EVENT_YN', 'THREE_GRADE_EVENT_YN']]
            tmp_sch_01['FR_GRADE_EVENT_YN'] = np.nan
            tmp_sch_01['FIV_GRADE_EVENT_YN'] = np.nan
            tmp_sch_01['SIX_GRADE_EVENT_YN'] = np.nan

        df_sch = pd.concat([df_sch, tmp_sch_01], axis = 0)
        backup = pd.concat([backup, tmp_sch[['SD_SCHUL_CODE', 'AY', 'EVENT_NM']]], axis = 0)

    except:
        l_no.append( v_schul )
        print( "=== NO INFORMATION: {}번째 {} {} ===".format( i, v_schul, l_info['SCHUL_NM']))

 

 

 

 

 

 

20210524_school_info.parquet
0.54MB