software_development:python_pandas

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
software_development:python_pandas [2022/08/04 06:00] – [order of columns] prgramsoftware_development:python_pandas [2025/07/07 14:12] (current) – external edit 127.0.0.1
Line 27: Line 27:
 <code python> <code python>
 df.groupby([컬럼들]).agg({'컬럼':sum}).reset_index() df.groupby([컬럼들]).agg({'컬럼':sum}).reset_index()
 +
 +df.groupby([COLUMNS])['COLUMN'].max().reset_index()
  
 df = df.assign(date=pd.to_numeric(df['date'], errors='coerce')).groupby(['코드', '종목명']).agg({'date':np.min}).reset_index().drop_duplicates() df = df.assign(date=pd.to_numeric(df['date'], errors='coerce')).groupby(['코드', '종목명']).agg({'date':np.min}).reset_index().drop_duplicates()
Line 105: Line 107:
 iloc: Select by position iloc: Select by position
 loc: Select by label loc: Select by label
 +  
 +df.loc[:,~df.columns.isin(['a','b'])]  
 +
 +df[~( df['a'].isin(['1','2','3']) & df['b']=='3' )] #row-wise
 +df.loc[~( df['a'].isin(['1','2','3']) & df['b']=='3' ), 8] #row-wise & column
 </code> </code>
  
Line 116: Line 123:
      
 =====I/O file===== =====I/O file=====
 +
 +=== encoding_errors - 'ignore'===
 +Encoding 제대로 했는데도 안되면..
 +공공데이터가 이런 경우가 많음.
 +
 +Error tokenizing data. C error: EOF inside string starting at row 0 | 판다스 에러
 +https://con2joa.tistory.com/m/60
 +quoting=csv.QUOTE_NONE 파라미터
 +
 +<code python>
 +import chardet
 +with open(file, 'rb') as rawdata:
 +    result = chardet.detect(rawdata.read(100000))
 +result
 +
 +
 +data = pd.read_csv( file, encoding='cp949', encoding_errors='ignore')
 +# on_bad_lines='skip'
 +# error_bad_lines=False
 +</code>
  
 === to_numberic === === to_numberic ===
  • software_development/python_pandas.1659592856.txt.gz
  • Last modified: 2025/07/07 14:12
  • (external edit)