pandas & csv

Pandas

官方文件

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_numpy.html#pandas.DataFrame.to_numpy

1.讀取csv : data=pandas.read_csv('filename')

2.瀏覽資料內容 : data.head()

3.讀取特定欄位資料(列,直直的) : detail=data[['name1','name2']......]

3-1. 轉為一般陣列: data['name'].tolist()

3-2. 可用numpy將series或者dataframe型態轉為numpy.array

4.計算各欄位的多種統計資訊(count,mean,std,min,max,25%,50%,75%) : data.describe()

5.所有欄位名稱: data.colums

6.直接選取各筆資料 data[1:] (list切割方式)

Source: pandas 資料處理

iloc

data.iloc[2] --> 第三個row (第三筆資料,橫的)

data.iloc[2:,0:2] --> 切割區塊(row,colums),抓取第一筆之後所有資料,但只包含第一、二項特徵

data.iloc[1:4,:] --> 以row選取

data.iloc[:,1:4] --> 以colums選取

iloc[0] --> Series

iloc[[0]] --> DataFrame

Source:Python Pandas DataFrame 的 iloc[0] 與 iloc[[0]]

讀取

import pandas as pd
df = pd.read_csv("City_Zhvi_AllHomes.csv")
df.head(15)
# df.地址,在read_excel()進來時,第一列就會被當成他的屬性。所以df.地址就抓到地址這一行

https://tinycorner.tw/2018/04/24/使用python讀取csv和xlsx檔/

https://tinycorner.tw/2018/04/24/使用python讀取csv和xlsx檔/

處理空值

test['name'] = test['name'].fillna('我是空的!')

判斷nan

import math
x = float('nan')
math.isnan(x)  # True

四捨五入

df = pd.DataFrame(np.random.random([3, 3]),
     columns=['A', 'B', 'C'], index=['first', 'second', 'third']

df.round(2)  # 取至小數點第二位

https://blog.csdn.net/shuibuzhaodeshiren/article/details/84324885

Series轉換

scores = pd.Series({'小明':90, '小華':80, '小李':70, '小強':55})
# 把Series轉成Dataframe只需要使用指令就好
score_df = scores.to_frame() #搞定

https://medium.com/@weilihmen/python-pandas-%E5%9F%BA%E6%9C%AC%E6%93%8D%E4%BD%9C%E6%95%99%E5%AD%B8-%E6%88%90%E7%B8%BE%E8%A1%A8-f6d0ec4f89

轉置矩陣

df.transpose()

http://violin-tao.blogspot.com/2017/06/pandas-1-indexing.html

row

用loop讀取

for (index_label, row_series) in empDfObj.iterrows():
    print('Row Index label : ', index_label)
    print('Row Content as Series : ', row_series.values)

https://thispointer.com/pandas-6-different-ways-to-iterate-over-rows-in-a-dataframe-update-while-iterating-row-by-row/

直接轉成numpy array

df.to_numpy()

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_numpy.html#pandas.DataFrame.to_numpy

colum

# 新增
df.insert(loc, column, value)

#刪除
del df['A']

https://www.delftstack.com/zh-tw/howto/python-pandas/how-to-delete-pandas-dataframe-column/

dataframe轉dict

可是value會是list ([0])

dataframe.to_dict('list')
# {key: [value]}

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_dict.html

https://stackoverflow.com/questions/26716616/convert-a-pandas-dataframe-to-a-dictionary

日期轉換

df['DATE'] = df['Date'].apply(lambda x: x.strftime('%Y-%m-%d'))

https://t.codebug.vip/questions-1537063.htm

CSV

1.讀取csv : inputfile=open('filepath','r')

2.讀取內容 : data=csv.reader(inputfile) (list, 包含欄位名稱(第一項[0]))

Source : CSV讀寫

Python之xlsx檔案與csv檔案相互轉換

Workbook轉csv

from openpyxl import Workbook
for sheet in wb.get_sheet_names():
    with open("export/" + file_csv, 'w', newline='') as op:
        csv_file_writer = csv.writer(op)
        sheet_ranges = wb[sheet]
        for row in sheet_ranges.rows:
            row_container = []
            for cell in row:
                if cell.value == None:
                    row_container.append('')
                else:
                    row_container.append(str(cell.value).replace('\n', ''))

            csv_file_writer.writerow(row_container)

https://codertw.com/%E7%A8%8B%E5%BC%8F%E8%AA%9E%E8%A8%80/434360/

https://blog.csdn.net/qq_33689414/article/details/78307031

https://codertw.com/%E7%A8%8B%E5%BC%8F%E8%AA%9E%E8%A8%80/369201/

issue: 輸出csv多一行空白

writefile = open('result.csv','w',newline =‘’)
writer = csv.writer(writefile)

https://blog.csdn.net/pfm685757/article/details/47806469

openpyxl

import openpyxl 

wb = openpyxl.Workbook() 
sheet = wb.active
sheet.title = "New Title"

# 新增資料表
ws1 = wb.create_sheet("Mysheet")

sheet.cell(row = 2, column = 2).value = ' everyone '

# 設定欄位長度與寬度
sheet.row_dimensions[1].height = 70
sheet.column_dimensions['B'].width = 20

https://www.geeksforgeeks.org/python-adjusting-rows-and-columns-of-an-excel-file-using-openpyxl-module/

https://www.itread01.com/content/1544850004.html

自動調整寬度尚未測試

Last updated