python_note
  • Introduction
  • My Python
    • Anaconda
    • argparse
    • datetime
    • json
    • logging
    • numpy
    • open
    • openCC
    • pandas & csv
    • Socket & SocketServer
    • re
    • yaml
    • smtp
    • 物件操作
    • unittest
    • 線程
    • prettytable
    • IO
    • pycurl
    • sys
    • pickle
    • auto-python-to-exe
    • cython
    • nuitka
  • Crawler
    • Urllib & Requests
      • User-agent
      • Percent-Encoding
      • mail code
    • Selenium
    • TCP & UDP
    • 控制字符(control character)
  • Web Development
    • Flask
      • RESTful api
      • Template
      • blueprint
    • Django
      • 環境佈署(windows)
    • 檢查Port
    • Apache
    • 使用者行為
    • jQuery
    • 壓力測試
    • DataTable
    • Bootstrap
    • CSS
    • JavaScript
    • Chart.js
  • Deep Learning
    • Keras 設定
    • RNN
    • LSTM
  • Test
    • T-Test
  • 資料結構
    • Hash
    • 時間複雜度
  • NLP
    • N-gram
    • CKIP
    • 中文轉數字
    • CRF
    • Mutual Information
    • 模糊比對
  • Linebot
    • Heroku
    • 圖文選單
    • channel
  • Linux
    • 常用指令
    • shell script
    • sshfs
    • ssh
    • nodejs & npm
    • debug
  • GCP
    • app engine
    • ssh(gcp)
    • gsutil
    • brabrabra
    • Load Balancer
    • k8s
  • Database
    • mysql
    • elasticsearch
      • Query
      • Backup and Restore
      • elasticdump
      • es2csv
      • ELK
    • mongodb
      • install
      • authentication
      • pymongo
    • sql server
  • go
    • Swarm
  • Docker
    • Kitematic
    • Dockerfile
    • Swarm
  • Git
  • 其他
    • USB軟體保護
    • Windows效能監視器
  • Blockchain
Powered by GitBook
On this page
  • Pandas
  • iloc
  • 讀取
  • 處理空值
  • 判斷nan
  • 四捨五入
  • Series轉換
  • 轉置矩陣
  • row
  • colum
  • dataframe轉dict
  • 日期轉換
  • CSV
  • Python之xlsx檔案與csv檔案相互轉換
  • openpyxl

Was this helpful?

  1. My Python

pandas & csv

PreviousopenCCNextSocket & SocketServer

Last updated 5 years ago

Was this helpful?

Pandas

官方文件

1.讀取csv : data=pandas.read_csv('filename')

2.瀏覽資料內容 : data.head()

3.讀取特定欄位資料(列,直直的) : detail=data[['name1','name2']......]

3-1. 轉為一般陣列: data['name'].tolist()

3-2. 可用numpy將series或者dataframe型態轉為numpy.array

4.計算各欄位的多種統計資訊(count,mean,std,min,max,25%,50%,75%) : data.describe()

5.所有欄位名稱: data.colums

6.直接選取各筆資料 data[1:] (list切割方式)

Source:

iloc

data.iloc[2] --> 第三個row (第三筆資料,橫的)

data.iloc[2:,0:2] --> 切割區塊(row,colums),抓取第一筆之後所有資料,但只包含第一、二項特徵

data.iloc[1:4,:] --> 以row選取

data.iloc[:,1:4] --> 以colums選取

iloc[0] --> Series

iloc[[0]] --> DataFrame

讀取

import pandas as pd
df = pd.read_csv("City_Zhvi_AllHomes.csv")
df.head(15)
# df.地址,在read_excel()進來時,第一列就會被當成他的屬性。所以df.地址就抓到地址這一行

處理空值

test['name'] = test['name'].fillna('我是空的!')

判斷nan

import math
x = float('nan')
math.isnan(x)  # True

四捨五入

df = pd.DataFrame(np.random.random([3, 3]),
     columns=['A', 'B', 'C'], index=['first', 'second', 'third']

df.round(2)  # 取至小數點第二位

Series轉換

scores = pd.Series({'小明':90, '小華':80, '小李':70, '小強':55})
# 把Series轉成Dataframe只需要使用指令就好
score_df = scores.to_frame() #搞定

轉置矩陣

df.transpose()

row

用loop讀取

for (index_label, row_series) in empDfObj.iterrows():
    print('Row Index label : ', index_label)
    print('Row Content as Series : ', row_series.values)

直接轉成numpy array

df.to_numpy()

colum

# 新增
df.insert(loc, column, value)

#刪除
del df['A']

dataframe轉dict

可是value會是list ([0])

dataframe.to_dict('list')
# {key: [value]}

日期轉換

df['DATE'] = df['Date'].apply(lambda x: x.strftime('%Y-%m-%d'))

CSV

1.讀取csv : inputfile=open('filepath','r')

2.讀取內容 : data=csv.reader(inputfile) (list, 包含欄位名稱(第一項[0]))

Python之xlsx檔案與csv檔案相互轉換

Workbook轉csv

from openpyxl import Workbook
for sheet in wb.get_sheet_names():
    with open("export/" + file_csv, 'w', newline='') as op:
        csv_file_writer = csv.writer(op)
        sheet_ranges = wb[sheet]
        for row in sheet_ranges.rows:
            row_container = []
            for cell in row:
                if cell.value == None:
                    row_container.append('')
                else:
                    row_container.append(str(cell.value).replace('\n', ''))

            csv_file_writer.writerow(row_container)

issue: 輸出csv多一行空白

writefile = open('result.csv','w',newline =‘’)
writer = csv.writer(writefile)

openpyxl

import openpyxl 

wb = openpyxl.Workbook() 
sheet = wb.active
sheet.title = "New Title"

# 新增資料表
ws1 = wb.create_sheet("Mysheet")

sheet.cell(row = 2, column = 2).value = ' everyone '

# 設定欄位長度與寬度
sheet.row_dimensions[1].height = 70
sheet.column_dimensions['B'].width = 20

自動調整寬度尚未測試

Source:

Source :

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_numpy.html#pandas.DataFrame.to_numpy
pandas 資料處理
Python Pandas DataFrame 的 iloc[0] 與 iloc[[0]]
https://tinycorner.tw/2018/04/24/使用python讀取csv和xlsx檔/
https://tinycorner.tw/2018/04/24/使用python讀取csv和xlsx檔/
https://blog.csdn.net/shuibuzhaodeshiren/article/details/84324885
https://medium.com/@weilihmen/python-pandas-%E5%9F%BA%E6%9C%AC%E6%93%8D%E4%BD%9C%E6%95%99%E5%AD%B8-%E6%88%90%E7%B8%BE%E8%A1%A8-f6d0ec4f89
http://violin-tao.blogspot.com/2017/06/pandas-1-indexing.html
https://thispointer.com/pandas-6-different-ways-to-iterate-over-rows-in-a-dataframe-update-while-iterating-row-by-row/
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_numpy.html#pandas.DataFrame.to_numpy
https://www.delftstack.com/zh-tw/howto/python-pandas/how-to-delete-pandas-dataframe-column/
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_dict.html
https://stackoverflow.com/questions/26716616/convert-a-pandas-dataframe-to-a-dictionary
https://t.codebug.vip/questions-1537063.htm
CSV讀寫
https://codertw.com/%E7%A8%8B%E5%BC%8F%E8%AA%9E%E8%A8%80/434360/
https://blog.csdn.net/qq_33689414/article/details/78307031
https://codertw.com/%E7%A8%8B%E5%BC%8F%E8%AA%9E%E8%A8%80/369201/
https://blog.csdn.net/pfm685757/article/details/47806469
https://www.geeksforgeeks.org/python-adjusting-rows-and-columns-of-an-excel-file-using-openpyxl-module/
https://www.itread01.com/content/1544850004.html