Python数据分析 知识量:13 - 56 - 232
索引用于更加高效的查找数据。对于没有索引的数据对象,Python默认会使用从0开始的自然数作为索引,例如下面的示例:
import pandas as pd df=pd.read_excel(r"D:\PythonTestFile\exam_no_head.xlsx",header=None) print(df,'\n')
运行结果为:
0 1 2 3 4 0 Noah male 90 50 66 1 Emma female 56 56 55 2 Noah male 90 50 66 3 Olivia female 86 87 44 4 Liam male 55 88 69 5 Sophia female 90 66 96 6 Liam male 55 88 69 7 Isabella female 66 85 55
也可以对数据对象指定索引,具体方法是使用参数cloumns指定列索引;使用参数index是的行索引。例如:
import pandas as pd df=pd.read_excel(r"D:\PythonTestFile\exam_no_head.xlsx",header=None) print(df,'\n') df.columns=['Name','Sex','Chinese','English','Math'] # 指定列索引 df.index=[1,2,3,4,5,6,7,8] # 指定行索引 print(df)
运行结果为:
0 1 2 3 4 0 Noah male 90 50 66 1 Emma female 56 56 55 2 Noah male 90 50 66 3 Olivia female 86 87 44 4 Liam male 55 88 69 5 Sophia female 90 66 96 6 Liam male 55 88 69 7 Isabella female 66 85 55 Name Sex Chinese English Math 1 Noah male 90 50 66 2 Emma female 56 56 55 3 Noah male 90 50 66 4 Olivia female 86 87 44 5 Liam male 55 88 69 6 Sophia female 90 66 96 7 Liam male 55 88 69 8 Isabella female 66 85 55
在实际应用中,有时会需要将数据表的某一列设为新的行索引,可以通过set_index()函数来实现。
import pandas as pd df=pd.read_excel(r"D:\PythonTestFile\exam.xlsx") print(df,'\n') print(df.set_index('Name'))
运行结果为:
Name Sex Chinese English Math 0 Noah male 90 50 66 1 Emma female 56 56 55 2 Noah male 90 50 66 3 Olivia female 86 87 44 4 Liam male 55 88 69 5 Sophia female 90 66 96 6 Liam male 55 88 69 7 Isabella female 66 85 55 Sex Chinese English Math Name Noah male 90 50 66 Emma female 56 56 55 Noah male 90 50 66 Olivia female 86 87 44 Liam male 55 88 69 Sophia female 90 66 96 Liam male 55 88 69 Isabella female 66 85 55
一般情况下,如果某个列含有重复值,是不能作为索引列的,因为索引值必须是唯一的。但是可以通过选择多个列同时作为索引列的方式来解决这类问题,这称为层次化索引。可以通过以列表的形式向set_index()函数传递多个列名来实现:
import pandas as pd df=pd.read_excel(r"D:\PythonTestFile\exam.xlsx") print(df,'\n') print(df.set_index(['Name','Sex'])) # Name与Sex同时作为索引
运行结果为:
Name Sex Chinese English Math 0 Noah male 90 50 66 1 Emma female 56 56 55 2 Noah male 90 50 66 3 Olivia female 86 87 44 4 Liam male 55 88 69 5 Sophia female 90 66 96 6 Liam male 55 88 69 7 Isabella female 66 85 55 Chinese English Math Name Sex Noah male 90 50 66 Emma female 56 56 55 Noah male 90 50 66 Olivia female 86 87 44 Liam male 55 88 69 Sophia female 90 66 96 Liam male 55 88 69 Isabella female 66 85 55
通过rename()函数来修改索引的名称,需要借助参数columns和index来指明修改后的列索引名称和行索引名称。
import pandas as pd df=pd.read_excel(r"D:\PythonTestFile\exam.xlsx") print(df,'\n') print(df.rename(columns={'Name':'NewName'},index={0:'first',1:'second',2:'third'}))
运行结果为:
Name Sex Chinese English Math 0 Noah male 90 50 66 1 Emma female 56 56 55 2 Noah male 90 50 66 3 Olivia female 86 87 44 4 Liam male 55 88 69 5 Sophia female 90 66 96 6 Liam male 55 88 69 7 Isabella female 66 85 55 NewName Sex Chinese English Math first Noah male 90 50 66 second Emma female 56 56 55 third Noah male 90 50 66 3 Olivia female 86 87 44 4 Liam male 55 88 69 5 Sophia female 90 66 96 6 Liam male 55 88 69 7 Isabella female 66 85 55
重置索引用于将设置的索引列重新更改为普通列,常用于层次化索引的重置。通过reset_index()函数来实现,通常涉及3个参数:
level 用于指定重置的层次化索引级别,第1个索引为第1级,以此类推。默认为全部重置。
drop 用于指定是否保留重置的索引列。默认为False(保留)。
inplace 用于指定是否修改原数据表。默认为False(不修改)。
下面是一个示例:
import pandas as pd df=pd.read_excel(r"D:\PythonTestFile\exam.xlsx") print('原数据表:\n',df,'\n') df.set_index(['Name','Sex'],inplace=True) # 将Name、Sex设为行索引,并指定修改原数据。 print('设置行索后:\n',df,'\n') print('默认重置后:\n',df.reset_index(),'\n') print('重置第2个索引后:\n',df.reset_index(level=1),'\n') print('不保留重置的索引列:\n',df.reset_index(drop=True))
运行结果为:
原数据表: Name Sex Chinese English Math 0 Noah male 90 50 66 1 Emma female 56 56 55 2 Noah male 90 50 66 3 Olivia female 86 87 44 4 Liam male 55 88 69 5 Sophia female 90 66 96 6 Liam male 55 88 69 7 Isabella female 66 85 55 设置行索后: Chinese English Math Name Sex Noah male 90 50 66 Emma female 56 56 55 Noah male 90 50 66 Olivia female 86 87 44 Liam male 55 88 69 Sophia female 90 66 96 Liam male 55 88 69 Isabella female 66 85 55 默认重置后: Name Sex Chinese English Math 0 Noah male 90 50 66 1 Emma female 56 56 55 2 Noah male 90 50 66 3 Olivia female 86 87 44 4 Liam male 55 88 69 5 Sophia female 90 66 96 6 Liam male 55 88 69 7 Isabella female 66 85 55 重置第2个索引后: Sex Chinese English Math Name Noah male 90 50 66 Emma female 56 56 55 Noah male 90 50 66 Olivia female 86 87 44 Liam male 55 88 69 Sophia female 90 66 96 Liam male 55 88 69 Isabella female 66 85 55 不保留重置的索引列: Chinese English Math 0 90 50 66 1 56 56 55 2 90 50 66 3 86 87 44 4 55 88 69 5 90 66 96 6 55 88 69 7 66 85 55
Copyright © 2017-Now pnotes.cn. All Rights Reserved.
编程学习笔记 保留所有权利
MARK:3.0.0.20240214.P35
From 2017.2.6