
Python pandas Dataframe是一个类似于表的数据结构对象。它包含行和列。每列包含相同类型的数据。对于每一列数据,您可以使用行号来迭代列元素。本文将告诉你如何创建一个pandas Dataframe 对象,以及如何获取其中的列和行数据。
1. 如何创建 Pandas Dataframe 对象。- 调用pandas模块的Dataframe(data, index=index, columns=columns)方法来创建 python pandas Dataframe对象。
- 的数据参数保存数据帧的对象数据,它可以是一个2维阵列或一个Python字典对象。
- 该指数参数是 数据框对象的行索引号,它是一个Python列表对象。
- 该列 参数是 数据框对象的列标签的文字,我们可以使用每个列的值,以获得 数据帧中的对象的一列数据pandas.Series类型的对象。
1.1 通过二维数组创建 Pandas Dataframe 对象。
- 下面的示例将创建一个带有二维数组的 python pandas Dataframe 对象。
import pandas as pd ''' This function create a python pandas Dataframe object with a 2 dimension array. ''' def create_dataframe_from_2_dimension_array(): pd.set_option('display.unicode.east_asian_width', True) ''' Define a 2 dimension array, each element in the array's first dimension is a list. It contains the position number, programming language and operating system ''' data = [[1, 'python', 'Windows'], [5, 'java', 'Linux'],[8, 'c++', 'macOS']] # Define the column list, each element in the list is the column label. columns = ['Position', 'Programming Language', 'Operating System'] # Define the row index list. index = [1, 2, 3] # Create the python pandas Dataframe object. df = pd.Dataframe(data, index=index, columns=columns) # Print out the Dataframe object data. print(df) # Return the python pandas Dataframe object. return df - 当你运行上面的函数时,它会在控制台打印出下面的数据。
Position Programming Language Operating System 1 1 python Windows 2 5 java Linux 3 8 c++ macOS
1.2 通过Python Dictionary 对象创建Pandas Dataframe 对象。
- 下面的示例将使用 Python 字典对象创建一个 python pandas Dataframe 对象。
import pandas as pd ''' This function create a python pandas Dataframe object with a python dictionary object. ''' def create_dataframe_from_dictionary_object(): pd.set_option('display.unicode.east_asian_width', True) ''' Define a python dictionary object, the key is the column name, the value is a list that contains the column value of each row in the column. ''' dict_obj = {'Position': [1, 5, 8], 'Programming Language': ['python', 'java', 'c++'], 'Operating System': ['Windows', 'Linux', 'macOS']} # Create a list object to store the row index number. index = [1, 2, 3] # Create the python pandas Dataframe object df = pd.Dataframe(dict_obj, index=index) # Print the Dataframe object's data in the console. print(df) # Return the created Dataframe object. return df print(create_dataframe_from_dictionary_object()) - 下面是上面的示例函数在控制台中的执行结果。
Position Programming Language Operating System 1 1 python Windows 2 5 java Linux 3 8 c++ macOS
2.1 迭代 Dataframe 列。
- python pandas Dataframe对象的columns属性将在列表中返回所有Dataframe对象的列值。
- 然后我们可以迭代返回的列列表,然后在 pandas Series对象中获取列数据。下面是一个例子。
''' This function will iterate the dataframe_object and print out each column data list in the python pandas Dataframe object. ''' def iterate_dataframe_object(dataframe_object): print('=================== iterate_dataframe_object ======================') # Loop the Dataframe object's columns. for column in dataframe_object.columns: # Print out the column name. print(column) # Get the column data in a pandas Series object. column_data_series = dataframe_object[column] # Print out the column data Series object. print(column_data_series) print('=======================================') if __name__ == '__main__': #create_dataframe_from_2_dimension_array() df = create_dataframe_from_dictionary_object() iterate_dataframe_object(df) - 下面是上面例子的执行结果。
=================== iterate_dataframe_object ====================== Position 1 1 2 5 3 8 Name: Position, dtype: int64 ======================================= Programming Language 1 python 2 java 3 c++ Name: Programming Language, dtype: object ======================================= Operating System 1 Windows 2 Linux 3 macOS Name: Operating System, dtype: object =======================================
2.2 迭代 Dataframe 行。
- 您可以使用pandas模块的Dataframe对象的iterrows()函数来获取 Dataframe 对象的行迭代器。
- 然后就可以调用python next()函数用迭代器对items进行迭代,然后得到Dataframe对象的每一行数据。下面是示例源代码。
''' Created on Oct 23, 2021 @author: songzhao ''' import pandas as pd ''' This function create a python pandas Dataframe object with a python dictionary object. ''' def create_dataframe_from_dictionary_object(): pd.set_option('display.unicode.east_asian_width', True) ''' Define a python dictionary object, the key is the column name, the value is a list that contains the column value of each row in the column. ''' dict_obj = {'Position':[1, 5, 8], 'Programming Language':['python', 'java', 'c++'], 'Operating System':['Windows', 'Linux', 'macOS']} # Create a list object to store the row index number. index = [1, 2, 3] # Create the python pandas Dataframe object df = pd.Dataframe(dict_obj, index=index) # Print the Dataframe object's data in the console. print(df) # Return the created Dataframe object. return df ''' This function will iterate the Dataframe object rows and print each row data. ''' def iterate_dataframe_rows(df_obj): print('=================== iterate_dataframe_rows ======================') # Call the Dataframe object's iterrows() function to get row iterator. iterator = df_obj.iterrows() # Get the next item in the iterator. row = next(iterator, None) # While there are rows in the iterator. while row != None: row_number = row[0] series_obj = row[1] print('row number = ', row_number) print(series_obj.index) print(series_obj.values) print('rn') # Get the next row from the iterator. row = next(iterator, None) if __name__ == '__main__': df = create_dataframe_from_dictionary_object() iterate_dataframe_rows(df) - 当您运行上面的示例源代码时,您将获得以下输出。
Position Programming Language Operating System 1 1 python Windows 2 5 java Linux 3 8 c++ macOS =================== iterate_dataframe_rows ====================== row number = 1 Index(['Position', 'Programming Language', 'Operating System'], dtype='object') [1 'python' 'Windows'] row number = 2 Index(['Position', 'Programming Language', 'Operating System'], dtype='object') [5 'java' 'Linux'] row number = 3 Index(['Position', 'Programming Language', 'Operating System'], dtype='object') [8 'c++' 'macOS']
欢迎分享,转载请注明来源:内存溢出
微信扫一扫
支付宝扫一扫
评论列表(0条)