如何在 Pandas 中使用 DataFrame

如何在 Pandas 中使用 DataFrame,第1张

如何在 Pandas 中使用 DataFrame

Python pandas Dataframe是一个类似于表的数据结构对象。它包含行和列。每列包含相同类型的数据。对于每一列数据,您可以使用行号来迭代列元素。本文将告诉你如何创建一个pandas Dataframe 对象,以及如何获取其中的列和行数据。

1. 如何创建 Pandas Dataframe 对象。
  1. 调用pandas模块的Dataframe(data, index=index, columns=columns)方法来创建 python pandas Dataframe对象。
  2. 的数据参数保存数据帧的对象数据,它可以是一个2维阵列或一个Python字典对象。
  3. 该指数参数是 数据框对象的行索引号,它是一个Python列表对象。
  4. 该列 参数是 数据框对象的列标签的文字,我们可以使用每个列的值,以获得 数据帧中的对象的一列数据pandas.Series类型的对象。

1.1 通过二维数组创建 Pandas Dataframe 对象。

  1. 下面的示例将创建一个带有二维数组的 python pandas Dataframe 对象。
    import pandas as pd
    
    
    '''
    This function create a python pandas Dataframe object with a 2 dimension array.
    '''
    def create_dataframe_from_2_dimension_array():
        
        pd.set_option('display.unicode.east_asian_width', True)
        
        ''' Define a 2 dimension array, each element in the array's first dimension is a list. 
            
            It contains the position number, programming language and operating system '''
        data = [[1, 'python', 'Windows'], [5, 'java', 'Linux'],[8, 'c++', 'macOS']]
        
        # Define the column list, each element in the list is the column label.
        columns = ['Position', 'Programming Language', 'Operating System']
        
        # Define the row index list.
        index = [1, 2, 3]
        
        # Create the python pandas Dataframe object.
        df = pd.Dataframe(data, index=index, columns=columns)
        
        # Print out the Dataframe object data.
        print(df)
        
        # Return the python pandas Dataframe object.
        return df

  2. 当你运行上面的函数时,它会在控制台打印出下面的数据。
       Position Programming Language Operating System
    1         1               python          Windows
    2         5                 java            Linux
    3         8                  c++            macOS

1.2 通过Python Dictionary 对象创建Pandas Dataframe 对象。

  1. 下面的示例将使用 Python 字典对象创建一个 python pandas Dataframe 对象。
    import pandas as pd
    
    '''
    This function create a python pandas Dataframe object with a python dictionary object.
    '''
    
    
    def create_dataframe_from_dictionary_object():
        pd.set_option('display.unicode.east_asian_width', True)
        '''
        Define a python dictionary object, the key is the column name, the value is a list that contains the column value of each row in the column.
        '''
        dict_obj = {'Position': [1, 5, 8], 'Programming Language': ['python', 'java', 'c++'],
                    'Operating System': ['Windows', 'Linux', 'macOS']}
    
        # Create a list object to store the row index number.
        index = [1, 2, 3]
    
        # Create the python pandas Dataframe object 
        df = pd.Dataframe(dict_obj, index=index)
    
        # Print the Dataframe object's data in the console.
        print(df)
    
        # Return the created Dataframe object.
        return df
    
    
    print(create_dataframe_from_dictionary_object())
    

  2. 下面是上面的示例函数在控制台中的执行结果。
       Position Programming Language Operating System
    1         1               python          Windows
    2         5                 java            Linux
    3         8                  c++            macOS

2. 如何迭代 Python Pandas Dataframe 对象。

2.1 迭代 Dataframe 列。

  1. python pandas Dataframe对象的columns属性将在列表中返回所有Dataframe对象的列值。
  2. 然后我们可以迭代返回的列列表,然后在 pandas Series对象中获取列数据。下面是一个例子。
    '''
    This function will iterate the dataframe_object and print out each column data list in the python pandas Dataframe object.
    '''
    def iterate_dataframe_object(dataframe_object):
        
        
        print('=================== iterate_dataframe_object ======================')
        
        
        # Loop the Dataframe object's columns.
        for column in dataframe_object.columns:
            
            # Print out the column name.
            print(column)
            
            # Get the column data in a pandas Series object.
            column_data_series = dataframe_object[column]
            
            # Print out the column data Series object.
            print(column_data_series)
    
            print('=======================================')
    
    
    if __name__ == '__main__':
        
        #create_dataframe_from_2_dimension_array()
        
        df = create_dataframe_from_dictionary_object()
        
        iterate_dataframe_object(df)

  3. 下面是上面例子的执行结果。
    =================== iterate_dataframe_object ======================
    Position
    1    1
    2    5
    3    8
    Name: Position, dtype: int64
    =======================================
    Programming Language
    1    python
    2      java
    3       c++
    Name: Programming Language, dtype: object
    =======================================
    Operating System
    1    Windows
    2      Linux
    3      macOS
    Name: Operating System, dtype: object
    =======================================

2.2 迭代 Dataframe 行。

  1. 您可以使用pandas模块的Dataframe对象的iterrows()函数来获取 Dataframe 对象的行迭代器。
  2. 然后就可以调用python next()函数用迭代器对items进行迭代,然后得到Dataframe对象的每一行数据。下面是示例源代码。
    '''
    Created on Oct 23, 2021
    
    @author: songzhao
    '''
    
    import pandas as pd
    
    '''
    This function create a python pandas Dataframe object with a python dictionary object.
    '''
    def create_dataframe_from_dictionary_object():
        
        pd.set_option('display.unicode.east_asian_width', True)
        '''
        Define a python dictionary object, the key is the column name, the value is a list that contains the column value of each row in the column.
        '''
        dict_obj = {'Position':[1, 5, 8], 'Programming Language':['python', 'java', 'c++'], 'Operating System':['Windows', 'Linux', 'macOS']}
        
        # Create a list object to store the row index number.
        index = [1, 2, 3]
        
        # Create the python pandas Dataframe object 
        df = pd.Dataframe(dict_obj, index=index)
        
        # Print the Dataframe object's data in the console.
        print(df)
        
        # Return the created Dataframe object.
        return df
            
        
    '''
    This function will iterate the Dataframe object rows and print each row data.
    '''  
    def iterate_dataframe_rows(df_obj):
        
        print('=================== iterate_dataframe_rows ======================')
        
        # Call the Dataframe object's iterrows() function to get row iterator.
        iterator = df_obj.iterrows()
        
        # Get the next item in the iterator.
        row = next(iterator, None)
       
        # While there are rows in the iterator.
        while row != None:  
            
            row_number = row[0]
            
            series_obj = row[1]
            
            print('row number = ', row_number)
            
            print(series_obj.index)
            
            print(series_obj.values)
            
            print('rn')
            
            # Get the next row from the iterator.
            row = next(iterator, None)
                
    
    if __name__ == '__main__':
        
        df = create_dataframe_from_dictionary_object()
        
        iterate_dataframe_rows(df)

  3. 当您运行上面的示例源代码时,您将获得以下输出。
       Position Programming Language Operating System
    1         1               python          Windows
    2         5                 java            Linux
    3         8                  c++            macOS
    =================== iterate_dataframe_rows ======================
    row number =  1
    Index(['Position', 'Programming Language', 'Operating System'], dtype='object')
    [1 'python' 'Windows']
    
    
    row number =  2
    Index(['Position', 'Programming Language', 'Operating System'], dtype='object')
    [5 'java' 'Linux']
    
    
    row number =  3
    Index(['Position', 'Programming Language', 'Operating System'], dtype='object')
    [8 'c++' 'macOS']

欢迎分享,转载请注明来源:内存溢出

原文地址:https://54852.com/zaji/5562971.html

(0)
打赏 微信扫一扫微信扫一扫 支付宝扫一扫支付宝扫一扫
上一篇 2022-12-14
下一篇2022-12-14

发表评论

登录后才能评论

评论列表(0条)

    保存