子类化Pandas DataFrame,可以更新吗?

子类化Pandas DataFrame,可以更新吗?,第1张

子类化Pandas DataFrame,可以更新吗?

这就是我的方法。我遵循了发现的建议:

  • 子类化熊猫数据结构
  • 修复完成问题

以下示例仅显示了构造的新子类的用法

pandas.Dataframe
。如果您按照我的第一个链接中的建议进行 *** 作,则也可以考虑使用子类化
pandas.Series
,以考虑获取
pandas.Dataframe
子类的一维切片。

定义
SomeData
import pandas as pdimport numpy as npclass SomeData(pd.Dataframe):    # This class variable tells Pandas the name of the attributes    # that are to be ported over to derivative Dataframes.  There    # is a method named `__finalize__` that grabs these attributes    # and assigns them to newly created `SomeData`    _metadata = ['my_attr']    @property    def _constructor(self):        """This is the key to letting Pandas know how to keep        derivative `SomeData` the same type as yours.  It should        be enough to return the name of the Class.  However, in        some cases, `__finalize__` is not called and `my_attr` is        not carried over.  We can fix that by constructing a callable        that makes sure to call `__finlaize__` every time."""        def _c(*args, **kwargs): return SomeData(*args, **kwargs).__finalize__(self)        return _c    def __init__(self, *args, **kwargs):        # grab the keyword argument that is supposed to be my_attr        self.my_attr = kwargs.pop('my_attr', None)        super().__init__(*args, **kwargs)    def my_method(self, other):        return self * np.sign(self - other)

示范
mydata = SomeData(dict(A=[1, 2, 3], B=[4, 5, 6]), my_attr='an attr')print(mydata, type(mydata), mydata.my_attr, sep='n' * 2)   A  B0  1  41  2  52  3  6<class '__main__.SomeData'>an attrnewdata = mydata.mul(2)print(newdata, type(newdata), newdata.my_attr, sep='n' * 2)   A   B0  2   81  4  102  6  12<class '__main__.SomeData'>an attrnewerdata = mydata.my_method(newdata)print(newerdata, type(newerdata), newerdata.my_attr, sep='n' * 2)   A  B0 -1 -41 -2 -52 -3 -6<class '__main__.SomeData'>an attr

陷阱

这种方法很烂

pd.Dataframe.equals

newerdata.equals(newdata)  # Should be `False`

TypeErrorTraceback (most recent call

last)
in ()
----> 1 newerdata.equals(newdata)

~/anaconda3/envs/3.6.ml/lib/python3.6/site-

packages/pandas/core/generic.py in equals(self, other)
1034 the same location are considered equal.
1035 “”“
-> 1036 if not isinstance(other, self._constructor):
1037 return False
1038 return self._data.equals(other._data)

TypeError: isinstance() arg 2 must be a type or tuple of types

发生的事情是该方法希望

type
_constructor
属性中找到类型的对象。相反,它找到了我可打电话的地方,以解决
__finalize__
我遇到的问题。

解决

equals
在类定义中使用以下方法重写该方法。

    def equals(self, other):        try: pd.testing.assert_frame_equal(self, other) return True        except AssertionError: return Falsenewerdata.equals(newdata)  # Should be `False`False


欢迎分享,转载请注明来源:内存溢出

原文地址:https://54852.com/zaji/5647009.html

(0)
打赏 微信扫一扫微信扫一扫 支付宝扫一扫支付宝扫一扫
上一篇 2022-12-16
下一篇2022-12-16

发表评论

登录后才能评论

评论列表(0条)

    保存