
这就是我的方法。我遵循了发现的建议:
- 子类化熊猫数据结构
- 修复完成问题
以下示例仅显示了构造的新子类的用法
pandas.Dataframe。如果您按照我的第一个链接中的建议进行 *** 作,则也可以考虑使用子类化
pandas.Series,以考虑获取
pandas.Dataframe子类的一维切片。定义
SomeData
import pandas as pdimport numpy as npclass SomeData(pd.Dataframe): # This class variable tells Pandas the name of the attributes # that are to be ported over to derivative Dataframes. There # is a method named `__finalize__` that grabs these attributes # and assigns them to newly created `SomeData` _metadata = ['my_attr'] @property def _constructor(self): """This is the key to letting Pandas know how to keep derivative `SomeData` the same type as yours. It should be enough to return the name of the Class. However, in some cases, `__finalize__` is not called and `my_attr` is not carried over. We can fix that by constructing a callable that makes sure to call `__finlaize__` every time.""" def _c(*args, **kwargs): return SomeData(*args, **kwargs).__finalize__(self) return _c def __init__(self, *args, **kwargs): # grab the keyword argument that is supposed to be my_attr self.my_attr = kwargs.pop('my_attr', None) super().__init__(*args, **kwargs) def my_method(self, other): return self * np.sign(self - other)示范
mydata = SomeData(dict(A=[1, 2, 3], B=[4, 5, 6]), my_attr='an attr')print(mydata, type(mydata), mydata.my_attr, sep='n' * 2) A B0 1 41 2 52 3 6<class '__main__.SomeData'>an attrnewdata = mydata.mul(2)print(newdata, type(newdata), newdata.my_attr, sep='n' * 2) A B0 2 81 4 102 6 12<class '__main__.SomeData'>an attrnewerdata = mydata.my_method(newdata)print(newerdata, type(newerdata), newerdata.my_attr, sep='n' * 2) A B0 -1 -41 -2 -52 -3 -6<class '__main__.SomeData'>an attr
陷阱
这种方法很烂
pd.Dataframe.equals
newerdata.equals(newdata) # Should be `False`
TypeErrorTraceback (most recent calllast)
in ()
----> 1 newerdata.equals(newdata)~/anaconda3/envs/3.6.ml/lib/python3.6/site-packages/pandas/core/generic.py in equals(self, other)
1034 the same location are considered equal.
1035 “”“
-> 1036 if not isinstance(other, self._constructor):
1037 return False
1038 return self._data.equals(other._data)TypeError: isinstance() arg 2 must be a type or tuple of types
发生的事情是该方法希望
type在
_constructor属性中找到类型的对象。相反,它找到了我可打电话的地方,以解决
__finalize__我遇到的问题。
解决
equals在类定义中使用以下方法重写该方法。
def equals(self, other): try: pd.testing.assert_frame_equal(self, other) return True except AssertionError: return Falsenewerdata.equals(newdata) # Should be `False`False
欢迎分享,转载请注明来源:内存溢出
微信扫一扫
支付宝扫一扫
评论列表(0条)