熊猫:如何从“周”和“年”创建日期时间对象?

熊猫:如何从“周”和“年”创建日期时间对象?,第1张

熊猫:如何从“周”和“年”创建日期时间对象?

尝试这个:

In [19]: pd.to_datetime(df.Year.astype(str), format='%Y') +   pd.to_timedelta(df.Week.mul(7).astype(str) + ' days')Out[19]:0   2016-10-281   2016-11-042   2016-12-233   2017-01-154   2017-02-055   2017-03-26dtype: datetime64[ns]

最初我有时间戳

s

从UNIX纪元时间戳解析它要容易得多:

df['Date'] = pd.to_datetime(df['UNIX_Time'], unit='s')

*10M行DF的 *计时

设定:

In [26]: df = pd.Dataframe(pd.date_range('1970-01-01', freq='1T', periods=10**7), columns=['date'])In [27]: df.shapeOut[27]: (10000000, 1)In [28]: df['unix_ts'] = df['date'].astype(np.int64)//10**9In [30]: dfOut[30]: date    unix_ts0       1970-01-01 00:00:00          01       1970-01-01 00:01:00         602       1970-01-01 00:02:00        1203       1970-01-01 00:03:00        1804       1970-01-01 00:04:00        2405       1970-01-01 00:05:00        3006       1970-01-01 00:06:00        3607       1970-01-01 00:07:00        4208       1970-01-01 00:08:00        4809       1970-01-01 00:09:00        540...          ...        ...9999990 1989-01-05 10:30:00  5999994009999991 1989-01-05 10:31:00  5999994609999992 1989-01-05 10:32:00  5999995209999993 1989-01-05 10:33:00  5999995809999994 1989-01-05 10:34:00  5999996409999995 1989-01-05 10:35:00  5999997009999996 1989-01-05 10:36:00  5999997609999997 1989-01-05 10:37:00  5999998209999998 1989-01-05 10:38:00  5999998809999999 1989-01-05 10:39:00  599999940[10000000 rows x 2 columns]

检查:

In [31]: pd.to_datetime(df.unix_ts, unit='s')Out[31]:0         1970-01-01 00:00:001         1970-01-01 00:01:002         1970-01-01 00:02:003         1970-01-01 00:03:004         1970-01-01 00:04:005         1970-01-01 00:05:006         1970-01-01 00:06:007         1970-01-01 00:07:008         1970-01-01 00:08:009         1970-01-01 00:09:00       ...9999990   1989-01-05 10:30:009999991   1989-01-05 10:31:009999992   1989-01-05 10:32:009999993   1989-01-05 10:33:009999994   1989-01-05 10:34:009999995   1989-01-05 10:35:009999996   1989-01-05 10:36:009999997   1989-01-05 10:37:009999998   1989-01-05 10:38:009999999   1989-01-05 10:39:00Name: unix_ts, Length: 10000000, dtype: datetime64[ns]

定时:

In [32]: %timeit pd.to_datetime(df.unix_ts, unit='s')10 loops, best of 3: 156 ms per loop

结论: 我认为156毫秒转换1000万行并不算慢



欢迎分享,转载请注明来源:内存溢出

原文地址:https://54852.com/zaji/5623913.html

(0)
打赏 微信扫一扫微信扫一扫 支付宝扫一扫支付宝扫一扫
上一篇 2022-12-15
下一篇2022-12-15

发表评论

登录后才能评论

评论列表(0条)

    保存