网络爬虫——pandas

网络爬虫——pandas,第1张

import pandas as pd

df=pd.read_csv('123.csv')
# print(df)

#删除空值
# df2=df.dropna()
# print(df2)

#判断空值
# print(df['NUM_BEDROOMS'].isnull())

#指定空值类型
# missing_values = ["n/a","na","--","NaN"]
# df=pd.read_csv('123.csv',na_values=missing_values)
# # df.dropna(inplace=True)
# #判断某几列
# df.dropna(subset=['ST_NUM'], inplace = True)
# print(df)

#指定空值类型
missing_values = ["n/a","na","--","NaN"]
df=pd.read_csv('123.csv',na_values=missing_values)
#替换空值类型
# df.fillna(123456,inplace=True)
# print(df)
#填充某一列
# df['ST_NUM'].fillna('66666',inplace=True)
# print(df)
#平均数,中位数,众数替换
avg=df['ST_NUM'].mean()
med=df['ST_NUM'].median()
df['ST_NUM'].fillna(avg,inplace=True)
print(df)
import pandas as pd

data = {
    "Date": ['2020/12/01', '2020/12/02' , '20201226'],
    "duration": [50, 40, 45]
}
df = pd.DataFrame(data, index = ["day1", "day2", "day3"])
print(df)

#修改日期数据列,替换原来的数列
df['Date'] = pd.to_datetime(df['Date'])
print(df)

欢迎分享,转载请注明来源:内存溢出

原文地址:https://54852.com/langs/794485.html

(0)
打赏 微信扫一扫微信扫一扫 支付宝扫一扫支付宝扫一扫
上一篇 2022-05-06
下一篇2022-05-06

发表评论

登录后才能评论

评论列表(0条)

    保存