
需求:Hive分区根据时间,年先分区,然后分月,然后分日(三级分区)。需要针对性调整SQL,使得查询速度加快。
思路:
例如:2019-03-05 到 2021-01-10 1、按年分段 2019-03-05 2019-12-31 2020-01-01 2020-12-31(不用管)pyear=2020 2021-01-01 2021-01-10 2、处理每段 (1)按月分隔 (不是整月)2019-03-05 2019-03-31//按每个月31天,不考虑月份天数不同 pyear=2019 and pmonth=03 and pday>=05 and pday<=31 (整月)2019-04-01 2019-12-31 pyear=2019 and pmonth>=04 and pmonth<=12 (整年)2020-01-01 2021-12-31 pyear=2020 (不是整月)2021-01-01 2021-01-10 pyear=2021 and pmonth=01 and pday>=01 and pday<=10
示例输入:startime = 20210102,endtime =20211231
示例输出:
(pyear=2021 and ((pmonth=01 and pday>=02) or (pmonth>01 and pmonth<12) or (pmonth=12 and pday<=31)))
示例输入:startime = 20190102,endtime =20211231
示例输出:
(pyear=2019 and ((pmonth=01 and pday>=02) or (pmonth>01 and pmonth<12) or (pmonth=12 and pday<=31))) or (pyear=2020) or (pyear=2021)
使用:使用该工具类就可以在原SQL上精准定位分区,加快了查询速度。
package com.koma.gantryprofile.Utils;
import java.util.ArrayList;
import java.util.List;
public class DateToSqlUtil {
public static String getTimeSpanSqlStr(String begintime, String endtime) {
//result_sql是最终结果添加分区的SQL子句
String result_sql = "";
List timeSqlList = new ArrayList<>();
//1、按年分段
List subtimeList = getTimeListByYear(begintime,endtime);
//System.out.println("按年分段后的结果(subtimeList):"+subtimeList+"n");
//2、处理每段
for (int i = 0; i < subtimeList.size(); i++) {
String timeStr = subtimeList.get(i);
System.out.print("正在处理 "+timeStr+" 这一时间段");
timeSqlList.add(getTimeSql(timeStr));
//System.out.println("这一时间段拼接日期分区sql为:"+timeSqlList);
}
if(timeSqlList.size()>0) {
result_sql = String.join(" or ", timeSqlList);
}
return result_sql;
}
public static List getTimeListByYear(String begintime, String endtime) {
List subTimeList = new ArrayList<>();
String begin_year = begintime.substring(0,4);
String end_year = endtime.substring(0,4);
List yearList = new ArrayList<>();
int start = Integer.parseInt(begin_year);
int end = Integer.parseInt(end_year);
while(start<=end) {
yearList.add(String.valueOf(start));
start++;
}
//判断输入是否同一年
if(yearList.size()==1) {
subTimeList.add(begintime+"-"+endtime);
}
else {
subTimeList.add(begintime+"-"+yearList.get(0)+"1231");
for (int i = 1; i < yearList.size()-1; i++) {
subTimeList.add(yearList.get(i)+"0101-"+yearList.get(i)+"1231");
}
subTimeList.add(yearList.get(yearList.size()-1)+"0101-"+endtime);
}
return subTimeList;
}
public static String getTimeSql(String timeStr) {
StringBuffer sql = new StringBuffer("");
String[] timeArray = timeStr.split("-");
String start = timeArray[0];
String end = timeArray[1];
//判断分段的日期是否为整年
if(start.endsWith("0101") && end.endsWith("1231")) {
sql.append("(pyear=").append(start.substring(0,4)).append(")");
}
//非整年
else {
sql.append("(pyear=").append(start.substring(0,4))
.append(" and (").append("(pmonth=").append(start.substring(4,6))
.append(" and ").append("pday>=").append(start.substring(6,8));
List monthList = getMonthList(start, end);
//同一月
if(monthList.size()==1) {
sql.append(" and ").append("pday<=").append(end.substring(6,8)).append(")");
}
else {
sql.append(")");
sql.append(" or (pmonth>").append(monthList.get(0))
.append(" and pmonth<").append(monthList.get(monthList.size()-1)).append(")");
sql.append(" or (pmonth=").append(monthList.get(monthList.size()-1)).append( " and ")
.append("pday<=").append(end.substring(6,8)).append(")");
}
sql.append("))");
}
return sql.toString();
}
public static List getMonthList(String start, String end) {
List monthList = new ArrayList<>();
int startnum = Integer.valueOf(start.substring(4,6));
int endnum = Integer.valueOf(end.substring(4,6));
while(startnum<=endnum) {
if(startnum<10) {
monthList.add("0"+startnum);
} else {
monthList.add(String.valueOf(startnum));
}
startnum++;
}
return monthList;
}
public static void main(String[] args) {
String startTime="20210102";
String finalTime="20211231";
System.out.println("n"+getTimeSpanSqlStr(startTime,finalTime));
}
}
欢迎分享,转载请注明来源:内存溢出
微信扫一扫
支付宝扫一扫
评论列表(0条)