(1)Hive 数仓中一些常用的dt与日期的转换操作

下面总结了自己工作中经常用到的一些日期转换,这类日期转换经常用于报表的时间粒度和统计周期的控制中

日期变换:
(1)dt转日期
to_date(from_unixtime(unix_timestamp('${dt}','yyyyMMdd')))
(2)日期转dt
regexp_replace('${date}','-','')
(3)dt转当月1号日期
to_date(from_unixtime(unix_timestamp(concat(substr('${dt}',1,6),'01'),'yyyyMMdd')))
trunc(to_date(from_unixtime(unix_timestamp('${dt}','yyyyMMdd'))),'MM')
-- 下月1号日期
trunc(add_months(to_date(from_unixtime(unix_timestamp('${dt}','yyyyMMdd'))),1),'MM')
(4)dt转当周星期一日期
next_day(date_add(to_date(from_unixtime(unix_timestamp('${dt}','yyyyMMdd'))), -7), 'Mo')
date_sub(next_day(to_date(from_unixtime(unix_timestamp('${dt}','yyyyMMdd'))),'MO'),7)
-- 下周星期一日期
next_day(to_date(from_unixtime(unix_timestamp('${dt}','yyyyMMdd'))),'MO')
(5)dt前六天日期(dt为星期天时得到的是本周周一的日期)
date_add(to_date(from_unixtime(unix_timestamp('${dt}','yyyyMMdd'))), -6)
(5)dt转当季第一天日期
if(length(floor(substr('${dt}',5,2)/3.1)*3+1)=1,concat(substr('${dt}',1,4),'-0',floor(substr('${dt}',5,2)/3.1)*3+1,'-01'),concat(substr('${dt}',1,4),'-',floor(substr('${dt}',5,2)/3.1)*3+1,'-01'))
(6)dt转半年第一天日期
if(length(floor(substr('${dt}',5,2)/6.1)*6+1)=1,concat(substr('${dt}',1,4),'-0',floor(substr('${dt}',5,2)/6.1)*6+1,'-01'),concat(substr('${dt}',1,4),'-',floor(substr('${dt}',5,2)/6.1)*6+1,'-01'))
(7)dt转当年1号日期
concat(substr('${dt}',1,4),'-01-01')(8)在同时有日周月粒度时要注意数据的时间范围,有时每月的第一个自然周会跨月,比如2019年3月的第一周的日期是20190225-20190303where agent_business_date between date_add_day('${dt}',-31) and to_date(from_unixtime(unix_timestamp('${dt}','yyyyMMdd')))where dt between regexp_replace(date_add_day('${dt}',-31),'-','') and '${dt}'

------------------------------------------------------------------------------------------
-- 日期维度表表结构edw_public.dim_esf_edw_pub_date
------------------------------------------------------------------------------------------
col_name	                data_type	comment
------------------------------------------------------------------------
calendar_date	              string	    日期,格式为"YYYY-MM-DD"
week_english_name	        string	    星期英文名
week_chinese_name	        string	    星期中文名
day_of_week_number	        int	    所属一周当中的第几天
calendar_month_code	        string	    日期所属月份,格式为"YYYY-MM"
calendar_month_number	      int	    所属月份数字
month_english_name	        string	    月份英文名
month_chinese_name	        string	    月份中文名
day_of_month_number	        int	    所属月份当中的第几天
calendar_quater_code	      string	    日期所属季度,格式为"YYYY-QT"
calendar_quater_number	      int	    所属季度数字
day_of_quater_number	      int	    所属季度当中的第几天
calendar_half_year_code	      string	    日期所属半年,格式为"YYYY-HY"
calendar_half_year_number	int	    所属半年数字,1为上半年,2为下半年
calendar_year_code	        string	    日期所属年份,格式为"YYYY"
day_of_year_number	        int	    所属年份当中的第几天
work_day_flag	              string	    工作日标志: Y - 是/ N - 否
holiday_flag	              string	    节假日标志: Y - 是/ N - 否

-- 日期维度表的使用
-- 当天日期
SELECT
	calendar_date
FROM
	edw_public.dim_esf_edw_pub_date
WHERE
	calendar_date = regexp_replace('${dt}','(\\d{4})(\\d{2})(\\d{2})','$1-$2-$3')

-- Finereport中日周月季半年年 各周期末日期的算法
select
    ${if(粒度 == 1," case when date(max(calendar_date))>=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) else date(max(calendar_date)) end as period_end_date","")}
    ${if(粒度 == 2," distinct case when day_of_week_number = 1 and date_add('day',6,date(calendar_date)) >=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date))  when day_of_week_number = 7 and date(calendar_date) >=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) when day_of_week_number = 1 then date_add('day',6,date(calendar_date))  when day_of_week_number = 7 then date(calendar_date) else date(calendar_date) end as period_end_date ","")}
    ${if(粒度 == 3," case when date(max(calendar_date))>=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) else date(max(calendar_date)) end as period_end_date ","")}
    ${if(粒度 == 4," case when date(max(calendar_date))>=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) else date(max(calendar_date)) end as period_end_date ","")}
    ${if(粒度 == 5," case when date(max(calendar_date))>=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) else date(max(calendar_date)) end as period_end_date ","")}
    ${if(粒度 == 6," case when date(max(calendar_date))>=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) else date(max(calendar_date)) end as period_end_date ","")}
from
    edw_public.dim_esf_edw_pub_date
where calendar_date >= '${开始时间}' and calendar_date <= '${结束时间}'
${if(粒度 == 1," group by calendar_date ","")}
${if(粒度 == 2," and day_of_week_number in (1,7) ","")}
${if(粒度 == 3," group by calendar_month_code  ","")}
${if(粒度 == 4," group by calendar_quater_code  ","")}
${if(粒度 == 5," group by calendar_year_code  ","")}
${if(粒度 == 6," group by calendar_half_year_code  ","")}

-- Finereport中日周月季半年年 各周期期初期末日期的算法(这种计算方法当前日期是20190330,输入的日期范围是2019-03-01至2091-03-28则输出的月日期范围是2019-03-29)
select
    ${if(粒度 == 1,"date(calendar_date) as period_start_date, date(calendar_date) as period_end_date ","")}
    ${if(粒度 == 2,"case when day_of_week_number = 1 then  date(calendar_date) when day_of_week_number = 7 then date_add('day',-6, date(calendar_date)) end as period_start_date, case  when day_of_week_number = 1 and date_add('day',6, date(calendar_date)) >=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) when day_of_week_number = 7 and date(calendar_date)>=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) when day_of_week_number = 1 then date_add('day',6, date(calendar_date))  when day_of_week_number = 7 then date(calendar_date)  end as period_end_date ","")}
    ${if(粒度 == 3,"date(calendar_date) as period_start_date, case when date_add('day',-day(date(calendar_date)),date_add('month',1,(date(calendar_date))))>=date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) else date_add('day',-day(date(calendar_date)),date_add('month',1,(date(calendar_date)))) end as period_end_date ","")}
    ${if(粒度 == 4,"calendar_date as period_start_date,date_add('day',-1,date_add('month',1,date(substr(calendar_date,1,4)||'-'||cast(cast(floor(cast(substr(calendar_date,6,2) as int)/3.1)*3+3 as int) as varchar)||'-01')))  as  period_end_date ","")}
    ${if(粒度 == 5,"date(concat(substr(calendar_date,1,4),'-01','-01')) as period_start_date,case when date(concat(substr(calendar_date,1,4),'-12','-31'))>= date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) else date(concat(substr(calendar_date,1,4),'-12','-31')) end as  period_end_date","")}
    ${if(粒度 == 6,"date(min(calendar_date)) as period_start_date,case when date(max(calendar_date))>= date(date_add('day',-1,current_date)) then date(date_add('day',-1,current_date)) else date(max(calendar_date)) end as period_end_date","")}
from
    edw_public.dim_esf_edw_pub_date
where  calendar_date >= '${开始时间}' and calendar_date <= '${结束时间}'
${if(粒度 == 1," and 1 = 1 ","")}
${if(粒度 == 2," and day_of_week_number in (1,7) ","")}
${if(粒度 == 3," and day_of_month_number = 1","")}
${if(粒度 == 4," and day_of_quater_number = 1","")}
${if(粒度 == 5," and day_of_year_number = 1","")}
${if(粒度 == 6," group by calendar_half_year_code ","")}

------------------------------------------------------------------------------------------------
-- 根据输入的时间范围计算期末日期
------------------------------------------------------------------------------------------------
select t1.*
from
-- 日周月季年半年不同粒度的统计数据各存为了一张表
    edw_reports.adm_xf_edw_house_sub_project_report_00${dtype}ly_di t1--日报
join
(
-- 日
SELECT
    calendar_date
FROM
    edw_public.dim_esf_edw_pub_date
WHERE
    calendar_date BETWEEN '${bdt}' AND '${edt}'
AND '${dtype}' = '1_dai'
UNION
-- 月
SELECT
    MAX(calendar_date) AS calendar_date
FROM
    edw_public.dim_esf_edw_pub_date
WHERE
    calendar_date BETWEEN '${bdt}' AND '${edt}'
AND '${dtype}' = '2_dai'
GROUP BY
    calendar_month_number

UNION
-- 周
SELECT
    calendar_date
FROM
    edw_public.dim_esf_edw_pub_date
WHERE
    calendar_date BETWEEN '${bdt}' AND '${edt}'
AND day_of_week_number = 7
AND '${dtype}' = '3_dai'

UNION
-- 季
SELECT
    MAX(calendar_date) AS calendar_date
FROM
    edw_public.dim_esf_edw_pub_date
WHERE
    calendar_date BETWEEN '${bdt}' AND '${edt}'
AND '${dtype}' = '4_dai'
GROUP BY
    calendar_quater_code

UNION
-- 年
SELECT
    MAX(calendar_date) AS calendar_date
FROM
    edw_public.dim_esf_edw_pub_date
WHERE
    calendar_date BETWEEN '${bdt}' AND '${edt}'
AND '${dtype}' = '5_dai'
GROUP BY
    calendar_year_code

UNION
-- 半年
SELECT
    MAX(calendar_date) AS calendar_date
FROM
    edw_public.dim_esf_edw_pub_date
WHERE
    calendar_date BETWEEN '${bdt}' AND '${edt}'
AND '${dtype}' = '6_dai'
GROUP BY
    calendar_half_year_code

UNION
SELECT
    MAX(calendar_date) AS calendar_date
FROM
    edw_public.dim_esf_edw_pub_date
WHERE
    calendar_date BETWEEN '${bdt}' AND '${edt}'
ORDER BY
    calendar_date
) t2
on t1.statistic_date = t2.calendar_date
where
statistic_date between '${bdt}' and '${edt}'
${if(len(tenant_name) == 0,"","and house_sub_project_organization_short_name = '" + tenant_name + "'")}
${if(len(status) == 0,"","and house_sub_project_cooperation_status_code = " + status)}
${if(len(tenant_type) == 0,"","and house_sub_project_organization_business_type_code= " + tenant_type)}
${if(len(project_type) == 0,"","and house_sub_project_cooperation_type_code= " + project_type)}
order by statistic_date

(2)Hive 计算指定日期在本周的第几天和指定日期的本周指定天数的日期

注意这里需要先明确本周的第一天到底是星期一还是星期天?dayofweek函数定义星期天是一周中的第一天,另外dayofweek在hive2.2.0才开始支持,低版本的hive不支持dayofweek函数,需要使用其他方法实现,请见我的博客Hive和sparksql中的dayofweek

-- 计算指定日期本周的第一天和最后一天
select
     day
    , dayofweek(day)                                                                    as dw1
    , date_add(day,1 - dayofweek(day))                                                  as Su_s -- 周日_start
    , date_add(day,7 - dayofweek(day))                                                  as Sa_e -- 周六_end
    , case when dayofweek(day) = 1 then 7 else dayofweek(day) - 1 end                   as dw2
    , date_add(day,1 - case when dayofweek(day) = 1 then 7 else dayofweek(day) - 1 end) as Mo_s -- 周一_start
    , date_add(day,7 - case when dayofweek(day) = 1 then 7 else dayofweek(day) - 1 end) as Su_e -- 周日_end
    , trunc(day,'YY')                                                                   as yearly_first_day
    , trunc(day,'MM')                                                                   as monthly_first_day    -- 本月1号日期
    , last_day(day)                                                                     as monthly_last_day     -- 本月最后一天日期
    , date_add(next_day(day,'MO'), -7)                                                  as weekly_first_day     -- 本周一日期
    , next_day(date_add(day, -7),'MO')                                                  as weekly_first_day     -- 本周一日期
    , case when (7 - datediff(next_day(day,'SU'),day)) <> 0 then next_day(day,'SU') else day end as weekly_end_day      -- 本周日日期

from (
    select '2018-11-01' as day union all
    select '2018-11-02' as day union all
    select '2018-11-03' as day union all
    select '2018-11-04' as day union all
    select '2018-11-05' as day union all
    select '2018-11-06' as day union all
    select '2018-11-07' as day union all
    select '2018-11-08' as day union all
    select '2018-11-09' as day union all
    select '2018-11-10' as day union all
    select '2018-11-11' as day union all
    select '2018-11-12' as day union all
    select '2018-11-13' as day union all
    select '2018-11-14' as day union all
    select '2018-11-15' as day union all
    select '2018-11-16' as day union all
    select '2018-11-17' as day union all
    select '2018-11-18' as day union all
    select '2018-11-19' as day union all
    select '2018-11-20' as day union all
    select '2018-11-21' as day union all
    select '2018-11-22' as day union all
    select '2018-11-23' as day union all
    select '2018-11-24' as day union all
    select '2018-11-25' as day union all
    select '2018-11-26' as day union all
    select '2018-11-27' as day union all
    select '2018-11-28' as day union all
    select '2018-11-29' as day union all
    select '2018-11-30' as day union all
) t1
;

Hive 数仓中常见的日期转换操作的更多相关文章

  1. 在HUE中将文本格式的数据导入hive数仓中

    今天有一个需求需要将一份文档形式的hft与fdd的城市关系关系的数据导入到hive数仓中,之前没有在hue中进行这项操作(上家都是通过xshell登录堡垒机直接连服务器进行操作的),特此记录一下. - ...

  2. 使用Oozie中workflow的定时任务重跑hive数仓表的历史分期调度

    在数仓和BI系统的开发和使用过程中会经常出现需要重跑数仓中某些或一段时间内的分区数据,原因可能是:1.数据统计和计算逻辑/口径调整,2.发现之前的埋点数据收集出现错误或者埋点出现错误,3.业务数据库出 ...

  3. js中时间戳与日期转换-js日期操作

    常用的一些日期操作. 用js获取一个时间戳. <script type="text/javascript"> var date = new Date();//当前时间 ...

  4. Hive数仓之快速入门(二)

    上次已经讲了<Hive数据仓库之快速入门一>不记得的小伙伴可以点击回顾一下,接下来我们再讲Hive数据仓库之快速入门二 DQL hive中的order by.distribute by.s ...

  5. hive数仓客户端界面工具

    1.Hive的官网上介绍了三个可以在Windows中通过JDBC连接HiveServer2的图形界面工具,包括:SQuirrel SQL Client.Oracle SQL Developer以及Db ...

  6. python中常见的日期换算

    time模块提供各种操作时间的函数  说明:一般有两种表示时间的方式:       第一种是时间戳的方式(相对于1970.1.1 00:00:00以秒计算的偏移量),时间戳是惟一的       第二种 ...

  7. 大数据学习——hive数仓DML和DDL操作

    1 创建一个分区表 create table t_partition001(ip string,duration int) partitioned by(country string) row for ...

  8. python中常见的日期处理方法

    今天:today = datetime.date.today() 昨天:yesterday = today - datetime.timedelta(days=1) 上个月:last_month = ...

  9. scala中的数组的转换操作

    1.共有两种操作 转换成一种新的数组 2.yield转换 3.函数式编程转换

随机推荐

  1. CSS实现背景透明,文字不透明(兼容各浏览器)

    在 FF/Chrome 等较新的浏览器中可以使用css属性background- color的rgba轻松实现背景透明,而文字保持不透明.而IE6/7/8浏览器不支持rgba,只有使用IE的专属滤镜f ...

  2. 去掉文件夹的.svn文件

    加注册表 代码为: Windows Registry Editor Version 5.00 [HKEY_LOCAL_MACHINE"SOFTWARE"Classes"F ...

  3. .net 禁止远程查看应用程序错误的详细信息,服务器上出现应用程序错误

    打开页面时出现以下错误   "/"应用程序中的服务器错误. 运行时错误 说明: 服务器上出现应用程序错误.此应用程序的当前自定义错误设置禁止远程查看应用程序错误的详细信息(出于安全 ...

  4. RFID电子标签的二次注塑封装

    生活当中,RFID电子标签具有明显的优势,随着RFID电子标签成本的降低.读写距离的提高.标签存储容量增大及处理时间缩短的发展趋势,R F I D电子标签的应用将会越来越广泛. RFID电子标签的应用 ...

  5. 【java基础】==与equals()方法比较

    1.八种基本类型 只有"=="方法进行判断是否相等,比较的是基本数据类型的值,没有equals()方法 2.Java类库中的类 "=="比较的是对象的句柄,即内 ...

  6. Function.caller、arguments.caller、argument.callee

    caller.callee是与javascript函数相关的两个属性,今天来总结下. Function.caller caller是javascript函数的一个属性,它指向调用当前函数的函数,如果函 ...

  7. django - 总结 - 用户认证组件

    用户认证组件 from django.contrib import auth 从auth_user表中获取对象,没有返回None,其中密码为密文,使用了加密算法 user = auth.authent ...

  8. foreach和List.Foreach 退出循环相关问题

    foreach: continue;:退出本次循环 break;:退出循环 return;:退出循环 List.Foreach: return;:退出本次循环 小结:list.Foreach中不能退出 ...

  9. devtools进行热部署

    热部署的形式这里只介绍一种devtools devtools可以实现页面热部署(即页面修改后会立即生效,这个可以直接在application.properties文件中配置spring.thymele ...

  10. ngx-moment汉化

    1.导入汉化文件 import '../../../node_modules/moment/locale/zh-cn.js' 2.使用汉化 <span>{{item.time|amLoca ...