一、相关参考

官网参考链接:Spark SQL, Built-in Functions

时间日期格式符号: Datetime patterns - Spark 3.3.1 Documentation

标准日期格式符号: yyyy-MM-dd HH:mm:ss、yyyy-M-ddTHH:mm:ss

二、时区影响

  1. 时间相关的函数,大多会受时区的影响;
  2. 不特殊指定时区,会默认使用服务器的时区;
  3. spark中处理时间时,最好指定session级时区,并且最好指定为0时区,便于进行时区偏移;
set spark.sql.session.timeZone=GMT+8;
set spark.sql.session.timeZone=UTC;


 

三、时间函数

1. 受时区影响


1. current_date获取当前日期

select current_date() as cur_date;

2018-04-09

2. current_timestamp/now()获取当前时间

select current_timestamp() as cur_time;

2018-04-09 15:20:49.247

3. unix_timestamp/to_unix_timestamp从当前时区转换到unix时间

这两个函数从官方文档看是相同的;

从指定格式的时间转换到Unix时间

unix_timestamp([timeExp[, fmt]])

Returns the UNIX timestamp of current or specified time.

Arguments:

  • timeExp - A date/timestamp or string. If not provided, this defaults to current time.
  • fmt - Date/time format pattern to follow. Ignored if timeExp is not a string. Default value is "yyyy-MM-dd HH:mm:ss". See Datetime Patterns for valid date and time format patterns.

Examples:

> SELECT unix_timestamp();
 1476884637
> SELECT unix_timestamp('2016-04-08', 'yyyy-MM-dd');
 1460041200

to_unix_timestamp(timeExp[, fmt])

Returns the UNIX timestamp of the given time.

Arguments:

  • timeExp - A date/timestamp or string which is returned as a UNIX timestamp.
  • fmt - Date/time format pattern to follow. Ignored if timeExp is not a string. Default value is "yyyy-MM-dd HH:mm:ss". See Datetime Patterns for valid date and time format patterns.

Examples:

> SELECT to_unix_timestamp('2016-04-08', 'yyyy-MM-dd');
 1460098800

4. from_unixtime从unix时间转换到当前时区的时间戳

注意: 第一个参数是10位的时间戳

from_unixtime(unix_time[, fmt])

Returns unix_time in the specified fmt.

Arguments:

  • unix_time - UNIX Timestamp to be converted to the provided format.
  • fmt - Date/time format pattern to follow. See Datetime Patterns for valid date and time format patterns. The 'yyyy-MM-dd HH:mm:ss' pattern is used if omitted.

Examples:

> SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss');
 1970-01-01 00:00:00

> SELECT from_unixtime(0);
 1970-01-01 00:00:00

5. cast从unix时间强转到时间戳(转到设定的时区)

select cast(1614815429 as timestamp); 

2021-03-04 07:50:29

6. date_format日期格式转换

第一个参数必须是标准格式的时间

date_format(timestamp, fmt)

Converts timestamp to a value of string in the format specified by the date format fmt.

Arguments:

  • timestamp - A date/timestamp or string to be converted to the given format.
  • fmt - Date/time format pattern to follow. See Datetime Patterns for valid date and time format patterns.

Examples:

> SELECT date_format('2016-04-08', 'y');
 2016

1) 不受时区影响

SELECT date_format('2016-04-08 18', 'yyyy-MM-dd');

2016-04-09

2) 这种特定时间戳格式受时区影响

select date_format('2021-06-24T22:04:50+00:00', 'yyyy-MM-dd');

2021-06-25

2. 不受时区影响

1. 返回当前时间的unix时间戳
SELECT unix_timestamp();
1614237507

2. to_utc_timestamp时间戳从指定的时区转换到0时区

to_utc_timestamp(timestamp, timezone)

Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in the given time zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield '2017-07-14 01:40:00.0'.

Examples:

> SELECT to_utc_timestamp('2016-08-31', 'Asia/Seoul');
 2016-08-30 15:00:00

3. from_utc_timestamp从0时区转换到指定时区

from_utc_timestamp(timestamp, timezone)

Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'.

Examples:

> SELECT from_utc_timestamp('2017-07-14 02:40:00.0', 'GMT+1');
2017-07-14 03:40:00

4.  to_date时间从指定格式转换到标准格式日期yyyy-MM-dd

to_date(date_str[, fmt])

Parses the date_str expression with the fmt expression to a date. Returns null with invalid input. By default, it follows casting rules to a date if the fmt is omitted.

Arguments:

  • date_str - A string to be parsed to date.
  • fmt - Date format pattern to follow. See Datetime Patterns for valid date and time format patterns.

Examples:

> SELECT to_date('2009-07-30 04:17:52');
 2009-07-30
> SELECT to_date('2016-12-31', 'yyyy-MM-dd');
 2016-12-31
> select to_date('20161231', 'yyyyMMdd');
 2016-12-31

5. 日期的加减(最终转成日期格式)
SELECT date_add('2016-07-30 22:12:30', 1);
2016-07-31

6. to_timestamp从指定时间格式转换到标准格式

to_timestamp(timestamp_str[, fmt])

Parses the timestamp_str expression with the fmt expression to a timestamp. Returns null with invalid input. By default, it follows casting rules to a timestamp if the fmt is omitted. The result data type is consistent with the value of configuration spark.sql.timestampType.

Arguments:

  • timestamp_str - A string to be parsed to timestamp.
  • fmt - Timestamp format pattern to follow. See Datetime Patterns for valid date and time format patterns.

Examples:

> select to_timestamp('2023/5/14 2:11:00', 'yyyy/M/d h:mm:ss');
2023-05-14 02:11:00
> select to_timestamp('20200726', 'yyyyMMdd');
2020-07-26 00:00:00


 

3.  客户端显示影响

前端显示date、time类型的时候,有可能收时区的影响,为了避免不必要的麻烦可存成字符串;

select
     typeof( from_utc_timestamp(cast(cast(1697992306453 as bigint)/1000 as timestamp), 'GMT+00:00') ) as time_ori -- timestamp类型, 客户端展示时可能转到本地时区
    ,typeof( date_format(from_utc_timestamp(cast(cast(1697992306453 as bigint)/1000 as timestamp),'GMT+00:00'), 'yyyy-mm-dd hh:mm:ss') ) as time_utc  -- 字符串类型, 不受时区影响
;
time_ori    time_utc
timestamp    string

四、获取时间差

1. 获取某个时段数据

限定时间最好要闭合, 否则回溯历史数据会出问题.

-- 1. sql 函数推算(日期必须是标准格式:yyyy-MM-dd)
select
    *
from
    tbl_name
where
    dt >= date_format(date_sub('2020-07-26', 2),'yyyyMMdd')
    and dt <= date_format('2020-07-26','yyyyMMdd')
;

-- 2. 调度直接传入
select
    *
from
    tbl_name
where
    dt >= '20200726'
    and dt <= '20200727'
;

2. 获取时间差 

-- 指定时间格式,获取两个时间相差的秒数
select unix_timestamp('2023/5/14 9:09:05', 'yyyy/M/d h:mm:ss') - unix_timestamp('2023/5/14 9:09:00', 'yyyy/M/d h:mm:ss') as diff_seconds;

五、格式转换

指定时区:set spark.sql.session.timeZone=UTC;

不受时区影响函数:to_timestamp、

时区影响函数:from_unixtime、from_utc_timestamp

-- 1 从指定时间格式转换到标准格式
select to_timestamp('2023/5/14 2:11:00', 'yyyy/M/d h:mm:ss');
2023-05-14 02:11:00
select to_timestamp('20200726', 'yyyyMMdd');
2020-07-26 00:00:00

-- 2 从标准格式时间(yyyy-MM-dd 和 yyyy-MM-dd HH:mm:ss.SSSS)转到指定格式时间
select date_format('2016-04-08', 'yyyyMMdd');
20160408

-- 3 yyyyMMdd--->yyyy-MM-dd 时间格式转换
select to_date('20161231', 'yyyyMMdd');
2016-12-31
select concat_ws('-',substr('20200726',1,4),substr('20200726',5,2),substr('20200726',7,2));
2020-07-26
select date_format(to_timestamp('20200726', 'yyyyMMdd'), 'yyyy-MM-dd');
2020-07-26


-- 4 从unix时间戳(10位)转换到当前时区时间戳,从当前时区转换到指定时区
-- session时区是0时区,转换到东八区
> set spark.sql.session.timeZone=UTC;
> select from_utc_timestamp(from_unixtime(0),'GMT+8') as dt;
  1970-01-01 08:00:00

日期格式符 - 格式符的数量影响显示格式,具体要什么样式可以实际测试;

Datetime patterns - Spark 3.5.1 Documentation

-- 季度的格式符号
> select date_format(date '2024-05-01', "q") as time;
2

> select date_format(date '2024-05-01', "qq") as time;
02

> select date_format(date '2024-05-01', "Q") as time;
2

> select date_format(date '2024-05-01', "QQ") as time;
02

> select date_format(date '2024-05-01', "QQQ") as time;
Q2

> select date_format(date '2024-05-01', "QQQQ") as time;
2nd quarter


-- 月份的格式符号
> select date_format(date '1970-01-01', "M");
1

> select date_format(date '1970-09-01', "MM");
09

> select date_format(date '1970-01-01', "MMM");
Jan

> select date_format(date '1970-01-01', "MMMM");
January

六、获取特定日期

-- 1 获取周一日期
SELECT date_format(date_trunc('WEEK', '2020-09-15'),'yyyy-MM-dd');
2020-09-14

-- 2 获取季初日期
SELECT date_format(date_trunc('QUARTER', '2020-09-15'),'yyyy-MM-dd');
2020-07-01

-- 3 获取年初日期
SELECT trunc('2020-09-15', 'YEAR');
2020-01-01
SELECT date_format(date_trunc('YEAR', '2020-09-15'),'yyyy-MM-dd');
2020-01-01

-- 4 获取月初日期
SELECT trunc('2020-09-15', 'MM');
2020-09-01
SELECT date_format(date_trunc('MM', '2020-09-15'),'yyyy-MM-dd');
2020-09-01
SELECT concat(substring('2020-09-15',1,8),'01');
2020-09-01

-- 5 获取月末日期
select last_day('2009-01-12');
2009-01-31

-- 6 获取昨天的日期
select date_format(date_sub(current_date, 1),'yyyMMdd');

-- 7 yyyyMMdd格式日期增加1天
select date_format( date_add( to_timestamp('20240410', 'yyyyMMdd'),1),'yyyyMMdd') as st;


Logo

开放原子开发者工作坊旨在鼓励更多人参与开源活动,与志同道合的开发者们相互交流开发经验、分享开发心得、获取前沿技术趋势。工作坊有多种形式的开发者活动,如meetup、训练营等,主打技术交流,干货满满,真诚地邀请各位开发者共同参与!

更多推荐