-
Notifications
You must be signed in to change notification settings - Fork 25
Description
Hi there,
I'm facing a problem starting HiveServer2 Interactive and LLAP in ODP 1.2.4.0 (possibly related to #46)
Once enabled and installed, HiveServer2 Interactive service fails to start. (no issues with other Hive services, HMS and HS2)
Ambari initially reported errors in
/var/lib/ambari-agent/cache/stacks/ODP/1.0/services/HIVE/package/scripts/hive_server_interactive.py
and
/usr/odp/1.2.4.0-102/hive/scripts/llap/yarn/package.py
(for the most part related to python2 to python3 conversion). I fixed all python2 artifacts and those errors were eventually resolved.
Now, it appears that LLAP daemon is able to start but immediately transitions to COMPLETE state and HS2 Interactive enters failed state.
Error log in Ambari when starting/restarting HiveServer2 Interactive:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stacks/ODP/1.0/services/HIVE/package/scripts/hive_server_interactive.py", line 502, in check_llap_app_status
return self._verify_llap_app_status(llap_app_info, llap_app_name, return_immediately_if_stopped, curr_time)
File "/var/lib/ambari-agent/cache/stacks/ODP/1.0/services/HIVE/package/scripts/hive_server_interactive.py", line 566, in _verify_llap_app_status
raise Fail(status_str)
resource_management.core.exceptions.Fail: LLAP app 'llap0' current state is COMPLETE.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stacks/ODP/1.0/services/HIVE/package/scripts/hive_server_interactive.py", line 573, in <module>
HiveServerInteractive().execute()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 350, in execute
method(env)
File "/var/lib/ambari-agent/cache/stacks/ODP/1.0/services/HIVE/package/scripts/hive_server_interactive.py", line 130, in start
status = self._llap_start(env)
File "/var/lib/ambari-agent/cache/stacks/ODP/1.0/services/HIVE/package/scripts/hive_server_interactive.py", line 322, in _llap_start
status = self.check_llap_app_status(params.llap_app_name, params.num_retries_for_checking_llap_status)
File "/var/lib/ambari-agent/cache/stacks/ODP/1.0/services/HIVE/package/scripts/hive_server_interactive.py", line 504, in check_llap_app_status
Logger.info(e.message)
AttributeError: 'Fail' object has no attribute 'message'
stdout:
..............
2025-12-02 16:32:06,409 - INFO [main:o.a.h.h.l.c.LlapStatusServiceDriver@862] - --------------------------------------------------------------------------------
2025-12-02 16:32:06,409 - INFO [main:o.a.h.h.l.c.LlapStatusServiceDriver@865] - Awaiting LLAP launch
2025-12-02 16:32:06,409 - INFO [main:o.a.h.h.l.c.LlapStatusServiceDriver@866] - --------------------------------------------------------------------------------
2025-12-02 16:32:06,409 - WARN [main:o.a.h.h.l.c.LlapStatusServiceDriver@733] - COMPLETE state reached while waiting for RUNNING state. Failing.
Final diagnostics: null
2025-12-02 16:32:06,410 - INFO [main:o.a.h.h.l.c.LlapStatusServiceDriver@865] - Awaiting LLAP launch
2025-12-02 16:32:06,410 - INFO [main:o.a.h.h.l.c.LlapStatusServiceDriver@866] - --------------------------------------------------------------------------------
2025-12-02 16:32:06,410 - INFO [main:o.a.h.h.l.c.LlapStatusServiceDriver@811] -
{
"amInfo" : {
"appName" : "llap0",
"appType" : "yarn-service",
"appId" : "application_1764721607818_0002"
},
"state" : "COMPLETE",
"appStartTime" : 1764721919988,
"appFinishTime" : 1764721925952,
"runningThresholdAchieved" : false
}
2025-12-02 16:32:06,615 - LLAP app 'llap0' current state is COMPLETE.
Command failed after 1 tries
[yarn@odp-gateway01 ~]$ yarn logs -applicationId application_1764721607818_0002 > llap_yarn_crash.log
Container: container_e19_1764721607818_0002_02_000001 on odp-slave13.hadoop-ech.[REDACTED]_45454_1764721927072
LogAggregationType: AGGREGATED
============================================================================================================
LogType:directory.info
LogLastModifiedTime:Tue Dec 02 16:32:07 -0800 2025
LogLength:6864
LogContents:
ls -l:
total 36
........
Container: container_e19_1764721607818_0002_02_000001 on odp-slave13.hadoop-ech.[REDACTED]_45454_1764721927072
LogAggregationType: AGGREGATED
============================================================================================================
LogType:serviceam-err.txt
LogLastModifiedTime:Tue Dec 02 16:32:07 -0800 2025
LogLength:86
LogContents:
Error: Could not find or load main class org.apache.hadoop.yarn.service.ServiceMaster
End of LogType:serviceam-err.txt
**********************************************************************************
It looks like the JVM cannot find the hadoop yarn services jar which contains the ServiceMaster class:
Error: Could not find or load main class org.apache.hadoop.yarn.service.ServiceMaster
My environment:
- Rocky Linux 9.3
- Python 3.9.16
- Postgres server 15.14
[root@odp-master01 ~]# dnf if ambari-server
Installed Packages
Name : ambari-server
Version : 2.7.11.0
Release : 158
Architecture : x86_64
Size : 847 M
Source : ambari-server-2.7.11.0-158.src.rpm
ODP 1.2.4.0 stack installed:
HDFS
YARN+MapReduce2
ZooKeeper
Ambari Metrics
HBase
Hive
Tez
Spark3
Kafka
Flink
Oozie
Infra Solr
Hue
No Kerberos
No Ranger\Atlas\Knox
Thank you for your help!