Skip to content

failure in HiveServer2 Interactive Start #142

@dsmftw

Description

@dsmftw

Hi there,

I'm facing a problem starting HiveServer2 Interactive and LLAP in ODP 1.2.4.0 (possibly related to #46)

Once enabled and installed, HiveServer2 Interactive service fails to start. (no issues with other Hive services, HMS and HS2)

Ambari initially reported errors in
/var/lib/ambari-agent/cache/stacks/ODP/1.0/services/HIVE/package/scripts/hive_server_interactive.py
and
/usr/odp/1.2.4.0-102/hive/scripts/llap/yarn/package.py
(for the most part related to python2 to python3 conversion). I fixed all python2 artifacts and those errors were eventually resolved.

Now, it appears that LLAP daemon is able to start but immediately transitions to COMPLETE state and HS2 Interactive enters failed state.

Error log in Ambari when starting/restarting HiveServer2 Interactive:

Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/stacks/ODP/1.0/services/HIVE/package/scripts/hive_server_interactive.py", line 502, in check_llap_app_status
    return self._verify_llap_app_status(llap_app_info, llap_app_name, return_immediately_if_stopped, curr_time)
  File "/var/lib/ambari-agent/cache/stacks/ODP/1.0/services/HIVE/package/scripts/hive_server_interactive.py", line 566, in _verify_llap_app_status
    raise Fail(status_str)
resource_management.core.exceptions.Fail: LLAP app 'llap0' current state is COMPLETE.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/stacks/ODP/1.0/services/HIVE/package/scripts/hive_server_interactive.py", line 573, in <module>
    HiveServerInteractive().execute()
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 350, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/stacks/ODP/1.0/services/HIVE/package/scripts/hive_server_interactive.py", line 130, in start
    status = self._llap_start(env)
  File "/var/lib/ambari-agent/cache/stacks/ODP/1.0/services/HIVE/package/scripts/hive_server_interactive.py", line 322, in _llap_start
    status = self.check_llap_app_status(params.llap_app_name, params.num_retries_for_checking_llap_status)
  File "/var/lib/ambari-agent/cache/stacks/ODP/1.0/services/HIVE/package/scripts/hive_server_interactive.py", line 504, in check_llap_app_status
    Logger.info(e.message)
AttributeError: 'Fail' object has no attribute 'message'

 stdout:
..............
2025-12-02 16:32:06,409 - INFO  [main:o.a.h.h.l.c.LlapStatusServiceDriver@862] - --------------------------------------------------------------------------------
2025-12-02 16:32:06,409 - INFO  [main:o.a.h.h.l.c.LlapStatusServiceDriver@865] - Awaiting LLAP launch
2025-12-02 16:32:06,409 - INFO  [main:o.a.h.h.l.c.LlapStatusServiceDriver@866] - --------------------------------------------------------------------------------
2025-12-02 16:32:06,409 - WARN  [main:o.a.h.h.l.c.LlapStatusServiceDriver@733] - COMPLETE state reached while waiting for RUNNING state. Failing.
Final diagnostics: null
2025-12-02 16:32:06,410 - INFO  [main:o.a.h.h.l.c.LlapStatusServiceDriver@865] - Awaiting LLAP launch
2025-12-02 16:32:06,410 - INFO  [main:o.a.h.h.l.c.LlapStatusServiceDriver@866] - --------------------------------------------------------------------------------
2025-12-02 16:32:06,410 - INFO  [main:o.a.h.h.l.c.LlapStatusServiceDriver@811] - 



{
  "amInfo" : {
    "appName" : "llap0",
    "appType" : "yarn-service",
    "appId" : "application_1764721607818_0002"
  },
  "state" : "COMPLETE",
  "appStartTime" : 1764721919988,
  "appFinishTime" : 1764721925952,
  "runningThresholdAchieved" : false
}
2025-12-02 16:32:06,615 - LLAP app 'llap0' current state is COMPLETE.

Command failed after 1 tries

[yarn@odp-gateway01 ~]$ yarn logs -applicationId application_1764721607818_0002 > llap_yarn_crash.log

Container: container_e19_1764721607818_0002_02_000001 on odp-slave13.hadoop-ech.[REDACTED]_45454_1764721927072
LogAggregationType: AGGREGATED
============================================================================================================
LogType:directory.info
LogLastModifiedTime:Tue Dec 02 16:32:07 -0800 2025
LogLength:6864
LogContents:
ls -l:
total 36

........

Container: container_e19_1764721607818_0002_02_000001 on odp-slave13.hadoop-ech.[REDACTED]_45454_1764721927072
LogAggregationType: AGGREGATED
============================================================================================================
LogType:serviceam-err.txt
LogLastModifiedTime:Tue Dec 02 16:32:07 -0800 2025
LogLength:86
LogContents:
Error: Could not find or load main class org.apache.hadoop.yarn.service.ServiceMaster

End of LogType:serviceam-err.txt
**********************************************************************************

It looks like the JVM cannot find the hadoop yarn services jar which contains the ServiceMaster class:
Error: Could not find or load main class org.apache.hadoop.yarn.service.ServiceMaster

My environment:

  • Rocky Linux 9.3
  • Python 3.9.16
  • Postgres server 15.14
[root@odp-master01 ~]# dnf if ambari-server
Installed Packages
Name         : ambari-server
Version      : 2.7.11.0
Release      : 158
Architecture : x86_64
Size         : 847 M
Source       : ambari-server-2.7.11.0-158.src.rpm

ODP 1.2.4.0 stack installed:
HDFS
YARN+MapReduce2
ZooKeeper
Ambari Metrics
HBase
Hive
Tez
Spark3
Kafka
Flink
Oozie
Infra Solr
Hue

No Kerberos
No Ranger\Atlas\Knox

Thank you for your help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions