Skip to content

[Bug]: Append to stream projects/my-project/datasets/my-dataset/tables/my-table/streams/_default failed with Status Code NOT_FOUND. The stream may not exist. #37140

@mkuthan

Description

@mkuthan

What happened?

While the streaming job was starting, I observed the following exception from the workers:

Error message from worker: java.lang.RuntimeException: Append to stream projects/my-project/datasets/my-dataset/tables/my-table/streams/_default failed with Status Code NOT_FOUND. The stream may not exist.
        org.apache.beam.sdk.io.gcp.bigquery.StorageApiWriteUnshardedRecords$WriteRecordsDoFn$DestinationState.lambda$flush$8(StorageApiWriteUnshardedRecords.java:848)
        org.apache.beam.sdk.io.gcp.bigquery.RetryManager.await(RetryManager.java:311)
        org.apache.beam.sdk.io.gcp.bigquery.StorageApiWriteUnshardedRecords$WriteRecordsDoFn.flushAll(StorageApiWriteUnshardedRecords.java:1053)
        org.apache.beam.sdk.io.gcp.bigquery.StorageApiWriteUnshardedRecords$WriteRecordsDoFn.finishBundle(StorageApiWriteUnshardedRecords.java:1209)
Caused by: com.google.api.gax.rpc.NotFoundException: io.grpc.StatusRuntimeException: NOT_FOUND: Requested entity was not found. Entity: projects/my-project/datasets/my-dataset/tables/my-table/streams/_default
        com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:90)
        com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:41)
        com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:86)
        com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:66)
        com.google.api.gax.grpc.ExceptionResponseObserver.onErrorImpl(ExceptionResponseObserver.java:82)
        com.google.api.gax.rpc.StateCheckingResponseObserver.onError(StateCheckingResponseObserver.java:84)
        com.google.api.gax.grpc.GrpcDirectStreamController$ResponseObserverAdapter.onClose(GrpcDirectStreamController.java:148)
        io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
        io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
        io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
        com.google.api.gax.grpc.ChannelPool$ReleasingClientCall$1.onClose(ChannelPool.java:569)
        io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
        io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
        io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
        com.google.api.gax.grpc.GrpcLoggingInterceptor$1$1.onClose(GrpcLoggingInterceptor.java:98)
        io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
        io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
        io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
        io.grpc.census.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:814)
        io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
        io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
        io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
        io.grpc.census.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:494)
        io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:565)
        io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:72)
        io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:733)
        io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:714)
        io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
        io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
        java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        java.base/java.lang.Thread.run(Thread.java:833)

After 10 seconds, the processing stabilized on its own and was working correctly.
Looks like a race-condition in the BigQuery Storage Write API IO initialization phase.

  • Apache Beam version: 2.68
  • Dataflow runner
  • Streaming engine enabled
  • At least once mode

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions