首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在嵌入式连接器中使用pub/sub运行束流管道时出错(apache_beam [GCP])

在嵌入式连接器中使用pub/sub运行束流管道时出错(apache_beam [GCP])
EN

Stack Overflow用户
提问于 2021-07-12 04:58:31
回答 1查看 565关注 0票数 6

在Apache上运行流管道(python)时,我面临以下错误。该管道包含一个GCP pub/sub源和pub/sub目标。

代码语言:javascript
复制
WARNING:root:Make sure that locally built Python SDK docker image has Python 3.6 interpreter.

ERROR:root:java.lang.IllegalArgumentException: PCollectionNodes [PCollectionNode{id=ref_PCollection_PCollection_1, PCollection=unique_name: "23 Read from Pub/Sub/Read.None"
coder_id: "ref_Coder_BytesCoder_1"
is_bounded: UNBOUNDED
windowing_strategy_id: "ref_Windowing_Windowing_1"
}] were consumed but never produced
Traceback (most recent call last):
  File "<stdin>", line 5, in <module>
  File "/usr/local/lib64/python3.6/site-packages/apache_beam/pipeline.py", line 586, in __exit__
    self.result.wait_until_finish()
  File "/usr/local/lib64/python3.6/site-packages/apache_beam/runners/portability/portable_runner.py", line 599, in wait_until_finish
    raise self._runtime_exception
RuntimeError: Pipeline BeamApp-swarna0kpaul-0712135603-763999c_45da372e-757d-4690-8e25-1a5ed0a5cc84 failed in state FAILED: java.lang.IllegalArgumentException: PCollectionNodes [PCollectionNode{id=ref_PCollection_PCollection_1, PCollection=unique_name: "23 Read from Pub/Sub/Read.None"
coder_id: "ref_Coder_BytesCoder_1"
is_bounded: UNBOUNDED
windowing_strategy_id: "ref_Windowing_Windowing_1"
}] were consumed but never produced

我试图在Python中运行以下代码,我试图使用我在GCP帐户中创建的两个发布/子主题({input },{output })运行- projects/{project }/ topics /{ topics }

代码语言:javascript
复制
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
input_topic=<input topic>
output_topic=<output topic>
options = PipelineOptions(["--runner=FlinkRunner", "--checkpointing_interval=1000","--streaming"])
with beam.Pipeline(options=options ) as pipeline:
  input1 = pipeline | " Read from Pub/Sub" >> beam.io.ReadFromPubSub(topic=input_topic).with_output_types(bytes)
  output = (input1
            |beam.WindowInto(beam.transforms.window.FixedWindows(5))
            |"Write to Pub/Sub" >>beam.io.WriteToPubSub(topic=output_topic, with_attributes=False).with_input_types(bytes))

以下版本的软件可在系统中使用

代码语言:javascript
复制
Python 3.6.8
apache_beam [gcp]==2.30.0
java -version
openjdk version "1.8.0_292"
OpenJDK Runtime Environment (build 1.8.0_292-b10)
OpenJDK 64-Bit Server VM (build 25.292-b10, mixed mode)

我试着按照页面中的规范使用flink集群和可移植的flink运行器来运行,但是得到了相同的错误。

当我使用以下选项时,相同的代码运行良好

代码语言:javascript
复制
options = PipelineOptions(["--streaming"])
EN

回答 1

Stack Overflow用户

发布于 2022-05-20 14:36:46

apache_beam.io.ReadFromPubsub()转换只适用于DirectRunner和Dataflow Runner,但是有一个外部转换可以尝试:apache_beam.io.external.gcp.pubsub.ReadFromPubSub,请参阅:beam/io/external/gcp/pubsub.py#L39

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/68342095

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档