怎么做app网站,wordpress更换域名重定向,icp域名备案查询,做破解软件网站赚广告费背景
FlinkKafkaConsumer支持当收到某个kafka分区中的某条记录时发送水位线#xff0c;比如这条特殊的记录代表一个完整记录的结束等#xff0c;本文就来解析下发送punctuated水位线的源码
punctuated 水位线发送源码解析
1.首先KafkaFetcher中的runFetchLoop方法
public…背景
FlinkKafkaConsumer支持当收到某个kafka分区中的某条记录时发送水位线比如这条特殊的记录代表一个完整记录的结束等本文就来解析下发送punctuated水位线的源码
punctuated 水位线发送源码解析
1.首先KafkaFetcher中的runFetchLoop方法
public void runFetchLoop() throws Exception {try {// kick off the actual Kafka consumerconsumerThread.start();while (running) {// this blocks until we get the next records// it automatically re-throws exceptions encountered in the consumer threadfinal ConsumerRecordsbyte[], byte[] records handover.pollNext();// get the records for each topic partitionfor (KafkaTopicPartitionStateT, TopicPartition partition :subscribedPartitionStates()) {ListConsumerRecordbyte[], byte[] partitionRecords records.records(partition.getKafkaPartitionHandle());
// 算子任务消费的每个分区都调用这个方法partitionConsumerRecordsHandler(partitionRecords, partition);}}} finally {// this signals the consumer thread that no more work is to be doneconsumerThread.shutdown();}2.查看partitionConsumerRecordsHandler方法处理当前算子任务对应的每个分区的水位线 protected void emitRecordsWithTimestamps(QueueT records,KafkaTopicPartitionStateT, KPH partitionState,long offset,long kafkaEventTimestamp) {// emit the records, using the checkpoint lock to guarantee// atomicity of record emission and offset state updatesynchronized (checkpointLock) {T record;while ((record records.poll()) ! null) {long timestamp partitionState.extractTimestamp(record, kafkaEventTimestamp);// 发送kafka记录到下游算子sourceContext.collectWithTimestamp(record, timestamp);// this might emit a watermark, so do it after emitting the record// 处理分区的水位线记录这个分区的水位线并在满足条件时更新整个算子任务的水位线partitionState.onEvent(record, timestamp);}partitionState.setOffset(offset);}}3.处理每个分区的水位线javapublic void onEvent(T event, long timestamp) {watermarkGenerator.onEvent(event, timestamp, immediateOutput);}public void onEvent(T event, long eventTimestamp, WatermarkOutput output) {final org.apache.flink.streaming.api.watermark.Watermark next wms.checkAndGetNextWatermark(event, eventTimestamp);if (next ! null) {output.emitWatermark(new Watermark(next.getTimestamp()));}}其中 output.emitWatermark(new Watermark(next.getTimestamp()));对应方法如下public void emitWatermark(Watermark watermark) {long timestamp watermark.getTimestamp();// 更新每个分区对应的水位线并且更新boolean wasUpdated state.setWatermark(timestamp);// if its higher than the max watermark so far we might have to update the// combined watermark 这个表明这个算子任务的最低水位线也就是算子任务级别的水位线而不是分区级别的了if (wasUpdated timestamp combinedWatermark) {updateCombinedWatermark();}}//每个分区水位线的更新如下public boolean setWatermark(long watermark) {this.idle false;final boolean updated watermark this.watermark;this.watermark Math.max(watermark, this.watermark);return updated;}
4.最后是发送算子任务级别的水位线的方法
private void updateCombinedWatermark() {long minimumOverAllOutputs Long.MAX_VALUE;boolean hasOutputs false;boolean allIdle true;for (OutputState outputState : watermarkOutputs) {if (!outputState.isIdle()) {minimumOverAllOutputs Math.min(minimumOverAllOutputs, outputState.getWatermark());allIdle false;}hasOutputs true;}// if we dont have any outputs minimumOverAllOutputs is not valid, its still// at its initial Long.MAX_VALUE state and we must not emit thatif (!hasOutputs) {return;}if (allIdle) {underlyingOutput.markIdle();} else if (minimumOverAllOutputs combinedWatermark) {combinedWatermark minimumOverAllOutputs;underlyingOutput.emitWatermark(new Watermark(minimumOverAllOutputs));}}你可以看这个流程是不是意味着如果使用Punctuated的方式是不支持Idle空闲时间的–答案是的