Understanding Dataflow – what it can and cannot do

Google Cloud Dataflow is a popular technology these days to build streaming data pipelines. However it would be useful to remember what it can and cannot do.

What Dataflow can do:

In above x1, x2, x3 are 3 streaming inputs. T is the time window. f cannot be any function. there are constraints on what f can be.

What Dataflow cannot do:

\Psi is reference data in an external database. It is not available as streaming and also evolves with t. g is an arbitrary function.

This entry was posted in Computers, programming, Software and tagged . Bookmark the permalink.

Leave a comment