Tag Archives: dataflow

Understanding Dataflow – what it can and cannot do

Google Cloud Dataflow is a popular technology these days to build streaming data pipelines. However it would be useful to remember what it can and cannot do. What Dataflow can do: In above x1, x2, x3 are 3 streaming inputs. … Continue reading

Posted in Computers, programming, Software | Tagged | Leave a comment

Deleting Entities in Bulk from Google Datastore

The easiest way to do this seems to be using Dataflow. Here is sample Dataflow job to delete all entities of kind foo in namespace bar: As example a job to delete 44,951,022 entities with default autoscaling took 1 hr … Continue reading

Posted in Computers, programming, Software | Tagged , | Leave a comment