python - Exporting Many BigQuery Tables to One -
we kicking off bigquery queries pull data many tables , write single table. we're using python run jobs parameters:
job = bq_client.run_async_query(jobname, select_query) job.destination = bq_table job.write_disposition = 'write_append'
500 jobs are kicked off 20 @ time. unfortunately, periodically run error:
exceeded rate limits: many table update operations table. more information, see https://cloud.google.com/bigquery/troubleshooting-errors
questions:
- is due many writes single destination table bigquery jobs/query? if so, surprised google not throttling write throughput of bigquery jobs internally.
- if case, have hard time gauging how many jobs can executed in parallel because there large differences in dataload each select query. can avoid or handle these errors? backoff , retry?
edit:
the queries against google analytics 360 data. example:
select ... [{datasetname}.ga_sessions_{date}];
where each dataset different dataset, , querying across many dates.
i can't share calculations against columns, perhaps enough come way consolidate these queries.
assuming select list same queries
for bigquery standard sql
#standardsql select ... `{datasetname1}.ga_sessions_{date}` union select ... `{datasetname2}.ga_sessions_{date}` union ... select ... `{datasetnamen}.ga_sessions_{date}`
for bigquery legacy sql
#legacysql select ... (select ... [{datasetname1}.ga_sessions_{date}]), (select ... [{datasetname2}.ga_sessions_{date}]), ... (select ... [{datasetnamen}.ga_sessions_{date}])
Comments
Post a Comment