Google BigQuery

Google Cloud BigQuery

  • A fully managed data warehouse where you can feed petabyte-scale data sets and run SQL-like queries.

Features

  • Cloud BigQuery is a serverless data warehousing technology.
  • It provides integration with the Apache big data ecosystem allowing Hadoop/Spark and Beam workloads to read or write data directly from BigQuery using Storage API.
  • BigQuery supports a standard SQL dialect that is ANSI:2011 compliant, which reduces the need for code rewrites.
  • Automatically replicates data and keeps a seven-day history of changes which facilitates restoration and data comparison from different times.

Loading data into BigQuery

You must first load your data into BigQuery before you can run queries. To do this you can:

IT Certification Category (English)728x90
  • Load a set of data records from Cloud Storage or from a local file. The records can be in Avro, CSV, JSON (newline delimited only), ORC, or Parquet format.
  • Export data from Datastore or Firestore and load the exported data into BigQuery.
  • Load data from other Google services, such as
    • Google Ad Manager
    • Google Ads
    • Google Play
    • Cloud Storage
    • Youtube Channel Reports
    • Youtube Content Owner reports
  • Stream data one record at a time using streaming inserts.
  • Write data from a Dataflow pipeline to BigQuery.
  • Use DML statements to perform bulk inserts. Note that BigQuery charges for DML queries. See Data Manipulation Language pricing.

Querying from external data sources

  • BigQuery offers support for querying data directly from:
    • Cloud BigTable
    • Cloud Storage
    • Cloud SQL
  • Supported formats are:
    • Avro
    • CSV
    • JSON (newline delimited only)
    • ORC
    • Parquet
  • To query data on external sources, you have to create external table definition file that contains the schema definition and metadata.

Monitoring

  • BigQuery creates log entries for actions such as creating or deleting a table, purchasing slots, or running a load job.

Pricing

  • On-demand pricing lets you pay only for the storage and compute that you use.
  • Flat-rate pricing with reservations enables high-volume users to choose price for workloads that are predictable.
  • To estimate query costs, it is best practice to acquire the estimated bytes read by using the query validator in Cloud Console or submitting a query job using the API with the dryRun parameter. Use this information in Pricing Calculator to calculate the query cost.

References:
https://cloud.google.com/bigquery
https://cloud.google.com/bigquery/docs/introduction

New Year Sale – Upgrade Your Skills and Get a Chance to Win FREE Courses

NEW Course – AWS Certified Data Analytics Specialty Practice Exams

AWS Certified Data Analytics Sepcialty

Pass your AWS and Azure Certifications with the Tutorials Dojo Portal

Tutorials Dojo portal

Our Bestselling AWS Certified Solutions Architect Associate Practice Exams

AWS Certified Solutions Architect Associate Practice Exams

Enroll Now – Our AWS Practice Exams with 95% Passing Rate

AWS Practice Exams Tutorials Dojo

Enroll Now – Our Azure Certification Exam Reviewers

azure reviewers tutorials dojo

Tutorials Dojo Study Guide and Cheat Sheets eBooks

Tutorials Dojo Study Guide and Cheat Sheets-2

FREE Intro to Cloud Computing for Beginners

FREE AWS Practice Test Samplers

Browse Other Courses

Generic Category (English)300x250

Recent Posts