Google BigQuery SQL Statement -


I am trying to get some data from the GitHub Archive using Google Big Query I am requesting that the current amount of data is very low for the BigQuery process (at least free level), so I am trying to limit the scope of my request

I limited the data I want to do as if the repository which currently will return more than 1000 stars historical data. It's just saying the repository_watchers> 1000 is more complicated because it will remove historical data for the first 1000 stars received a collection.

  select repository_name, repository_owner, created_at, type, repository_url, the repository_watchers [githubarchive: Github.timeline] WHERE type = "WatchEvent" ORDER created_at Diissi  
< P> Edit: Select the solution I used (@ Brian based on answer)

  select y.repository_name, y.repository_owner, y.created_at, y.type, y.repository_url, y.repository_watchers [githubarchive: github.timeline] select joining y of (repository_url, Max (the repository_watchers) [githubarchive: github.timeline] X where x.type by repository_url = 'WatchEvent group maximum (repository_watchers) & gt 1000 to be) y On .repository_url = x.repository_url x where y.type = dictated by 'WatchEvent' y.repository_name, y.repository_owner, y.created_at Desc  

try:

  [githubarchive: github.timeline] from y.timman] where x.type = 'repository_watchers' in the 'WatchEvent' group is maximized (repository_watchers) & gt; 1000) x y.repository_name = x.repository_name by order y.created_at desc  

If this syntax is not supported, then you can use 3 step solution like this:

Step 1: Find a REPOSITORY_NAME values ​​w / a rEPOSITORY_WATCHERS amount of at least one record> 1000

  select repository_name, max [ From githubarchive as curr_watchers (repository_watchers): Github.timeline] where type = 'watchEvent' group has the repository_name maximum (repository_watchers) & gt; 1000  

Step 2: Store the result as a table, it is called SUB

Step 3: Subtitle against the run the following (and your original table)

  select [y: gitubarchive: github.timeline] sub y.repository_name = x.repository_name ordered by y y.reposit_at Join X.  

Comments

Popular posts from this blog

java - org.apache.http.ProtocolException: Target host is not specified -

java - Gradle dependencies: compile project by relative path -

ruby on rails - Object doesn't support #inspect when used with .include -