Archivarix · Echo

code-archive

GH Archive

A running record of public activity on GitHub: the public event timeline has been archived hourly since 2011, amounting to billions of events. It is a bulk dataset rather than a search site — you download gzipped JSON files per hour or query the public BigQuery dataset; there is no per-record search API or web search form. Researchers reach for it to analyze or reconstruct historical GitHub activity offline.

Search this archive

No programmatic check — opens the archive’s own search.

Why it’s useful & how it works

Bulk dataset (no per-record query API). Reachable via proxy. For offline/BigQuery analysis, not a hand-a-link UI.

What’s inside

Billions of GitHub events.

API access

bulk gzip JSON https://data.gharchive.org/YYYY-MM-DD-H.json.gz ; BigQuery public dataset

Access

Freely reachable — no key, login, or captcha.

Homepage

https://www.gharchive.org/