Nodes represent web pages and directed edges represent hyperlinks between them. The data was released in 2002 by Google as a part of Google Programming Contest.
Dataset statistics | |
---|---|
Nodes | 875713 |
Edges | 5105039 |
Nodes in largest WCC | 855802 (0.977) |
Edges in largest WCC | 5066842 (0.993) |
Nodes in largest SCC | 434818 (0.497) |
Edges in largest SCC | 3419124 (0.670) |
Average clustering coefficient | 0.5143 |
Number of triangles | 13391903 |
Fraction of closed triangles | 0.01911 |
Diameter (longest shortest path) | 21 |
90-percentile effective diameter | 8.1 |
File | Description |
---|---|
web-Google.txt.gz | Webgraph from the Google programming contest, 2002 |