Flickr image relationships
Dataset information
This dataset is built by forming links between images sharing common metadata from Flickr. Edges are formed between images from the same location, submitted to the same gallery, group, or set, images sharing common tags, images taken by friends, etc. The original images are collected from PASCAL, ImageCLEF, MIR, and NUS-wide.
Dataset statistics |
Nodes | 105938 |
Edges | 2316948 |
Nodes in largest WCC | 105722 (0.998) |
Edges in largest WCC | 2316668 (1.000) |
Nodes in largest SCC | 105722 (0.998) |
Edges in largest SCC | 2316668 (1.000) |
Average clustering coefficient | 0.0891 |
Number of triangles | 107987357 |
Fraction of closed triangles | 0.1828 |
Diameter (longest shortest path) | 9 |
90-percentile effective diameter | 4.8 |
Source (citation)
Files
How to parse (in Python)
import xml.etree.ElementTree as ET
import sys
def parsePhotos(path):
f = open(path, 'r')
f.readline()
content = ""
for l in f:
content += l
if l.startswith(""):
yield ET.fromstring(content)
content = ""
for x in parsePhotos(sys.argv[1]):
print ET.tostring(x)