Open MIC (Open Museum Identification Challenge) contains photos of exhibits captured in 10 distinct exhibition spaces of several museums which showcase paintings, timepieces, sculptures, glassware, relics, science exhibits, natural history pieces, ceramics, pottery, tools and indigenous crafts. The goal of Open MIC is to stimulate research in domain adaptation, egocentric recognition and few-shot learning by providing a testbed complementary to the famous Office 31 dataset which reaches ~90% accuracy.
INTRODUCTION
- For the source domain, we captured the photos in a controlled fashion by Android phones e.g., we ensured that each exhibit is centered and non-occluded in photos. We prevented adverse capturing conditions and did not mix multiple objects per photo unless they were all part of one exhibit. We captured 2–30 photos of each art piece from different viewpoints and distances in their natural settings.
- For the target domain, we employed an egocentric setup to ensure in-the-wild capturing process. We equipped several volunteers with cheap wearable cameras and let them stroll and interact with artworks at their discretion. Open MIC contains 10 distinct source-target subsets of images from 10 different kinds of museum exhibition spaces, each exhibiting various photometric and geometric challenges. We annotated each image with labels of art pieces visible in it. The wearable cameras were set to capture an image every 10s and they operated in-the-wild, e.g., volunteers had no control over shutter, focus, centering, etc.
- Therefore, the collected target subsets exhibit many realistic challenges, e.g., sensor noises, motion blur, occlusions, background clutter, varying viewpoints, scale changes, rotations, glares, transparency, non-planar surfaces, clipping, multiple exhibits, active light, color inconstancy, very large or small exhibits, to name but a few phenomena.
- Every subset (10 distinct exhibition spaces) contains 37–166 exhibits to identify. We provide 5 train, 5 validation, and 5 test splits per exhibition. In total, our dataset contains 866 unique exhibit labels, 8560 source and 7596 target images.
- Shown below are sample source images from our dataset:
- Shown below are sample taget images from our dataset:
EXHIBITIONS
Open MIC contains 10 distinct source-target subsets of images from 10 different kinds of museum exhibition spaces. They include:- Paintings from Shenzhen Museum (Shn),
- Clocks and Watch Gallery (Clk), and the Indian and Chinese Sculptures (Scl) from the Palace Museum,
- Xiangyang Science Museum (Sci),
- European Glass Art (Gls) and the Collection of Cultural Relics (Rel) from the Hubei Provincial Museum,
- Nature, Animals and Plants in Ancient Times (Nat) from Shanghai Natural History Museum,
- Comprehensive Historical and Cultural Exhibits from Shaanxi History Museum (Shx),
- Sculptures, Pottery and Bronze Figurines from the Cleveland Museum of Arts (Clv),
- Indigenous Arts from Honolulu Museum Of Arts (Hon).
BASELINES
To demonstrate the intrinsic difficulty of the Open MIC dataset, we provide the community with baseline accuracies obtained from:- fine-tuning CNNs on the source subsets (S) and testing on the randomly chosen target splits,
- fine tuning on target only (T) and evaluating on remaining disjoint target splits,
- fine-tuning on the source+target (S+T) and evaluating on remaining disjoint target splits,
- training state-of-the-art domain adaptation So-HoT algorithm (Euclidean and non-Euclidean distances).
DOMAIN ADAPTATION
We include the following evaluation protocols for Domain Adaptation (see the cited below ECCV'18 paper for more details). Kindly note that if you use our dataset, you do not have to run your algorithm on all these protocols for all combinations etc. Just choose one protocol you like:- protocol (i): training/evaluation per exhibition subset (one experiment per one exhibition),
- protocol (ii): training/testing on the combined set with 866 identity labels (one experiment for 10 combined exhibitions),
- protocol (iii): testing w.r.t. 12 scene factors annotated by us:
object clipping (clp), low lighting (lgt), blur (blr), light glares (glr), background clutter (bgr), occlusions (ocl), in-plane rotations (rot), zoom (zom), tilted viewpoint (vpc), small size/far away (sml), object shadows (shd), reflections (rfl) and the clean view (ok), - protocol (iv): training/evaluation per exhibition subset (unsupervised Domain Adaptation).
- Below we illustrate (left) results for training on all 866 identity labels and (right) how the adaptation accuracy depends on photometric and geometric distortions of tagret images:
- Below we illustrate results for training/evaluation per exhibition subset for the protocol iv (unsupervised Domain Adaptation):
The above results are based on recent popular algorithms: Invariant Hilbert Space (IHS), Uns. Domain Adaptation with Residual Transfer Networks (RTN) and Joint Adaptation Networks (JAN).
FEW-SHOT LEARNING
We include the following evaluation protocols for One-shot Learning. Kindly note that if you use our dataset, you do not have to run your algorithm on all these protocols for all combinations etc. Just choose one protocol you like:- protocol (v): 1-shot L-way training on each combined target split (p1: shn+hon+clv, p2: clk+gls+scl, p3: sci+nat, p4: shx+rlc) and testing on the rest of combined target splits (this prot. checks the ability of task to task generalisation),
- protocol (vi): 1-shot L-way training on each source split (ten exhibitions defined above) and testing on the corresponding target split (v the ability of source to target generalisation),
- protocol (vii): 1-shot L-way training on each combined source split (p1,...,p4) and testing on the rest of combined target splits (this prot. checks the ability of task to task and source to target generalisation).
- also: see our CVPR 2019 and 2020 papers listed below for the latest FSL results and protocols.
- Below we illustrate results for the protocol (v) using our SoSN network (84x84/224x224 image crops):
- Below we illustrate results for the protocol (vi) using our SoSN network (84x84/224x224 image crops):
PUBLICATIONS
For more details on the data, protocols, evaluatinons and algorithms, see the following publication. We would ask you to kindly cite the following paper(s) when using our dataset:- Museum Exhibit Identification Challenge for Domain Adaptation and Beyond,
P. Koniusz, Y. Tas, H. Zhang, M. Harandi, F. Porikli, R. Zhang,
European Conference on Computer Vision (ECCV), 2018, bibtex.
(oral ~2% acceptance rate, ECCV'18 talk /YouTube/) - Power Normalizing Second-order Similarity Network for Few-shot Learning,
H. Zhang, P. Koniusz, Winter Conference on Applications of Computer Vision (WACV), 2019, bibtex. Also, see the GitHub code. - Few-Shot Learning via Saliency-guided Hallucination of Samples, Hongguang Zhang, Jing Zhang, Piotr Koniusz, Computer Vision and Pattern Recognition (CVPR), 2019. Also, see the GitHub code.
- Adaptive Subspaces for Few-Shot Learning, Christian Simon, Piotr Koniusz, Richard Nock, Mehrtash Harandi, Computer Vision and Pattern Recognition (CVPR), 2020. Also, see the Supp. Mat. and the GitHub code.
REQUEST FORM
Our dataset license follows mostly the fair use regulations making it available for the academic non-commercial use only. The license assumes royalty-free, non-exclusive, non-transferable, attribution, 'no derivatives' rights. Please read carefully the license and fill in below the requested details. We will verify the request and send you an e-mail with a password. Send us an e-mail to Open MIC providing the following details (cut and paste and fill in and send to us):
DATASET/DOWNLOAD (SMALL SIZE)
Once you will have obtained a valid passwrd, you will be able to instantly downlaod our files (enter e-mail as login followed by the password from e-mail).Firstly, go through the following 'readme' file for tdetails of what is contained in which folders of our archives:
Below we provide versions of our dataset in resolution 256, 512 and 1024px. You can choose the quality needed for your experiments but we expect that 256 or 512px should be sufficient if you work with CNNs. The following archives contain full images and crops. We used crops in our ECCV'18 paper as well as for one-shot learning:
- 256_OpenMIC.zip (bilinear interpolation)
- 512_OpenMIC.zip (bilinear interpolation)
- 1024_OpenMIC.zip (bicubic interpolation)
- target_splits_eccv2018.zip (labels used by us in the ECCV'18 paper)
See also the MULTILABELS at the bottom of this page.
DATASET/DOWNLOAD (LARGE CROPS)
Below are crops (3 per image) in high resolution of approximately 2048x2048px. Note that each exhibition archive is large, e.g. 1-3GB per file, and to evaluate your algorithm on any of the protocols lsited above, you will need to download all 10 following files:- crops_clk.zip
- crops_gls.zip
- crops_nat.zip
- crops_sci.zip
- crops_shn.zip
- crops_clv.zip
- crops_hon.zip
- crops_rel.zip
- crops_scl.zip
- crops_shx.zip
DATASET/DOWNLOAD (FULL IMAGES)
Below are full resolution whole images (over 2048px). Note that each exhibition archive is large, e.g. 1-3GB per file:- full_clk.zip
- full_gls.zip
- full_nat.zip
- full_sci.zip
- full_shn.zip
- full_clv.zip
- full_hon.zip
- full_rel.zip
- full_scl.zip
- full_shx.zip