Visual Re-Identification Across Large, Distributed Camera Networks README file (C) 2013-2015 Machine Vision Laboratory and Rok Mandeljc Janez Pers Vildana Sulic Kenk 1. About This code was used to obtain all the results in the following paper: Vildana Sulic Kenk, Rok Mandeljc, Stanislav Kovacic, Matej Kristan, Melita Hajdinjak, Janez Pers. "Visual Re-Identification Across Large, Distributed Camera Networks". Image and Vision Computing, 34(0): 11-26, February 2015. DOI: http://dx.doi.org/10.1016/j.imavis.2014.11.002 2. System requirements The code was tested under Matlab 2012a and later under Linux. The limitation regarding the Matlab version is the function p = randperm(n,k), which has to be available in the two-argument form (not available in R2011a and earlier). During its run, the code produces a distance matrix of N_objects * N_objects singles, which, in the case of Dana36 dataset, means 23683*23683 = ~ 2.2GB, so RAM in multiple of that amount is advisable - we ran the code on the machine with 32 GB RAM. 3. Running the code The purpose of the code is threefold: 1) to recreate the results from the above paper (if you are interested), 2) to test your own or 3rd party object descriptor for surveillance applications, and 3) to test you own or 3rd party surveillance dataset. 3.1 Replicating the experiments from the paper a) If you did not already, download the code from: http://vision.fe.uni-lj.si/RESEARCH/reid/reid_code.zip b) Unpack the ZIP file into its own folder (e.g. '/home/user/reid'). Add this folder to the matlab path (e.g. addpath('/home/user/reid')). c) Download the Dana36 Dataset and SAIVT-SoftBio Dataset from the following locations: http://vision.fe.uni-lj.si/RESEARCH/dana36/ https://wiki.qut.edu.au/display/saivt/SAIVT-SoftBio+Database d) Unpack the datasets and adjust the paths to unpacked datasets at the beginning of file /home/user/reid/paper_experiment_step1.m (under section called "Create dataset adapter"). e) Select the Dana36 dataset by editing the file /home/user/reid/paper_experiment_config.m - uncomment the line 'dataset_name = 'dana36'; % Dana36' f) Run the script paper_experiment_step1.m from the command line and wait. The calculation of all distances may take up to several hours on a slow computer. Distance matrix will be calculated using the basic segmented color descriptor from the paper. The descriptors and the distances will be stored into the subdirectory 'dana36-puzzle' of the current directory. h) Check that rng_do_reseed variable is set to true at the beginning of the /home/user/reid/paper_experiment_config.m. This way, results should be reproduced exactly by selecting the same random seed that was used for the paper This will initialize random generator with the exact seed that was used for the paper. You may set it to false, the results on Dana36 should be roughly the same, but (of course) not exactly the same. If you are testing something repeatedly and comparing the results, it is advisable to keep it at 1. The random generator does not affect experiments on SAIVT SoftBio. i) Run all the experiments in the sequence: paper_experiment_step2 NOTE: Next two experiments assume you have parallel processing toolbox installed - they use parfor loop for speedup. To take advantage of this, execute matlabpool(N) where N is the number of the cores of your CPU If you don't have parallel processing toolbox, edit paper_experiment_step3.m and paper_experiment_step4.m and simply substitute 'parfor' loop with 'for' loop. In any case, then run: paper_experiment_step3 paper_experiment_step4 At the end, there will be a text file with the results from the table and the bunch of .fig and .eps figures in subdirectory 'dana36-puzzle', below the current directory. If you change the parameters or do any other changes, delete all files from this subdirectory, EXCEPT the distances*.mat files and features.mat. If you delete these, you will have to re-run paper_experiment1.m and wait again. j) Select SAIVT SoftBio dataset, by editing the file /home/user/reid/paper_experiment_config.m and rerun everything from paper_experiment_step1.m on. 3.2 Testing your own descriptor The code could be easily adopted to your own descriptor or any other descriptor you can implement in Matlab. You have to write the interface code - the good place to start is by examining the implementation of the basic histogram descriptor from the paper (internally called 'puzzle histogram'), in the file +surv/+descriptor/PuzzleHistogram.m and start by adapting it. The puzzle histogram supports operation on single images and on image sequences (tracklets), so you can implement your descriptor in the same way. if you don't support the image sequences, you will not be able to run experiments on the SAIVT SoftBio dataset, at least not in the configuration from the paper. There are two important parts of the descriptor, which are encapsulated in two functions in the adapter code: the first one calculates the descriptor (feature vector) from the image or sequence data, and the second one implements the distance calculation (more accurately, the calculation of the whole distance matrix). You can also choose the more messy way and calculate the distance matrix outside our code and place it in the appropriate subfolder. The distance matrix should contain the matrix of all-versus-all feature distances (NxN if N is the number of all samples in the dataset), and two vectors of Nx1, which have information on camera and person identity for each sample. Name Size Bytes Class distances 23683x23683 2243537956 single identities_cameras 23683x1 189464 double identities_objects 23683x1 189464 double After implementing your own descriptor, intruduce the new descriptor into the /home/user/reid/paper_experiment_config.m file (now there is only one selection, 'puzzle'). NOTE1: The usability of the particular descriptor ***in the distributed framework*** (if you are interested in this) depends on the mathematical properties of the distance function. For details, see the paper on Hierarchical Feature Distribution (HFD): Sulic, V.; Pers, J.; Kristan, M.; Kovacic, S., "Efficient Feature Distribution for Object Matching in Visual-Sensor Networks," Circuits and Systems for Video Technology, IEEE Transactions on , vol.21, no.7, pp.903,916, July 2011 doi: 10.1109/TCSVT.2011.2133330 (The paper contains proofs for several combinations of descriptors and distances, including histogram/hellinger combination, used in this code) However, this is not prerequisite for using a descriptor in this code! NOTE2: If your descriptor needs training on real-world surveillance images, do not use Dana36 for training, as your results will be overly optimistic. Try to use some unrelated dataset, e.g. Viper ( http://vision.soe.ucsc.edu/node/178 ) 3.3 Testing your own dataset a) Go through the steps, as described in 3.1 and make sure everything works. b) Organize your dataset in the manner, similar to Dana36 or SAIVT SoftBio and write the dataset adapter - you have three examples in +surv/+dataset folder: Dana36.m SAIVT.m ThreeDPeS.m Note that 3DPeS dataset (http://www.openvisor.org/3dpes.asp) was not used in the paper, but the code is ready to use it (the code supports only a subset of 3Dpes, called 3DPeS_ReId_Snap, since there is not bounding boxes for the whole dataset. c) The rest of the procedure should be similar to using Dana36 or SAIVT.