<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Online Joint Multi-Metric Adaptation from Frequent Sharing-Subset Mining for Person Re-Identification</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>2020 June</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10161339</idno>
					<idno type="doi">10.1109/CVPR42600.2020.00298</idno>
					<title level='j'>IEEE/CVF Conf. on Computer Vision and Pattern Recognition</title>
<idno></idno>
<biblScope unit="volume"></biblScope>
<biblScope unit="issue"></biblScope>					

					<author>Jiahuan Zhou</author><author>Bing Su</author><author>Ying Wu</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Person Re-IDentification (P-RID), as an instance-level recognition problem, still remains challenging in computer vision community. Many P-RID works aim to learn faithful and discriminative features/metrics from offline training data and directly use them for the unseen online testing data. However, their performance is largely limited due to the severe data shifting issue between training and testing data. Therefore, we propose an online joint multi-metric adaptation model to adapt the offline learned P-RID models for the online data by learning a series of metrics for all the sharing-subsets. Each sharing-subset is obtained from the proposed novel frequent sharing-subset mining module and contains a group of testing samples which share strong visual similarity relationships to each other. Unlike existing online P-RID methods, our model simultaneously takes both the sample-specific discriminant and the set-based visual similarity among testing samples into consideration so that the adapted multiple metrics can refine the discriminant of all the given testing samples jointly via a multi-kernel late fusion framework. Our proposed model is generally suitable to any offline learned P-RID baselines for online boosting, the performance improvement by our model is not only verified by extensive experiments on several widely-used P-RID benchmarks (CUHK03, Market1501, DukeMTMC-reID and MSMT17) and state-of-the-art P-RID baselines but also guaranteed by the provided in-depth theoretical analyses.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>testing samples. Their performances totally rely on the offline learned models from training data while treat different testing samples equally ignoring the individual characteristics, hence the improvement is neither significant nor stable. The other category is query-specific metric adaptation <ref type="bibr">[17,</ref><ref type="bibr">38,</ref><ref type="bibr">45]</ref> which aims to enhance the discriminant of each query individually. The generic offline learned metric is adapted to an instance-specific local metric for each query. Compared with the set-centric ones, the individual discriminant of queries is enhanced while the visual similarity relationships among given testing samples are ignored. Moreover, existing query-specific models <ref type="bibr">[17,</ref><ref type="bibr">38,</ref><ref type="bibr">45]</ref> completely ignore the counterpart gallery data during adaptation. Even a discriminative probe-specific metric can be learned, the "hard" gallery samples with large intra-class and small inter-class variances will tremendously degrade its performance since they are still indistinguishable under the learned query-specific metric ( Fig. <ref type="figure">1</ref>).</p><p>In order to tackle the aforementioned issues, we propose a novel online joint multi-metric adaptation algorithm which not only takes individual characteristics of testing samples into consideration but also fully explore the visual similarity relationships among both query and gallery samples. As shown by Fig. <ref type="figure">2</ref>, at the online P-RID testing stage, the redundant intrinsic visual similarity relationships among unlabeled query (gallery) set are utilized by our proposed frequent sharing-subsets mining model to automatically mine the concise and strong visual sharing associations of samples. Since a sharing-subset contains a group of queries (galleries) sharing strong visual similarity to each other, their local distributions will be jointly adjusted by efficiently learning a Mahalanobis metric for all of them. Once a series of such kind of sharing-subset based Mahalanobis metrics are learned, for each query (gallery), its instance-specific local metric is obtained via a multimetric late fusion of all the sharing-subset based Mahalanobis metrics. Therefore, our proposed online joint Multi-Metric adaptation model based on the frequent sharingsubsets Mining (denoted as M 3 ) is able to refine the ranking performance online. The success of learning from sharing relies on discovering the latent sharing relationships among samples, which cannot be found by treating each instance independently <ref type="bibr">[4]</ref>. Learning from sharing is good at handling such condition that only a limited number of learning data are available by taking the sharing relationships as data augmentation. Therefore the sharing strategy is particularly suitable for online P-RID learning in where each testing sample itself is the only positive sample available for learning.</p><p>The main contributions of this paper are as follows: <ref type="bibr">(1)</ref> To handle the severe shifted training-testing data distribution issue in P-RID, we leap from offline global learning to online instance-specific metric adaptation. We propose a general and flexible learning objective to simultaneously enhance the local discriminant of testing query and gallery data. (2) By mining various frequent sharing-subsets, the intrinsic visual similarity sharing relationships are fully explored. Therefore the online time cost of learning metrics from sharing is much more smaller than learning local metrics independently. (3) To fulfill the time-efficient requirement of online testing, a theoretical sound optimization solution is proposed for efficient learning which is also proven to guarantee the improvement of performance. (4) Our proposed model can be readily applied to any existing offline P-RID baselines for online performance improvement. The efficiency and effectiveness of our method are further verified by the extensive experiments on four challenging P-RID benchmarks (CUHK03, Market1501, DukeMTMC-reID and MSMT17) based on various state-of-the-art P-RID models.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>Online Re-Ranking in P-RID: In recent years, increasing efforts have been paid to online P-RID re-ranking. Ye et al. <ref type="bibr">[35]</ref> revised the ranking list by considering the nearest neighbors of both the global and local features. An unsupervised re-ranking model proposed by Garcia et al. <ref type="bibr">[9]</ref> takes advantage of the content and context information in the ranking list. Zhong et al. <ref type="bibr">[43]</ref> proposed a k-reciprocal encoding approach for re-ranking, which relies on a hypothesis that if a gallery image is similar to the probe in the k-reciprocal nearest neighbors, it is more likely to be a true-match. Zhou et al. <ref type="bibr">[45]</ref> proposed to learn an instancespecific Mahalanobis metric for each query sample by using extra negative learning samples at online stage. Barman et al. <ref type="bibr">[3]</ref> focused on how to make a consensus-based decision for retrieval by aggregating the ranking results from multiple algorithms, only the matching scores are needed. Bai et al. <ref type="bibr">[1]</ref> concentrated on re-ranking with the capacity of metric fusion for P-RID by proposing an Unified Ensemble Diffusion (UED) framework. However, the aforementioned online re-ranking methods either simply treat different testing samples equally without considering the instance-specific characteristics or completely ignore the intrinsic visual similarity relationships among testing samples, so that the performance improvement is neither stable nor significant.</p><p>CNN-based Feature Extraction in P-RID: CNN-based feature extraction has achieved the state-of-the-art performance in P-RID. A novel Harmonious Attention CNN (HA-CNN) proposed by Li et al. <ref type="bibr">[18]</ref> tries to jointly learn attention selection and feature representation in a CNN by maximizing the complementary information of different levels of visual attention (soft attention and hard attention). Wang et al. <ref type="bibr">[30]</ref> proposed a novel deeply supervised fully attentional block that can be plugged into any CNNs to solve Therefore, the mined SSSets not only keep the strong and reliable visual similarity sharing information but also significantly alleviate the redundancy. Compared with the original combinatorial problem suffering from exponential computation complexity O(2 n ), the time complexity of our proposed algorithm is O(n 2 ) which is much more efficient when a large scale of testing samples are given.</p><p>Considering Q as the given Item set, we firstly prepare a Transaction set T = {t i } nt i=1 from Q where each t i is a subset of Q. The affinity matrix A &#8712; R nq&#215;nq of Q is defined as:</p><p>, j = i 0, j = i</p><p>(1) where &#963; is the variance parameter of distance matrix from Q so that A i,j represents the soft-max normalized visual similarity between q i and q j . The i-th row of A represents the similarity distribution between q i and the other samples in Q. To keep only the most reliable sharing relationships, a threshold &#920; defined as the average affinity of Q is used for outlier filtering: &#920; = nq i=1 nq j=1 A i,j /n q &#8226; n q . Therefore, a binary index map B is obtained by:</p><p>The non-zero B i,j implies the strong similarity sharing relationship between q i and q j . Therefor each non-zero row B j of B can be considered as a Transaction t i .</p><p>Once the transaction set T is obtained, we propose to mine the frequent sharing-subsets from T that each sharingsubset is represented by a mined frequent pattern from a classical FP-Close mining algorithm <ref type="bibr">[10]</ref>. To do so, a Closed Frequent Itemset Tree (CFI-Tree) is firstly constructed based on T under a minimum support 5 (Fig. <ref type="figure">3</ref>), then the FP-Close mining algorithm in <ref type="bibr">[10]</ref> is performed to the constructed CFI-Tree to obtain all the closed frequent patterns {S i } ns i=1 that each S i represents a sharing-subset.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Joint Multi-Metric Adaptation From SSSets</head><p>Once all the frequent SSSets {S i } ns i=1 are obtained, our goal is to jointly learn n s SSSets-based local Mahalanobis metrics for {S i } ns i=1 by optimizing Eqn. 4:</p><p>The learned metric M i from Eqn. 4 is shared by all the samples in S i . Suppose we have n s SSSets and O(n) samples in each S i , there are totally O(n 2 s n 2 ) inequality constraints and O(n s n 2 ) equality constraints in Eqn. 4 which are too difficult to deal with, so that we aim to reduce the constraint size in Eqn. 4. We find out that Eqn. 4 has an exactly equivalent form by only keeping the constraints related to one anchor sample s i in S i , that s i can be any sample in S i . Therefore the equivalent form is shown by Eqn. 5:</p><p>Revisit Eqn. 4, its equality constraints propose to collapse all s i u &#8712; S i together. Therefore keeping only the equality constraints related to the anchor sample s i achieves the same collapsing performance. So as to the inequality constraints in Eqn. 4. Finally, we can reduce the constraint size by only keeping the constraints related to s i as in Eqn. 5. The re-formed objective Eqn. 5 has only O(n 2 s n) and O(n s n) inequality and equality constraints respectively. An important merit of Eqn. 5 is that it can be efficiently optimized:</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Replace the inequality constraints in Eqn. 5 by h</head><p>Now Eqn. 5 has an equivalent form as:</p><p>Finally, we prove that Eqn. 7 has the same solution to Eqn. 4 by eliminating its PSD and equality constraints.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Theorem 2</head><p>The solution to Eqn. 4 is exactly the same as solving the Eqn. 7 by relaxing its equality and PSD constraints, since they are indeed off-the-shelf.</p><p>Proof 2 If we get rid of the PSD and equality constraints in Eqn. 7, the new form is:</p><p>) Eqn. 8 is exactly in the same form of a multi-kernel SVM problem so that it can be efficiently solved.</p><p>Thus the positive semi-definiteness of M i is guaranteed since</p><p>For the equality constraints in Eqn. 7, given a member s of S, we have:</p><p>) which proves that the solution to Eqn. 8 satisfies the equality constraints as well.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.">Bi-Directional Discriminant Enhancement</head><p>At online testing stage, the gallery set G, the counterpart of query set Q, also plays an important role. As shown by Fig. <ref type="figure">1</ref>, the re-ranking performance by using only the query-centric metric adaptation may suffer from ambiguous gallery distractors. The similar gallery images from different identities will significantly degrade the discriminant of M p since these gallery distractors are still indistinguishable under M p . Therefore, we aim to handle these indistinguishable gallery samples by performing a gallery-centric local discriminant enhancement method as Eqn. 4. The SSSets of G and the corresponding joint metrics are obtained via Sec. 3.2 and Eqn. 4 respectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.5.">Multi-Metric Late Fusion For Re-Ranking</head><p>For one query probe q, it may be contained by multiple SSSets so that there will be multiple learned metrics M i associated to q. The final metric M q for q is obtained via a boosting-form multi-metric late fusion <ref type="bibr">[24,</ref><ref type="bibr">23]</ref>:</p><p>where &#947; i = 1 if q &#8712; S i . For a gallery sample g, a similar fused metric M g can be obtained likewise. Therefor the refined distance between q and g is defined as Eqn. 11 based on which the re-ranking list of q i is obtained.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Theoretical Analyses and Justifications</head><p>As demonstrated by Theorem. 2, the solution of our joint multi-metric adaptation objective can be readily transformed to the equivalent form as <ref type="bibr">[45]</ref>. Therefore, the appealing theoretical properties in <ref type="bibr">[45]</ref> can be inherited by our learned M i as presented in Theorem. 3. Moreover, our late multi-kernel fusion metric Eqn. 10 will guarantee a further reduction of generalization error bound as in Theorem. 4.</p><p>Theorem 3 (The reduction of both asymptotic and practical error bound by the learned M i ): As demonstrated by the Theorem.2 in <ref type="bibr">[45]</ref>, for an input x, its asymptotic error P a (e|x) by using extra negative data D a is:</p><p>where q is a probability scalar that 0 &#8804; q &#8804; 1 and P(e|x) is the Bayesian error. Moreover, the asymptotic error P a (e|x) can be best approximated by the practical error rate P n (e|x) (n is finite) by finding a local metric M x which turns out to be the one for our Eqn. 4.</p><p>Theorem 4 (The reduction of generalization error bound by using M q/g in Eqn. 10): Our fused multi-kernel metric M q = ( ns i=1 &#947; i M i ) / &#947; i is a linear combinations of several base kernels M i from the family of finite Gaussian kernels:</p><p>which is bounded by B k . Therefore, for a fixed &#948; &#8712; (0, 1), n s &lt; n k is the number of metrics (kernels) involved in our final joint multi-metric learning solution. With probability at least 1 -&#948; over the choice of a random training set X = {x i } n i=1 of size n we have:</p><p>In our work, we have n s &#8810; n k , that the selected number of kernels is much fewer than the total kernel number,</p><p>The generalization error by using M q is much smaller than using only any M i . The same conclusion can be obtained for M g likewise.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Proof 3</head><p>The classification rule of our learned M i can be defined as &#950; j qT M i xj -1 &#8805; 1 so that the margin is 1. Motivated by <ref type="bibr">[25]</ref>, the generalization error using M q , which is a linear combination of all M i from the family of finite Gaussian kernel K d G , its generalization error E est (M q ) is bounded by O log n k + B k + 2n s n which is guaranteed by the Theorem.2 in <ref type="bibr">[14]</ref>. For the kernel family</p><note type="other">by</note><p>) and in our work, d &#8776; 10 3 so that n k &#8776; 10 6 . The selected kernels for combination is about 20 in average so that n s &#8810; n k which means E est (M q ) &#8810; E est (M i ).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Experiments</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.">Experimental Settings</head><p>Datasets. We evaluate our proposed M 3 model on CUHK03 <ref type="bibr">[17]</ref>, Market1501 <ref type="bibr">[39]</ref>, DukeMTMC-reID <ref type="bibr">[41]</ref> and MSMT17 <ref type="bibr">[33]</ref> benchmarks. The statistic details of the above datasets are summarized in Table . 1. For CUHK03 1 , the new splitting protocol proposed by <ref type="bibr">[43]</ref> is adopted in our experiment so that 767 identities are used for training as well as the left 700 identities are used for testing. As for the other three benchmarks, Market1501, DukeMTMC-reID and MSMT17, the pre-determined probe and gallery sets are directly utilized with no modification. Baselines. Our proposed M 3 method is evaluated based on several state-of-the-art CNN-based P-RID models: ResNet50 <ref type="bibr">[11]</ref>, DenseNet121 <ref type="bibr">[13]</ref>, HA-CNN <ref type="bibr">[18]</ref>, MLFN <ref type="bibr">[5]</ref> and ABDNet <ref type="bibr">[6]</ref>. The general CNN models, ResNet50 and DenseNet121, are well trained on each benchmark for feature extraction. HA-CNN, MLFN and ABDNet are the P-RID specific CNNs so that the original works are directly utilized in our experiments. Besides, the 1 In our experiment, the CUHK03 detected dataset is utilized. other state-of-the-art P-RID methods <ref type="bibr">[15,</ref><ref type="bibr">21,</ref><ref type="bibr">37,</ref><ref type="bibr">27,</ref><ref type="bibr">28,</ref><ref type="bibr">5,</ref><ref type="bibr">26,</ref><ref type="bibr">40,</ref><ref type="bibr">46,</ref><ref type="bibr">6]</ref> are further compared. Moreover, related online P-RID methods including <ref type="bibr">[45]</ref> (OL) and <ref type="bibr">[43]</ref> (RR) are compared with our M 3 method.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Dataset</head><p>Evaluation. We follow the same official evaluation protocols in <ref type="bibr">[39,</ref><ref type="bibr">41,</ref><ref type="bibr">17,</ref><ref type="bibr">33]</ref>, the single-shot evaluation setting is adopted and all the results are shown in the form of Cumulated Matching Characteristic (CMC) at several selected ranks and mean Average Precision (mAP). Various ablation studies of our proposed model are explored in Sec. 5.5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.">Comparison with the State-of-the-arts</head><p>Evaluation on CUHK03: The comparison results on CUHK03 (767/700 splitting protocol) are presented in Table. 2. Our M 3 model significantly boosts the baseline Rank@1(mAP) performance of ResNet50, DenseNet12, HA-CNN and MLFN to 66.9%(60.7%), 61.6%(54.4%), 69.8%(63.5%) and 73.4%(71.2%) with a 40.0%(29.7%), 50.2%(35.7%), 45.4%(33.4%) and 34.2%(44.7%) relative improvement respectively. Even compared with the stateof-the-art method MGN <ref type="bibr">[31]</ref>, our results outperform it by 5% at Rank@1. The reason for such a large improvement is that the "hard" gallery distractors which are still indistinguishable under M q is well handled by our M 3 method (Fig. <ref type="figure">4</ref>), so the ranking of true-match gallery targets is significantly improved.</p><p>Evaluation on Market1501: The superiority of our M 3 method is further verified by the experiments on Mar-ket1501. Table . 2 demonstrates that although the state-ofthe-art approach ABDNet <ref type="bibr">[6]</ref> has achieved a pretty high performance (&#8805; 94%) on Market1501, the improvement of our M 3 is still over 3.7%(10%) on Rank@1(mAP) based on ABDNet (visualization results in Fig. <ref type="figure">4</ref>).</p><p>Evaluation on DukeMTMC-reID: DukeMTMC-reID is a recent benchmark proposed for P-RID, but the latest methods have obtained promising performances. As shown in Table . 2, the recently published OSNet <ref type="bibr">[46]</ref> has raised the state-of-the-art to 87.0%(70.2%). Our ABDNet+M 3 improves the Rank@1(mAP) result to 87.5%(73.3%), which beats OSNet by a large margin on mAP.</p><p>Evaluation on MSMT17: MSMT17 is the latest and largest benchmark so far which is pretty challenging due to the extreme large-scale identities and distractors. We eval-CUHK03(767/700)</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Market1501</head><p>DukeMTMC-reID Method R@1 mAP Method R@1 mAP Method R@1 mAP ResNet50 <ref type="bibr">[11]</ref> 47.9  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>MSMT17</head><p>Baseline Baseline+M 3 Method R@1 mAP R@1 mAP ResNet50 <ref type="bibr">[11]</ref> 63. uate the performance of selected baselines on the MSMT17 dataset with(w/) and without(w/o) our M 3 model in Table . 3. For all the baselines, our M 3 model significantly improves their Rank@1(mAP) performance. The performance of ABDNet is boosted from 82.3%(60.8%) to a state-of-theart level of 85.7%(64.2%). Table . 3 verifies the scalability of our proposed M 3 model, even for the extremely largescale query/gallery sets, our method is still able to consistently improve the baseline performance.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Method CUHK03 Market</head><p>Duke HA-CNN <ref type="bibr">[18]</ref> 48.0(47.6) 90.6(75.3) 80.7(64.4) HA-CNN+RR <ref type="bibr">[43]</ref> 54.8(55.7) 91.4(79.0) 82.5(69.9) HA-CNN+OL <ref type="bibr">[45]</ref>   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3.">Comparison with Online P-RID Re-ranking</head><p>Two state-of-the-art online P-RID re-ranking methods, OL <ref type="bibr">[45]</ref> and RR <ref type="bibr">[43]</ref>, are compared with our M 3 since all the three methods can be readily utilized at online testing stage for further performance improvement. The comparison results in Table. <ref type="bibr">4</ref> show that the query-specific method OL <ref type="bibr">[45]</ref> works better on improving Rank@1 per-Method Market1501 &#8594; DukeMTMC DukeMTMC &#8594; Market1501 R@1 R@5 R@10 R@20 mAP R@1 R@5 R@10 R@20 mAP MLFN <ref type="bibr">[5]</ref> 45 formance but has little improvement on mAP due to the lack of gallery-specific local discriminant enhancement. In contrast, since RR <ref type="bibr">[43]</ref> considers the k-reciprocal nearest neighbors of both query and gallery data, it achieves a large improvement on mAP but with limited improvement on Rank@1 owing to the lack of instance-specific local adaptation. Our M 3 outperforms the other two approaches significantly at both Rank@1 and mAP due to the fully utilization of both the group-level visual similarity sharing information and instance-specific local discriminant enhancement.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.4.">Cross-Set Generalization Ability Validation</head><p>We explore the generalization ability of our proposed M 3 . We claim the improvement by M 3 is from the testing sample itself which is independent of how the baseline models are trained. Therefore we conduct a cross-set generalization ability validation experiment as shown in Table . 5. Following the setting in <ref type="bibr">[44]</ref>, the baseline model trained by Market1501 with our M 3 is evaluated on DukeMTMC-reID and vice versa. The results show our M 3 model is able to consistently and significantly improve the baseline performance regardless of whether the baseline is trained by the same-source data or not.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.5.">Ablation Study</head><p>The Influence of Model Components: The final retrieval performance of Eqn. 11 relies on a bi-directional retrieval matching, so the influence of each component is shown in Table . 6. As can be seen, by only keeping the query-specific metric adaptation M q or the gallery-centric one M g , we still can achieve a significant improvement. While by performing a full-model bi-directional matching, the performance is further boosted by a large margin which demonstrates the necessity of bi-directional local discriminant enhancement. More visualizations are shown in Fig. <ref type="figure">4</ref>.</p><p>The Influence of &#955; in Eqn. 11: The weighting parameter &#955; in Eqn. 11 aims to balance the importance of M q and M g . The full CMC curves w.r.t &#955; of HA-CNN on CUHK03, Market1501 and DukeMTMC-reID are plotted in Fig. <ref type="figure">5</ref> respectively. As can be seen, setting &#955; = 1 gives the best performance since we perform a max-normalization to both M q and M g , over-weighting either side is prone to suppress the other side's impact.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusion</head><p>Unlike previous online P-RID works, in this paper, we propose a novel online joint multi-metric adaptation algorithm which not only takes individual characteristics of testing samples into consideration but also fully utilizes the visual similarity relationships among both query and gallery samples. Our M 3 method can be readily applied to any existing P-RID baselines with the guarantee of performance improvement, and a theoretical sound optimization solution to M 3 keeps a low online computational burden. Compared with the other state-of-the-art online P-RID refinement approaches, our method achieves significant improvement on Rank@1(mAP) performance. Moreover, by implementing our method to the state-of-the-art baselines, their performance is further boosted by a large margin on four challenging large-scale P-RID benchmarks.</p></div></body>
		</text>
</TEI>
