[nPPE, vPPE, uPPE] = ccl_error_ppe(U_t, N_p, Pi) Compute the normalised projected policy error (nPPE). This error measures the difference between the policy subject to the true constraints, and that of the policy subject to the estimated constraints. Input: U_t True observation N_p Learnt projection matrix P_i True nullspace policy Output: nPPE Normalised projected policy error vPPE Variance of the nullspace policy uPPE Projected policy error
0001 function [nPPE, vPPE, uPPE] = ccl_error_ppe(U_t, N_p, Pi) 0002 % [nPPE, vPPE, uPPE] = ccl_error_ppe(U_t, N_p, Pi) 0003 % 0004 % Compute the normalised projected policy error (nPPE). This error measures the 0005 % difference between the policy subject to the true constraints, and that of 0006 % the policy subject to the estimated constraints. 0007 % 0008 % Input: 0009 % 0010 % U_t True observation 0011 % N_p Learnt projection matrix 0012 % P_i True nullspace policy 0013 % 0014 % Output: 0015 % 0016 % nPPE Normalised projected policy error 0017 % vPPE Variance of the nullspace policy 0018 % uPPE Projected policy error 0019 0020 0021 0022 0023 % CCL: A MATLAB library for Constraint Consistent Learning 0024 % Copyright (C) 2007 Matthew Howard 0025 % Contact: matthew.j.howard@kcl.ac.uk 0026 % 0027 % This library is free software; you can redistribute it and/or 0028 % modify it under the terms of the GNU Lesser General Public 0029 % License as published by the Free Software Foundation; either 0030 % version 2.1 of the License, or (at your option) any later version. 0031 % 0032 % This library is distributed in the hope that it will be useful, 0033 % but WITHOUT ANY WARRANTY; without even the implied warranty of 0034 % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 0035 % Lesser General Public License for more details. 0036 % 0037 % You should have received a copy of the GNU Library General Public 0038 % License along with this library; if not, write to the Free 0039 % Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. 0040 0041 dim_n = size(U_t,2) ; 0042 U_p = zeros(size(U_t)) ; 0043 for n = 1:dim_n 0044 U_p(:,n) = N_p*Pi(:,n) ; 0045 end 0046 uPPE = sum((U_t-U_p).^2,2) / dim_n ; 0047 vPPE = var(Pi,0,2); 0048 nPPE = sum(uPPE) / sum(vPPE); 0049 end