Risk based evaluations of competing agronomic climate adaptation strategies: Workflow example for rice planting strategies from crop growth simulation evidence
Maxwell Mkondiwa (m.mkondiwa@cgiar.org), Anton Urfels and Terry Hurley
1 Introduction
This notebook presents a novel approach for developing and targeting robust recommendations from crop simulation, long term experiments or multi-season survey data based on a risk optimization model.
For the computational model, Terry Hurley at University of Minnesota wrote the first Matlab code from which we have adapted to Octave and R.
To reproduce the analyses, one requires Octave installation. You can download Octave from here: https://octave.org/download.
Purpose: The commonly used risk measures focus on central moments (e.g., variance, conditional value at risk, skewness) of the distribution. Yield distributions overtime and space are however more complicated such that one may need to consider the whole distribution when evaluating which agronomic practice will likely work where and when. The use of stochastic dominance especially second order stochastic dominance allows the relationship between the cumulative distribution function of the outcome and the expected utility maximization behaviour under risk aversion. A computational approach developed by Hurley et al (2018) allows one to compute willingness to pay lower and upper bounds for a new technology to second order stochastically dominate an old practice such that any risk averse farmer will choose the new technology.
Advantages
Unlike mean-variance optimization, this optimization strategy considers distributional comparisons
Disadvantages
Computationally expensive especially when implementing across a large area of interest.
The comparisons are pairwise thereby requiring many combinations to come up with the best alternative for each pixel.
Difficult to apply with survey or agronomic datasets are it requires long timeseries. However, it is possible to implement the approach with monte-carlo simulated survey or agronomic trial datasets.
Stylized use case: Where to target sowing date advisories?
We use gridded crop growth simulation model results to identify scenarios that would be agronomically and economically beneficial even for a risk averse farmer.
Input data requirements
For the spatial exante (economic) component of the model, one only needs gridded crop simulation results for each of the scenarios.
Toolkit workflow
Figure 1 shows the workflow for implementation the computational second order stochastic dominance analysis.
2 Stylized example
In order to run this step, one needs to install Octave. The folder should contain the different functions that used for the optimization including CreateWTPBoundsbyCell and CreateTableData.
library(rio)ID=rep(1,1000)g=as.data.frame(g*1000)q=as.data.frame(q*1000)f=as.data.frame(f*1000)Rep=1:nrow(g)Area=rep(1,1000)QvsG=data.frame(ID,Rep,Area,q,g)export(QvsG,colNames=F,"QvsG.xlsx")QvsG=as.matrix(QvsG)colnames(QvsG) <-c(1,2,3,4,5)export(QvsG,"QvsG.csv")library(rhdf5)h5save(QvsG,file="QvsG.hdf")QvsF=data.frame(ID,Rep,Area,q,f)export(QvsF,colNames=F,"QvsF.xlsx")QvsF=as.matrix(QvsF)colnames(QvsF) <-c(1,2,3,4,5)export(QvsF,"QvsF.csv")h5save(QvsF,file="QvsF.hdf")FvsG=data.frame(ID,Rep,Area,f,g)export(FvsG,colNames=F,"FvsG.xlsx")FvsG=as.matrix(FvsG)colnames(FvsG) <-c(1,2,3,4,5)export(FvsG,"FvsG.csv")h5save(FvsG,file="FvsG.hdf")# Experiments -------------# G vs Q: Q as baseline -----------------GvsQ=data.frame(ID,Rep,Area,g,q)export(GvsQ,colNames=F,"GvsQ.xlsx")GvsQ=as.matrix(GvsQ)colnames(GvsQ) <-c(1,2,3,4,5)export(GvsQ,"GvsQ.csv")h5save(GvsQ,file="GvsQ.hdf")## F vs Q: Q as baseline ---------------FvsQ=data.frame(ID,Rep,Area,f,q)export(FvsQ,colNames=F,"FvsQ.xlsx")FvsQ=as.matrix(FvsQ)colnames(FvsQ) <-c(1,2,3,4,5)export(FvsQ,"FvsQ.csv")h5save(FvsQ,file="FvsQ.hdf")## G vs F: F as baseline ----------------GvsF=data.frame(ID,Rep,Area,g,f)export(GvsF,colNames=F,"GvsF.xlsx")GvsF=as.matrix(GvsF)colnames(GvsF) <-c(1,2,3,4,5)export(GvsF,"GvsF.csv")h5save(GvsF,file="GvsF.hdf")
2.3 Q vs G
The computational model consists of three main functions.
2.3.0.1 SOSDIntegralTest function
function [result] = SOSDIntegralTestv3(DOMINANT,DOMINATED)
% Tests for second-order stochastic dominance assuming uniform distribution
% over vector elements
% Algorithm Adapted from Levy, H. (2006). Stochastic Dominance: Investment Decision
% Making Under Uncertainty (Second Edition). Springer. New York, NY.
% Pages 180-182.
% result = 1 implies suspected DOMINANT distribution is in fact SOSD over DOMINATED
% result = 0 implies suspected DOMINANT distribution is not SOSD over DOMINATED
len1 = length(DOMINANT);
len2 = length(DOMINATED);
if len1 ~= len2
error('Lengths do not match!');
end
DOMINANTSORT = sortrows(DOMINANT,1);
DOMINATEDSORT = sortrows(DOMINATED,1);
work = zeros(len1,2);
flag1 = 1; % stays 1 if cumulative of suspected dominated >= cumulative of suspected dominant
flag2 = 0; % stays 0 unless cumulative of supected dominated > cumulative suspected dominant for some value
for ind = 1:len1 % Check to see if suspected dominant distribution satisfies SOSD integral condition
% relative to the suspected dominated distribution
if ind == 1
work(ind,1) = DOMINANTSORT(ind,1);
work(ind,2) = DOMINATEDSORT(ind,1);
else
work(ind,1) = DOMINANTSORT(ind,1) + work(ind - 1,1);
work(ind,2) = DOMINATEDSORT(ind,1) + work(ind - 1,2);
end
if work(ind,1) < work(ind,2)
flag1 = 0; % Not SOSD because suspected dominant distribution has higher cumulative value than suspected dominated
end
if work(ind,1) > work(ind,2)
flag2 = 1; % To be SOSD the suspected dominated distribution must have higher cumulative
% distribution than suspected dominant distribution for at least one value
end
end
result = flag1 * flag2;
2.3.0.2 ConstRiskThresh function
function [Result] = ConstRiskThresh(Data, CompVarNum, BaseVarNum)
% INPUT Matrices: ndataseq01060216 and ndataseq10060216 from crop
% simulation models with Columns
% 1 = cell30m/Unique Cell ID
% 2 = Replication
% 3 = Maize Area in Cell (ha)
% 4 = Maize Yield (kg/ha) for CM1510 with 40 kg N / yldCM1040
% 5 = Maize Yield (kg/ha) for CM1510 with 0 kg N / yldCM1000
CompManNum = 1;
if CompVarNum == 5
CompManNum = 0;
end
BaseManNum = 1;
if BaseVarNum == 4
BaseManNum = 0;
end
% Outputs
% 1 = cell30m
% # = Comparison Cultivar ID (CM1510 = 10, CM1509 = 9, Improved = 0)
% 2 = Comparison Nitrogen ID
% # = Base Cultivar ID (CM1510 = 10, CM1509 = 9, Improved = 0)
% 3 = Base Nitrogen Management ID
% 4 = Mean Yield for Comp
% 5 = Standard Deviation of Yield for Comp
% 6 = CV of Yield for Comp
% 7 = Maximum Yield for Comp
% 8 = Minimum Yield for Comp
% 9 = Probability of Crop Failure for Comp
% 10 = Min Proportion for Comp to SOSD Base
% 11 = Mean Yield for Base
% 12 = Standard Deviation of Yield for Base
% 13 = CV of Yield for Base
% 14 = Maximum Yield for Base
% 15 = Minimum Yield for Base
% 16 = Probability of Crop Failure for Base
% 17 = Min Proportion for Base to SOSD Comp
% 18 = Difference in mean Comp - Base
% 19 = Difference in standard deviation Comp - Base
% 20 = Difference in CV Comp - Base
% 21 = Difference in Prob of Crop Failure Comp - Base
% 22 = Min Proportion for Comp to SOSD Base divided by average base yield
% 23 = Min Proportion for Base to SOSD Comp divided by average base yield
% 24 = Comp More Risky (-1)/less Risky (1)/ Indeterminant (0) compared to Base
% 25 = Maize Area
% Get list of cell30m
CELLIDS = Data;
CELLIDScondy = CELLIDS(:,2) ~= 1;
CELLIDS(CELLIDScondy,:) = [];
CELLIDS(:,2:3) = [];
LEN = length(CELLIDS);
Result = ones(LEN, 25);
count = 1;
%while count <= LEN
id = CELLIDS(count,1);
Yields = Data;
cellcond = Yields(:,1) ~= id;
Yields(cellcond,:) = [];
TEMP = SOSDConstBoundsv3(Yields(:,CompVarNum),Yields(:,BaseVarNum));
Result(count, 1) = id;
% Result(count, 2) = CompVar;
Result(count, 2) = CompManNum;
% Result(count, 4) = BaseVar;
Result(count, 3) = BaseManNum;
Result(count, 4:24) = TEMP.';
Result(count, 25) = Yields(1,3);
% count = count + 1;
%end
2.3.0.3 SOSDConstBounds
function [Result] = SOSDConstBoundsv3(DIST1,DIST2)
% Finds the min number where DIST1 (the comparison distribution) SOSD DIST2 (the base distribution) and
% the min number where DIST2 SOSD DIST1 in addition to a variety of
% descriptive statistics for DIST1 and DIST2
if size(DIST1,2) > 1
error('DIST1 must be a column vector!')
end
if size(DIST2,2) > 1
error('DIST2 must be a column vector!')
end
% Output Result row definitions
MEANDIST1 = 1;
SDDIST1 = 2;
CVDIST1 = 3;
MAXDIST1 = 4;
MINDIST1 = 5;
PRCROPFAILDIST1 = 6;
DIST1MINPROP = 7;
MEANDIST2 = 8;
SDDIST2 = 9;
CVDIST2 = 10;
MAXDIST2 = 11;
MINDIST2 = 12;
PRCROPFAILDIST2 = 13;
DIST2MINPROP = 14;
MEANDIFF = 15;
SDDIFF = 16;
CVDIFF = 17;
CFDIFF = 18;
RELCOMPSOSDBASE = 19;
RELBASESOSDCOMP = 20;
DELTARISK = 21;
Result = -999999 * ones(21,1); % Initialize Results
% Calculate the mean, standard deviation, coeffcient of variation, minimum
% and maximum for DIST1
Result(MEANDIST1,1) = mean(DIST1);
Result(SDDIST1,1) = std(DIST1);
if Result(MEANDIST1,1) > 0
Result(CVDIST1,1) = Result(SDDIST1,1) / Result(MEANDIST1,1);
end
Result(MAXDIST1,1) = max(DIST1);
Result(MINDIST1,1) = min(DIST1);
ldist1 = length(DIST1);
% Calculate the mean, standard deviation, coeffcient of variation, minimum
% and maximum for DIST2
Result(MEANDIST2,1) = mean(DIST2);
Result(SDDIST2,1) = std(DIST2);
if Result(MEANDIST2,1) > 0
Result(CVDIST2,1) = Result(SDDIST2,1) / Result(MEANDIST2,1);
end
Result(MAXDIST2,1) = max(DIST2);
Result(MINDIST2,1) = min(DIST2);
ldist2 = length(DIST2);
% Calculate differences in descriptive statistics between DIST2 and DIST1
if Result(MEANDIST1,1) == 0 && Result(MEANDIST2,1) == 0
Result(MEANDIFF,1) = -999999;
else
Result(MEANDIFF,1) = Result(MEANDIST1,1) - Result(MEANDIST2,1);
end
if Result(SDDIST1,1) == 0 && Result(SDDIST2,1) == 0
Result(SDDIFF,1) = -999999;
else
Result(SDDIFF,1) = Result(SDDIST1,1) - Result(SDDIST2,1);
end
if Result(CVDIST1,1) >= 0 && Result(CVDIST2,1) >= 0
Result(CVDIFF,1) = Result(CVDIST1,1) - Result(CVDIST2,1);
else
Result(CVDIFF,1) = -999999;
end
% Initialize Thresholds for Stopping Golden Section Search
thresh = 0.00001;
bailthresh = 1000000;
if Result(MEANDIST1,1) > 0 || Result(MEANDIST2,1) > 0
% Calculate Probability of crop failiure for DIST1
CropFailDIST1 = DIST1;
CropFailDIST1cond = DIST1(:,1) ~= 0;
CropFailDIST1(CropFailDIST1cond,:) = [];
Result(PRCROPFAILDIST1,1) = length(CropFailDIST1) / ldist1;
% Calculate Probability of crop failiure for DIST1
CropFailDIST2 = DIST2;
CropFailDIST2cond = DIST2(:,1) ~= 0;
CropFailDIST2(CropFailDIST2cond,:) = [];
Result(PRCROPFAILDIST2,1) = length(CropFailDIST2) / ldist2;
tcomp = Result(MAXDIST2,1) - Result(MINDIST1,1); % Maximum amount that DIST1 can be shifted to ensure it is SOSD
tbase = Result(MINDIST2,1) - Result(MAXDIST1,1); % Maximum amount that DIST1 can be shifted back to ensure DIST2 is SOSD
% Initialize Golden Section Search upper and lower starting points
upper = 0;
lower = 0;
if tcomp > tbase
upper = tcomp;
lower = tbase;
elseif tcomp < tbase
upper = tbase;
lower = tcomp;
else
upper = tbase + 10;
lower = tbase - 10;
end
% Initialize Flags to Test for Convergence
bail = 0;
converge = 0;
while converge ~= 1 && bail < bailthresh % Find the minimum proportion that makes DIST1 SOSD DIST2
middle = (lower + upper) / 2;
if SOSDIntegralTestv3(DIST1 + middle, DIST2) == 1
upper = middle;
else
lower = middle;
end
if lower > upper
error('lower > upper!')
end
if (upper - lower) <= thresh % Convergence acheived when upper and lower are within thresh tolerance
converge = 1;
end
bail = bail + 1;
end
if converge == 1
Result(DIST1MINPROP,1) = upper;
if Result(MEANDIST2,1) > 0
Result(RELCOMPSOSDBASE,1) = upper / Result(MEANDIST2,1);
end
else % Golden Section Search Failed to converge because Threshold not met
Result(DIST1MINPROP,1) = -777777;
Result(RELCOMPSOSDBASE,1) = -777777;
end
% Initialize Golden Section Search upper and lower starting points
if tcomp > tbase
upper = tcomp;
lower = tbase;
elseif tcomp < tbase
upper = tbase;
lower = tcomp;
else
upper = tbase + 10;
lower = tbase - 10;
end
% Initialize Flags to Test for Convergence
bail = 0;
converge = 0;
while converge ~= 1 && bail < bailthresh % Find the minimum proportion that makes DIST2 SOSD DIST1
middle = (lower + upper) / 2;
if SOSDIntegralTestv3(DIST2, DIST1 + middle) == 1
lower = middle;
else
upper = middle;
end
if lower > upper
error('lower > upper!')
end
if (upper - lower) <= thresh % Convergence acheived when upper and lower are within thresh tolerance
converge = 1;
end
bail = bail + 1;
end
if converge == 1
Result(DIST2MINPROP,1) = lower;
if Result(MEANDIST2,1) > 0
Result(RELBASESOSDCOMP,1) = lower / Result(MEANDIST2,1);
end
else
Result(DIST2MINPROP,1) = -777777;
Result(RELBASESOSDCOMP,1) = -777777;
end
if Result(PRCROPFAILDIST1,1) >= 0 && Result(PRCROPFAILDIST2,1) >= 0
Result(CFDIFF,1) = Result(PRCROPFAILDIST1,1) - Result(PRCROPFAILDIST2,1);
else
Result(CFDIFF,1) = -999999;
end
% Categorize Risk: DIST1 More Risky (-1)/less Risky (1)/ Indeterminant (0) compared to DIST2
if Result(DIST1MINPROP,1) <= -777777 || Result(DIST2MINPROP,1) <= -777777
Result(DELTARISK,1) = -999999;
elseif Result(DIST1MINPROP,1) > 0 && Result(DIST2MINPROP,1) > 0
Result(DELTARISK,1) = -1;
elseif Result(DIST1MINPROP,1) < 0 && Result(DIST2MINPROP,1) < 0
Result(DELTARISK,1) = 1;
else
Result(DELTARISK,1) = 0;
end
end
2.3.0.4 CreateWTPBoundsbyCell
% INPUT Matrices: ndataseq01060216 and ndataseq10060216 from crop
% simulation models with Columns
% 1 = Unique Cell ID
% 2 = Replication
% 3 = Area
% 4 = Base
% 5 = New
%
%
% OUTPUT Matrix: RA with columns
% 1 = cell30m
% 2 = Comparison ID
% 3 = Base ID
% 4 = Mean Yield for Comp
% 5 = Standard Deviation of Yield for Comp
% 6 = CV of Yield for Comp
% 7 = Maximum Yield for Comp
% 8 = Minimum Yield for Comp
% 9 = Probability of Crop Failure for Comp
% 10 = Min Proportion for Comp to SOSD Base
% 11 = Mean Yield for Base
% 12 = Standard Deviation of Yield for Base
% 13 = CV of Yield for Base
% 14 = Maximum Yield for Base
% 15 = Minimum Yield for Base
% 16 = Probability of Crop Failure for Base
% 17 = Min Proportion for Base to SOSD Comp
% 18 = Difference in mean Comp - Base
% 19 = Difference in standard deviation Comp - Base
% 20 = Difference in CV Comp - Base
% 21 = Difference in Prob of Crop Failure Comp - Base
% 22 = Min Proportion for Comp to SOSD Base divided by average base yield
% 23 = Min Proportion for Base to SOSD Comp divided by average base yield
% 24 = Comp More Risky (-1)/less Risky (1)/ Indeterminant (0) compared to Base
% 25 = Area
% Compare Q to G
RA_yearseq_01 = ConstRiskThresh(ndataseq01060216, 4, 5);
LEN = length(RA_yearseq_01);
apones = ones(LEN,1);
clear apones LEN;
RA = [RA_yearseq_01];
RA=RA(1,:)
Variables visible from the current scope:
variables in scope: top scope
Attr Name Size Bytes Class
==== ==== ==== ===== =====
ndataseq01060216 1001x5 40040 double
Total is 5005 elements using 40040 bytes
RA =
Columns 1 through 6:
1.0000e+00 1.0000e+00 1.0000e+00 5.2463e+03 7.7658e+02 1.4802e-01
Columns 7 through 12:
7.8655e+03 4.0000e+00 0 7.6559e+02 5.9915e+03 7.9440e+02
Columns 13 through 18:
1.3259e-01 7.9778e+03 5.0000e+00 0 1.0000e+00 -7.4515e+02
Columns 19 through 24:
-1.7814e+01 1.5436e-02 0 1.2778e-01 1.6690e-04 -1.0000e+00
Column 25:
3.0000e+00
3.0.0.1 Calculate descriptive statistics
This part can be done in R or octave. In octave, the function we use is CreateTableData
% Using RA
% 1 = yearseq
% 2 = cell30m
% 3 = Comparison Cultivar ID (CM1510 = 10, CM1509 = 9, Improved = 0)
% 4 = Comparison Nitrogen ID
% 5 = Base Cultivar ID (CM1510 = 10, CM1509 = 9, Improved = 0)
% 6 = Base Nitrogen Management ID
% 7 = Mean Yield for Comp
% 8 = Standard Deviation of Yield for Comp
% 9 = CV of Yield for Comp
% 10 = Maximum Yield for Comp
% 11 = Minimum Yield for Comp
% 12 = Probability of Crop Failure for Comp
% 13 = Min Proportion for Comp to SOSD Base
% 14 = Mean Yield for Base
% 15 = Standard Deviation of Yield for Base
% 16 = CV of Yield for Base
% 17 = Maximum Yield for Base
% 18 = Minimum Yield for Base
% 19 = Probability of Crop Failure for Base
% 20 = Min Proportion for Base to SOSD Comp
% 21 = Difference in mean Comp - Base
% 22 = Difference in standard deviation Comp - Base
% 23 = Difference in CV Comp - Base
% 24 = Difference in Prob of Crop Failure Comp - Base
% 25 = Min Proportion for Comp to SOSD Base divided by average base yield
% 26 = Min Proportion for Base to SOSD Comp divided by average base yield
% 27 = Comp More Risky (-1)/less Risky (1)/ Indeterminant (0) compared to Base
% 28 = Harvest Acres
% Create DescriptiveStat
% Rows
% 1 = Weighted Mean UB
% 2 = Weighted S.D. UB
% 3 = Minimum UB
% 4 = 10th Percentile UB
% 5 = 25th Percentile UB
% 6 = Median UB
% 7 = 75th Percentile UB
% 8 = 90th Percentile UB
% 9 = Maximum UB
% 10 = Weighted Mean LB
% 11 = Weighted S.D. LB
% 12 = Minimum LB
% 13 = 10th Percentile LB
% 14 = 25th Percentile LB
% 15 = Median LB
% 16 = 75th Percentile LB
% 17 = 90th Percentile LB
% 18 = Maximum LB
% 19 = Proportion of Acres in Green
% 20 = Proportion of Acres in Yellow
% 21 = Proportion of Acres in Red
% 22 = Total Acres
% 23 = Number of Cells
% Columns
% 1 = RC_01_00_00_09_00
% 2 = RC_01_00_00_10_00
% 3 = RC_01_00_40_00_00
% 4 = RC_01_00_40_09_00
% 5 = RC_01_00_40_10_00
% 6 = RC_01_09_40_09_00
% 7 = RC_01_10_40_10_00
% 8 = RC_10_00_00_09_00
% 9 = RC_10_00_00_10_00
% 10 = RC_10_00_40_00_00
% 11 = RC_10_00_40_09_00
% 12 = RC_10_00_40_10_00
% 13 = RC_10_09_40_09_00
% 14 = RC_10_10_40_10_00
% Define Scenarios of interest
scenarios = [ 1, 0];
ScenariosLEN = length(scenarios);
DescriptiveStat = -999999 * ones(23, 1);
for ind = 1
datatemp = RA;
CW = scenarios(ind, 1);
if (scenarios(ind, 1) == 1)
cellcond = (datatemp(:,2) ~= CW | datatemp(:,25) <= 0);
end
if (scenarios(ind, 1) == 0 )
cellcond = (datatemp(:,2) ~= CW | datatemp(:,25) <= 0);
end
datatemp(cellcond,:) = [];
if CW == 1
COL = 1;
elseif CW == 0
COL = 2;
else
error('Something is wrong');
end
SampleN = size(datatemp, 1);
totalacres = sum(datatemp(:, 25));
wsumacresxLB = 0;
wsumacresxUB = 0;
wsumacresxLB2 = 0;
wsumacresxUB2 = 0;
percentilesLB = -99999999 * ones(2, SampleN);
percentilesUB = -99999999 * ones(2, SampleN);
propred = 0;
propgreen = 0;
for statind = 1:SampleN
acres = datatemp(statind, 25);
wtpLB = -1 * datatemp(statind, 17) / 1000;
wtpUB = -1 * datatemp(statind, 10) / 1000;
wsumacresxLB = wsumacresxLB + wtpLB * (acres/totalacres);
wsumacresxUB = wsumacresxUB + wtpUB * (acres/totalacres);
wsumacresxLB2 = wsumacresxLB2 + (wtpLB ^ 2) * (acres/totalacres);
wsumacresxUB2 = wsumacresxUB2 + (wtpUB ^ 2) * (acres/totalacres);
percentilesLB(statind, 1) = wtpLB;
percentilesLB(statind, 2) = acres/totalacres;
percentilesUB(statind, 1) = wtpUB;
percentilesUB(statind, 2) = acres/totalacres;
if datatemp(statind, 24) == -1
propred = propred + acres/totalacres;
elseif datatemp(statind, 24) == 1
propgreen = propgreen + acres/totalacres;
end
clear acres wtpLB wtpUB;
end
percentilesLB = sortrows(percentilesLB);
percentilesUB = sortrows(percentilesUB);
cumLB = 0;
cumUB = 0;
for perind = 1:SampleN
cumLBLast = cumLB;
cumUBLast = cumUB;
intervalLB = percentilesLB(perind, 2);
intervalUB = percentilesUB(perind, 2);
cumLB = cumLB + intervalLB;
cumUB = cumUB + intervalUB;
if cumLBLast <= 0.1 && cumLB >= 0.1
DescriptiveStat( 4, COL) = ((0.1 - cumLBLast) / intervalLB) * percentilesLB(perind - 1, 1) + ((cumLB - 0.1) / intervalLB) * percentilesLB(perind, 1) ;
end
if cumUBLast <= 0.1 && cumUB >= 0.1
DescriptiveStat(13, COL) = ((0.1 - cumUBLast) / intervalUB) * percentilesUB(perind - 1, 1) + ((cumUB - 0.1) / intervalUB) * percentilesUB(perind, 1) ;
end
if cumLBLast <= 0.25 && cumLB >= 0.25
DescriptiveStat( 5, COL) = ((0.25 - cumLBLast) / intervalLB) * percentilesLB(perind - 1, 1) + ((cumLB - 0.25) / intervalLB) * percentilesLB(perind, 1) ;
end
if cumUBLast <= 0.25 && cumUB >= 0.25
DescriptiveStat(14, COL) = ((0.25 - cumUBLast) / intervalUB) * percentilesUB(perind - 1, 1) + ((cumUB - 0.25) / intervalUB) * percentilesUB(perind, 1) ;
end
if cumLBLast <= 0.5 && cumLB >= 0.5
DescriptiveStat( 6, COL) = ((0.5 - cumLBLast) / intervalLB) * percentilesLB(perind - 1, 1) + ((cumLB - 0.5) / intervalLB) * percentilesLB(perind, 1) ;
end
if cumUBLast <= 0.5 && cumUB >= 0.5
DescriptiveStat(15, COL) = ((0.5 - cumUBLast) / intervalUB) * percentilesUB(perind - 1, 1) + ((cumUB - 0.5) / intervalUB) * percentilesUB(perind, 1) ;
end
if cumLBLast <= 0.75 && cumLB >= 0.75
DescriptiveStat( 7, COL) = ((0.75 - cumLBLast) / intervalLB) * percentilesLB(perind - 1, 1) + ((cumLB - 0.75) / intervalLB) * percentilesLB(perind, 1) ;
end
if cumUBLast <= 0.75 && cumUB >= 0.75
DescriptiveStat(16, COL) = ((0.75 - cumUBLast) / intervalUB) * percentilesUB(perind - 1, 1) + ((cumUB - 0.75) / intervalUB) * percentilesUB(perind, 1) ;
end
if cumLBLast <= 0.9 && cumLB >= 0.9
DescriptiveStat( 8, COL) = ((0.9 - cumLBLast) / intervalLB) * percentilesLB(perind - 1, 1) + ((cumLB - 0.9) / intervalLB) * percentilesLB(perind, 1) ;
end
if cumUBLast <= 0.9 && cumUB >= 0.9
DescriptiveStat(17, COL) = ((0.9 - cumUBLast) / intervalUB) * percentilesUB(perind - 1, 1) + ((cumUB - 0.9) / intervalUB) * percentilesUB(perind, 1) ;
end
clear cumLBLast cumUBLast intervalLB intervalUB;
end
DescriptiveStat( 1, COL) = wsumacresxLB;
DescriptiveStat( 2, COL) = sqrt(wsumacresxLB2 - wsumacresxLB ^ 2);
DescriptiveStat( 3, COL) = percentilesLB(1, 1);
DescriptiveStat( 9, COL) = percentilesLB(SampleN, 1);
DescriptiveStat(10, COL) = wsumacresxUB;
DescriptiveStat(11, COL) = sqrt(wsumacresxUB2 - wsumacresxUB ^ 2);
DescriptiveStat(12, COL) = percentilesUB(1, 1);
DescriptiveStat(18, COL) = percentilesUB(SampleN, 1);
DescriptiveStat(19, COL) = propgreen;
DescriptiveStat(20, COL) = 1 - propred - propgreen;
DescriptiveStat(21, COL) = propred;
DescriptiveStat(22, COL) = totalacres;
DescriptiveStat(23, COL) = SampleN;
clear datatemp CW cellcond COL SampleN totalacres ...
wsumacresxLB wsumacresxUB wsumacresxUB2 wsumacresxLB2 ...
percentilesLB percentilesUB propgreen propred perind cumLB cumUB statind;
end
clear scenarios ScenariosLEN yearseq ind;
Variables visible from the current scope:
variables in scope: top scope
Attr Name Size Bytes Class
==== ==== ==== ===== =====
ndataseq01060216 1001x5 40040 double
Total is 5005 elements using 40040 bytes
RA =
Columns 1 through 6:
1.0000e+00 1.0000e+00 1.0000e+00 5.9915e+03 7.9441e+02 1.3259e-01
Columns 7 through 12:
7.9778e+03 4.0000e+00 0 1.0000e+00 5.2463e+03 7.7658e+02
Columns 13 through 18:
1.4802e-01 7.8655e+03 5.0000e+00 0 -7.6559e+02 7.4515e+02
Columns 19 through 24:
1.7828e+01 -1.5434e-02 0 1.9061e-04 -1.4593e-01 0
Column 25:
3.0000e+00
DescriptiveStat_GvsQ =
7.6559e-01
0
-1.0000e+08
-1.0000e+06
-1.0000e+06
-1.0000e+06
-1.0000e+06
-1.0000e+06
-1.0000e+08
-1.0000e-03
0
-1.0000e+08
-1.0000e+06
-1.0000e+06
-1.0000e+06
-1.0000e+06
-1.0000e+06
-1.0000e+08
0
1.0000e+00
0
3.0000e+00
1.0000e+00
We can see here now that G stochastically dominates Q.
5 Bihar Model Application
The example above has shown for a case where there is one location in which there are multiple years of data. In the case of when one has run a crop growth model and has ncdf4 outputs, then one can use the same approach to estimate the model for the whole state.
Code
# Subset for each scenario and for Bihar and then send cell in row and var_year in columns# load some librarieslibrary(rasterVis)library(raster)library(rgdal)library(colorspace)library(RColorBrewer)#library(tidyverse)library(ggplot2)library(reshape2)library(ggtext)library(plyr)# define key files#load files to be used laterfile1 <-"data/simulations/s001.nc4"file2 <-"data/simulations/s002.nc4"file_aoi <-"data/aoi_mask/aoi_mask.tif"india_states <-"data/gadm40_IND_shp/gadm40_IND_1.shp"# loading key dataaoi <-raster(file_aoi)bihar <-readOGR(india_states)
OGR data source with driver: ESRI Shapefile
Source: "C:\Users\MMKONDIWA\OneDrive - CIMMYT\Documents\GitHub\WTP_Bounds_SOSD_Risk_Model\data\gadm40_IND_shp\gadm40_IND_1.shp", layer: "gadm40_IND_1"
with 36 features
It has 11 fields
Code
aoi <- bihar[bihar$NAME_1=="Bihar",]plot(aoi)
Code
# functions for loading data load_sims <-function(files,var,aoi,operator,var2) {# check if operators is used, is NA, just load one varif (is.na(operator)==TRUE) { l <-list()#cycle through file names to load all different scenariosfor (i in files) { j <-which(i==files)#load file and add to list l[[j]] <- raster::stack(i,varname= var)[[3:35]] }} else { # if operator is used, load to vars with operators l <-list()for (i in files) { j <-which(i==files)# important: need to pass operator with backticks, not as stringif (var=="rice_yield") { a <- raster::stack(i,varname= var)[[3:35]] *3600}else {a <- raster::stack(i,varname= var)[[3:35]] }if (var=="wheat_yield") { b <- raster::stack(i,varname= var2)[[3:35]] *3340}else {b <- raster::stack(i,varname= var2)[[3:35]] } l[[j]] <-operator(a,b) } } l <-lapply(l,function(x){raster::mask(x,aoi)}) l <-lapply(l,function(x){raster::crop(x,aoi)})return(l)}# start extraction for ricecreate_csv <-function(crop,var) {files <-c(file1,file2)raster_df <-load_sims(files,var,aoi,NA)list_df <-lapply(raster_df, getValues)list_df <-lapply(list_df, na.omit)lapply(list_df, nrow)list_df <-lapply(list_df, "colnames<-", paste0(var,"_",1983:2015))#Scenarios# Fixed/onset is the rice planting strategy# Medium/long is the variety# suppl is where we ran supplementary irrigation instead of full irrigation. Use with caution.scen <-c("baseline","fixed_long")for (i in1:length(scen)) {write.csv(list_df[[i]],paste0("",crop,"_",scen[i],".csv"))}}create_csv("rice","rice_yield")create_csv("wheat","wheat_yield")
5.1 Clean for Octave
Code
# packages ---------------------------------------------------------------------library(dplyr)library(data.table)library(tidyr)library(rio)# RICE--------------------------------------------------------------------------## RICE-- Fixed long ---------------------------------------rice_fixed_long=read.csv("rice_fixed_long.csv")#rice_fixed_long <- na_if(rice_fixed_long, '-99') #Replace all -99 with NA#rice_fixed_long <- na_if(rice_fixed_long, '-999999')rice_fixed_long=rename(rice_fixed_long,ID=X)rice_fixed_long$rice_area=rep(1,3386)rice_fixed_long_long <-melt(setDT(rice_fixed_long), id.vars =c("ID","rice_area"),value.name=c("fixed_long_rice_yield") ,variable.name ="year")rice_fixed_long_long$year=as.numeric(gsub('[^0-9]', '',rice_fixed_long_long$year))## RICE-- Baseline ---------------------------rice_baseline=read.csv("rice_baseline.csv")#rice_baseline <- na_if(rice_baseline, '-99') #Replace all -99 with NA#rice_baseline <- na_if(rice_baseline, '-999999')rice_baseline=rename(rice_baseline,ID=X)rice_baseline_long <-melt(setDT(rice_baseline), id.vars =c("ID"),value.name=c("baseline_rice_yield") ,variable.name ="year")rice_baseline_long$year=as.numeric(gsub('[^0-9]', '',rice_baseline_long$year))rice_baseline_fixedlong=merge(rice_fixed_long_long,rice_baseline_long,by=c("ID","year"))rice_baseline_fixedlong$year=rice_baseline_fixedlong$year-1982rice_baseline_fixedlong=as.data.frame(rice_baseline_fixedlong)rice_baseline_fixedlong$baseline_rice_yield[is.na(rice_baseline_fixedlong$baseline_rice_yield)]=0rice_baseline_fixedlong$fixed_long_rice_yield[is.na(rice_baseline_fixedlong$fixed_long_rice_yield)]=0rice_baseline_fixedlong=rice_baseline_fixedlong[,c(1,2,3,5,4)]export(rice_baseline_fixedlong,colNames=F,"rice_baseline_fixedlong_s.xlsx")library(rhdf5)rice_baseline_fixedlong=as.matrix(rice_baseline_fixedlong)colnames(rice_baseline_fixedlong) <-c(1,2,3,4,5)export(rice_baseline_fixedlong,"rice_baseline_fixedlong.csv")h5save(rice_baseline_fixedlong,file="rice_baseline_fixedlong.hdf",createnewfile =TRUE)rice_baseline_fixedlong=as.data.frame(rice_baseline_fixedlong)# Remove all zeros and nasrice_baseline_fixedlong_nonzero=subset(rice_baseline_fixedlong, rice_baseline_fixedlong$fixed_long_rice_yield!=0)rice_baseline_fixedlong_nonzero=subset(rice_baseline_fixedlong_nonzero, rice_baseline_fixedlong_nonzero$baseline_rice_yield!=0)export(rice_baseline_fixedlong_nonzero,colNames=F,"rice_baseline_fixedlong_nonzero.xlsx")library(rhdf5)rice_baseline_fixedlong_nonzero=as.matrix(rice_baseline_fixedlong_nonzero)colnames(rice_baseline_fixedlong_nonzero) <-c(1,2,3,4,5)h5save(rice_baseline_fixedlong_nonzero,file="rice_baseline_fixedlong_nonzero.hdf",createnewfile =TRUE)# WHEAT ---------------------------------------------------------------------------------------------------## wheat-- fixedlong ---------------------------------------wheat_fixedlong=read.csv("wheat_fixed_long.csv")#wheat_fixedlong <- na_if(wheat_fixedlong, '-99') #Replace all -99 with NA#wheat_fixedlong <- na_if(wheat_fixedlong, '-999999') #Replace all -99 with NAwheat_fixedlong=rename(wheat_fixedlong,ID=X)wheat_fixedlong$wheat_area=rep(1,3386)wheat_fixedlong_long <-melt(setDT(wheat_fixedlong), id.vars =c("ID","wheat_area"),value.name=c("fixedlong_wheat_yield") ,variable.name ="year")wheat_fixedlong_long$year=as.numeric(gsub('[^0-9]', '',wheat_fixedlong_long$year))## wheat-- Baseline ---------------------------wheat_baseline=read.csv("wheat_baseline.csv")#wheat_baseline <- na_if(wheat_baseline, '-99') #Replace all -99 with NA#wheat_baseline <- na_if(wheat_baseline, '-999999') #Replace all -99 with NAwheat_baseline=rename(wheat_baseline,ID=X)wheat_baseline_long <-melt(setDT(wheat_baseline), id.vars =c("ID"),value.name=c("baseline_wheat_yield") ,variable.name ="year")wheat_baseline_long$year=as.numeric(gsub('[^0-9]', '',wheat_baseline_long$year))wheat_baseline_long=merge(wheat_fixedlong_long,wheat_baseline_long,by=c("ID","year"))wheat_baseline_long$year=wheat_baseline_long$year-1982wheat_baseline_long=as.data.frame(wheat_baseline_long)wheat_baseline_long$baseline_wheat_yield[is.na(wheat_baseline_long$baseline_wheat_yield)]=0wheat_baseline_long$baseline_wheat_yield[is.na(wheat_baseline_long$baseline_wheat_yield)]=0wheat_baseline_long=wheat_baseline_long[,c(1,2,3,5,4)]export(wheat_baseline_long,colNames=F,"wheat_baseline_long_s.xlsx")library(rhdf5)wheat_baseline_long=as.matrix(wheat_baseline_long)colnames(wheat_baseline_long) <-c(1,2,3,4,5)export(wheat_baseline_long,"wheat_baseline_fixedlong.csv")h5save(wheat_baseline_long,file="wheat_baseline_long.hdf",createnewfile =TRUE)# Remove all zeros and naswheat_baseline_long=as.data.frame(wheat_baseline_long)wheat_baseline_long_nonzero=subset(wheat_baseline_long, wheat_baseline_long$baseline_wheat_yield!=0)wheat_baseline_long_nonzero=subset(wheat_baseline_long_nonzero, wheat_baseline_long_nonzero$baseline_wheat_yield!=0)export(wheat_baseline_long_nonzero,colNames=F,"wheat_baseline_long_nonzero.xlsx")#library(rhdf5)#wheat_baseline_long_nonzero=as.matrix(wheat_baseline_long_nonzero)#colnames(wheat_baseline_long_nonzero) <- c(1,2,3,4,5)#h5save(wheat_baseline_long_nonzero,file="wheat_baseline_long_nonzero.hdf",createnewfile = TRUE)#h5closeAll()
Variables visible from the current scope:
variables in scope: top scope
Attr Name Size Bytes Class
==== ==== ==== ===== =====
ndataseq01060216 111739x5 4469560 double
Total is 558695 elements using 4469560 bytes
DescriptiveStat_rice_baseline_fixedlong =
-8.8478e-01
1.5947e+00
-2.2825e+00
-2.0022e+00
-1.9445e+00
-1.7215e+00
-4.1993e-01
1.8642e+00
4.8809e+00
-3.2533e+00
1.4588e+00
-6.6922e+00
-5.4027e+00
-4.3245e+00
-3.0953e+00
-2.1248e+00
-1.8775e+00
7.5890e-01
1.2987e-02
1.9362e-01
7.9339e-01
3.3880e+03
3.3860e+03
For about 79% of the cells, farmer practice is sub-optimal.
5.2.1.1 Mapping the results
Code
RA_rice_baseline_fixedlong<-import("RA_rice_baseline_fixedlong.txt", col_names =FALSE)# RA column names# % 1 = cellID# % 2 = Comparison Technology ID# % 3 = Base Technology ID# % 4 = Mean Yield for Comp# % 5 = Standard Deviation of Yield for Comp# % 6 = CV of Yield for Comp# % 7 = Maximum Yield for Comp# % 8 = Minimum Yield for Comp# % 9 = Probability of Crop Failure for Comp# % 10 = Min Proportion for Comp to SOSD Base# % 11 = Mean Yield for Base# % 12 = Standard Deviation of Yield for Base# % 13 = CV of Yield for Base# % 14 = Maximum Yield for Base# % 15 = Minimum Yield for Base# % 16 = Probability of Crop Failure for Base# % 17 = Min Proportion for Base to SOSD Comp# % 18 = Difference in mean Comp - Base# % 19 = Difference in standard deviation Comp - Base# % 20 = Difference in CV Comp - Base# % 21 = Difference in Prob of Crop Failure Comp - Base# % 22 = Min Proportion for Comp to SOSD Base divided by average base yield# % 23 = Min Proportion for Base to SOSD Comp divided by average base yield# % 24 = Comp More Risky (-1)/less Risky (1)/ Indeterminant (0) compared to Base# % 25 = Wheat Areacolumnnames=c("CellID","CompID","BaseID","CompYield_Mean","CompYield_SD","CompYield_CV","CompYield_Max","CompYield_Min","CompCropFailureProb","CompMinPropSOSDBase","BaseYield_Mean","BaseYield_SD","BaseYield_CV","BaseYield_Max","BaseYield_Min","BaseCropFailureProb","BaseMinPropSOSDComp","Diff_Mean_CompBase","Diff_SD_CompBase","Diff_CV_CompBase","Diff_CropFailureProbCompBase","MinProp_CompSOSDBase_Divdbaseyield","MinProp_BaseSOSDComp_Divdbaseyield","Riskiness_Comp","RiceArea")names(RA_rice_baseline_fixedlong)[1:25]=c("CellID","CompID","BaseID","CompYield_Mean","CompYield_SD", "CompYield_CV","CompYield_Max","CompYield_Min","CompCropFailureProb","CompMinPropSOSDBase","BaseYield_Mean","BaseYield_SD","BaseYield_CV","BaseYield_Max","BaseYield_Min", "BaseCropFailureProb","BaseMinPropSOSDComp","Diff_Mean_CompBase","Diff_SD_CompBase","Diff_CV_CompBase", "Diff_CropFailureProbCompBase","MinProp_CompSOSDBase_Divdbaseyield","MinProp_BaseSOSDComp_Divdbaseyield","Riskiness_Comp","RiceArea")# Based on the raster extraction in the initial stepsfiles <-c(file1)r <-load_sims(files,"rice_yield",aoi,NA)r1 <- r[[1]][[1]]# Simulated rice yield for the farmer practiceplot(r1)
Code
# Revenue ------------------------------------------### Fixed long versus farmer practice -------------------m <- RA_rice_baseline_fixedlongv <-as.vector(getValues(r1))t <-which(!is.na(v))r1_RA_rice_baseline_fixedlong_c<- r1values(r1_RA_rice_baseline_fixedlong_c)[t]<- m$Riskiness_Complibrary(rasterVis)levelplot(r1_RA_rice_baseline_fixedlong_c,par.settings=RdBuTheme(),margin=F, main="Rice yield optimal strategy for FP vs FL")
Code
RA_rice_baseline_fixedlong_plot=gplot(r1_RA_rice_baseline_fixedlong_c) +geom_raster(aes(fill =factor(value)))+scale_fill_manual(values =c("#F8766D","#619CFF", "#00BA38"),na.value ="transparent",labels=c("Clearly worse","Better or worse","Clearly better",""))+labs(x="Longitude",y="Latitude",title="Farmer practice",fill="Willingness to pay")previous_theme <-theme_set(theme_bw())RA_rice_baseline_fixedlong_plot
Variables visible from the current scope:
variables in scope: top scope
Attr Name Size Bytes Class
==== ==== ==== ===== =====
ndataseq01060216 111739x5 4469560 double
Total is 558695 elements using 4469560 bytes
Variables visible from the current scope:
variables in scope: top scope
Attr Name Size Bytes Class
==== ==== ==== ===== =====
RA 3386x25 677200 double
RA_yearseq_01 3386x25 677200 double
ndataseq01060216 111739x5 4469560 double
Total is 727995 elements using 5823960 bytes
DescriptiveStat_wheat_baseline_long =
-3.6501e-01
1.0115e+00
-2.6960e+00
-2.0722e+00
-1.2017e+00
4.4870e-02
3.1988e-01
5.8641e-01
1.5900e+00
-1.7805e+00
1.2718e+00
-3.8207e+00
-3.2327e+00
-2.7430e+00
-2.2142e+00
-1.0827e-02
-9.4492e-10
1.3358e+00
6.4640e-02
5.1446e-01
4.2090e-01
3.3880e+03
3.3860e+03
5.3.0.1 Mapping the results
Code
RA_wheat_baseline_fixedlong<-import("RA_wheat_baseline_fixedlong.txt", col_names =FALSE)# RA column names# % 1 = cellID# % 2 = Comparison Technology ID# % 3 = Base Technology ID# % 4 = Mean Yield for Comp# % 5 = Standard Deviation of Yield for Comp# % 6 = CV of Yield for Comp# % 7 = Maximum Yield for Comp# % 8 = Minimum Yield for Comp# % 9 = Probability of Crop Failure for Comp# % 10 = Min Proportion for Comp to SOSD Base# % 11 = Mean Yield for Base# % 12 = Standard Deviation of Yield for Base# % 13 = CV of Yield for Base# % 14 = Maximum Yield for Base# % 15 = Minimum Yield for Base# % 16 = Probability of Crop Failure for Base# % 17 = Min Proportion for Base to SOSD Comp# % 18 = Difference in mean Comp - Base# % 19 = Difference in standard deviation Comp - Base# % 20 = Difference in CV Comp - Base# % 21 = Difference in Prob of Crop Failure Comp - Base# % 22 = Min Proportion for Comp to SOSD Base divided by average base yield# % 23 = Min Proportion for Base to SOSD Comp divided by average base yield# % 24 = Comp More Risky (-1)/less Risky (1)/ Indeterminant (0) compared to Base# % 25 = Wheat Areacolumnnames=c("CellID","CompID","BaseID","CompYield_Mean","CompYield_SD","CompYield_CV","CompYield_Max","CompYield_Min","CompCropFailureProb","CompMinPropSOSDBase","BaseYield_Mean","BaseYield_SD","BaseYield_CV","BaseYield_Max","BaseYield_Min","BaseCropFailureProb","BaseMinPropSOSDComp","Diff_Mean_CompBase","Diff_SD_CompBase","Diff_CV_CompBase","Diff_CropFailureProbCompBase","MinProp_CompSOSDBase_Divdbaseyield","MinProp_BaseSOSDComp_Divdbaseyield","Riskiness_Comp","wheatArea")names(RA_wheat_baseline_fixedlong)[1:25]=c("CellID","CompID","BaseID","CompYield_Mean","CompYield_SD", "CompYield_CV","CompYield_Max","CompYield_Min","CompCropFailureProb","CompMinPropSOSDBase","BaseYield_Mean","BaseYield_SD","BaseYield_CV","BaseYield_Max","BaseYield_Min", "BaseCropFailureProb","BaseMinPropSOSDComp","Diff_Mean_CompBase","Diff_SD_CompBase","Diff_CV_CompBase", "Diff_CropFailureProbCompBase","MinProp_CompSOSDBase_Divdbaseyield","MinProp_BaseSOSDComp_Divdbaseyield","Riskiness_Comp","wheatArea")# Based on the raster extraction in the initial stepsfiles <-c(file1)r <-load_sims(files,"wheat_yield",aoi,NA)r1 <- r[[1]][[1]]# Simulated wheat yield for the farmer practiceplot(r1)
Code
# Revenue ------------------------------------------### Fixed long versus farmer practice -------------------m <- RA_wheat_baseline_fixedlongv <-as.vector(getValues(r1))t <-which(!is.na(v))r1_RA_wheat_baseline_fixedlong_c<- r1values(r1_RA_wheat_baseline_fixedlong_c)[t]<- m$Riskiness_Complibrary(rasterVis)levelplot(r1_RA_wheat_baseline_fixedlong_c,par.settings=RdBuTheme(),margin=F, main="wheat yield optimal strategy for FP vs FL")
Code
RA_wheat_baseline_fixedlong_plot=gplot(r1_RA_wheat_baseline_fixedlong_c) +geom_raster(aes(fill =factor(value)))+scale_fill_manual(values =c("#F8766D","#619CFF", "#00BA38"),na.value ="transparent",labels=c("Clearly worse","Better or worse","Clearly better",""))+labs(x="Longitude",y="Latitude",title="Farmer practice",fill="Willingness to pay")previous_theme <-theme_set(theme_bw())RA_wheat_baseline_fixedlong_plot
6 Conclusion
In conclusion, we have demonstrated a workflow for generating willingness to pay measures so that even a risk averse farmer can find the technology beneficial. This is important as it can allow targeting of scenarios and prioritizing investments.
7 References
Hurley, T., Koo, J., & Tesfaye, K. (2018). Weather risk: how does it change the yield benefits of 395 nitrogen fertilizer and improved maize varieties in sub-Saharan Africa? Agricultural 396 Economics, 49(6), 711-723. https://doi.org/https://doi.org/10.1111/agec.12454.
Levy, H. (2006). Stochastic Dominance: Investment Decision Making Under Uncertainty (Second Edition). Springer. New York, NY.Pages 180-182.
Mkondiwa, M., and Urfels, A. 2023. “Risk-based evaluations of competing agronomic climate adaptation strategies: The case of rice planting strategies in the Indo Gangetic Plains”. Working Paper.
Source Code
---title: "Risk based evaluations of competing agronomic climate adaptation strategies: Workflow example for rice planting strategies from crop growth simulation evidence"format: html: code-fold: true code-tools: truefig-dpi: 300fig-width: 8.88fig-align: centerfig-height: 5self-contained: trueauthor: Maxwell Mkondiwa (m.mkondiwa@cgiar.org), Anton Urfels and Terry Hurleyeditor: visualtoc: truetoc-location: leftnumber-sections: trueexecute: message: false warning: false echo: true---# IntroductionThis notebook presents a novel approach for developing and targeting robust recommendations from crop simulation, long term experiments or multi-season survey data based on a risk optimization model.For the computational model, Terry Hurley at University of Minnesota wrote the first Matlab code from which we have adapted to Octave and R.To reproduce the analyses, one requires Octave installation. You can download Octave from here: https://octave.org/download.**Purpose:** The commonly used risk measures focus on central moments (e.g., variance, conditional value at risk, skewness) of the distribution. Yield distributions overtime and space are however more complicated such that one may need to consider the whole distribution when evaluating which agronomic practice will likely work where and when. The use of stochastic dominance especially second order stochastic dominance allows the relationship between the cumulative distribution function of the outcome and the expected utility maximization behaviour under risk aversion. A computational approach developed by Hurley et al (2018) allows one to compute willingness to pay lower and upper bounds for a new technology to second order stochastically dominate an old practice such that any risk averse farmer will choose the new technology.***Advantages***- Unlike mean-variance optimization, this optimization strategy considers distributional comparisons***Disadvantages***- Computationally expensive especially when implementing across a large area of interest.- The comparisons are pairwise thereby requiring many combinations to come up with the best alternative for each pixel.- Difficult to apply with survey or agronomic datasets are it requires long timeseries. However, it is possible to implement the approach with monte-carlo simulated survey or agronomic trial datasets.***Stylized use case: Where to target sowing date advisories?***We use gridded crop growth simulation model results to identify scenarios that would be agronomically and economically beneficial even for a risk averse farmer.***Input data requirements***For the spatial exante (economic) component of the model, one only needs gridded crop simulation results for each of the scenarios.***Toolkit workflow***Figure 1 shows the workflow for implementation the computational second order stochastic dominance analysis.![](WTP_picture.png)# Stylized exampleIn order to run this step, one needs to install Octave. The folder should contain the different functions that used for the optimization including CreateWTPBoundsbyCell and CreateTableData.## Stochastic dominance analysis using graphics```{r}# Hypothetical Comparisons ---------------------------------------------------# Theoretical graphs ------------------------------------------------------------# Truncated normal -----------------------library(truncnorm)g <-rtruncnorm (n=1000,a=4,b=8, mean=6,sd=0.8 )q <-rtruncnorm (n=1000,a=4,b=8, mean=5,sd=1 )f <-rtruncnorm (n=1000,a=3,b=9, mean=5,sd=2 )df <-data.frame(x =c(g, q, f),Scenarios =factor(rep(c("G","Q","F"), c(1000,1000,1000))))df <- df[order(df$x), ]df$ecdf <-ave(df$x, df$Scenarios, FUN=function(x) seq_along(x)/length(x))library(ggplot2)stochasticdominance=ggplot(df, aes(x, ecdf, colour = Scenarios,linetype=Scenarios)) +geom_line(lwd=c(1.5))+scale_linetype_manual(values =c(1,2,3))+# theme(axis.ticks = element_blank(),axis.text.x = element_blank()) +xlab("Yield (tons/ha)") +ylab("Cumulative probability") +scale_color_manual(values =c("grey","black","black"))+theme(panel.grid.major.x =element_blank())+theme(panel.grid.minor.x =element_blank())previous_theme <-theme_set(theme_bw())stochasticdominance```## Clean data for octave```{r}library(rio)ID=rep(1,1000)g=as.data.frame(g*1000)q=as.data.frame(q*1000)f=as.data.frame(f*1000)Rep=1:nrow(g)Area=rep(1,1000)QvsG=data.frame(ID,Rep,Area,q,g)export(QvsG,colNames=F,"QvsG.xlsx")QvsG=as.matrix(QvsG)colnames(QvsG) <-c(1,2,3,4,5)export(QvsG,"QvsG.csv")library(rhdf5)h5save(QvsG,file="QvsG.hdf")QvsF=data.frame(ID,Rep,Area,q,f)export(QvsF,colNames=F,"QvsF.xlsx")QvsF=as.matrix(QvsF)colnames(QvsF) <-c(1,2,3,4,5)export(QvsF,"QvsF.csv")h5save(QvsF,file="QvsF.hdf")FvsG=data.frame(ID,Rep,Area,f,g)export(FvsG,colNames=F,"FvsG.xlsx")FvsG=as.matrix(FvsG)colnames(FvsG) <-c(1,2,3,4,5)export(FvsG,"FvsG.csv")h5save(FvsG,file="FvsG.hdf")# Experiments -------------# G vs Q: Q as baseline -----------------GvsQ=data.frame(ID,Rep,Area,g,q)export(GvsQ,colNames=F,"GvsQ.xlsx")GvsQ=as.matrix(GvsQ)colnames(GvsQ) <-c(1,2,3,4,5)export(GvsQ,"GvsQ.csv")h5save(GvsQ,file="GvsQ.hdf")## F vs Q: Q as baseline ---------------FvsQ=data.frame(ID,Rep,Area,f,q)export(FvsQ,colNames=F,"FvsQ.xlsx")FvsQ=as.matrix(FvsQ)colnames(FvsQ) <-c(1,2,3,4,5)export(FvsQ,"FvsQ.csv")h5save(FvsQ,file="FvsQ.hdf")## G vs F: F as baseline ----------------GvsF=data.frame(ID,Rep,Area,g,f)export(GvsF,colNames=F,"GvsF.xlsx")GvsF=as.matrix(GvsF)colnames(GvsF) <-c(1,2,3,4,5)export(GvsF,"GvsF.csv")h5save(GvsF,file="GvsF.hdf")```## Q vs GThe computational model consists of three main functions.#### SOSDIntegralTest function``` function [result] = SOSDIntegralTestv3(DOMINANT,DOMINATED)% Tests for second-order stochastic dominance assuming uniform distribution% over vector elements% Algorithm Adapted from Levy, H. (2006). Stochastic Dominance: Investment Decision% Making Under Uncertainty (Second Edition). Springer. New York, NY.% Pages 180-182.% result = 1 implies suspected DOMINANT distribution is in fact SOSD over DOMINATED% result = 0 implies suspected DOMINANT distribution is not SOSD over DOMINATEDlen1 = length(DOMINANT);len2 = length(DOMINATED);if len1 ~= len2 error('Lengths do not match!');endDOMINANTSORT = sortrows(DOMINANT,1);DOMINATEDSORT = sortrows(DOMINATED,1);work = zeros(len1,2);flag1 = 1; % stays 1 if cumulative of suspected dominated >= cumulative of suspected dominantflag2 = 0; % stays 0 unless cumulative of supected dominated > cumulative suspected dominant for some valuefor ind = 1:len1 % Check to see if suspected dominant distribution satisfies SOSD integral condition % relative to the suspected dominated distribution if ind == 1 work(ind,1) = DOMINANTSORT(ind,1); work(ind,2) = DOMINATEDSORT(ind,1); else work(ind,1) = DOMINANTSORT(ind,1) + work(ind - 1,1); work(ind,2) = DOMINATEDSORT(ind,1) + work(ind - 1,2); end if work(ind,1) < work(ind,2) flag1 = 0; % Not SOSD because suspected dominant distribution has higher cumulative value than suspected dominated end if work(ind,1) > work(ind,2) flag2 = 1; % To be SOSD the suspected dominated distribution must have higher cumulative % distribution than suspected dominant distribution for at least one value endendresult = flag1 * flag2;```#### ConstRiskThresh function``` function [Result] = ConstRiskThresh(Data, CompVarNum, BaseVarNum)% INPUT Matrices: ndataseq01060216 and ndataseq10060216 from crop% simulation models with Columns% 1 = cell30m/Unique Cell ID% 2 = Replication% 3 = Maize Area in Cell (ha)% 4 = Maize Yield (kg/ha) for CM1510 with 40 kg N / yldCM1040% 5 = Maize Yield (kg/ha) for CM1510 with 0 kg N / yldCM1000CompManNum = 1;if CompVarNum == 5 CompManNum = 0;endBaseManNum = 1;if BaseVarNum == 4 BaseManNum = 0;end% Outputs% 1 = cell30m% # = Comparison Cultivar ID (CM1510 = 10, CM1509 = 9, Improved = 0)% 2 = Comparison Nitrogen ID% # = Base Cultivar ID (CM1510 = 10, CM1509 = 9, Improved = 0)% 3 = Base Nitrogen Management ID% 4 = Mean Yield for Comp% 5 = Standard Deviation of Yield for Comp% 6 = CV of Yield for Comp% 7 = Maximum Yield for Comp% 8 = Minimum Yield for Comp% 9 = Probability of Crop Failure for Comp% 10 = Min Proportion for Comp to SOSD Base% 11 = Mean Yield for Base% 12 = Standard Deviation of Yield for Base% 13 = CV of Yield for Base% 14 = Maximum Yield for Base% 15 = Minimum Yield for Base% 16 = Probability of Crop Failure for Base% 17 = Min Proportion for Base to SOSD Comp% 18 = Difference in mean Comp - Base% 19 = Difference in standard deviation Comp - Base% 20 = Difference in CV Comp - Base% 21 = Difference in Prob of Crop Failure Comp - Base% 22 = Min Proportion for Comp to SOSD Base divided by average base yield% 23 = Min Proportion for Base to SOSD Comp divided by average base yield% 24 = Comp More Risky (-1)/less Risky (1)/ Indeterminant (0) compared to Base% 25 = Maize Area% Get list of cell30mCELLIDS = Data;CELLIDScondy = CELLIDS(:,2) ~= 1;CELLIDS(CELLIDScondy,:) = [];CELLIDS(:,2:3) = [];LEN = length(CELLIDS);Result = ones(LEN, 25);count = 1;%while count <= LEN id = CELLIDS(count,1); Yields = Data; cellcond = Yields(:,1) ~= id; Yields(cellcond,:) = []; TEMP = SOSDConstBoundsv3(Yields(:,CompVarNum),Yields(:,BaseVarNum)); Result(count, 1) = id; % Result(count, 2) = CompVar; Result(count, 2) = CompManNum; % Result(count, 4) = BaseVar; Result(count, 3) = BaseManNum; Result(count, 4:24) = TEMP.'; Result(count, 25) = Yields(1,3);% count = count + 1;%end```#### SOSDConstBounds``` function [Result] = SOSDConstBoundsv3(DIST1,DIST2)% Finds the min number where DIST1 (the comparison distribution) SOSD DIST2 (the base distribution) and % the min number where DIST2 SOSD DIST1 in addition to a variety of% descriptive statistics for DIST1 and DIST2if size(DIST1,2) > 1 error('DIST1 must be a column vector!')endif size(DIST2,2) > 1 error('DIST2 must be a column vector!')end% Output Result row definitionsMEANDIST1 = 1;SDDIST1 = 2;CVDIST1 = 3;MAXDIST1 = 4;MINDIST1 = 5;PRCROPFAILDIST1 = 6;DIST1MINPROP = 7;MEANDIST2 = 8;SDDIST2 = 9;CVDIST2 = 10;MAXDIST2 = 11;MINDIST2 = 12;PRCROPFAILDIST2 = 13;DIST2MINPROP = 14;MEANDIFF = 15;SDDIFF = 16;CVDIFF = 17;CFDIFF = 18;RELCOMPSOSDBASE = 19;RELBASESOSDCOMP = 20;DELTARISK = 21;Result = -999999 * ones(21,1); % Initialize Results% Calculate the mean, standard deviation, coeffcient of variation, minimum% and maximum for DIST1Result(MEANDIST1,1) = mean(DIST1);Result(SDDIST1,1) = std(DIST1);if Result(MEANDIST1,1) > 0 Result(CVDIST1,1) = Result(SDDIST1,1) / Result(MEANDIST1,1);endResult(MAXDIST1,1) = max(DIST1);Result(MINDIST1,1) = min(DIST1);ldist1 = length(DIST1);% Calculate the mean, standard deviation, coeffcient of variation, minimum% and maximum for DIST2Result(MEANDIST2,1) = mean(DIST2);Result(SDDIST2,1) = std(DIST2);if Result(MEANDIST2,1) > 0 Result(CVDIST2,1) = Result(SDDIST2,1) / Result(MEANDIST2,1);endResult(MAXDIST2,1) = max(DIST2);Result(MINDIST2,1) = min(DIST2);ldist2 = length(DIST2);% Calculate differences in descriptive statistics between DIST2 and DIST1if Result(MEANDIST1,1) == 0 && Result(MEANDIST2,1) == 0 Result(MEANDIFF,1) = -999999;else Result(MEANDIFF,1) = Result(MEANDIST1,1) - Result(MEANDIST2,1);endif Result(SDDIST1,1) == 0 && Result(SDDIST2,1) == 0 Result(SDDIFF,1) = -999999;else Result(SDDIFF,1) = Result(SDDIST1,1) - Result(SDDIST2,1);endif Result(CVDIST1,1) >= 0 && Result(CVDIST2,1) >= 0 Result(CVDIFF,1) = Result(CVDIST1,1) - Result(CVDIST2,1);else Result(CVDIFF,1) = -999999;end% Initialize Thresholds for Stopping Golden Section Searchthresh = 0.00001;bailthresh = 1000000;if Result(MEANDIST1,1) > 0 || Result(MEANDIST2,1) > 0 % Calculate Probability of crop failiure for DIST1 CropFailDIST1 = DIST1; CropFailDIST1cond = DIST1(:,1) ~= 0; CropFailDIST1(CropFailDIST1cond,:) = []; Result(PRCROPFAILDIST1,1) = length(CropFailDIST1) / ldist1; % Calculate Probability of crop failiure for DIST1 CropFailDIST2 = DIST2; CropFailDIST2cond = DIST2(:,1) ~= 0; CropFailDIST2(CropFailDIST2cond,:) = []; Result(PRCROPFAILDIST2,1) = length(CropFailDIST2) / ldist2; tcomp = Result(MAXDIST2,1) - Result(MINDIST1,1); % Maximum amount that DIST1 can be shifted to ensure it is SOSD tbase = Result(MINDIST2,1) - Result(MAXDIST1,1); % Maximum amount that DIST1 can be shifted back to ensure DIST2 is SOSD % Initialize Golden Section Search upper and lower starting points upper = 0; lower = 0; if tcomp > tbase upper = tcomp; lower = tbase; elseif tcomp < tbase upper = tbase; lower = tcomp; else upper = tbase + 10; lower = tbase - 10; end % Initialize Flags to Test for Convergence bail = 0; converge = 0; while converge ~= 1 && bail < bailthresh % Find the minimum proportion that makes DIST1 SOSD DIST2 middle = (lower + upper) / 2; if SOSDIntegralTestv3(DIST1 + middle, DIST2) == 1 upper = middle; else lower = middle; end if lower > upper error('lower > upper!') end if (upper - lower) <= thresh % Convergence acheived when upper and lower are within thresh tolerance converge = 1; end bail = bail + 1; end if converge == 1 Result(DIST1MINPROP,1) = upper; if Result(MEANDIST2,1) > 0 Result(RELCOMPSOSDBASE,1) = upper / Result(MEANDIST2,1); end else % Golden Section Search Failed to converge because Threshold not met Result(DIST1MINPROP,1) = -777777; Result(RELCOMPSOSDBASE,1) = -777777; end % Initialize Golden Section Search upper and lower starting points if tcomp > tbase upper = tcomp; lower = tbase; elseif tcomp < tbase upper = tbase; lower = tcomp; else upper = tbase + 10; lower = tbase - 10; end % Initialize Flags to Test for Convergence bail = 0; converge = 0; while converge ~= 1 && bail < bailthresh % Find the minimum proportion that makes DIST2 SOSD DIST1 middle = (lower + upper) / 2; if SOSDIntegralTestv3(DIST2, DIST1 + middle) == 1 lower = middle; else upper = middle; end if lower > upper error('lower > upper!') end if (upper - lower) <= thresh % Convergence acheived when upper and lower are within thresh tolerance converge = 1; end bail = bail + 1; end if converge == 1 Result(DIST2MINPROP,1) = lower; if Result(MEANDIST2,1) > 0 Result(RELBASESOSDCOMP,1) = lower / Result(MEANDIST2,1); end else Result(DIST2MINPROP,1) = -777777; Result(RELBASESOSDCOMP,1) = -777777; end if Result(PRCROPFAILDIST1,1) >= 0 && Result(PRCROPFAILDIST2,1) >= 0 Result(CFDIFF,1) = Result(PRCROPFAILDIST1,1) - Result(PRCROPFAILDIST2,1); else Result(CFDIFF,1) = -999999; end % Categorize Risk: DIST1 More Risky (-1)/less Risky (1)/ Indeterminant (0) compared to DIST2 if Result(DIST1MINPROP,1) <= -777777 || Result(DIST2MINPROP,1) <= -777777 Result(DELTARISK,1) = -999999; elseif Result(DIST1MINPROP,1) > 0 && Result(DIST2MINPROP,1) > 0 Result(DELTARISK,1) = -1; elseif Result(DIST1MINPROP,1) < 0 && Result(DIST2MINPROP,1) < 0 Result(DELTARISK,1) = 1; else Result(DELTARISK,1) = 0; endend```#### CreateWTPBoundsbyCell``` % INPUT Matrices: ndataseq01060216 and ndataseq10060216 from crop% simulation models with Columns% 1 = Unique Cell ID% 2 = Replication% 3 = Area% 4 = Base% 5 = New%%% OUTPUT Matrix: RA with columns% 1 = cell30m% 2 = Comparison ID% 3 = Base ID% 4 = Mean Yield for Comp% 5 = Standard Deviation of Yield for Comp% 6 = CV of Yield for Comp% 7 = Maximum Yield for Comp% 8 = Minimum Yield for Comp% 9 = Probability of Crop Failure for Comp% 10 = Min Proportion for Comp to SOSD Base% 11 = Mean Yield for Base% 12 = Standard Deviation of Yield for Base% 13 = CV of Yield for Base% 14 = Maximum Yield for Base% 15 = Minimum Yield for Base% 16 = Probability of Crop Failure for Base% 17 = Min Proportion for Base to SOSD Comp% 18 = Difference in mean Comp - Base% 19 = Difference in standard deviation Comp - Base% 20 = Difference in CV Comp - Base% 21 = Difference in Prob of Crop Failure Comp - Base% 22 = Min Proportion for Comp to SOSD Base divided by average base yield% 23 = Min Proportion for Base to SOSD Comp divided by average base yield% 24 = Comp More Risky (-1)/less Risky (1)/ Indeterminant (0) compared to Base% 25 = Area% Compare Q to GRA_yearseq_01 = ConstRiskThresh(ndataseq01060216, 4, 5);LEN = length(RA_yearseq_01);apones = ones(LEN,1);clear apones LEN;RA = [RA_yearseq_01];RA=RA(1,:)```# Q vs G run```{octave, engine.path='C:/Program Files/GNU Octave/Octave-8.2.0/mingw64/bin/octave', warning=F}%load -hdf5 "QvsG.hdf"%ndataseq01060216=QvsGndataseq01060216=dlmread('QvsG.csv');whosCreateWTPBoundsbyCell%dlmwrite('RA.txt',RA,'-append','roffset',1,'delimiter',' ')dlmwrite('RA.csv',RA)%save -hdf5 "RA.hdf"```#### Calculate descriptive statisticsThis part can be done in R or octave. In octave, the function we use is CreateTableData``` % Using RA% 1 = yearseq% 2 = cell30m% 3 = Comparison Cultivar ID (CM1510 = 10, CM1509 = 9, Improved = 0)% 4 = Comparison Nitrogen ID% 5 = Base Cultivar ID (CM1510 = 10, CM1509 = 9, Improved = 0)% 6 = Base Nitrogen Management ID% 7 = Mean Yield for Comp% 8 = Standard Deviation of Yield for Comp% 9 = CV of Yield for Comp% 10 = Maximum Yield for Comp% 11 = Minimum Yield for Comp% 12 = Probability of Crop Failure for Comp% 13 = Min Proportion for Comp to SOSD Base% 14 = Mean Yield for Base% 15 = Standard Deviation of Yield for Base% 16 = CV of Yield for Base% 17 = Maximum Yield for Base% 18 = Minimum Yield for Base% 19 = Probability of Crop Failure for Base% 20 = Min Proportion for Base to SOSD Comp% 21 = Difference in mean Comp - Base% 22 = Difference in standard deviation Comp - Base% 23 = Difference in CV Comp - Base% 24 = Difference in Prob of Crop Failure Comp - Base% 25 = Min Proportion for Comp to SOSD Base divided by average base yield% 26 = Min Proportion for Base to SOSD Comp divided by average base yield% 27 = Comp More Risky (-1)/less Risky (1)/ Indeterminant (0) compared to Base% 28 = Harvest Acres% Create DescriptiveStat% Rows% 1 = Weighted Mean UB% 2 = Weighted S.D. UB% 3 = Minimum UB% 4 = 10th Percentile UB% 5 = 25th Percentile UB% 6 = Median UB% 7 = 75th Percentile UB% 8 = 90th Percentile UB% 9 = Maximum UB% 10 = Weighted Mean LB% 11 = Weighted S.D. LB% 12 = Minimum LB% 13 = 10th Percentile LB% 14 = 25th Percentile LB% 15 = Median LB% 16 = 75th Percentile LB% 17 = 90th Percentile LB% 18 = Maximum LB% 19 = Proportion of Acres in Green% 20 = Proportion of Acres in Yellow% 21 = Proportion of Acres in Red% 22 = Total Acres% 23 = Number of Cells% Columns% 1 = RC_01_00_00_09_00% 2 = RC_01_00_00_10_00% 3 = RC_01_00_40_00_00% 4 = RC_01_00_40_09_00% 5 = RC_01_00_40_10_00% 6 = RC_01_09_40_09_00% 7 = RC_01_10_40_10_00% 8 = RC_10_00_00_09_00% 9 = RC_10_00_00_10_00% 10 = RC_10_00_40_00_00% 11 = RC_10_00_40_09_00% 12 = RC_10_00_40_10_00% 13 = RC_10_09_40_09_00% 14 = RC_10_10_40_10_00% Define Scenarios of interestscenarios = [ 1, 0];ScenariosLEN = length(scenarios);DescriptiveStat = -999999 * ones(23, 1); for ind = 1 datatemp = RA; CW = scenarios(ind, 1); if (scenarios(ind, 1) == 1) cellcond = (datatemp(:,2) ~= CW | datatemp(:,25) <= 0); end if (scenarios(ind, 1) == 0 ) cellcond = (datatemp(:,2) ~= CW | datatemp(:,25) <= 0); end datatemp(cellcond,:) = []; if CW == 1 COL = 1; elseif CW == 0 COL = 2; else error('Something is wrong'); end SampleN = size(datatemp, 1); totalacres = sum(datatemp(:, 25)); wsumacresxLB = 0; wsumacresxUB = 0; wsumacresxLB2 = 0; wsumacresxUB2 = 0; percentilesLB = -99999999 * ones(2, SampleN); percentilesUB = -99999999 * ones(2, SampleN); propred = 0; propgreen = 0; for statind = 1:SampleN acres = datatemp(statind, 25); wtpLB = -1 * datatemp(statind, 17) / 1000; wtpUB = -1 * datatemp(statind, 10) / 1000; wsumacresxLB = wsumacresxLB + wtpLB * (acres/totalacres); wsumacresxUB = wsumacresxUB + wtpUB * (acres/totalacres); wsumacresxLB2 = wsumacresxLB2 + (wtpLB ^ 2) * (acres/totalacres); wsumacresxUB2 = wsumacresxUB2 + (wtpUB ^ 2) * (acres/totalacres); percentilesLB(statind, 1) = wtpLB; percentilesLB(statind, 2) = acres/totalacres; percentilesUB(statind, 1) = wtpUB; percentilesUB(statind, 2) = acres/totalacres; if datatemp(statind, 24) == -1 propred = propred + acres/totalacres; elseif datatemp(statind, 24) == 1 propgreen = propgreen + acres/totalacres; end clear acres wtpLB wtpUB; end percentilesLB = sortrows(percentilesLB); percentilesUB = sortrows(percentilesUB); cumLB = 0; cumUB = 0; for perind = 1:SampleN cumLBLast = cumLB; cumUBLast = cumUB; intervalLB = percentilesLB(perind, 2); intervalUB = percentilesUB(perind, 2); cumLB = cumLB + intervalLB; cumUB = cumUB + intervalUB; if cumLBLast <= 0.1 && cumLB >= 0.1 DescriptiveStat( 4, COL) = ((0.1 - cumLBLast) / intervalLB) * percentilesLB(perind - 1, 1) + ((cumLB - 0.1) / intervalLB) * percentilesLB(perind, 1) ; end if cumUBLast <= 0.1 && cumUB >= 0.1 DescriptiveStat(13, COL) = ((0.1 - cumUBLast) / intervalUB) * percentilesUB(perind - 1, 1) + ((cumUB - 0.1) / intervalUB) * percentilesUB(perind, 1) ; end if cumLBLast <= 0.25 && cumLB >= 0.25 DescriptiveStat( 5, COL) = ((0.25 - cumLBLast) / intervalLB) * percentilesLB(perind - 1, 1) + ((cumLB - 0.25) / intervalLB) * percentilesLB(perind, 1) ; end if cumUBLast <= 0.25 && cumUB >= 0.25 DescriptiveStat(14, COL) = ((0.25 - cumUBLast) / intervalUB) * percentilesUB(perind - 1, 1) + ((cumUB - 0.25) / intervalUB) * percentilesUB(perind, 1) ; end if cumLBLast <= 0.5 && cumLB >= 0.5 DescriptiveStat( 6, COL) = ((0.5 - cumLBLast) / intervalLB) * percentilesLB(perind - 1, 1) + ((cumLB - 0.5) / intervalLB) * percentilesLB(perind, 1) ; end if cumUBLast <= 0.5 && cumUB >= 0.5 DescriptiveStat(15, COL) = ((0.5 - cumUBLast) / intervalUB) * percentilesUB(perind - 1, 1) + ((cumUB - 0.5) / intervalUB) * percentilesUB(perind, 1) ; end if cumLBLast <= 0.75 && cumLB >= 0.75 DescriptiveStat( 7, COL) = ((0.75 - cumLBLast) / intervalLB) * percentilesLB(perind - 1, 1) + ((cumLB - 0.75) / intervalLB) * percentilesLB(perind, 1) ; end if cumUBLast <= 0.75 && cumUB >= 0.75 DescriptiveStat(16, COL) = ((0.75 - cumUBLast) / intervalUB) * percentilesUB(perind - 1, 1) + ((cumUB - 0.75) / intervalUB) * percentilesUB(perind, 1) ; end if cumLBLast <= 0.9 && cumLB >= 0.9 DescriptiveStat( 8, COL) = ((0.9 - cumLBLast) / intervalLB) * percentilesLB(perind - 1, 1) + ((cumLB - 0.9) / intervalLB) * percentilesLB(perind, 1) ; end if cumUBLast <= 0.9 && cumUB >= 0.9 DescriptiveStat(17, COL) = ((0.9 - cumUBLast) / intervalUB) * percentilesUB(perind - 1, 1) + ((cumUB - 0.9) / intervalUB) * percentilesUB(perind, 1) ; end clear cumLBLast cumUBLast intervalLB intervalUB; end DescriptiveStat( 1, COL) = wsumacresxLB; DescriptiveStat( 2, COL) = sqrt(wsumacresxLB2 - wsumacresxLB ^ 2); DescriptiveStat( 3, COL) = percentilesLB(1, 1); DescriptiveStat( 9, COL) = percentilesLB(SampleN, 1); DescriptiveStat(10, COL) = wsumacresxUB; DescriptiveStat(11, COL) = sqrt(wsumacresxUB2 - wsumacresxUB ^ 2); DescriptiveStat(12, COL) = percentilesUB(1, 1); DescriptiveStat(18, COL) = percentilesUB(SampleN, 1); DescriptiveStat(19, COL) = propgreen; DescriptiveStat(20, COL) = 1 - propred - propgreen; DescriptiveStat(21, COL) = propred; DescriptiveStat(22, COL) = totalacres; DescriptiveStat(23, COL) = SampleN; clear datatemp CW cellcond COL SampleN totalacres ... wsumacresxLB wsumacresxUB wsumacresxUB2 wsumacresxLB2 ... percentilesLB percentilesUB propgreen propred perind cumLB cumUB statind; endclear scenarios ScenariosLEN yearseq ind;``````{octave, engine.path='C:/Program Files/GNU Octave/Octave-8.2.0/mingw64/bin/octave', warning=F}%load -hdf5 "RA.hdf"RA=dlmread('RA.csv');RACreateTableDataDescriptiveStat2=DescriptiveStat```We can see here that Q does not stochastically dominate G. In fact, it shows that for our case G stochastically dominate Q# G vs Q \[base\]```{octave, engine.path='C:/Program Files/GNU Octave/Octave-8.2.0/mingw64/bin/octave', warning=F}%load -hdf5 "GvsQ.hdf"ndataseq01060216=dlmread('GvsQ.csv');%ndataseq01060216=GvsQwhosCreateWTPBoundsbyCell%save -hdf5 "RA.hdf"%load -hdf5 "RA.hdf"CreateTableDataDescriptiveStat_GvsQ=DescriptiveStat```We can see here now that G stochastically dominates Q.# Bihar Model ApplicationThe example above has shown for a case where there is one location in which there are multiple years of data. In the case of when one has run a crop growth model and has ncdf4 outputs, then one can use the same approach to estimate the model for the whole state.```{r}# Subset for each scenario and for Bihar and then send cell in row and var_year in columns# load some librarieslibrary(rasterVis)library(raster)library(rgdal)library(colorspace)library(RColorBrewer)#library(tidyverse)library(ggplot2)library(reshape2)library(ggtext)library(plyr)# define key files#load files to be used laterfile1 <-"data/simulations/s001.nc4"file2 <-"data/simulations/s002.nc4"file_aoi <-"data/aoi_mask/aoi_mask.tif"india_states <-"data/gadm40_IND_shp/gadm40_IND_1.shp"# loading key dataaoi <-raster(file_aoi)bihar <-readOGR(india_states)aoi <- bihar[bihar$NAME_1=="Bihar",]plot(aoi)# functions for loading data load_sims <-function(files,var,aoi,operator,var2) {# check if operators is used, is NA, just load one varif (is.na(operator)==TRUE) { l <-list()#cycle through file names to load all different scenariosfor (i in files) { j <-which(i==files)#load file and add to list l[[j]] <- raster::stack(i,varname= var)[[3:35]] }} else { # if operator is used, load to vars with operators l <-list()for (i in files) { j <-which(i==files)# important: need to pass operator with backticks, not as stringif (var=="rice_yield") { a <- raster::stack(i,varname= var)[[3:35]] *3600}else {a <- raster::stack(i,varname= var)[[3:35]] }if (var=="wheat_yield") { b <- raster::stack(i,varname= var2)[[3:35]] *3340}else {b <- raster::stack(i,varname= var2)[[3:35]] } l[[j]] <-operator(a,b) } } l <-lapply(l,function(x){raster::mask(x,aoi)}) l <-lapply(l,function(x){raster::crop(x,aoi)})return(l)}# start extraction for ricecreate_csv <-function(crop,var) {files <-c(file1,file2)raster_df <-load_sims(files,var,aoi,NA)list_df <-lapply(raster_df, getValues)list_df <-lapply(list_df, na.omit)lapply(list_df, nrow)list_df <-lapply(list_df, "colnames<-", paste0(var,"_",1983:2015))#Scenarios# Fixed/onset is the rice planting strategy# Medium/long is the variety# suppl is where we ran supplementary irrigation instead of full irrigation. Use with caution.scen <-c("baseline","fixed_long")for (i in1:length(scen)) {write.csv(list_df[[i]],paste0("",crop,"_",scen[i],".csv"))}}create_csv("rice","rice_yield")create_csv("wheat","wheat_yield")```## Clean for Octave```{r}# packages ---------------------------------------------------------------------library(dplyr)library(data.table)library(tidyr)library(rio)# RICE--------------------------------------------------------------------------## RICE-- Fixed long ---------------------------------------rice_fixed_long=read.csv("rice_fixed_long.csv")#rice_fixed_long <- na_if(rice_fixed_long, '-99') #Replace all -99 with NA#rice_fixed_long <- na_if(rice_fixed_long, '-999999')rice_fixed_long=rename(rice_fixed_long,ID=X)rice_fixed_long$rice_area=rep(1,3386)rice_fixed_long_long <-melt(setDT(rice_fixed_long), id.vars =c("ID","rice_area"),value.name=c("fixed_long_rice_yield") ,variable.name ="year")rice_fixed_long_long$year=as.numeric(gsub('[^0-9]', '',rice_fixed_long_long$year))## RICE-- Baseline ---------------------------rice_baseline=read.csv("rice_baseline.csv")#rice_baseline <- na_if(rice_baseline, '-99') #Replace all -99 with NA#rice_baseline <- na_if(rice_baseline, '-999999')rice_baseline=rename(rice_baseline,ID=X)rice_baseline_long <-melt(setDT(rice_baseline), id.vars =c("ID"),value.name=c("baseline_rice_yield") ,variable.name ="year")rice_baseline_long$year=as.numeric(gsub('[^0-9]', '',rice_baseline_long$year))rice_baseline_fixedlong=merge(rice_fixed_long_long,rice_baseline_long,by=c("ID","year"))rice_baseline_fixedlong$year=rice_baseline_fixedlong$year-1982rice_baseline_fixedlong=as.data.frame(rice_baseline_fixedlong)rice_baseline_fixedlong$baseline_rice_yield[is.na(rice_baseline_fixedlong$baseline_rice_yield)]=0rice_baseline_fixedlong$fixed_long_rice_yield[is.na(rice_baseline_fixedlong$fixed_long_rice_yield)]=0rice_baseline_fixedlong=rice_baseline_fixedlong[,c(1,2,3,5,4)]export(rice_baseline_fixedlong,colNames=F,"rice_baseline_fixedlong_s.xlsx")library(rhdf5)rice_baseline_fixedlong=as.matrix(rice_baseline_fixedlong)colnames(rice_baseline_fixedlong) <-c(1,2,3,4,5)export(rice_baseline_fixedlong,"rice_baseline_fixedlong.csv")h5save(rice_baseline_fixedlong,file="rice_baseline_fixedlong.hdf",createnewfile =TRUE)rice_baseline_fixedlong=as.data.frame(rice_baseline_fixedlong)# Remove all zeros and nasrice_baseline_fixedlong_nonzero=subset(rice_baseline_fixedlong, rice_baseline_fixedlong$fixed_long_rice_yield!=0)rice_baseline_fixedlong_nonzero=subset(rice_baseline_fixedlong_nonzero, rice_baseline_fixedlong_nonzero$baseline_rice_yield!=0)export(rice_baseline_fixedlong_nonzero,colNames=F,"rice_baseline_fixedlong_nonzero.xlsx")library(rhdf5)rice_baseline_fixedlong_nonzero=as.matrix(rice_baseline_fixedlong_nonzero)colnames(rice_baseline_fixedlong_nonzero) <-c(1,2,3,4,5)h5save(rice_baseline_fixedlong_nonzero,file="rice_baseline_fixedlong_nonzero.hdf",createnewfile =TRUE)# WHEAT ---------------------------------------------------------------------------------------------------## wheat-- fixedlong ---------------------------------------wheat_fixedlong=read.csv("wheat_fixed_long.csv")#wheat_fixedlong <- na_if(wheat_fixedlong, '-99') #Replace all -99 with NA#wheat_fixedlong <- na_if(wheat_fixedlong, '-999999') #Replace all -99 with NAwheat_fixedlong=rename(wheat_fixedlong,ID=X)wheat_fixedlong$wheat_area=rep(1,3386)wheat_fixedlong_long <-melt(setDT(wheat_fixedlong), id.vars =c("ID","wheat_area"),value.name=c("fixedlong_wheat_yield") ,variable.name ="year")wheat_fixedlong_long$year=as.numeric(gsub('[^0-9]', '',wheat_fixedlong_long$year))## wheat-- Baseline ---------------------------wheat_baseline=read.csv("wheat_baseline.csv")#wheat_baseline <- na_if(wheat_baseline, '-99') #Replace all -99 with NA#wheat_baseline <- na_if(wheat_baseline, '-999999') #Replace all -99 with NAwheat_baseline=rename(wheat_baseline,ID=X)wheat_baseline_long <-melt(setDT(wheat_baseline), id.vars =c("ID"),value.name=c("baseline_wheat_yield") ,variable.name ="year")wheat_baseline_long$year=as.numeric(gsub('[^0-9]', '',wheat_baseline_long$year))wheat_baseline_long=merge(wheat_fixedlong_long,wheat_baseline_long,by=c("ID","year"))wheat_baseline_long$year=wheat_baseline_long$year-1982wheat_baseline_long=as.data.frame(wheat_baseline_long)wheat_baseline_long$baseline_wheat_yield[is.na(wheat_baseline_long$baseline_wheat_yield)]=0wheat_baseline_long$baseline_wheat_yield[is.na(wheat_baseline_long$baseline_wheat_yield)]=0wheat_baseline_long=wheat_baseline_long[,c(1,2,3,5,4)]export(wheat_baseline_long,colNames=F,"wheat_baseline_long_s.xlsx")library(rhdf5)wheat_baseline_long=as.matrix(wheat_baseline_long)colnames(wheat_baseline_long) <-c(1,2,3,4,5)export(wheat_baseline_long,"wheat_baseline_fixedlong.csv")h5save(wheat_baseline_long,file="wheat_baseline_long.hdf",createnewfile =TRUE)# Remove all zeros and naswheat_baseline_long=as.data.frame(wheat_baseline_long)wheat_baseline_long_nonzero=subset(wheat_baseline_long, wheat_baseline_long$baseline_wheat_yield!=0)wheat_baseline_long_nonzero=subset(wheat_baseline_long_nonzero, wheat_baseline_long_nonzero$baseline_wheat_yield!=0)export(wheat_baseline_long_nonzero,colNames=F,"wheat_baseline_long_nonzero.xlsx")#library(rhdf5)#wheat_baseline_long_nonzero=as.matrix(wheat_baseline_long_nonzero)#colnames(wheat_baseline_long_nonzero) <- c(1,2,3,4,5)#h5save(wheat_baseline_long_nonzero,file="wheat_baseline_long_nonzero.hdf",createnewfile = TRUE)#h5closeAll()```## Octave optimization### Rice```{octave, engine.path='C:/Program Files/GNU Octave/Octave-8.2.0/mingw64/bin/octave', warning=F}%load -hdf5 "rice_baseline_fixedlong.hdf"%ndataseq01060216=rice_baseline_fixedlongndataseq01060216=dlmread('rice_baseline_fixedlong.csv');whosCreateWTPBoundsbyCell_Bihar%save -hdf5 "RA.hdf"%load -hdf5 "RA.hdf"dlmwrite('RA_rice_baseline_fixedlong.txt',RA,'-append','roffset',1,'delimiter',' ')CreateTableData_BiharDescriptiveStat_rice_baseline_fixedlong=DescriptiveStat```For about 79% of the cells, farmer practice is sub-optimal.#### Mapping the results```{r}RA_rice_baseline_fixedlong<-import("RA_rice_baseline_fixedlong.txt", col_names =FALSE)# RA column names# % 1 = cellID# % 2 = Comparison Technology ID# % 3 = Base Technology ID# % 4 = Mean Yield for Comp# % 5 = Standard Deviation of Yield for Comp# % 6 = CV of Yield for Comp# % 7 = Maximum Yield for Comp# % 8 = Minimum Yield for Comp# % 9 = Probability of Crop Failure for Comp# % 10 = Min Proportion for Comp to SOSD Base# % 11 = Mean Yield for Base# % 12 = Standard Deviation of Yield for Base# % 13 = CV of Yield for Base# % 14 = Maximum Yield for Base# % 15 = Minimum Yield for Base# % 16 = Probability of Crop Failure for Base# % 17 = Min Proportion for Base to SOSD Comp# % 18 = Difference in mean Comp - Base# % 19 = Difference in standard deviation Comp - Base# % 20 = Difference in CV Comp - Base# % 21 = Difference in Prob of Crop Failure Comp - Base# % 22 = Min Proportion for Comp to SOSD Base divided by average base yield# % 23 = Min Proportion for Base to SOSD Comp divided by average base yield# % 24 = Comp More Risky (-1)/less Risky (1)/ Indeterminant (0) compared to Base# % 25 = Wheat Areacolumnnames=c("CellID","CompID","BaseID","CompYield_Mean","CompYield_SD","CompYield_CV","CompYield_Max","CompYield_Min","CompCropFailureProb","CompMinPropSOSDBase","BaseYield_Mean","BaseYield_SD","BaseYield_CV","BaseYield_Max","BaseYield_Min","BaseCropFailureProb","BaseMinPropSOSDComp","Diff_Mean_CompBase","Diff_SD_CompBase","Diff_CV_CompBase","Diff_CropFailureProbCompBase","MinProp_CompSOSDBase_Divdbaseyield","MinProp_BaseSOSDComp_Divdbaseyield","Riskiness_Comp","RiceArea")names(RA_rice_baseline_fixedlong)[1:25]=c("CellID","CompID","BaseID","CompYield_Mean","CompYield_SD", "CompYield_CV","CompYield_Max","CompYield_Min","CompCropFailureProb","CompMinPropSOSDBase","BaseYield_Mean","BaseYield_SD","BaseYield_CV","BaseYield_Max","BaseYield_Min", "BaseCropFailureProb","BaseMinPropSOSDComp","Diff_Mean_CompBase","Diff_SD_CompBase","Diff_CV_CompBase", "Diff_CropFailureProbCompBase","MinProp_CompSOSDBase_Divdbaseyield","MinProp_BaseSOSDComp_Divdbaseyield","Riskiness_Comp","RiceArea")# Based on the raster extraction in the initial stepsfiles <-c(file1)r <-load_sims(files,"rice_yield",aoi,NA)r1 <- r[[1]][[1]]# Simulated rice yield for the farmer practiceplot(r1)# Revenue ------------------------------------------### Fixed long versus farmer practice -------------------m <- RA_rice_baseline_fixedlongv <-as.vector(getValues(r1))t <-which(!is.na(v))r1_RA_rice_baseline_fixedlong_c<- r1values(r1_RA_rice_baseline_fixedlong_c)[t]<- m$Riskiness_Complibrary(rasterVis)levelplot(r1_RA_rice_baseline_fixedlong_c,par.settings=RdBuTheme(),margin=F, main="Rice yield optimal strategy for FP vs FL")RA_rice_baseline_fixedlong_plot=gplot(r1_RA_rice_baseline_fixedlong_c) +geom_raster(aes(fill =factor(value)))+scale_fill_manual(values =c("#F8766D","#619CFF", "#00BA38"),na.value ="transparent",labels=c("Clearly worse","Better or worse","Clearly better",""))+labs(x="Longitude",y="Latitude",title="Farmer practice",fill="Willingness to pay")previous_theme <-theme_set(theme_bw())RA_rice_baseline_fixedlong_plot```## Wheat```{octave, engine.path='C:/Program Files/GNU Octave/Octave-8.2.0/mingw64/bin/octave', warning=F}%load -hdf5 "wheat_baseline_long.hdf"%ndataseq01060216=wheat_baseline_longndataseq01060216=dlmread('wheat_baseline_fixedlong.csv');whosCreateWTPBoundsbyCell_Bihar%save -hdf5 "RA.hdf"%load -hdf5 "RA.hdf"dlmwrite('RA_wheat_baseline_fixedlong.txt',RA,'-append','roffset',1,'delimiter',' ')whos%save -hdf5 "RA.hdf"%load -hdf5 "RA.hdf"CreateTableData_BiharDescriptiveStat_wheat_baseline_long=DescriptiveStat```#### Mapping the results```{r}RA_wheat_baseline_fixedlong<-import("RA_wheat_baseline_fixedlong.txt", col_names =FALSE)# RA column names# % 1 = cellID# % 2 = Comparison Technology ID# % 3 = Base Technology ID# % 4 = Mean Yield for Comp# % 5 = Standard Deviation of Yield for Comp# % 6 = CV of Yield for Comp# % 7 = Maximum Yield for Comp# % 8 = Minimum Yield for Comp# % 9 = Probability of Crop Failure for Comp# % 10 = Min Proportion for Comp to SOSD Base# % 11 = Mean Yield for Base# % 12 = Standard Deviation of Yield for Base# % 13 = CV of Yield for Base# % 14 = Maximum Yield for Base# % 15 = Minimum Yield for Base# % 16 = Probability of Crop Failure for Base# % 17 = Min Proportion for Base to SOSD Comp# % 18 = Difference in mean Comp - Base# % 19 = Difference in standard deviation Comp - Base# % 20 = Difference in CV Comp - Base# % 21 = Difference in Prob of Crop Failure Comp - Base# % 22 = Min Proportion for Comp to SOSD Base divided by average base yield# % 23 = Min Proportion for Base to SOSD Comp divided by average base yield# % 24 = Comp More Risky (-1)/less Risky (1)/ Indeterminant (0) compared to Base# % 25 = Wheat Areacolumnnames=c("CellID","CompID","BaseID","CompYield_Mean","CompYield_SD","CompYield_CV","CompYield_Max","CompYield_Min","CompCropFailureProb","CompMinPropSOSDBase","BaseYield_Mean","BaseYield_SD","BaseYield_CV","BaseYield_Max","BaseYield_Min","BaseCropFailureProb","BaseMinPropSOSDComp","Diff_Mean_CompBase","Diff_SD_CompBase","Diff_CV_CompBase","Diff_CropFailureProbCompBase","MinProp_CompSOSDBase_Divdbaseyield","MinProp_BaseSOSDComp_Divdbaseyield","Riskiness_Comp","wheatArea")names(RA_wheat_baseline_fixedlong)[1:25]=c("CellID","CompID","BaseID","CompYield_Mean","CompYield_SD", "CompYield_CV","CompYield_Max","CompYield_Min","CompCropFailureProb","CompMinPropSOSDBase","BaseYield_Mean","BaseYield_SD","BaseYield_CV","BaseYield_Max","BaseYield_Min", "BaseCropFailureProb","BaseMinPropSOSDComp","Diff_Mean_CompBase","Diff_SD_CompBase","Diff_CV_CompBase", "Diff_CropFailureProbCompBase","MinProp_CompSOSDBase_Divdbaseyield","MinProp_BaseSOSDComp_Divdbaseyield","Riskiness_Comp","wheatArea")# Based on the raster extraction in the initial stepsfiles <-c(file1)r <-load_sims(files,"wheat_yield",aoi,NA)r1 <- r[[1]][[1]]# Simulated wheat yield for the farmer practiceplot(r1)# Revenue ------------------------------------------### Fixed long versus farmer practice -------------------m <- RA_wheat_baseline_fixedlongv <-as.vector(getValues(r1))t <-which(!is.na(v))r1_RA_wheat_baseline_fixedlong_c<- r1values(r1_RA_wheat_baseline_fixedlong_c)[t]<- m$Riskiness_Complibrary(rasterVis)levelplot(r1_RA_wheat_baseline_fixedlong_c,par.settings=RdBuTheme(),margin=F, main="wheat yield optimal strategy for FP vs FL")RA_wheat_baseline_fixedlong_plot=gplot(r1_RA_wheat_baseline_fixedlong_c) +geom_raster(aes(fill =factor(value)))+scale_fill_manual(values =c("#F8766D","#619CFF", "#00BA38"),na.value ="transparent",labels=c("Clearly worse","Better or worse","Clearly better",""))+labs(x="Longitude",y="Latitude",title="Farmer practice",fill="Willingness to pay")previous_theme <-theme_set(theme_bw())RA_wheat_baseline_fixedlong_plot```# ConclusionIn conclusion, we have demonstrated a workflow for generating willingness to pay measures so that even a risk averse farmer can find the technology beneficial. This is important as it can allow targeting of scenarios and prioritizing investments.# ReferencesHurley, T., Koo, J., & Tesfaye, K. (2018). Weather risk: how does it change the yield benefits of 395 nitrogen fertilizer and improved maize varieties in sub-Saharan Africa? Agricultural 396 Economics, 49(6), 711-723. https://doi.org/https://doi.org/10.1111/agec.12454.Levy, H. (2006). Stochastic Dominance: Investment Decision Making Under Uncertainty (Second Edition). Springer. New York, NY.Pages 180-182.Mkondiwa, M., and Urfels, A. 2023. "Risk-based evaluations of competing agronomic climate adaptation strategies: The case of rice planting strategies in the Indo Gangetic Plains". Working Paper.