ข้อผิดพลาดของค่าเฉลี่ยสัมบูรณ์ร้อยละของข้อผิดพลาด (MAPE) คืออะไร

เปอร์เซ็นต์ความผิดพลาด Mean แอบโซลูท ( MAPE ) เป็นความถูกต้องทั่วไปหรือมาตรการข้อผิดพลาดสำหรับชุดเวลาหรือการคาดการณ์อื่น ๆ

MAPE = \frac{100}{n} \sum_{t = 1}^{n} \frac{| A_{t} - F_{t} |}{A_{t}} %,

$\text{MAPE} = \frac{100}{n}\sum_{t=1}^n\frac{|A_t-F_t|}{A_t}\%,$

โดยที่คือจำนวนจริงและคาดการณ์หรือการทำนายที่สอดคล้องกัน $A_t$ $F_t$

MAPE คือเปอร์เซ็นต์ดังนั้นเราสามารถเปรียบเทียบระหว่างซีรีย์ได้อย่างง่ายดายและผู้คนสามารถเข้าใจและตีความเปอร์เซ็นต์ได้อย่างง่ายดาย

อย่างไรก็ตามฉันได้ยินมาว่า MAPE มีข้อเสีย ฉันต้องการทำความเข้าใจข้อเสียเหล่านี้ให้ดีขึ้นเพื่อให้ฉันสามารถตัดสินใจได้อย่างชาญฉลาดว่าจะใช้ MAPE หรือทางเลือกอื่นเช่น MSE ( mse ), Mae ( mae ) หรือ MASE ( mase )

accuracy mape

— S. Kolassa - Reinstate Monica
แหล่งที่มา

ข้อบกพร่องของ MAPE

MAPE ในฐานะเปอร์เซ็นต์นั้นสมเหตุสมผลสำหรับค่าที่หน่วยงานและอัตราส่วนเข้ากันได้เท่านั้น มันไม่สมเหตุสมผลเลยที่จะคำนวณเปอร์เซ็นต์ของอุณหภูมิดังนั้นคุณไม่ควรใช้ MAPE เพื่อคำนวณความแม่นยำของการพยากรณ์อุณหภูมิ
ถ้าค่าจริงเพียงค่าเดียวคือศูนย์ $A_t=0$ คุณก็หารด้วยศูนย์ในการคำนวณ MAPE ซึ่งไม่ได้กำหนด

ปรากฎว่าซอฟต์แวร์การพยากรณ์บางอย่างยังคงรายงาน MAPE สำหรับชุดดังกล่าวเพียงแค่วางช่วงเวลาโดยไม่มีศูนย์จริง ( Hoover, 2006 ) ไม่จำเป็นต้องบอกว่านี่ไม่ใช่ความคิดที่ดีเพราะมันบอกเป็นนัยว่าเราไม่สนใจเลยเกี่ยวกับสิ่งที่เราคาดการณ์หากที่จริงเป็นศูนย์ - แต่การคาดการณ์ของ $F_t=100$ และหนึ่งใน $F_t=1000$ อาจมี ผลกระทบที่แตกต่างกัน ดังนั้นตรวจสอบว่าซอฟต์แวร์ของคุณทำอะไร

หากมีค่าศูนย์ไม่กี่ค่าคุณสามารถใช้ MAPE แบบถ่วงน้ำหนัก ( Kolassa & Schütz, 2007 ) ซึ่งยังมีปัญหาของตัวเองอยู่ สิ่งนี้ยังใช้กับ MAPE แบบสมมาตร ( Goodwin & Lawton, 1999 )
MAPE ที่มากกว่า 100% สามารถเกิดขึ้นได้ หากคุณต้องการทำงานด้วยความแม่นยำซึ่งบางคนกำหนดเป็น 100% -MAPE นี่อาจนำไปสู่ความแม่นยำในเชิงลบซึ่งผู้คนอาจเข้าใจยาก ( ไม่การตัดความแม่นยำที่ศูนย์ไม่ใช่ความคิดที่ดี )
หากเรามีข้อมูลในเชิงบวกอย่างเคร่งครัดที่เราต้องการคาดการณ์ (และต่อไปข้างต้น MAPE ไม่สมเหตุสมผล) จากนั้นเราจะไม่คาดการณ์ต่ำกว่าศูนย์ MAPE โชคไม่ดีที่ถือว่า overforecast แตกต่างจาก underforecasts: underforecast จะไม่สนับสนุนมากกว่า 100% (เช่นถ้า $F_t=0$ และ $A_t=1$ ) แต่การมีส่วนร่วมของ overforecast นั้นไม่ จำกัด (เช่นถ้า $F_t=5$ และ ) ซึ่งหมายความว่า MAPE อาจต่ำกว่าสำหรับการลำเอียงมากกว่าการคาดการณ์ที่เป็นกลาง การลดขนาดมันอาจนำไปสู่การคาดการณ์ที่มีอคติต่ำ $A_t=1$

โดยเฉพาะอย่างยิ่งกระสุนนัดสุดท้ายทำบุญคิดสักหน่อย สำหรับสิ่งนี้เราต้องย้อนกลับไป

$F_t$ $t$ $t=1, \dots, n$

The problem here is that people rarely explicitly say what a good one-number-summary of a future distribution is.

When you talk to forecast consumers, they will usually want $F_t$ to be correct "on average". That is, they want $F_t$ to be the expectation or the mean of the future distribution, rather than, say, its median.

Here's the problem: minimizing the MAPE will typically not incentivize us to output this expectation, but a quite different one-number-summary (McKenzie, 2011, Kolassa, 2020). This happens for two different reasons.

Asymmetric future distributions. Suppose our true future distribution follows a stationary $(\mu=1,\sigma^2=1)$ lognormal distribution. The following picture shows a simulated time series, as well as the corresponding density.

The horizontal lines give the optimal point forecasts, where "optimality" is defined as minimizing the expected error for various error measures.
- The dashed line at $F_t=\exp(\mu+\frac{\sigma^2}{2})\approx 4.5$ minimizes the expected MSE. It is the expectation of the time series.
- The dotted line at $F_t=\exp\mu\approx 2.7$ minimizes the expected MAE. It is the median of the time series.
- The dash-dotted line at $F_t=\exp(\mu-\sigma^2)=1.0$ minimizes the expected MAPE. It is the (-1)-median of the time series (Gneiting, 2011, p. 752 with $\beta=-1$ ), which in the specific case of a lognormal distribution happens to coincide with the mode of the distribution.
We see that the asymmetry of the future distribution, together with the fact that the MAPE differentially penalizes over- and underforecasts, implies that minimizing the MAPE will lead to heavily biased forecasts. (Here is the calculation of optimal point forecasts in the gamma case.)
Symmetric distribution with a high coefficient of variation. Suppose that $A_t$ comes from rolling a standard six-sided die at each time point $t$ . The picture below again shows a simulated sample path:

In this case:
- The dashed line at $F_t=3.5$ minimizes the expected MSE. It is the expectation of the time series.
- Any forecast $3\leq F_t\leq 4$ (not shown in the graph) will minimize the expected MAE. All values in this interval are medians of the time series.
- The dash-dotted line at $F_t=2$ minimizes the expected MAPE.
We again see how minimizing the MAPE can lead to a biased forecast, because of the differential penalty it applies to over- and underforecasts. In this case, the problem does not come from an asymmetric distribution, but from the high coefficient of variation of our data-generating process.

This is actually a simple illustration you can use to teach people about the shortcomings of the MAPE - just hand your attendees a few dice and have them roll. See Kolassa & Martin (2011) for more information.

R code

Lognormal example:

mm <- 1
ss.sq <- 1
SAPMediumGray <- "#999999"; SAPGold <- "#F0AB00"

set.seed(2013)
actuals <- rlnorm(100,meanlog=mm,sdlog=sqrt(ss.sq))

opar <- par(mar=c(3,2,0,0)+.1)
    plot(actuals,type="o",pch=21,cex=0.8,bg="black",xlab="",ylab="",xlim=c(0,150))
    abline(v=101,col=SAPMediumGray)

    xx <- seq(0,max(actuals),by=.1)
    polygon(c(101+150*dlnorm(xx,meanlog=mm,sdlog=sqrt(ss.sq)),
      rep(101,length(xx))),c(xx,rev(xx)),col="lightgray",border=NA)

    (min.Ese <- exp(mm+ss.sq/2))
    lines(c(101,150),rep(min.Ese,2),col=SAPGold,lwd=3,lty=2)

    (min.Eae <- exp(mm))
    lines(c(101,150),rep(min.Eae,2),col=SAPGold,lwd=3,lty=3)

    (min.Eape <- exp(mm-ss.sq))
    lines(c(101,150),rep(min.Eape,2),col=SAPGold,lwd=3,lty=4)
par(opar)

Dice rolling example:

SAPMediumGray <- "#999999"; SAPGold <- "#F0AB00"

set.seed(2013)
actuals <- sample(x=1:6,size=100,replace=TRUE)

opar <- par(mar=c(3,2,0,0)+.1)
    plot(actuals,type="o",pch=21,cex=0.8,bg="black",xlab="",ylab="",xlim=c(0,150))
    abline(v=101,col=SAPMediumGray)

    min.Ese <- 3.5
    lines(c(101,150),rep(min.Ese,2),col=SAPGold,lwd=3,lty=2)

    min.Eape <- 2
    lines(c(101,150),rep(min.Eape,2),col=SAPGold,lwd=3,lty=4)
par(opar)

References

Gneiting, T. Making and Evaluating Point Forecasts. Journal of the American Statistical Association, 2011, 106, 746-762

Goodwin, P. & Lawton, R. On the asymmetry of the symmetric MAPE. International Journal of Forecasting, 1999, 15, 405-408

Hoover, J. Measuring Forecast Accuracy: Omissions in Today's Forecasting Engines and Demand-Planning Software. Foresight: The International Journal of Applied Forecasting, 2006, 4, 32-35

Kolassa, S. Why the "best" point forecast depends on the error or accuracy measure (Invited commentary on the M4 forecasting competition). International Journal of Forecasting, 2020, 36(1), 208-211

Kolassa, S. & Martin, R. Percentage Errors Can Ruin Your Day (and Rolling the Dice Shows How). Foresight: The International Journal of Applied Forecasting, 2011, 23, 21-29

Kolassa, S. & Schütz, W. Advantages of the MAD/Mean ratio over the MAPE. Foresight: The International Journal of Applied Forecasting, 2007, 6, 40-43

McKenzie, J. Mean absolute percentage error and bias in economic forecasting. Economics Letters, 2011, 113, 259-262

— S. Kolassa - Reinstate Monica
แหล่งที่มา

Excellent Q&A. I would add that all of these metrics have two big underlying assumptions -- the series is iid and stationary. If one or both of these assumptions are not met, which happens frequently in practice, then their validity is questionable.

— Mike Hunter

I agree with most of this, however, wouldn't it be legitimate to deal with ratios of temperatures so long as they are on their proper scale (i.e., the Kelvin scale)?

— Reinstate Monica

@Ben: in that case, we won't divide by zero. However, the asymmetry is still a slight problem. If your forecast is 293K and the actual is 288K, you have an APE of 1.74%, and if the forecast is 288K while the actual is 293K, the APE is 1.71%, so the second forecast looks better, though both are off by 5K. (Translate into C or F as needed.) Essentially, the same absolute errors are penalized more strongly for lower actuals. Plus, interpretation of percentage errors for temperatures is not easy.

— S. Kolassa - Reinstate Monica

@Ben Percentages of absolute temperature are legitimate, but differences of temperature are easier to understand - at least, when we deal with temperatures in the everyday range; when forecasting star core temperature it may be the other way.

— Pere

ข้อผิดพลาดของค่าเฉลี่ยสัมบูรณ์ร้อยละของข้อผิดพลาด (MAPE) คืออะไร

ข้อบกพร่องของ MAPE

Related CrossValidated questions

R code

References