Significance Tests for Event Studies

Event studies are concerned with the question of whether abnormal returns on an event date or, more generally, during a window around an event date (called the event window) are unusually large (in magnitude). To answer this question one carries out a formal hypothesis test where the null hypothesis specifies that the expected value of a certain random variable is zero; if the null hypothesis is rejected, one concludes that the event had an ‘impact’. It is customary in the literature to use two-sided tests, which specify as alternative hypothesis that the expected value is different from zero (as opposed to larger, or smaller, than zero). We follow this convention.

If there is only one instance under study, the random variable is the abnormal return on the event day itself (AR) or, more generally, the cumulative abnormal return during the event window (CAR). If there are multiple instances under study, the respective quantities are averaged across instances. Thus, the random variable is the average abnormal return on the respective event day (AAR) or the average cumulative abnormal return during the respective event window, which can alternatively be expressed as the cumulative average abnormal return (CAAR).

In terms of terminology, by an instance we mean a given event for a given firm. In the case of multiple instances, there are two possibilities: (i) a given event (type), such as inclusion to an index or a merger, for multiple firms or (ii) multiple repetitions of a given event (type) for a given firm. An example of the first possibility would be studying the effect of being included in the S&P500 index for multiple firms; an example of the second possibility would be studying the effect of mergers for a given firm. In terms of the statistical methodology, both possibilities are handled in the same way.

For the computation of the abnormal return of firm i on day t, denoted by ARi,t we refer the user to the introduction. In case one considers more than one instance, let N denote the number of instances considered and define

AARt=1NNi=1ARi,t

CARi=T2t=T1+1ARi,t

CAAR=1NNi=1CARi

The literature on event-study hypothesis testing covers a wide range of tests. Generally, significance tests can be classified into parametric and nonparametric tests. Parametric tests (at least in the field of event studies) assume that the individual firm's abnormal returns are normally distributed, whereas nonparametric tests do not rely on any such assumption. Applied researchers typically carry out both parametric and nonparametric tests to verify that the research findings are not driven by non-normal returns or outliers, which tend to affect the results of parametric tests but not the results of nonparametric tests; for example, see Schipper and Smith (1983).

Table 1 lists the various tests according to the null hypothesis for which they can be used. Table 2 lists them by their name and presents strengths and weaknesses compiled from Kolari and Pynnonen (2011).

Table 1: Tests by use
Null Hypothesis Parametric Tests Nonparametric Tests Application
H0:E(AR)=0 T Test Permutation Test Single Instance
H0:E(AAR)=0 Cross-Sectional Test, Time-Series Standard Deviation Test, Patell Test, Adjusted Patell Test, Standardized Cross-Sectional Test, Adjusted Standardized Cross-Sectional Test, and Skewness Corrected Test Generalized Sign TestGeneralized Rank T TestGeneralized Rank Z Test, and Wilcoxon Test Multiple Instances
H0:E(CAR)=0 T Test Permutation Test Single Instance
H0:E(CAAR)=0 Cross-Sectional TestTime-Series Standard Deviation TestPatell TestAdjusted Patell TestStandardized Cross-Sectional TestAdjusted Standardized Cross-Sectional Test, and Skewness Corrected Test Generalized Sign TestGeneralized Rank T Test, and Generalized Rank Z Test Multiple Instances

 

Table 2: Tests by name (1-9 are parametric, 10-16 are non-parametric)
# Name Key Reference EST Abbreviation Strengths and Weaknesses
1 T Test    
  • Simplicity
  • Sensitive to cross-sectional  and event-induced volatility; also sensitive to deviations from normality
2 Cross-Sectional Test   CSect T  
3 Time-Series Standard Deviation Test   CDA T  
4 Patell Test Patell (1976) Patell Z
  • Robust against the way in which ARs are distributed across the (cumulated) event window
  • Sensitive to cross-sectional correlation and event-induced volatility
5 Adjusted Patell Test Kolari and Pynnönen (2010) Adjusted Patell Z
  • Same as Patell; accounts for cross-sectional correlation
6 Standardized Cross-Sectional Test Boehmer, Musumeci and Poulsen (1991) StdCSect Z
  • Robust against the way in which ARs are distributed across the (cumulated) event window. Accounts for event-induced volatility and serial correlation
  • Sensitive to cross-sectional correlation
7 Adjusted Standardized Cross-Section Test Kolari and Pynnönen (2010) Adjusted StdCSect Z
  • Accounts additionally for cross-correlation
8 Skewness Corrected Test Hall (1992) Skewness-Corrected T
  • Corrects the test statistics for potential skewness (in the return distribution)
9 Jackknife Test Giaccotto and Sfiridis (1996) Jackknife T  
10 Corrado Rank Test Corrado and Zivney (1992) Rank Z
  • Loses power for longer event windows (e.g., [-10,10])
11 Generalized Rank Test Kolari and Pynnönen (2011) Generalized Rank T
  • Accounts for cross-sectional and serial correlation of returns, as well as for event-induced volatility
12 Generalized Rank Test Kolari and Pynnönen (2011) Generalized Rank Z
  • Less robust against the cross-sectional correlation of returns than Generalized Rank T
13 Sign Test Cowan (1992) Sign Z
  • Robust against skewness (in the return distribution)
  • Inferior performance for longer event windows
14 Cowan Generalized Sign Test Cowan (1992) Generalized Sign Z   
15 Wilcoxon signed-rank Test Wilcoxon (1945) Wilcoxon
  • Takes into account both the sign and the magnitude of ARs
16 Permutation Test Nguyen and Wolf (2023) Permutation
  • Robust against non-normality  of abnormal returns, unlike the T Test
  • Computationally more expensive

In describing the formulas for the test statistics and their (approx.) distributions under the null, which are used to compute p-values, we follow the order in Table 2.

Some Preliminaries

The estimation window is given by {T0,,T1} and thus has length L1=T1T0+1. The event window is given by {T1+1,,T2} and thus has length L2=T2T1. This convention implies that the estimation window ends immediately before the event window. We will stick to this convention for simplicity in all the formulas below, but note that our methodology also allows for an arbitrary gap between the two windows, as specified by the user.

If the event window is of length one (that is, contains a single day only), we shall use the convention T1+1=0=T2. Otherwise, it always holds more generally that T1+10T2.

If multiple instances are considered, N denotes the number of instances.

For any given firm i, SARi denotes the sample standard deviation of the returns during the estimation window, which is given as the square root of the corresponding sample variance

S2ARi=1MiKT1t=T0AR2i,t

Here, Mi denotes the number of non-missing returns during the estimation window; for example, Mi=T1T0+1 in case of no missing observations. Furthermore, K denotes the degrees of freedom (given by the number of free parameters)  in the benchmark model that was used to compute the abnormal returns; for example, K=1 for the constant-expected-return model, K=2 for the market model, and K=4 for the three-factor Fama-French factor (which also contains a constant in addition to the three stochastic factors).

Finally, N(0,1) denotes the standard normal distribution and tk denotes the t-distribution with k degrees of freedom.

 

Parametric Tests

[1] T Test

[1.1] Null hypothesis of interest: H0:E(ARi,0)=0 

Test statistic:

t=ARi,tSARi

Approximate null distribution: ttMiK

[1.2] Null hypothesis of interest: H0:E(CARi)=0

Test statistic:

t=CARiSCARiwithS2CARi=L2S2ARi

Approximate null distribution: ttMiK

 

[2] Cross-Sectional Test (Abbr.: CSect T) 

[2.1] Null hypothesis of interest: E(AAR0)=0

Test statistic:

t=NAAR0SAAR,0withS2AAR,0=1N1Ni=1(ARi,0AAR0)2

Approximate null distribution: ttN1

[2.2] Null hypothesis of interest: E(CAAR)=0

Test statistic:

t=NCAARSCAARwithS2CAAR=1N1Ni=1(CARiCAAR)2

Approximate null distribution: ttN1

 

[3] Time-Series Standard Deviation or Crude Dependence Test (Abbr.: CDA T) 

[3.1] Null hypothesis of interest: E(AAR0)=0

Test statistic:

t=NAAR0SAARwithS2AAR=1M1T1t=T0(AARt1MT1t=T0AARt)2

where M denotes the number of non-missing AARt during the estimation window.

Approximate null distribution: ttM1

[3.2] Null hypothesis of interest: E(CAAR)=0

Test statistic:

t=NCAARSCAARwithS2CAAR=1M1T1t=T0(CAARt1MT1t=T0CAARt)2

where M denotes the number of non-missing CAARt during the estimation window.

Approximate null distribution: ttM1

 

[4] Patell or Standardized Residual Test (Abbr.: Patell Z) 

[4.1] Null hypothesis of interest: H0:E(AAR0)=0

Test statistic:

z=ASAR0SASAR

The underlying idea is to standardize each ARi,t by the so-called forecast-error-corrected standard deviation before calculating the test statistic; for example, for the market model,

SARi,0=ARi,0SARi,0withS2ARi,0=S2ARi(1+1Mi+(Rm,0¯Rm)2T1t=T0(Rm,t¯Rm)2)and¯Rm=1L1T1t=T0Rm,t

where Rm,t denotes the market return on day t. (The standardization is analogous for any other day t in the event window.)

Then compute

ASAR0=Ni=1SARi,0

Under the null, this statistic has expectation zero and variance

S2ASAR=Ni=1Mi2Mi4

Approximate null distribution: zN(0,1)

[4.2] Null hypothesis of interest: H0:E(CAAR)=0

Test statistic:

z=1NNi=1CSARiSCSARi

where CSARi denotes the cumulative standardized abnormal return of firm i:

CSARi=T2t=T1+1SARi,t

which under the null has expectation zero and variance

S2CSARi=L2Mi2Mi4

Approximate null distribution: zN(0,1)

 

[5] Kolari and Pynnönen adjusted Patell or Standardized Residual Test (Abbr.: Adjusted Patell Z) 

[5.1] Null hypothesis of interest: H0:E(AAR0)=0

Test statistic:

zadj=z1ˉr1+(N1)ˉr

where z is defined as in [4.1] and ˉr denotes the average of the (pairwise) sample cross-correlations of the estimation-period abnormal returns.

Approximate null distribution: zadjN(0,1)

[5.2] Null hypothesis of interest: H0:E(CAAR)=0

Test statistic:

zadj=z1ˉr1+(N1)ˉr

where z is defined as in [4.2] and ˉr denotes the average of the (pairwise) sample cross-correlations of the estimation-period abnormal returns.

Approximate null distribution: zadjN(0,1)

 

[6] Standardized Cross-Sectional or BMP Test (Abbr.: StdCSect T) 

[6.1] Null hypothesis of interest: H0:E(AAR0)=0

Test statistic:

t=ASAR0NSASAR,0withS2ASAR,0=1N1Ni=1(SARi,01NNi=1SARi,0)2

with SARi,0 and ASAR0 defined as in [4.1]

Approximate null distribution: ttM1

[6.2] Null hypothesis of interest: H0:E(CAAR)=0

Test statistic: 

t=N¯SCARS¯SCAR

where

¯SCAR=1NNi=1SCARiandS2¯SCAR=1N1Ni=1(SCARi¯SCAR)2

These statistics are based on

SCARi=CARiSCARi

where SCARi denotes the forecast-error-corrected standard deviation; for example, for the market model,

S2CARi=S2ARi(L2+L2Mi+T2t=T1+1(Rm,tˉRm)2T1t=T0(Rm,tˉRm)2)

Approximate null distribution: ttM1

 

[7] Kolari and Pynnönen Adjusted Standardized Cross-Sectional or BMP Test  (Abbr.: Adjusted StdCSect T

[7.1] Null hypothesis of interest: H0:E(AAR0)=0

Test statistic:

zadj=t1ˉr1+(N1)ˉr

where t is defined as in [6.1] and ˉr denotes the average of the (pairwise) sample cross-correlations of the estimation-period abnormal returns.

Approximate null distribution: tadjtN1

[7.2] Null hypothesis of interest: H0:E(CAAR)=0

Test statistic:

tadj=z1ˉr1+(N1)ˉr

where t is defined as in [6.2] and ˉr denotes the average of the (pairwise) sample cross-correlations of the estimation-period abnormal returns.

Approximate null distribution: tadjtN1

 

[8] Skewness-Corrected Test (Abbr.: Skewness-Corrected T) 

[8.1] Null hypothesis of interest: H0:E(AAR0)=0

Test statistic: 

t=N(S+13γS2+127γ2S3+16Nγ)

As far as the ingredients are concerned, first recall the cross-sectional sample variance

S2AAR,0=1N1Ni=1(ARi,0AAR0)2

Next, the corresponding sample skewness is given by

γ=N(N2)(N1)Ni=1(ARi,0AAR0)3S3AAR,0

Finally, let

S=AAR0SAAR,0

Approximate null distribution: ttN1

[8.2] Null hypothesis of interest: H0:E(CAAR)=0

Test statistic: 

t=N(S+13γS2+127γ2S3+16Nγ)

As far as the ingredients are concerned, first recall the cross-sectional sample variance

S2CAAR=1N1Ni=1(CARiCAAR)2

Next, the corresponding sample skewness is given by

γ=N(N2)(N1)Ni=1(CARiCAAR)3S3CAAR

Finally, let

S=CAARSCAAR

Approximate null distribution: ttN1

 

[9] Jackknife Test (Abbr.: Jackknife T) 

This test will be added in a future version.

 

Nonparametric Tests

[10] Corrado Rank Test (Abbr.: Rank Z) 

[10.1] Null hypothesis of interest: H0:E(AAR0)=0

Test statistic: 

z=ˉK00.5SˉK

Start by computing, for any i, a vector of `scaled' ranks based on the combined sample {ARi,t}T2i=T0:

Ki,t=rank(ARi,t)1+Mi+L2,i

where Li,2 denotes the number of non-missing ARi,t during the event window

Then, for any t, denote the number of non-missing Ki,t by Nt and define

ˉKt=1NtNi=1Ki,tandS2ˉK=1L1+L2T2t=T0(ˉKt0.5)2

Approximate null distribution: zN(0,1)

[10.2] Null hypothesis of interest: H0:E(CAAR)=0

Test statistic: 

z=L2(ˉKT1+1,T20.5SˉK)withˉKT1+1,T2=1L2T2t=T1+1ˉKt

Approximate null distribution: zN(0,1)

 

[11] Generalized Rank T Test (Abbr.: Generalized Rank T) 

[11.1] Null hypothesis of interest: H0:E(AAR0)=0

Test statistic: 

t=Z(L11L1Z2)withZ=ˉUL1+1SˉU

Arguably, this is the most complicated test statistic of them all, so it will take a while to describe its construction. For simplicity, we will assume no missing data anywhere.

For any t during the estimation window, let SARi,t=ARi,t/SARi and then compute SARi,0 as described in [4.1]. Next, use cross-sectional standardization to compute

SARi,0=SARi,0SSAR0withS2SAR0=1N1Ni=1(SARi,0¯SAR0)2and¯SAR0=1NNi=1SARi,0

This, for any i, gives a time series of length L1+1:

{GSARi,1,,GSARi,L1,GSARi,L1+1}={SARi,T0,,SARi,T1,SARi}

Next, for any i, let

Ui,t=rank(GSARi,t)L1+20.5

where the ranks are across t{1,,L1+1} 

Next, for any t, let

ˉUt=1NNi=1Ui,t

and then let

S2ˉU=1L1+1L1+1t=1ˉU2t

noting that, necessarily, the average of the values {ˉUt}L1+1t=1 is zero

Approximate null distribution:ttL11

[11.2]  Null hypothesis of interest: H0:E(CAAR)=0

Test statistic: 

t=Z(L11L1Z2)withZ=ˉUL1+1SˉU

Arguably, this is the most complicated test statistic of them all, so it will take a while to describe its construction. For simplicity, we will assume no missing data anywhere.

Compute SCARi as described in [6.1]; use cross-sectional standardization to compute

SCARi=SCARiSSCARwithS2SCAR=1N1Ni=1(SCARi¯SCAR)2and¯SCAR=1NNi=1SCARi

This,for any i, gives a time series of length L1+1:

{GSARi,1,,GSARi,L1,GSARi,L1+1}={SARi,T0,,SARi,T1,SCARi}

Next, for any i, let

Ui,t=rank(GSARi,t)L1+20.5

where the ranks are across t{1,,L1+1} 

Next, for any t, let

ˉUt=1NNi=1Ui,t

and then let

S2ˉU=1L1+1L1+1t=1ˉU2t

noting that, necessarily, the average of the values {ˉUt}L1+1t=1 is zero

Approximate null distribution:ttL11

 

[12] Generalized Rank Z Test (Abbr.: G-Rank Z) 

[12.1] Null hypothesis of interest: H0:E(AAR0)=0

Test statistic:

z=ˉUL1+1SˉUL1+1withS2ˉUL1+1=L112N(L1+2)

where the ingredients are defined as in [11.1]

Approximate null distribution: zN(0,1)

[12.2]  Null hypothesis of interest: H0:E(CAAR)=0

Test statistic: 

z=ˉUL1+1SˉUL1+1withS2ˉUL1+1=L112N(L1+2)

where the ingredients are defined as in [11.2]

Approximate null distribution: zN(0,1)

 

[13] Sign Test (Abbr.: Sign Z) 

[13.1] Null hypothesis of interest: H0:E(AAR0)=0

Test statistic:

z=wN0.5N0.50.5

where w is the number of  the ARi,0 that are positive

Approximate null distribution: zN(0,1)

[13.2] Null hypothesis of interest: H0:E(CAAR)=0

Test statistic:

z=wN0.5N0.50.5

where w is the number of  the CARi during the event window that are positive

Approximate null distribution: zN(0,1)

 

[14] Generalized Sign Test (Abbr.: Generalized Sign Z) 

[14.1] Null hypothesis of interest: H0:E(AAR0)=0

Test statistic:

z=wNˆpNˆp(1ˆp)

where w is the number of the ARi,0 that are positive and ˆp is the fraction of the ARi,t during the estimation window (across both i and t) that are positive

Approximate null distribution: zN(0,1)

[14.2] Null hypothesis of interest: H0:E(CAAR)=0

Test statistic:

z=wNˆpNˆp(1ˆp)

where w is the number of the CARi during the event window that are positive and ˆp is the fraction of the ARi,t during the estimation window (across both i and t) that are positive

Approximate null distribution: zN(0,1)

 

[15] Wilcoxon Test (Abbr.: Wilcoxon)

[15.1] Null hypothesis of interest: H0:E(AAR0)=0

The Wilcoxon test is a nonparametric test based on the ranks of the ARi,0 across i. The (exact) distribution of the test statistic under the null, upon which we base the p-value, is nonstandard and we refer the user to the original paper of Wilcoxon (1945) or any suitable textbook for the details.

[15.2] Null hypothesis of interest: H0:E(CAAR)=0

The Wilcoxon test is not available for this null hypothesis.

 

[16] Permutation Test (Abbr.: Permutation) 

[16.1] Null hypothesis of interest: H0:E(ARi,0)=0

The permutation test is a non-parametric test that computes the p-value in a data-dependent (or resampling-based) fashion. We refer the user to Nguyen and Wolf (2023) for the details.

[16.2] Null hypothesis of interest: H0:E(CARI)=0

The permutation test is a non-parametric test that computes the p-value in a data-dependent (or resampling-based) fashion. We refer the user to Nguyen and Wolf (2023) for the details.